Summary
We hit a permanent process deadlock in a long-running asyncio CLI using rich.Live with the default redirect_stdout=True / redirect_stderr=True and auto_refresh=True.
The app is Harbor, a high-concurrency benchmark runner. The immediate trigger was noisy third-party output and an aiohttp.ClientSession.__del__ warning during cancellation cleanup, but the observed deadlock appears to be a Rich lock-order inversion between Live._lock and Console._lock.
I understand redirecting global stdout/stderr during live rendering is inherently tricky. If this usage pattern is unsupported, a docs warning would help. If it is expected to be safe, I think there is a real deadlock here.
Environment
- Python: 3.13.3
- Rich: 14.2.0
- aiohttp: 3.13.4
- LiteLLM: 1.86.2
- Terminal: tmux pane
- Platform: Linux
What happened
The CLI froze permanently: the Live spinner stopped, no progress counters advanced, and no job files were written for 9+ hours. The process was still alive but all useful work had stopped.
At the time of the freeze:
- 46 TLS sockets owned by the process were in
CLOSE-WAIT.
- Many asyncio tasks had recently been cancelled by
asyncio.wait_for(...) timeouts.
aiohttp.ClientSession.__del__ was calling the event loop exception handler for an unclosed session.
- Third-party code was also printing directly to stdout.
Observed stacks
Abbreviated py-spy dump output from the wedged process:
Thread 2525706 (idle): "MainThread"
process_renderables (rich/live.py:281)
print (rich/console.py:1708)
write (rich/file_proxy.py:47)
emit (logging/__init__.py:1153)
handle (logging/__init__.py:1026)
callHandlers (logging/__init__.py:1744)
handle (logging/__init__.py:1680)
_log (logging/__init__.py:1664)
error (logging/__init__.py:1548)
default_exception_handler (asyncio/base_events.py:1865)
call_exception_handler (asyncio/base_events.py:1891)
__del__ (aiohttp/client.py:465)
...
print (rich/console.py:1724)
write (rich/file_proxy.py:47)
get_llm_provider (.../litellm/.../get_llm_provider_logic.py:503)
completion_cost (.../litellm/cost_calculator.py:1255)
Thread 2525755 (idle): "Thread-1"
render_lines (rich/console.py:1375)
__rich_console__ (rich/live_render.py:81)
render (rich/console.py:1345)
print (rich/console.py:1724)
refresh (rich/live.py:267)
run (rich/live.py:38)
Another worker thread was also in redirected stdout rendering:
Thread 2526618 (idle): "asyncio_27"
render_lines (rich/console.py:1375)
__rich_console__ (rich/live_render.py:81)
render (rich/console.py:1345)
print (rich/console.py:1724)
write (rich/file_proxy.py:47)
... third-party library print path ...
Suspected lock inversion
From Rich 14.2.0 source:
Live._RefreshThread.run() enters with self.live._lock: and then calls self.live.refresh().
Live.refresh() calls self.console.print(...) while still inside the live lock path.
Console.print() eventually needs Console._lock in rendering/writing paths.
FileProxy.write() calls console.print(...) for redirected stdout/stderr.
Console.print() invokes live render hooks; Live.process_renderables() takes Live._lock.
The observed deadlock looks like this:
main thread:
owns Console._lock
-> FileProxy/logging path re-enters Rich
-> waits for Live._lock in Live.process_renderables()
refresh thread:
owns Live._lock
-> calls console.print() from Live.refresh()
-> waits for Console._lock
So the two threads wait forever:
main thread: Console._lock -> waits Live._lock
refresh thread: Live._lock -> waits Console._lock
Linux syscall state supported this: the hot Rich threads were parked in futex waits, not blocked on network IO.
Why stderr redirection matters
aiohttp.ClientSession.__del__ calls loop.call_exception_handler(...), and asyncio's default handler logs with Python logging. If no explicit handler captures it, Python's fallback stderr handler writes to the current sys.stderr. During Live, Rich has replaced sys.stderr with FileProxy, so finalizer/error logging can re-enter Rich at arbitrary points during a render.
Third-party print(...) calls can do the same through redirected stdout.
Workaround
For our app, disabling stdout/stderr redirection appears to be the right defensive workaround:
with Live(
Group(loading_progress, running_progress),
refresh_per_second=10,
redirect_stdout=False,
redirect_stderr=False,
):
...
We are also suppressing noisy third-party stdout output separately.
Question
Is Live(..., redirect_stdout=True, redirect_stderr=True, auto_refresh=True) intended to be safe when arbitrary third-party threads/tasks may print or log to stdout/stderr during rendering?
If yes, I think the lock ordering in Live / Console / FileProxy can deadlock. If no, would you accept a docs warning around redirected stdout/stderr in multi-threaded or high-concurrency asyncio applications?
Summary
We hit a permanent process deadlock in a long-running asyncio CLI using
rich.Livewith the defaultredirect_stdout=True/redirect_stderr=Trueandauto_refresh=True.The app is Harbor, a high-concurrency benchmark runner. The immediate trigger was noisy third-party output and an
aiohttp.ClientSession.__del__warning during cancellation cleanup, but the observed deadlock appears to be a Rich lock-order inversion betweenLive._lockandConsole._lock.I understand redirecting global stdout/stderr during live rendering is inherently tricky. If this usage pattern is unsupported, a docs warning would help. If it is expected to be safe, I think there is a real deadlock here.
Environment
What happened
The CLI froze permanently: the
Livespinner stopped, no progress counters advanced, and no job files were written for 9+ hours. The process was still alive but all useful work had stopped.At the time of the freeze:
CLOSE-WAIT.asyncio.wait_for(...)timeouts.aiohttp.ClientSession.__del__was calling the event loop exception handler for an unclosed session.Observed stacks
Abbreviated
py-spy dumpoutput from the wedged process:Another worker thread was also in redirected stdout rendering:
Suspected lock inversion
From Rich 14.2.0 source:
Live._RefreshThread.run()enterswith self.live._lock:and then callsself.live.refresh().Live.refresh()callsself.console.print(...)while still inside the live lock path.Console.print()eventually needsConsole._lockin rendering/writing paths.FileProxy.write()callsconsole.print(...)for redirected stdout/stderr.Console.print()invokes live render hooks;Live.process_renderables()takesLive._lock.The observed deadlock looks like this:
So the two threads wait forever:
Linux syscall state supported this: the hot Rich threads were parked in futex waits, not blocked on network IO.
Why stderr redirection matters
aiohttp.ClientSession.__del__callsloop.call_exception_handler(...), and asyncio's default handler logs with Python logging. If no explicit handler captures it, Python's fallback stderr handler writes to the currentsys.stderr. DuringLive, Rich has replacedsys.stderrwithFileProxy, so finalizer/error logging can re-enter Rich at arbitrary points during a render.Third-party
print(...)calls can do the same through redirected stdout.Workaround
For our app, disabling stdout/stderr redirection appears to be the right defensive workaround:
We are also suppressing noisy third-party stdout output separately.
Question
Is
Live(..., redirect_stdout=True, redirect_stderr=True, auto_refresh=True)intended to be safe when arbitrary third-party threads/tasks may print or log to stdout/stderr during rendering?If yes, I think the lock ordering in
Live/Console/FileProxycan deadlock. If no, would you accept a docs warning around redirected stdout/stderr in multi-threaded or high-concurrency asyncio applications?