If you can swing it (don't need to block on IO indefinitely), I'd suggest just t...

OskarS · 2025-10-20T16:14:24 1760976864

The tricky part is really point 2 there, that can be harder than it looks (e.g. even simple file I/O can be network drives). Async IO can really shine here, though it’s not exactly trivial designing async cancelletion either.

exDM69 · 2025-10-21T09:39:48 1761039588

> To kill the thread, set the stop flag and cond_signal the condvar

This is a race condition. When you "spin" on a condition variable, the stop flag you check must be guarded by the same mutex you give to cond_wait.

See this article for a thorough explanation:

https://zeux.io/2024/03/23/condvars-atomic/

asveikau · 2025-10-20T17:18:54 1760980734

Relying heavily on a check for an atomic bool is prone to race conditions. I think it's cleaner to structure the event loop as a message queue and have a queued message that indicates it's time to stop.

toast0 · 2025-10-20T20:05:33 1760990733

Queuing a stop means you have to process the queue before stopping. Which certainly is stopping cleanly, but if you wanted to stop the thread because its queue was too long and the work requests were stale, it doesn't help much.

You could maybe allow a queue skipping feature to be used for stop messages... But if it's only for stop messages, set an atomic bool stop, then send a stop message. If the thread just misses the stop bool and waits for messages, you'll get the stop message; if the queue is large, you'll get the stop bool.

ps, hi

loeg · 2025-10-20T20:25:47 1760991947

> Relying heavily on a check for an atomic bool is prone to race conditions.

It is not, actually. This extremely simple protocol is race-free.

krackers · 2025-10-21T03:59:17 1761019157

Calling pthread_cond_signal without acquiring mutex can lead to lost wakeup. And of course you can't really acquire a mutex in an async signal safe function like interrupt handler.

Without the signaling thread acquiring a mutex, you might end up signaling after T2 has checked the boolean, but before it has called cond_wait.

But this can be solved by processing the async signal in a deferred manner from some other watcher thread

PaulDavisThe1st · 2025-10-20T17:19:35 1760980775

Every event loop is subject to the blocked-due-to-long-running-computation issue. It bites ...

asveikau · 2025-10-20T18:43:32 1760985812

The same is true if you're repeatedly polling an atomic boolean in an event loop.

CamperBob2 · 2025-10-21T00:06:09 1761005169

How so? It takes only a couple of machine cycles to poll a boolean.

(And what other kind of boolean is there, besides atomic? It's either true or it's false, and if nothing can set it back to false once it goes true, I don't see the hazard. It's a CPU, not an FPGA.)

kbolino · 2025-10-21T01:03:24 1761008604

The type is named atomic, but atomicity is not its only useful property. The atomic types also give control over the memory ordering, defaulting to sequentially consistent (seq_cst, the strongest).

Without memory order guarantees enforced by memory barriers, a write to the boolean in thread A is not guaranteed to be observed by thread B. That matters both after initialization--where thread A sets the boolean to false but thread B may observe true, false, or invalid--and also after the transition--where thread B may fail to observe that the boolean has flipped from false to true.

[edit: I'm not sure the above reasoning actually matters; as stated already by parent, "It's a CPU, not an FPGA"; modern multicore shared-memory CPUs have coherent caches]

adwn · 2025-10-21T09:29:59 1761038999

> Without memory order guarantees enforced by memory barriers, a write to the boolean in thread A is not guaranteed to be observed by thread B.

No, that's not correct. Memory ordering doesn't influence how fast a write is propagated to other cores, that's what cache coherency is for. Memory ordering of an access only matters in relation to accesses on other memory locations. There's a great introduction to this topic by Mara Bos: https://marabos.nl/atomics/memory-ordering.html

kbolino · 2025-10-21T14:07:06 1761055626

Indeed. I started to figure this out, hence my edit. Thanks for the link.

There are hypothetical, historical, and special-purpose architectures which don't have cache coherency (or implement it differently enough to matter here), but for all practical purposes, it seems that all modern, general-purpose architectures implement it.

CamperBob2 · 2025-10-21T01:36:32 1761010592

Well, it'll be observed the next time through the loop. If that matters, then it's true that this technique isn't desirable.

manwe150 · 2025-10-21T02:06:17 1761012377

Without atomic, the compiler won’t bother with there being a next time and just infinite loop (in the old days, you’d mark it volatile instead)

CamperBob2 · 2025-10-21T02:15:02 1761012902

True enough, but volatile still works, of course.

mattgreenrocks · 2025-10-21T01:04:19 1761008659

It’s free on x86 for relaxed ordering, which is sufficient for this use case.

a-dub · 2025-10-20T18:23:31 1760984611

disagree. i think then it's too tempting down the line for someone to add a message with blocking processing.

a simple clear loop that looks for a requested stop flag with a confirmed stop flag works pretty well. this can be built into a synchronous "stop" function for the caller that sets the flag and then does a timed wait on the confirmation (using condition variables and pthread_cond_timedwait or waitforxxxobject if you're on windows).

asveikau · 2025-10-20T18:46:11 1760985971

Making your check less stable doesn't prevent this.

The examples in this article IIRC were something like this.

   while (check_flag())
   {
       do_stuff();
       sleep_like_a_moron_instead_of_proper_blocking_mechanism(1second);
   }

You're still going to be arbitrarily delayed if do_stuff() (or one one of its callees, maybe deep inside the stack) delays, or the sleep call.

If you can't accept this, maybe don't play with threads, they are dangerous.

a-dub · 2025-10-20T20:01:45 1760990505

that's the point. use nonblocking io and an event polling mechanism with a timeout to keep an eye on an exit flag- that's all you need to handle clean shutdowns.

i think on windows you can wait on both the sockets/file descriptors and condition variables with the same waitforxxxobject blocking mechanism. on linux you can do libevent, epoll, select or pthread_cond_timedwait. all of these have "release on event or after timeout" semantics. you can use eventfd to combine them.

i would not ever recommend relying on signals and writing custom cleanup handlers for them (!).

unless they're blocked waiting for an external event, most system calls tend to return in a reasonable amount of time. handle the external event blocking scenario (stuff that select waits for) and you're basically there. moreover, if you're looking to exit cleanly, you probably don't want to take your chances interrupting syscalls with signals (!) anyway.

> If you can't accept this, maybe don't play with threads, they are dangerous.

too late. when i first started playing with threads, linux didn't really support them.

asveikau · 2025-10-20T21:43:57 1760996637

> use nonblocking io and an event polling mechanism

Not incompatible with what I said.

> with a timeout to keep an eye on an exit flag

This is the stupid part. You will burn CPU cycles waking up spuriously for timeouts with no work to do. Setting the flag won't wake up the event loop until the timeout hits, adding pointless delay.

You want to make signalling an exit to actually wake up your event loop. Then you also don't need a timeout.

I.e. you should make your "ask to exit" code use the same wakeup mechanism as the work queue, which is what I said at the beginning. Not burning CPU polling a volatile bool in memory on the side.

a-dub · 2025-10-20T23:04:56 1761001496

> This is the stupid part. You will burn CPU cycles waking up spuriously for timeouts with no work to do. Setting the flag won't wake up the event loop until the timeout hits, adding pointless delay.

it's the smart part. waking up at 50hz or 100hz is essentially free and if there's an os bug or other race that causes the one time "wake up to exit" event to get lost, the system will still be able to shut down cleanly with largely imperceptible delay. it also means that it can be ported to systems that don't support combined condition variable/fd semantics.

loeg · 2025-10-20T23:58:37 1761004717

> You want to make signalling an exit to actually wake up your event loop.

This is exactly what condwait + condsignal do.