OPTIM: sink: reduce contention on sink_announce_dropped()
authorWilly Tarreau <w@1wt.eu>
Thu, 18 Sep 2025 06:38:34 +0000 (08:38 +0200)
committerChristopher Faulet <cfaulet@haproxy.com>
Wed, 1 Oct 2025 14:48:35 +0000 (16:48 +0200)
commit10c4edd89f0c80b25eecb0d1ee8cf3cab7212be0
treed90c71e22e6fc78277cb3ddfef79536e33ba9a13
parentcf36dd667b3262019f528f58d11457973efe47e3
OPTIM: sink: reduce contention on sink_announce_dropped()

perf top shows that sink_announce_dropped() consumes most of the CPU
on a 128-thread x86 system. Digging further reveals that the atomic
fetch_or() on the dropped field used to detect the presence of another
thread is entirely responsible for this. Indeed, the compiler implements
it using a CAS that loops without relaxing and makes all threads wait
until they can synchronize on this one, only to discover later that
another thread is there and they need to give up.

Let's just replace this with a hand-crafted CAS loop that will detect
*before* attempting the CAS if another thread is there. Doing so
achieves the same goal without forcing threads to agree. With this
simple change, the sustained request rate on h1 with all traces on
bumped from 110k/s to 244k/s!

This should be backported to stable releases where it's often needed
to help debugging.

(cherry picked from commit 4431e3bd26f5e7af6e229d1d06bbc2749c2272c0)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit e7c281fc6e11e774d0f4010758f59f6016c406d7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit a61a2d3c5a16dcc9e02e03a2076e852ec03bbbea)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
src/sink.c