MEDIUM: queue: refine the locking in process_srv_queue()
authorWilly Tarreau <w@1wt.eu>
Fri, 18 Jun 2021 17:08:23 +0000 (19:08 +0200)
committerWilly Tarreau <w@1wt.eu>
Tue, 22 Jun 2021 16:41:55 +0000 (18:41 +0200)
commit1b648c857bb9e0fb857e86838bcca0c9ed01e2bd
tree22591f87744ff5be7dc7cdd49ac044ee6e4fc551
parent3e92a31783b545dd58c4be6c588808763e0042bc
MEDIUM: queue: refine the locking in process_srv_queue()

The lock in process_srv_queue() was placed around the whole loop to
avoid the cost of taking/releasing it multiple times. But in practice
almost all calls to this function only dequeue a single connection, so
that argument doesn't really stand. However by placing the lock inside
the loop, we'd make it possible to release it before manipulating the
pendconn and waking the task up. That's what this patch does.

This increases the performance from 431k to 491k req/s on 16 threads
with 20 servers under leastconn.

The performance profile changes from this:
  14.09%  haproxy             [.] process_srv_queue
  10.22%  haproxy             [.] fwlc_srv_reposition
   6.39%  haproxy             [.] fwlc_get_next_server
   3.97%  haproxy             [.] pendconn_dequeue
   3.84%  haproxy             [.] pendconn_add

to this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

The difference is even slightly more visible in roundrobin which
does not have take_conn() call.
src/queue.c