Repositories - haproxy-2.1.git/log

BUILD: thread: add parenthesis around values of locking macros

clang just failed on fd.c with this error:

  src/fd.c:491:9: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses]
          while (HA_SPIN_TRYLOCK(OTHER_LOCK, &log_lock) != 0) {
                 ^                                      ~~
That's because this expands to this:

          while (!pl_try_s(&log_lock) != 0) {

Let's just add parenthesis in the TRYLOCK macros to avoid this.
This may need to be backported if commit df187875d ("BUG/MEDIUM: log:
don't hold the log lock during writev() on a file descriptor") is
backported as well as it seems to be the first one to trigger it.

(cherry picked from commit db57a142c31016ff3e0dd533cb2b4de14445781e)
[wt: applied by hand to common/hathread.h since context significantly differed]
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: pools: increase MAX_BASE_POOLS to 64

When not sharing pools (i.e. when building with -DDEBUG_DONT_SHARE_POOLS)
we have about 47 pools right now, while MAX_BASE_POOLS is only 32, meaning
that only the first 32 ones will benefit from a per-thread cache entry.
This totally kills performance when pools are not shared (roughly -20%).

Let's double the limit to gain some margin, and make it possible to set
it as a build option.

It might be useful to backport this to stable versions as they're likely
to be affected as well.

(cherry picked from commit daf8aa62a8d0210f81db5c93235da19d5ed22ab3)
[wt: applied to include/common/memory.h]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: threads: Don't forget to init each thread toremove_lock.

Don't forget to use HA_SPIN_INIT() on each toremove_lock, or DEBUG_THREAD may
not work reliably with it.

This should be backported to 2.1 and 2.0.

(cherry picked from commit f21695bd8b59d582efc99c85f9a1afac200eda81)
[wt: the lock is called toremove_lock[i] in 2.1]
Signed-off-by: Willy Tarreau <w@1wt.eu>

REGEST: Add reg tests about error files

2 reg tests are added. The first one ensures the declaration of errors in a
proxy is fonctionnal. It declares http-errors sections and declare error files
using the errorfile and the errorfiles directives, both in the default section
and the frontend sections. The second one ensures it is possible to use a custom
error file for an HTTP deny rule.

(cherry picked from commit a5afb0bf3668851d130d75ccd7dbf4fcadd3a2b7)
[wt: needed since the backport of 60837d340 ("REGTEST: Add a simple
script to tests errorfile directives in proxy sections")]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: ebtree: fix build on libmusl after recent introduction of eb_memcmp()

In order to address a corner case with memory compares, commit 853926a
("BUG/MEDIUM: ebtree: use a byte-per-byte memcmp() to compare memory
blocks") introduced a new eb_memcmp() function, which uses ssize_t for
an offset. However this one is not known by default in ebtree for versions
prior to 2.2, and it depends on the libs. Musl requires unistd to be included
to have it.

This patch just adds this harmless include, which was verified to address
issue #760. There is no mainline commit ID as 2.2 and above already include
unistd.h as part of haproxy/api.h. It must be backported to older versions
(2.0 already has the commit above).

BUG/MEDIUM: channel: Be aware of SHUTW_NOW flag when output data are peeked

The CF_SHUTW_NOW flag must be handled the same way than the CF_SHUTW flag in
co_getblk_nc() and co_getline_nc() functions. It is especally important when we
try to peek a line from outgoing data. In this case, an unfinished line is
blocked an nothing is peeked if the CF_SHUTW_NOW flag is set. But the blocked
data pevent the transition to CF_SHUTW.

The above functions are only used by LUA cosockets. Because of this bug, we may
experienced wakeups in loop of the cosocket's io handler if we try to read a
line on a closed socket with a pending unfinished line (no LF found at the end).

This patch should fix issue #744. It must be backported to all supported
versions.

(cherry picked from commit f706a794d8b8d854cba4414adb6f9a1297c6fd02)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit b95c6fe4a0b23b7ed69ea99e48680d29997fbdd7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: server: fix possibly uninitialized state file on close

Previous fix dc6e8a9a7 ("BUG/MEDIUM: server: resolve state file handle
leak on reload") traded a bug for another one, now we get this warning
when building server.c, which is valid since f is not necessarily
initialized (e.g. if no global state file is passed):

  src/server.c: In function 'apply_server_state':
  src/server.c:3272:3: warning: 'f' may be used uninitialized in this function [-Wmaybe-uninitialized]
     fclose(f);
   ^~~~~~~~~

Let's initialize it first. This whole code block should really be
splitted, cleaned up and reorganized as it's possible that other
similar bugs are hidden in it.

This must be backported to the same branches the commit above is
backported to (likely 2.2 and 2.1).

(cherry picked from commit 2d067f93fbbecc2b25bb0374a1cd2552299f19f0)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 4881a40915f618a8b7455f5229f46724e6f9a70b)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: server: resolve state file handle leak on reload

During reload, server state file is read and file handle is not released
this was indepently reported in #738 and #660.

partially resolves #660. This should be backported to 2.2 and 2.1.

(cherry picked from commit dc6e8a9a7b19052b8fd8ab7f9a84a4a95b4a6d5b)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d74774bca53a2f4c30995d01eeb028dde313fc29)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: fcgi-app: fix memory leak in fcgi_flt_http_headers

When the loop is continued early, the memory for param_rule is not freed. This
can leak memory per request, which will eventually consume all available memory
on the server.

This patch should fix the issue #750. It must be backported as far as 2.1.

(cherry picked from commit b605a736b0766030746b50c322879d929fcebacf)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 4f472bedc8c696f9c18da0eb16904a98a37f076c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: log: issue mixing sampled to not sampled log servers.

A boolean was mistakenly declared 'static THREAD_LOCAL' causing
the probe of a log to a 'not sampled' log server conditionned by
the last evaluated 'sampled log' server test on the same thread.

This results to unpredictable drops of logs on 'not sampled'
log servers as soon a 'sampled' log server is declared.

This patch removes the static THREAD_LOCAL attribute from this
boolean, fixing the issue and allowing to mix 'sampled' and
'not sampled' servers.

This fix should be backported in any branches which includes
the log sampling feature.

(cherry picked from commit 2f4cc28e0f1c8c3deca8b1b7fa446c6523dcc93c)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 9ce61f6d578876f1eda258d4cfad997e09c140a4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-fcgi: Set flags on the right stream field for empty FCGI_STDOUT

In fcgi_strm_handle_empty_stdout(), the FCGI_SF_ES_RCVD flag is set on "->state"
stream field instead of "->flags". It is obviously wrong. This bug is not
noticeable because the right state is set in the fcgi_process_demux() function a
bit later.

This patch must be backported as far as 2.1.

(cherry picked from commit 3b3096ede1b52007fa49a563436df8ee9323d78c)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit ca5e8c058e157ee7d549014705e23e3be6fc505c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-fcgi: Set conn state to RECORD_P when skipping the record padding

When the padding of a "stream" record (STDOUT or STDERR) is skipped, we must set
the connection state to RECORD_P. It is especially important if the padding is
not fully received.

This patch must be backported as far as 2.1.

(cherry picked from commit 6c99d3baeabef9aa3a798b2251215f1026b48751)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 82cffcd95ae7dc81387344e69432c1776d02102c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-fcgi: Handle empty STDERR record

As mentionned in the FastCGI specification, FCGI "streams" are series of
non-empty stream records (length != 0), followed by an empty one. It is properly
handled for FCGI_STDOUT records, but not for FCGI_STDERR ones. If an empty
FCGI_STDERR record is received, the connection is blocked waiting for data which
will never come.

To fix the bug, when an empty FCGI_STDERR record is received, we drop it, eating
the padding if any.

This patch should fix the issue #743. It must be backported as far as 2.1.

(cherry picked from commit 7f85433a912529c5cda1629001b34fd2b2e54758)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit bd6dc07038f4c6b552559ccde41d31ff9a7dbf6d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: mux-h1: Continue to process request when switching in tunnel mode

When input data are processed, if the request is switched in tunnel mode on a
protocol upgrade, we must continue the processing. Otherwise, pending input data
will only be processed on the next wakeup. So when new input data are received,
on a timeout expiration or shutdown. Worst, if the input buffer is full when it
happens, only a timeout or a shutdown will unblock the situation.

This patch should fix the issue #737. It must be backported as far as 1.9. The
bug does not seem to affect the 2.0 and 1.9 because, on a protocol upgrade, the
request is switched in tunnel mode when the response is sent to the client. But
the bug is present, so the backport remains necessary.

(cherry picked from commit 23021ad7f1d9d0e924aa5e5c6c4103b51a805af0)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 4814c74842d13dcf8d948910902f248263f80eed)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

CONTRIB: da: fix memory leak in dummy function da_atlas_open()

The dummy function takes care of doing a bit of work using a malloc()
to avoid returning a constant but it doesn't free the tested pointer,
which coverity noticed in issue #741. Let's free it before testing it
for the return value.

This may be backported but is not important since this code is only
present to allow to build the device detection code and not to actually
run it.

(cherry picked from commit 62fd12149f73feb4e9c93a961670d527436a6e33)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 7bd62cad8b711ed0b39ca00bf74ca326909ff73d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: lists: add missing store barrier in MT_LIST_ADD/MT_LIST_ADDQ

The torture test run for previous commit 787dc20 ("BUG/MEDIUM: lists: add
missing store barrier on MT_LIST_BEHEAD()") finally broke again after 34M
connections. It appeared that MT_LIST_ADD and MT_LIST_ADDQ were suffering
from the same missing barrier when restoring the original pointers before
giving up, when checking if the element was already added. This is indeed
something which seldom happens with the shared scheduler, in case two
threads simultaneously try to wake up the same tasklet.

With a store barrier there after reverting the pointers, the torture test
survived 750M connections on the NanoPI-Fire3, so it looks good this time.
Probably that MT_LIST_BEHEAD should be added to test-list.c since it seems
to be more sensitive to concurrent accesses with MT_LIST_ADDQ.

It's worth noting that there is no barrier between the last two pointers
update, while there is one in MT_LIST_POP and MT_LIST_BEHEAD, the latter
having shown to be needed, but I cannot demonstrate why we would need
one here. Given that the code seems solid here, let's stick to what is
shown to work.

This fix should be backported to 2.1, just for the sake of safety since
the issue couldn't be triggered there, but it could change with the
compiler or when backporting a fix for example.

(cherry picked from commit 3b8f9b7b880397db03bfa54b26c8b6dbf6b5dc5e)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit ae8b8361ce2d50df0fb8c2d4b4d23e13ff497029)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: lists: add missing store barrier on MT_LIST_BEHEAD()

When running multi-threaded tests on my NanoPI-Fire3 (8 A53 cores), I
managed to occasionally get either a bus error or a segfault in the
scheduler, but only when running at a high connection rate (injecting
on a tcp-request connection reject rule). The bug is rare and happens
around once per million connections. I could never reproduce it with
less than 4 threads nor on A72 cores.

Haproxy 2.1.0 would also fail there but not 2.1.7.

Every time the crash happened with the TL_URGENT task list corrupted,
though it was not immediately after the LIST_SPLICE() call, indicating
background activity survived the MT_LIST_BEHEAD() operation. This queue
is where the shared runqueue is transferred, and the shared runqueue
gets fast inter-thread tasklet wakeups from idle conn takeover and new
connections.

Comparing the MT_LIST_BEHEAD() and MT_LIST_DEL() implementations, it's
quite obvious that a few barriers are missing from the former, and these
will simply fail on weakly ordered caches.

Two store barriers were added before the break() on failure, to match
what is done on the normal path. Missing them almost always results in
a segfault which is quite rare but consistent (after ~3M connections).

The 3rd one before updating n->prev seems intuitively needed though I
coudln't make the code fail without it. It's present in MT_LIST_DEL so
better not be needlessly creative. The last one is the most important
one, and seems to be the one that helps a concurrent MT_LIST_ADDQ()
detect a late failure and try again. With this, the code survives at
least 30M connections.

Interestingly the exact same issue was addressed in 2.0-dev2 for MT_LIST_DEL
with commit 690d2ad4d ("BUG/MEDIUM: list: add missing store barriers when
updating elements and head").

This fix must be backported to 2.1 as MT_LIST_BEHEAD() is also used there.

It's only tagged as medium because it will only affect entry-level CPUs
like Cortex A53 (x86 are not affected), and requires load levels that are
very hard to achieve on such machines to trigger it. In practice it's
unlikely anyone will ever hit it.

(cherry picked from commit 787dc20952aecdfb9322fb511f07d1d21b5733c7)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 826c7b73820b44b210f97b0c24320c2aa6990bf5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: sample: Free str.area in smp_check_const_meth

Given the following example configuration:

    listen foo
     mode http
     bind *:8080
     http-request set-var(txn.leak) meth(GET)
     server x example.com:80

Running a configuration check with valgrind reports:

    ==25992== 4 bytes in 1 blocks are definitely lost in loss record 1 of 344
    ==25992==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==25992==    by 0x4E239D: my_strndup (tools.c:2261)
    ==25992==    by 0x581E20: make_arg_list (arg.c:253)
    ==25992==    by 0x4DE91D: sample_parse_expr (sample.c:890)
    ==25992==    by 0x58E304: parse_store (vars.c:772)
    ==25992==    by 0x566A3F: parse_http_req_cond (http_rules.c:95)
    ==25992==    by 0x4A4CE6: cfg_parse_listen (cfgparse-listen.c:1339)
    ==25992==    by 0x494C59: readcfgfile (cfgparse.c:2049)
    ==25992==    by 0x545145: init (haproxy.c:2029)
    ==25992==    by 0x421E42: main (haproxy.c:3175)

After this patch is applied the leak is gone as expected.

This is a fairly minor leak, but it can add up for many uses of the `bool()`
sample fetch. The bug most likely exists since the `bool()` sample fetch was
introduced in commit cc103299c75c530ab3637a1698306145bdc85552. The fix may
be backported to HAProxy 1.6+.

(cherry picked from commit 041a626a8acfd53aa3710cd3d620f7f9f01fe893)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 761200b597cb36378527ba8cca1ad10a5d09a922)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: sample: Free str.area in smp_check_const_bool

Given the following example configuration:

    listen foo
     mode http
     bind *:8080
     http-request set-var(txn.leak) bool(1)
     server x example.com:80

Running a configuration check with valgrind reports:

    ==24233== 2 bytes in 1 blocks are definitely lost in loss record 1 of 345
    ==24233==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==24233==    by 0x4E238D: my_strndup (tools.c:2261)
    ==24233==    by 0x581E10: make_arg_list (arg.c:253)
    ==24233==    by 0x4DE90D: sample_parse_expr (sample.c:890)
    ==24233==    by 0x58E2F4: parse_store (vars.c:772)
    ==24233==    by 0x566A2F: parse_http_req_cond (http_rules.c:95)
    ==24233==    by 0x4A4CE6: cfg_parse_listen (cfgparse-listen.c:1339)
    ==24233==    by 0x494C59: readcfgfile (cfgparse.c:2049)
    ==24233==    by 0x545135: init (haproxy.c:2029)
    ==24233==    by 0x421E42: main (haproxy.c:3175)

After this patch is applied the leak is gone as expected.

This is a fairly minor leak, but it can add up for many uses of the `bool()`
sample fetch. The bug most likely exists since the `bool()` sample fetch was
introduced in commit cc103299c75c530ab3637a1698306145bdc85552. The fix may
be backported to HAProxy 1.6+.

(cherry picked from commit c7d8a86f2f8ccdcaada18a5e84bd5cb8c27dac51)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 300f65dfa5d5bdb51cd3283d92bf84793f959b86)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: configuration: remove obsolete mentions of H2 being converted to HTTP/1.x

The first H2 implementation in version 1.8 used to turn HTTP/2 requests
to HTTP/1.1, causing many limitations. This is not true anymore and we
don't suffer from the lack of server-side H2 nor are we forced to close
mode anymore, so let's remove such obsolete mentions.

This could be backported to 2.0.

(cherry picked from commit 253c2519c2d3e394cc71630331dd79ceb14c57ec)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: http: Add support for http 413 status

Add 413 http "payload too large" status code. This will allow 413 to be
used in deny_status and errorfile.

(cherry picked from commit 85048f80c9c1ca22732951e83c01ab0b31f5371f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

[Cf: A "connection: close" has been added in the HTTP message to be consistent
with other error codes]

BUG/MINOR: backend: Remove CO_FL_SESS_IDLE if a client remains on the last server

When a connection is picked from the session server list because the proxy or
the session are marked to use the last requested server, if it is idle, we must
marked it as used removing the CO_FL_SESS_IDLE flag and decrementing the session
idle_conns counter.

This patch must be backported as far as 1.9.

(cherry picked from commit e91a526c8fe7bf3b655057d41e8b72a2011b7046)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: connection: Continue to recv data to a pipe when the FD is not ready

Don't wait a FD is ready to receive data to try to get data to a pipe to be sure
to not be blocked. Indeed, after a call to raw_sock_to_pipe(), the readiness for
a FD may be disabled because we read enough data (fd_done_recv) or because
nothing was received (fd_cant_recv). In the first case, the readiness is not
re-enabled after a successful send on the opposite side.

On the 2.2, calls to fd_done_recvt() was removed from raw_sock functions in the
commit 1113116b4 ("MEDIUM: raw-sock: remove obsolete calls to
fd_{cant,cond,done}_{send,recv}"). But it is a bit dangerous to directely
backport it in a stable version because many changes were perforned on the
connection layer in the 2.2. So to unblock the situation, it is safer to just
remove the test on the FD at the beginning of raw_sock_to_pipe(). At worst, we
have an extra syscall from time to time.

This patch must be backported to 2.0.

MINOR: connection: move the CO_FL_WAIT_ROOM cleanup to the reader only

CO_FL_WAIT_ROOM is set by the splicing function in raw_sock, and cleared
by the stream-int when splicing is disabled, as well as in
conn_refresh_polling_flags() so that a new call to ->rcv_pipe() could
be attempted by the I/O callbacks called from conn_fd_handler(). This
clearing in conn_refresh_polling_flags() makes no sense anymore and is
in no way related to the polling at all.

Since we don't call them from there anymore it's better to clear it
before attempting to receive, and to set it again later. So let's move
this operation where it should be, in raw_sock_to_pipe() so that it's
now symmetric. It was also placed in raw_sock_to_buf() so that we're
certain that it gets cleared if an attempt to splice is replaced with
a subsequent attempt to recv(). And these were currently already achieved
by the call to conn_refresh_polling_flags(). Now it could theorically be
removed from the stream-int.

(cherry picked from commit e2a0eeca778115adc2e6dd7c89bb36181f349ca1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

[Cf: This patch is mandatory in the 2.1 to remove the flag CO_FL_WAIT_ROOM
because it is not removed after a successfull send.]

BUG/MEDIUM: mux-h1: Subscribe rather than waking up in h1_rcv_buf()

At the end of h1_rcv_buf(), if the message was not fully processed (not in
HTTP_MSG_DONE state), it means some data are missing. But instead of waking up
the H1 connection, it is safer to subscribe for reads. Especially if the
splicing is enabled (or if we are waiting the input buffer is flushed). Because
in that case, waking up the H1 connection leads to a ping-pong loop with
h1_recv() function. Indeed, in h1_recv(), if the splicing is enabled, no receive
is performed and the H1 stream is woken up.

This bug is specific to the 2.1. Thus, there is no upstream commit ID for this
patch. But in the 2.2, it is part of the commit 2c1f37d35 ("OPTIM: mux-h1:
subscribe rather than waking up at a few other places"). The 2.0 must be
evaluated to know if this patch must be backported or not.

MINOR: mux-h1: Improve traces about the splicing

Trace messages have been added when the CS_FL_MAY_SPLICE flag is set or unset
and when the splicing is really enabled for the H1 connection.

This patch may be backpored to 2.1 to ease debugging.

(cherry picked from commit 27182297c763fb6f4818c9872692df51d4032d6b)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: mux-h1: Disable splicing for the conn-stream if read0 is received

The CS_FL_MAY_SPLICE flag must be unset for the conn-stream if a read0 is
received while reading on the kernel pipe. It is mandatory when some data was
also received. Otherwise, this flag prevent the call to the h1 rcv_buf()
callback. Thus the read0 will never be handled by the h1 multiplexer leading to
a freeze of the session until a timeout is reached.

This patch must be backported to 2.1 and 2.0.

(cherry picked from commit a131a8fe934b40324fd8a16d237616331ea0088f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-h1: Disable splicing only if input data was processed

In h1_rcv_buf(), the splicing is systematically disabled if it was previously
enabled. When it happens, if the splicing is enabled it means the channel's
buffer was empty before calling h1_rcv_buf(). Thus, the only reason to disable
the splicing at this step is when some input data have just been processed.

This patch may be backported to 2.1 and 2.0.

(cherry picked from commit 7b7016bf6e69f61074684e9145e1a6784b6a9d03)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-h1: Don't read data from a pipe if the mux is unable to receive

In h1_rcv_pipe(), if the mux is unable to receive data, for instance because the
multiplexer is blocked on input waiting the other side (BUSY mode), no receive
must be performed.

This patch must be backported to 2.1 and 2.0.

(cherry picked from commit 0060be9ccfc76f8c6a9c29ce99d18a37d85bcc1d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-h1: Fix the splicing in TUNNEL mode

In the commit 17ccd1a35 ("BUG/MEDIUM: connection: add a mux flag to indicate
splice usability"), The CS_FL_MAY_SPLICE flags was added to notify the upper
layer that the mux is able to use the splicing. But this was only done for the
payload in a message, in HTTP_MSG_DATA state. But the splicing is also possible
in TUNNEL mode, in HTTP_MSG_TUNNEL state. In addition, the splicing ability is
always disabled for chunked messages.

This patch must be backported to 2.1 and 2.0.

(cherry picked from commit 2eaf30955f3619e2f262ba948313b8c2cb40ee81)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: http_act: don't check capture id in backend (2)

Please refer to commit 19a69b3740702ce5503a063e9dfbcea5b9187d27 for all the
details. This follow up commit fixes the `http-response capture` case, the
previous one only fixed the `http-request capture` one. The documentation was
already updated and the change to `check_http_res_capture` is identical to
the `check_http_req_capture` change.

This patch must be backported together with 19a69b3740702ce5503a063e9dfbcea5b9187d27.
Most likely this is 1.6+.

(cherry picked from commit f3f4aa0266809c729ca98171a34de8c9bdd2c9f3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUILD: haproxy: fix build error when RLIMIT_AS is not set

As reported in issue #724, openbsd fails to build in haproxy.c
due to a faulty comma in the middle of a warning message. This code
is only compiled when RLIMIT_AS is not defined, which seems to be
rare these days.

This may be backported to older versions as the problem was likely
introduced when strict limits were added.

(cherry picked from commit ab8b6a45be2f76efb28f7f0ad18b69549c77b4d6)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: configuration: fix alphabetical ordering for tune.pool-{high,low}-fd-ratio

In addition they were in the wrong alphabetical order in the doc. They
were added in 2.0 by commit 88698d966 ("MEDIUM: connections: Add a way
to control the number of idling connections.") so this must be backported
to 2.0.

(cherry picked from commit 83ca305ddc2222ec269e18a9df6d519ed0f08ae8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: configuration: add missing index entries for tune.pool-{low,high}-fd-ratio

These two keywords didn't have an entry in the index. They were added in
2.0 by commit 88698d966 ("MEDIUM: connections: Add a way to control the
number of idling connections.") so this must be backported to 2.0.

(cherry picked from commit a8e2d9790548f395c5f3782117ef2eef350c7c39)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: proxy: always initialize the trash in show servers state

Actually the cleanup in commit 6ff8143f7 ("BUG/MINOR: proxy: fix
dump_server_state()'s misuse of the trash") allowed to spot that the
trash is never reset when dumping a servers state. I couldn't manage
to make it dump garbage even with large setups but didn't find either
where it's cleared between successive calls while other handlers do
explicitly invoke chunk_reset(), so it seems to happen a bit by luck.

Let's use chunk_printf() here for each turn, it makes things clearer.

This could be backported along with previous patch, especially if any
user reports occasional garbage appearing in the show servers output.

(cherry picked from commit df2a0305f24a58f56d44e1d12ea247a5ff243757)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: proxy: fix dump_server_state()'s misuse of the trash

dump_server_state() claims to dump into a buffer but instead it writes
into a buffer then dumps the trash into the channel, so it only supports
being called with buf=&trash and doesn't need this buffer. There doesn't
seem to be any current impact of this mistake since the function is called
from one location only.

A backport may be performed if it helps fixing other bugs but it will not
fix an existing bug by itself.

(cherry picked from commit 6ff8143f7c87ad40d7369119467d779642702ad5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: pattern: Add a trailing \0 to match strings only if possible

In pat_match_str() and pat_math_beg() functions, a trailing zero is
systematically added at the end of the string, even if the buffer is not large
enough to accommodate it. It is a possible buffer overflow. For instance, when
the alpn is matched against a list of strings, the sample fetch is filled with a
non-null terminated string returned by the SSL library. No trailing zero must be
added at the end of this string, because it is outside the buffer.

So, to fix the bug, a trailing zero is added only if the buffer is large enough
to accommodate it. Otherwise, the sample fetch is duplicated. smp_dup() function
adds a trailing zero to the duplicated string, truncating it if it is too long.

This patch should fix the issue #718. It must be backported to all supported
versions.

(cherry picked from commit b4cf7ab9bc413bbb956e225f903959bff17e4049)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: ssl: add "allow-0rtt" and "ciphersuites" in crt-list

Support for "allow-0rtt" and "ciphersuites" exists for crt-list.

Fix issue #721.

Should be backported as far as 1.8.

(cherry picked from commit 5d03639ba6fa9e7eee8af8fe489101de65d7f6f1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: cli: make "show sess" stop at the last known session

"show sess" and particularly "show sess all" can be very slow when dumping
lots of information, and while dumping, new sessions might appear, making
the output really endless. When threads are used, this causes a double
problem:
  - all threads are paused during the dump, so an overly long dump degrades
    the quality of service ;

  - since all threads are paused, more events get postponed, possibly
    resulting in more streams to be dumped on next invocation of the dump
    function.

This patch addresses this long-lasting issue by doing something simple:
the CLI's stream is moved at the end of the steams list, serving as an
identifiable marker to end the dump, because all entries past it were
added after the command was entered. As a result, the CLI's stream always
appears as the last one.

It may make sense to backport this to stable branches where dumping live
streams is difficult as well.

(cherry picked from commit c6e7a1b8e9da7813661096b096c65b137b72e35f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: fetch: Fix hdr_ip misparsing IPv4 addresses due to missing NUL

The IPv4 code did not take into account that the header value might not
contain the trailing NUL byte, possibly reading stray data after the header
value, failing the parse and testing the IPv6 branch. That one adds the
missing NUL, but fails to parse IPv4 addresses.

Fix this issue by always adding the trailing NUL.

The bug was reported on GitHub as issue #715.

It's not entirely clear when this bug started appearing, possibly earlier
versions of smp_fetch_hdr guaranteed the NUL termination. However the
addition of the NUL in the IPv6 case was added together with IPv6 support,
hinting that at that point in time the NUL was not guaranteed.

The commit that added IPv6 support was 69fa99292e689e355080d83ab19db4698b7c502b
which first appeared in HAProxy 1.5. This patch should be backported to
1.5+, taking into account the various buffer / chunk changes and the movement
across different files.

(cherry picked from commit 5cd00873f46f49f78587d4cc770e5ce7e979cc9c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTEST: ssl: add some ssl_c_* sample fetches test

Test the following ssl sample fetches:

ssl_c_der, ssl_c_sha1,hex, ssl_c_notafter, ssl_c_notbefore,
ssl_c_sig_alg, ssl_c_i_dn, ssl_c_s_dn, ssl_c_serial,hex, ssl_c_key_alg,
ssl_c_version

This reg-test could be used as far as haproxy 1.6.

(cherry picked from commit 86bc9539f29df77231584ba17cc1796680fce92f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTEST: ssl: tests the ssl_f_* sample fetches

Test the following ssl sample fetches:

ssl_f_der, ssl_f_sha1,hex, ssl_f_notafter, ssl_f_notbefore,
ssl_f_sig_alg, ssl_f_i_dn, ssl_f_s_dn, ssl_f_serial,hex, ssl_f_key_alg,
ssl_f_version

This reg-test could be used as far as haproxy 1.5.

(cherry picked from commit 6e56b2cd8a673ace60e9fa3fcde8fd1ecef3d22d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: spoe: Don't systematically create new applets if processing rate is low

When an event must be processed, we decide to create a new SPOE applet if there
is no idle applet at all or if the processing rate is lower than the number of
waiting events. But when the processing rate is very low (< 1 event/second), a
new applet is created independently of the number of idle applets.

Now, when there is at least one idle applet when there is only one event to
process, no new applet is created.

This patch is related to the issue #690.

(cherry picked from commit eccfa612e8c9fd566f94dea9b0213b8b0d2d0531)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: http_ana: clarify connection pointer check on L7 retry

Coverity reports a possible null deref in issue #703. It seems this
cannot happen as in order to have a CF_READ_ERROR we'd need to have
attempted a recv() which implies a conn_stream, thus conn cannot be
NULL anymore. But at least one line tests for conn and the other one
not, which is confusing. So let's add a check for conn before
dereferencing it.

This needs to be backported to 2.1 and 2.0. Note that in 2.0 it's
in proto_htx.c.

(cherry picked from commit ee99aaf1f9bad06475aaf8c5dc3c0986a9899952)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: spoe: correction of setting bits for analyzer

When a SPOE filter starts the response analyze, the wrong flag is tested on the
pre_analyzers bit field. AN_RES_INSPECT must be tested instead of
SPOE_EV_ON_TCP_RSP.

This patch must be backported to all versions with the SPOE support, i.e as far
as 1.7.

(cherry picked from commit 88403266e5c38b5fbe278a25304cbdc735ae50fe)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTEST: Add a simple script to tests errorfile directives in proxy sections

This script is compatible with all HAProxy versions. It does not depend on 2.2
features.

(cherry picked from commit 60837d340ccf0c715b0b06db347d66109c20accf)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: systemd: Wait for network to be online

Change systemd service file to wait for network to be completely
online. This solves two problems:

If haproxy is configured to bind to IP address(es) that are not yet
assigned, haproxy would previously fail. The workaround is to use
"option transparent".

If haproxy us configured to use a resolver to resolve servers via DNS,
haproxy would previously fail due to the fact that the network is not
fully online yet. This is the most compelling reason for this patch.

Signed-off-by: Ryan O'Hara <rohara@redhat.com>
Acked-by: Lukas Tribus <lukas@ltri.eu>
(cherry picked from commit f49a6049b88b7a9f0f0a18b076f34ab83820445a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MEDIUM: map: make the "clear map" operation yield

As reported in issue #419, a "clear map" operation on a very large map
can take a lot of time and freeze the entire process for several seconds.

This patch makes sure that pat_ref_prune() can regularly yield after
clearing some entries so that the rest of the process continues to work.
The first part, the removal of the patterns, can take quite some time
by itself in one run but it's still relatively fast. It may block for
up to 100ms for 16M IP addresses in a tree typically. This change needed
to declare an I/O handler for the clear operation so that we can get
back to it after yielding.

The second part can be much slower because it deconstructs the elements
and its users, but it iterates progressively so we can yield less often
here.

The patch was tested with traffic in parallel sollicitating the map being
released and showed no problem. Some traffic will definitely notice an
incomplete map but the filling is already not atomic anyway thus this is
not different.

It may be backported to stable versions once sufficiently tested for side
effects, at least as far as 2.0 in order to avoid the watchdog triggering
when the process is frozen there. For a better behaviour, all these
prune_* functions should support yielding so that the callers have a
chance to continue also yield in turn.

(cherry picked from commit d1d005d7f6b894ed243c177937626d888b3d03e1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTEST: http-rules: test spaces in ACLs with master CLI

Do the tests for spaces on the CLI with the master CLI.

Could be backported as far as 2.0 once the required patches are applied.

(cherry picked from commit 5bb21b1d29f58875fbac05a9fdfd49f84de991d8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTEST: http-rules: test spaces in ACLs

This reg-test tests the spaces in an ACL file, it tries to add new
entries with spaces from the CLI

This reg-test could backported in all stable branches if the fix for
spaces on the CLI was backported.

(cherry picked from commit 398c5f39ee768e1b0a97b13e1a6c31ce19a6e80c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mworker/cli: fix semicolon escaping in master CLI

Fix the semicolon escaping which must be handled in the master CLI,
the commands were wrongly splitted and could be forwarded partially to
the target CLI.

(cherry picked from commit 02c255e64bd528dfa9f77dc15912d692c2213cc5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mworker/cli: fix the escaping in the master CLI

The master CLI must not do the escaping since it forwards the commands
to another CLI. It should be able to split into words by taking care of
the escaping, but must not remove the forwarded backslashes.

This fix do the same thing as the previous patch applied to the
cli_parse_request() function, by taking care of the escaping during the
word split, but it also remove the part which was removing the
backslashes from the forwarded command.

(cherry picked from commit fe249c3df53127d53d55c3fda8e6be7f22cfa58b)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: cli: allow space escaping on the CLI

It was not possible to escape spaces over the CLI, making impossible the
insertion of new ACL entries with spaces from the CLI.

This patch fixes the escaping of spaces over the CLI.

It is now possible to launch "add acl agents.acl My\ User\ Agent" over
the CLI.

Could be backported in all stable branches.

Should fix issue #400.

(cherry picked from commit b08c6d06e712b148bf396dcf13f7da590a6104c9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: spoe: add missing key length check before checking key names

The spoe parser fails to check that the decoded key length is large
enough to match a given key but it uses the returned length in memcmp().
So returning "ver" could match "version" for example. In addition this
makes clang 10's ASAN complain because the second argument to memcmp()
is the static key which is shorter than the decoded buffer size, which
in practice has no impact.

I'm still not 100% sure the parser is entirely correct because even
with this fix it cannot parse a key whose name matches the beginning
of another one, but in practice this does not happen. Ideally a
preliminary length check before the comparison would be safer.

This needs to be backported as far as 1.7.

(cherry picked from commit da21ed1662ff7f1610db14b3ce48f19ec37a695b)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: ebtree: use a byte-per-byte memcmp() to compare memory blocks

As reported in issue #689, there is a subtle bug in the ebtree code used
to compared memory blocks. It stems from the platform-dependent memcmp()
implementation. Original implementations used to perform a byte-per-byte
comparison and to stop at the first non-matching byte, as in this old
example:

https://www.retro11.de/ouxr/211bsd/usr/src/lib/libc/compat-sys5/memcmp.c.html

The ebtree code has been relying on this to detect the first non-matching
byte when comparing keys. This is made so that a zero-terminated string
can fail to match against a longer string.

Over time, especially with large busses and SIMD instruction sets,
multi-byte comparisons have appeared, making the processor fetch bytes
past the first different byte, which could possibly be a trailing zero.
This means that it's possible to read past the allocated area for a
string if it was allocated by strdup().

This is not correct and definitely confuses address sanitizers. In real
life the problem doesn't have visible consequences. Indeed, multi-byte
comparisons are implemented so that aligned words are loaded (e.g. 512
bits at once to process a cache line at a time). So there is no way such
a multi-byte access will cross a page boundary and end up reading from
an unallocated zone. This is why it was never noticed before.

This patch addresses this by implementing a one-byte-at-a-time memcmp()
variant for ebtree, called eb_memcmp(). It's optimized for both small and
long strings and guarantees to stop after the first non-matching byte. It
only needs 5 instructions in the loop and was measured to be 3.2 times
faster than the glibc's AVX2-optimized memcmp() on short strings (1 to
257 bytes), since that latter one comes with a significant setup cost.
The break-even seems to be at 512 bytes where both version perform
equally, which is way longer than what's used in general here.

This fix should be backported to stable versions and reintegrated into
the ebtree code.

(cherry picked from commit 853926a9ac7f8a98f22121ea596034660e3ae537)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: tcp-rules: tcp-response must check the buffer's fullness

It's unclear why the buffer length wasn't considered when tcp-response
rules were added in 1.5-dev3 with commit 97679e790 ("[MEDIUM] Implement
tcp inspect response rules"). But it's impossible to write working
tcp-response content rules as they're always waiting for the expiration
and do not consider the fact that the buffer is full. It's likely that
tcp-response content rules were only used with HTTP traffic.

This may be backported to stable versions, though it's not very
important considering that that nobody reported this in 10 years.

(cherry picked from commit 55ae1ab9e46bb8c10931cadfe685319e4fa9170c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: http: make smp_fetch_body() report that the contents may change

The req_body and res_body sample fetch functions forgot to set the
SMP_F_MAY_CHANGE flag, making them unusable in tcp content rules. Now we
set the flag as long as the channel is not full and nothing indicates
the end was reached.

This is marked as a bug because it's unusual for a sample fetch function
to return a final verdict while data my change, but this results from a
limitation that was affecting the legacy mode where it was not possible
to know whether the end was reached without de-chunking the message. In
HTX there is no more reason to limit this. This fix could be backported
to 2.1, and to 2.0 if really needed, though it will only be doable for
HTX, and legacy cannot be fixed.

(cherry picked from commit 9dc92b2650a83a17db9779eadf63cb011be4f8ff)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: http: Add 404 to http-request deny

This patch adds http status code 404 Not Found to http-request deny. See
issue #80.

(cherry picked from commit 9205fea13a5e1711a54e6eca68b94128d47c391a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: http: Add 410 to http-request deny

This patch adds http status code 410 Gone to http-request deny. See
issue #80.

(cherry picked from commit 272e29b5cc55baacce96350a5921a5ae204f43d3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: ssl: crt-list must continue parsing on ERR_WARN

The original crt-list parsing was stopping at any non-zero value in
the cfgerr variable, even warnings.

This is an issue as it could lead to a crt-list parsing stopped at the
first warning, then HAProxy launched with a partial crt-list.

A ERR_WARN must continue the parsing. The parsing must be only
stopped on an ERR_CODE.

This commit is 2.1 only since it was fixed
in 2.2 by commit 2954c47 ("MEDIUM: ssl: allow crt-list caching")
and accidently in 2.0 by commit b131c87 ("CLEANUP: ssl: make
ssl_sock_load_cert*() return real error codes") as well as in 1.9 and
1.8.

BUG/MINOR: ssl: fix ssl-{min,max}-ver with openssl < 1.1.0

In bug #676, it was reported that ssl-min-ver SSLv3 does not work in
Amazon environments with OpenSSL 1.0.2.

The reason for this is a patch of Amazon OpenSSL which sets
SSL_OP_NO_SSLv3 in SSL_CTX_new(). Which is kind of a problem with our
implementation of ssl-{min,max}-ver in old openSSL versions, because it
does not try to clear existing version flags.

This patch fixes the bug by cleaning versions flags known by HAProxy in
the SSL_CTX before applying the right ones.

Should be backported as far as 1.8.

(cherry picked from commit d0712f3873546a0c24f3204ad75dd7eacd689602)
[wla: context edit]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

REGTESTS: Add missing OPENSSL to REQUIRE_OPTIONS for compression/lua_validation

The test uses the `ssl` keyword, add `OPENSSL` as a requirement.

Should be backported to all branches with that test.

(cherry picked from commit 185a64309c4891fbde2a35f95020c64188ad1e97)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

REGTESTS: Add missing OPENSSL to REQUIRE_OPTIONS for lua/txn_get_priv

The test uses the `ssl` keyword, add `OPENSSL` as a requirement.

Should be backported to all branches with that test.

(cherry picked from commit de5021a133e27a43da1d695876dc7fded0214293)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUILD: make dladdr1 depend on glibc version and not __USE_GNU

Technically speaking the call was implemented in glibc 2.3 so we must
rely on this and not on __USE_GNU which is an internal define of glibc
to track use of GNU_SOURCE.

(cherry picked from commit 62af9c83f9ed2b25e0061798e29e3cccfce5fbdc)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MEDIUM: pattern: fix thread safety of pattern matching

Commit b5997f740 ("MAJOR: threads/map: Make acls/maps thread safe")
introduced a subtle bug in the pattern matching code. In order to cope
with the possibility that another thread might be modifying the pattern's
sample_data while it's being used, we return a thread-local static
sample_data which is a copy of the one found in the matched pattern. The
copy is performed depending on the sample_data's type. But the switch
statement misses some breaks and doesn't set the new sample_data pointer
at the right place, resulting in the original sample_data being restored
at the end before returning.

The net effect overall is that the correct sample_data is returned (hence
functionally speaking the matching works fine) but it's not thread-safe
so any del_map() or set_map() action could modify the pattern on one
thread while it's being used on another one. It doesn't seem likely to
cause a crash but could result in corrupted data appearing where the
value is consumed (e.g. when appended in a header or when logged) or an
ACL occasionally not matching after a map lookup.

This fix should be backported as far as 1.8.

Thanks to Tim for reporting it and to Emeric for the analysis.

(cherry picked from commit 2fc761e827071c4b976403e69836ff063994ee14)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MEDIUM: log: don't hold the log lock during writev() on a file descriptor

In issue #648 a second problem was reported, indicating that some users
mistakenly send the log to an FD mapped on a file. This situation doesn't
even enable O_NONBLOCK and results in huge access times in the order of
milliseconds with the lock held and other threads waiting till the
watchdog fires to unblock the situation.

The problem with files is that O_NONBLOCK is ignored, and we still need
to lock otherwise we can end up with interleaved log messages.

What this patch does is different. Instead of locking all writers, it
uses a trylock so that there's always at most one logger and that other
candidates can simply give up and report a failure, just as would happen
if writev() returned -1 due to a pipe full condition. This solution is
elegant because it gives back the control to haproxy to decide to give
up when it takes too much time, while previously it was the kernel that
used to block the syscall.

However at high log rates (500000 req/s) there was up to 50% dropped logs
due to the contention on the lock. In order to address this, we try to
grab the lock up to 200 times and call ha_thread_relax() on failure. This
results in almost no failure (no more than previously with O_NONBLOCK). A
typical test with 6 competing threads writing to stdout chained to a pipe
to a single process shows around 1000 drops for 10 million logs at 480000
lines per second.

Please note that this doesn't mean that writing to a blocking FD is a good
idea, and it might only be temporarily done on testing environments for
debugging. A file or a terminal will continue to block the writing thread
while others spin a little bit and lose their logs, but the writing thread
will still experience performance-killing latencies.

This patch should be backported to 2.1 and 2.0. The code is in log.c in
2.0, but the principle is the same.

(cherry picked from commit df187875da479c71f12e2e8edb0a2e345f189523)
[wla: the lock is the one of the fd instead of log_lock]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

[RELEASE] Released version 2.1.7

Released version 2.1.7 with the following main changes :
- BUG/MAJOR: http-htx: Don't forget to copy error messages from defaults sections

BUG/MAJOR: http-htx: Don't forget to copy error messages from defaults sections

A bug was introduced in the commit c9dbbbdfc ("BUG/MEDIUM: http-htx: Duplicate
error messages as raw data instead of string"). No copy is performed in this
patch. The memory is allocated but the copy is not perfomed. It is stupid and it
affects all configuration with errofiles defined in a defaults section.

This patch should fix the issue #669. This bug is specific to the 2.1. Other
versions are not affected. Thus, there is no upstream commit ID for this
patch. And it should not be backported.

[RELEASE] Released version 2.1.6

Released version 2.1.6 with the following main changes :
    - Revert "BUG/MEDIUM: connections: force connections cleanup on server changes"
    - SCRIPTS: publish-release: pass -n to gzip to remove timestamp
    - BUG/MINOR: peers: fix internal/network key type mapping.
    - BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua action
    - BUG/MEDIUM: http-htx: Duplicate error messages as raw data instead of string
    - BUG/MEDIUM: hlua: Lock pattern references to perform set/add/del operations
    - BUG/MEDIUM: contrib/prometheus-exporter: Properly set flags to dump metrics
    - BUG/MEDIUM: mworker: fix the copy of options in copy_argv()
    - BUG/MINOR: init: -x can have a parameter starting with a dash
    - BUG/MINOR: init: -S can have a parameter starting with a dash
    - BUG/MEDIUM: mworker: fix the reload with an -- option
    - BUG/MINOR: ssl: fix a trash buffer leak in some error cases
    - BUG/MINOR: mworker: fix a memleak when execvp() failed

BUG/MINOR: mworker: fix a memleak when execvp() failed

Free next_argv when execvp() failed.

Must be backported as far as 1.8.

Should fix issue #668.

(cherry picked from commit 9fc6c97fb30bb9c15d6b6e34c917b0b5aba7486a)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MINOR: ssl: fix a trash buffer leak in some error cases

Fix a trash buffer leak when we can't take the lock of the ckch, or when
"set ssl cert" is wrongly used.

The bug was mentionned in this thread:
https://www.mail-archive.com/haproxy@formilux.org/msg37539.html

The bug was introduced by commit bc6ca7c ("MINOR: ssl/cli: rework 'set
ssl cert' as 'set/commit'").

Must be backported in 2.1.

(cherry picked from commit e5ff4addb2300db60f5a4d1f99abd54b4a1a94f6)
[wla: function cli_parse_set_cert() was moved in 2.2 and is in
src/ssl_sock.c in 2.1]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MEDIUM: mworker: fix the reload with an -- option

When HAProxy is started with a '--' option, all following parameters are
considered configuration files. You can't add new options after a '--'.

The current reload system of the master-worker adds extra options at the
end of the arguments list. Which is a problem if HAProxy was started wih
'--'.

This patch fixes the issue by copying the new option at the beginning of
the arguments list instead of the end.

This patch must be backported as far as 1.8.

(cherry picked from commit 0041741ef7253fcad2fc98f0b2a9968fb4af3574)
[wla: context edit]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MINOR: init: -S can have a parameter starting with a dash

There is no reason the -S option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)

This can be backported only if the previous patch which fixes
copy_argv() is backported too.

Could be backported as far as 1.9.

(cherry picked from commit a6b3249935cf8e4de53b9940419683ef82cb40b8)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MINOR: init: -x can have a parameter starting with a dash

There is no reason the -x option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)

This can be backported only if the previous patch which fixes
copy_argv() is backported too.

Could be backported as far as 1.8.

(cherry picked from commit 4f71d304aad5199d9a0179deb34241b4a3665740)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MEDIUM: mworker: fix the copy of options in copy_argv()

The copy_argv() function, which is used to copy and remove some of the
arguments of the command line in order to re-exec() the master process,
is poorly implemented.

The function tries to remove the -x and the -sf/-st options but without
taking into account that some of the options could take a parameter
starting with a dash.

In issue #644, haproxy starts with "-L -xfoo" which is perfectly
correct. However, the re-exec is done without "-xfoo" because the master
tries to remove the "-x" option. Indeed, the copy_argv() function does
not know how much arguments an option can have, and just assume that
everything starting with a dash is an option. So haproxy is exec() with
"-L" but without a parameter, which is wrong and leads to the exit of
the master, with usage().

To fix this issue, copy_argv() must know how much parameters an option
takes, and copy or skip the parameters correctly.

This fix is a first step but it should evolve to a cleaner way of
declaring the options to avoid deduplication of the parsing code, so we
avoid new bugs.

Should be backported with care as far as 1.8, by removing the options
that does not exists in the previous versions.

(cherry picked from commit df6c5a8ffad7ec28ff6987dc6b1e26d68eb89c5e)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>

BUG/MEDIUM: contrib/prometheus-exporter: Properly set flags to dump metrics

When the metrics are dumped, in the main function promex_dump_metrics(), the
appctx flags are set before entering in a new scope, among others things to know
which metrics names and descriptions to use. But, those flags are not restored
when the dump is interrupted because of a full output buffer. If this happens
after the dump of global metrics, it may only lead to extra #TYPE and #HELP
lines. But if this happens during the dump of global metrics, the following
dumps of frontends, backends and servers metrics use names and descriptions of
global ones with the unmatching indexes. This first leads to unexisting metrics
names. For instance, "haproxy_frontend_nbproc". But also to out-of-bound
accesses to name and description arrays because there are more stats fields than
info fields.

It is easy to reproduce the bug using small buffers, setting tune.bufsize to
8192 for instance.

This patch should fix the issue #666. It must be backported as far as 2.0.

(cherry picked from commit efde955bdb7b7479267cfcd823cce26bb3f5c4d3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: hlua: Lock pattern references to perform set/add/del operations

The pattern references lock must be hold to perform set/add/del
operations. Unfortunately, it is not true for the lua functions manipulating acl
and map files.

This patch should fix the issue #664. It must be backported as far as 1.8.

(cherry picked from commit 5cf2dfc5fd08b7ff8d27d6293e2aaa8ec18fc607)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: http-htx: Duplicate error messages as raw data instead of string

When the parsing of a proxy starts, the error messages are duplicated from the
default proxy. chunk_dup() function is used to do so. But this function handles
the source buffer as a string and try to add a null-terminated byte to the
destination buffer after the data copy. Since the 2.1, The error messages are
only HTX messages. So the buffer must not be handled as string but as raw
data. When the HTX message is not empty, it is not a problem because the
underlying buffer is considered as full. null-terminated byte cannot be added
and an exact copy is performed. But when the error message is set to /dev/null,
the source buffer as a null size. In this case, size of the destination buffer
is incremented by one. At the end, the destination buffer has a size of 1. It is
an unexpected and undefined state. In http_reply_and_close(), these buffers are
erroneously casted to junk HTX messages leading to undefined behaviors, most
probably to crashes.

This bug is specific to the 2.1. Other versions are not affected. Thus, there is
no upstream commit ID for this patch. And it should not be backported. it should
fix the issue #648 and probably the issue #658.

BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua action

Before executing a lua action, the analyse expiration timeout of the
corresponding channel must be reset. Otherwise, when it expires, for instance
because of a call to core.msleep(), if the action yields, an expired timeout
will be used for the stream's task, leading to a loop.

This patch should fix the issue #661. It must be backported in all versions
supporting the lua.

(cherry picked from commit 23308ebf0d603d3f81b4c7a5bdcd82f1cd70c449)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: peers: fix internal/network key type mapping.

Network types were directly and mistakenly mapped on sample types:

This patch fix the doc with values effectively used to keep backward
compatiblitiy on existing implementations.

In addition it adds an internal/network mapping for key types to avoid
further mistakes adding or modifying internals types.

This patch should be backported on all maintained branches,
particularly until v1.8 included for documentation part.

(cherry picked from commit 530ba38a706ed640e43eb44abe5578c7a5826d5d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

SCRIPTS: publish-release: pass -n to gzip to remove timestamp

It just appeared that the tar.gz we put online are not reproducible
because a timestamp is put by default into the archive. Passing "-n"
to gzip is sufficient to remove this timestamp, so let's do it, and
also make the gzip command configurable for more flexibility. Now
issuing the commands multiple times finally results in the same
archives being produced.

This should be backported to supported stable branches.

(cherry picked from commit 6fab3e6d91bff843dcb289172c8b7a5dd54062ac)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

Revert "BUG/MEDIUM: connections: force connections cleanup on server changes"

As explained by Christopher on github issue #665:
In 2.2, srv->idle_conns and srv->safe_conns are thread-safe lists. But
not in 2.1. So the patch must be reverted or the lists must be changed
to use mt_list instead. The same must be done in 2.0, but the mt_list
does not exist on this version.

I choose to revert it as the original bug is truly revealed in v2.2
after commit 079cb9af22da6 ("MEDIUM: connections: Revamp the way idle
connections are killed")

this should resolve github issue #665

this reverts commit 7eab37b6819af685c647cf5a581e29fca2f3e079.
this reverts commit 3ad3306ec0bcb0cd4ca2b9ba134ed67663473ee8.

The patch was directly introduced on 2.1, there is no upstream commit ID for
this patch.

[RELEASE] Released version 2.1.5

Released version 2.1.5 with the following main changes :
    - BUG/MINOR: protocol_buffer: Wrong maximum shifting.
    - MINOR: ssl: improve the errors when a crt can't be open
    - BUG/MINOR: ssl/cli: memory leak in 'set ssl cert'
    - BUG/MINOR: ssl: memleak of the struct cert_key_and_chain
    - BUG/MINOR: connection: always send address-less LOCAL PROXY connections
    - BUG/MINOR: peers: Incomplete peers sections should be validated.
    - DOC: hashing: update link to hashing functions
    - MINOR: version: Show uname output in display_version()
    - DOC: Improve documentation on http-request set-src
    - BUG/MINOR: ssl: default settings for ssl server options are not used
    - BUG/MEDIUM: http-ana: Handle NTLM messages correctly.
    - BUG/MINOR: tools: fix the i386 version of the div64_32 function
    - BUG/MINOR: http: make url_decode() optionally convert '+' to SP
    - DOC: option logasap does not depend on mode
    - MEDIUM: memory: make pool_gc() run under thread isolation
    - MINOR: contrib: make the peers wireshark dissector a plugin
    - BUG/MINOR: check: Update server address and port to execute an external check
    - MINOR: checks: Add a way to send custom headers and payload during http chekcs
    - BUG/MINOR: checks: Respect the no-check-ssl option
    - BUG/MEDIUM: server/checks: Init server check during config validity check
    - BUG/MINOR: checks: chained expect will not properly wait for enough data
    - BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function
    - BUG/MINOR: mux-fcgi: Be sure to have a connection as session's origin to use it
    - BUG/MEDIUM: capture: capture-req/capture-res converters crash without a stream
    - BUG/MEDIUM: capture: capture.{req,res}.* crash without a stream
    - BUG/MEDIUM: http: the "http_first_req" sample fetch could crash without a steeam
    - BUG/MEDIUM: http: the "unique-id" sample fetch could crash without a steeam
    - BUG/MEDIUM: sample: make the CPU and latency sample fetches check for a stream
    - BUG/MEDIUM: listener: mark the thread as not stuck inside the loop
    - MINOR: threads: export the POSIX thread ID in panic dumps
    - BUG/MINOR: debug: properly use long long instead of long for the thread ID
    - BUG/MEDIUM: shctx: really check the lock's value while waiting
    - BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock
    - MINOR: stream: report the list of active filters on stream crashes
    - MINOR: haproxy: export run_poll_loop
    - MINOR: tools: add new function dump_addr_and_bytes()
    - MINOR: tools: add resolve_sym_name() to resolve function pointers
    - MINOR: debug: use resolve_sym_name() to dump task handlers
    - MINOR: cli: make "show fd" rely on resolve_sym_name()
    - MEDIUM: debug: add support for dumping backtraces of stuck threads
    - MINOR: debug: call backtrace() once upon startup
    - BUILD: Makefile: include librt before libpthread
    - MINOR: wdt: do not depend on USE_THREAD
    - MINOR: debug: report the number of entries in the backtrace
    - MINOR: debug: improve backtrace() on aarch64 and possibly other systems
    - MINOR: debug: use our own backtrace function on clang+x86_64
    - MINOR: debug: dump the whole trace if we can't spot the starting point
    - BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms
    - BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr()
    - BUILD: Makefile: add linux-musl to TARGET
    - REGTEST: ssl: test the client certificate authentication
    - REGTEST: http-rules: Require PCRE or PCRE2 option to run map_redirect script
    - Revert "BUG/MINOR: connection: always send address-less LOCAL PROXY connections"
    - Revert "BUG/MINOR: connection: make sure to correctly tag local PROXY connections"
    - BUG/MINOR: checks/server: use_ssl member must be signed
    - BUG/MINOR: checks: Compute the right HTTP request length for HTTP health checks
    - BUG/MINOR: checks: Remove a warning about http health checks
    - BUG/MEDIUM: mux_fcgi: Free the FCGI connection at the end of fcgi_release()
    - BUG/MEDIUM: mux-fcgi: Fix wrong test on FCGI_CF_KEEP_CONN in fcgi_detach()
    - BUG/MEDIUM: connections: force connections cleanup on server changes
    - BUG/MEDIUM: h1: Don't compare host and authority if only h1 headers are parsed
    - BUG/MEDIUM: ssl: fix the id length check within smp_fetch_ssl_fc_session_id()
    - CLEANUP: connections: align function declaration
    - BUG/MINOR: sample: Set the correct type when a binary is converted to a string
    - BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS()
    - BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}()
    - BUG/MEDIUM: lua: Fix dumping of stick table entries for STD_T_DICT
    - BUG/MINOR: config: Make use_backend and use-server post-parsing less obscur
    - BUG/MINOR: http-ana: fix NTLM response parsing again
    - BUG/MEDIUM: http_ana: make the detection of NTLM variants safer
    - BUG/MINOR: cfgparse: Abort parsing the current line if an invalid \x sequence is encountered
    - BUG/MINOR: pools: use %u not %d to report pool stats in "show pools"
    - BUG/MINOR: pollers: remove uneeded free in global init
    - BUG/MINOR: soft-stop: always wake up waiting threads on stopping
    - BUILD: select: only declare existing local labels to appease clang
    - BUG/MEDIUM: streams: Remove SF_ADDR_SET if we're retrying due to L7 retry.
    - BUG/MEDIUM: stream: Only allow L7 retries when using HTTP.
    - BUG/MINOR: cache: Don't needlessly test "cache" keyword in parse_cache_flt()
    - BUG/MAJOR: mux-fcgi: Stop sending loop if FCGI stream is blocked for any reason
    - BUG/MEDIUM: ring: write-lock the ring while attaching/detaching
    - BUG/MINOR: checks: Respect check-ssl param when a port or an addr is specified
    - BUG/MINOR: server: Fix server_finalize_init() to avoid unused variable
    - DOC: retry-on can only be used with mode http
    - DOC/MINOR: halog: Add long help info for ic flag
    - DOC: SPOE is no longer experimental
    - BUG/MINOR: logs: prevent double line returns in some events.
    - REGTESTS: checks: Fix tls_health_checks when IPv6 addresses are used
    - BUG/MEDIUM: logs: fix trailing zeros on log message.
    - BUG/MINOR: lua: Add missing string length for lua sticktable lookup
    - BUG/MINOR: nameservers: fix error handling in parsing of resolv.conf

BUG/MINOR: nameservers: fix error handling in parsing of resolv.conf

In issue #657, Coverity found a bug in the "nameserver" parser for the
resolv.conf when "parse-resolv-conf" is set. What happens is that if an
unparsable address appears on a "nameserver" line, it will destroy the
previously allocated pointer before reporting the warning, then the next
"nameserver" line will dereference it again and wlil cause a crash. If
the faulty nameserver is the last one, it will only be a memory leak.
Let's just make sure we preserve the pointer when handling the error.
The patch also fixes a typo in the warning.

The bug was introduced in 1.9 with commit 44e609bfa ("MINOR: dns:
Implement `parse-resolv-conf` directive") so the fix needs to be backported
up to 1.9 or 2.0.

(cherry picked from commit 78675252fb6ce195c145afd2862fa53087bf585f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: lua: Add missing string length for lua sticktable lookup

In hlua_stktable_lookup(), the key length is never set so all
stktable:lookup("key") calls return nil from lua.

This patch must be backported as far as 1.9.

[Cf: I slightly updated the patch to use lua_tolstring() instead of
luaL_checkstring() + strlen()]

(cherry picked from commit 31a841c323b0e9692695a57d4747f71a32b9fb68)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: logs: fix trailing zeros on log message.

This patch removes all trailing LFs and Zeros from
log messages. Previously only the last LF was removed.

It's a regression from e8ea0ae6f6 "BUG/MINOR: logs:
prevent double line returns in some events."

This should fix github issue #654

(cherry picked from commit fa9d780119adc687e58a8b09e55780a0173b4ab4)
Signed-off-by: Willy Tarreau <w@1wt.eu>

REGTESTS: checks: Fix tls_health_checks when IPv6 addresses are used

In tls_health_checks.vtc, when IPv6 addresses are used, A config error is
reported because of the "addr" server parameter. Because there is no specified
port, the IPv6 address must be enclosed into brackets to be properly parsed. It
also works with IPv4 addresses. But instead, a dummy port is added to the addr
parameter. This way, we also check the port parameter, when specified, is used
in priority over the port found in the addr parameter.

This patch should fix the issue #646.

(cherry picked from commit ed48657e0297268d83d4b07027843c0a8ab05d36)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: logs: prevent double line returns in some events.

Historically some messages used to already contain the trailing LF but
not all, and __do_send_log adds a new one in needed cases. It also does
trim a trailing LF in certain cases while computing the max message
length, as a result of subtracting 1 to the available room in the
destination buffer. But the way it's done is wrong since some messages
still contain it.

So the code was fixed to always trim the trailing LF from messages if
present, and then only subtract 1 from the destination buffer room
instead of the size..

Note: new sink API is not designed to receive a trailing LF on
event messages

This could be backported to relevant stable versions with particular
care since the logic of the code changed a bit since 1.6 and there
may be other locations that need to be adjusted.

(cherry picked from commit 9e8ea0ae6f63fe1a8119d99b1218ccc095b390ac)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: SPOE is no longer experimental

The SPOE was marked as experiemental since the begining. But, it is no longer
true. This can be an obstacle to its use.

(cherry picked from commit 3b78809de09c6227164f6f48d587f86c827d5de1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC/MINOR: halog: Add long help info for ic flag

Add missing long help text for the ic (ip count) flag

(cherry picked from commit 6112f5ccd28376444c5dfccee9c3fe2be4545070)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: retry-on can only be used with mode http

The documentation for retry-on hints at it being meant to be used
in conjuction with mode http, but since we've a had bug report
involving mode tcp and retry-on, lets make it explicit in the
documentation that it only works with mode http and will be
ignored otherwise.

(cherry picked from commit 5ce3c14aa97396502fe68c6ce42a32ea99361046)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: server: Fix server_finalize_init() to avoid unused variable

The variable 'ret' must only be declared When HAProxy is compiled with the SSL
support (more precisely SSL_CTRL_SET_TLSEXT_HOSTNAME must be defined).

No backport needed.

(cherry picked from commit b3b53524addbea79f5928b0bd5c58fd201a3e828)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
[Cf: The patch fixes a bug introduced by the commit 8892e5d3. Thus it must be
backported to the same versions.]

BUG/MINOR: checks: Respect check-ssl param when a port or an addr is specified

When a check port or a check address is specified, the check transport layer is
ignored. So it is impossible to do a SSL check in this case. This bug was
introduced by the commit 8892e5d30 ("BUG/MEDIUM: server/checks: Init server
check during config validity check").

This patch should fix the issue #643. It must be backported to all branches
where the above commit was backported.

(cherry picked from commit 66163ec616bd3dfb06605bfc5a6e640ea7ada7d7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: ring: write-lock the ring while attaching/detaching

The LIST_ADDQ() and LIST_DEL_INIT() calls made to attach/detach a waiter
to the ring were made under a read lock which was sufficient in front of
the writer's write lock. But it's not sufficient against other readers!
Thus theorically multiple "show events" on the same ring buffer on the
CLI could result in a crash, even though for now I couldn't manage to
reproduce it.

This fixes commit 1d181e489c ("MEDIUM: ring: implement a wait mode for
watchers") so it must be backported to 2.1, possibly further if the ring
code gets backported.

(cherry picked from commit 223ddedb467494d78ebf67e0ac19206f767ffe6a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MAJOR: mux-fcgi: Stop sending loop if FCGI stream is blocked for any reason

Because of a typo error in conditions to exit the sending loop, it is possible
to loop infinitely in fcgi_snd_buf() function. Instead of checking the FCGI
stream is not blocked to continue sending data, the FCGI connection is used. So
it is possible to have a stream blocked because there is not enough space in the
mux buffers to copy more data but continue to loop to send more data.

This patch should fix the issue #637. It must be backported to 2.1.

(cherry picked from commit fe410d685ad8bce7399e50695098b69aea07e558)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: cache: Don't needlessly test "cache" keyword in parse_cache_flt()

parse_cache_flt() is the registered callback for the "cache" filter keyword. It
is only called when the "cache" keyword is found on a filter line. So, it is
useless to test the filter name in the callback function.

This patch should fix the issue #634. It may be backported as far as 1.9.

(cherry picked from commit 2a37cdbe6b2cf7e9fa8654ee017af12ceb6015c2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: stream: Only allow L7 retries when using HTTP.

Only allow L7 retries when using HTTP, it only really makes sense for HTTP,
anyway, and as the L7 retries code assume the message will be HTX, it will
crash when used with mode TCP.
This should fix github issue #627.

This should be backported to 2.1 and 2.0.

(cherry picked from commit 7dd7b908db1cb804ffe30ef208f7eac1824c1889)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: streams: Remove SF_ADDR_SET if we're retrying due to L7 retry.

In do_l7_retry(), remove the SF_ADDR_SET flag. Otherwise,
assign_server_address() won't be called again, which means for 2.1 or 2.2,
we will always retry to connect to the server that just failed, and for 2.0,
that we will try to use to whatever the address is for the connection,
probably the last server used by that connection before it was pool_free()
and reallocated.

This should be backported to 2.1 and 2.0.

(cherry picked from commit 8cabc9783aaa8309f4ec6f7dd95cd74c61ed2e61)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUILD: select: only declare existing local labels to appease clang

Commit 42a50bd19 ("BUG/MINOR: pollers: remove uneeded free in global
init") removed the 'fail_revt' label from the _do_init() function
in src/ev_select.c but left the local label declaration, which makes
clang unhappy and unable to build.

These labels are only historic and unneeded anyway so let's remove them.

This should be backported where 42a50bd19 is backported.

(cherry picked from commit c9c475e40d9104cc12ebc9320cb768bb4794766d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: soft-stop: always wake up waiting threads on stopping

Currently the soft-stop can lead to old processes remaining alive for as
long as two seconds after receiving a soft-stop signal. What happens is
that when receiving SIGUSR1, one thread (usually the first one) wakes up,
handles the signal, sets "stopping", goes into runn_poll_loop(), and
discovers that stopping is set, so its also sets itself in the
stopping_thread_mask bit mask. After this it sees that other threads are
not yet willing to stop, so it continues to wait.

From there, other threads which were waiting in poll() expire after one
second on poll timeout and enter run_poll_loop() in turn. That's already
one second of wait time. They discover each in turn that they're stopping
and see that other threads are not yet stopping, so they go back waiting.

After the end of the first second, all threads know they're stopping and
have set their bit in stopping_thread_mask. It's only now that those who
started to wait first wake up again on timeout to discover that all other
ones are stopping, and can now quit. One second later all threads will
have done it and the process will quit.

This is effectively strictly larger than one second and up to two seconds.

What the current patch does is simple, when the first thread stops, it sets
its own bit into stopping_thread_mask then wakes up all other threads to do
also set theirs. This kills the first second which corresponds to the time
to discover the stopping state. Second, when a thread exists, it wakes all
other ones again because some might have gone back sleeping waiting for
"jobs" to go down to zero (i.e. closing the last connection). This kills
the last second of wait time.

Thanks to this, as SIGUSR1 now acts instantly again if there's no active
connection, or it stops immediately after the last connection has left if
one was still present.

This should be backported as far as 2.0.

(cherry picked from commit d7a6b2f742340ae64eead154bce80ccb25158f84)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>