Repositories - haproxy-3.1.git/log

BUG/MINOR: log: Be able to use %ID alias at anytime of the stream's evaluation

In a log-format string, using "%[unique-id]" or "%ID" should be equivalent.
However, for the first one, the unique ID is generated when the sample fetch
function is called. For the alias, it is not true. It that case, the
stream's unique ID is generated when the log message is emitted. Otherwise,
by default, the unique id is automatically generated at the end of the HTTP
request analysis.

So, if the alias "%ID" is use in a log-format string anywhere before the end
of the request analysis, the evaluation failed and the ID is considered as
empty. It is not consistent and in contradiction with the "%ID"
documentation.

To fix the issue, instead of evaluating the unique ID when the log message
is emitted, it is now performed on demand when "%ID" format is evaluated.

This patch should fix the issue #3016. It should be backported to all stable
versions. It relies on the following commit:

* BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself

(cherry picked from commit 20a82027ceb7a46ce2a0cbe05e40c63d132601f7)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 609d9447e2ecfe6bc4854aa2c3ee9154a97b2710)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: stream: Avoid recursive evaluation for unique-id based on itself

There is nothing that prevent a "unique-id-format" to reference itself,
using '%ID' or '%[unique-id]'. If the sample fetch function is used, it
leads to an infinite loop, calling recursively the function responsible to
generate the unique ID.

One solution is to detect it during the configuration parsing to trigger an
error. With this patch, we just inhibit recursive calls by considering the
unique-id as empty during its evaluation. So "id-%[unique-id]" lf string
will be evaluated as "id-".

This patch must be backported to all stable versions.

(cherry picked from commit fb7b5c8a53cb4f19a223abd20660d47162aa8708)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit dfbb00ac1410af65d2befe3e94afb6cfc137b220)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-quic/h3: properly handle too low peer fctl initial stream

Previously, no check on peer flow-control was implemented prior to open
a local QUIC stream. This was a small problem for frontend
implementation, as in this case haproxy as a server never opens
bidirectional streams.

On frontend, the only stream opened by haproxy in this case is for
HTTP/3 control unidirectional data. If the peer uses an initial value
for max uni streams set to 0, it would violate its flow control, and the
peer will probably close the connection. Note however that RFC 9114
mandates that each peer defines minimal initial value so that at least
the control stream can be created.

This commit improves the situation of too low initial max uni streams
value. Now, on HTTP/3 layer initialization, haproxy preemptively checks
flow control limit on streams via a new function
qcc_fctl_avail_streams(). If credit is already expired due to a too
small initial value, haproxy preemptively closes the connection using
H3_ERR_GENERAL_PROTOCOL_ERROR. This behavior is better as haproxy is now
the initiator of the connection closure.

This should be backported up to 2.8.

(cherry picked from commit 805a070ab920d14b22a6b7beac3b0648e684b2d2)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 47a4954c3974b85fc1b30ea723bf1d809e66e630)
[cf: context adjustment]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: config: prefer-last-server: add notes for non-deterministic algorithms

Add some notes which load-balancing algorithm can be considered as
deterministic or non-deterministic and add some examples for each type.
This was asked via mailing list to clarify the usage of
prefer-last-server option.

This can be backported to all stable versions.

(cherry picked from commit cdb2f8d780a27778cc23669693a9910e07135e8a)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit d6ca14b764039ed1c5d21d2fa83706996fd0a717)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: hlua_fcn: ensure systematic watcher cleanup for server list iterator

In 358166a ("BUG/MINOR: hlua_fcn: restore server pairs iterator pointer
consistency"), I wrongly assumed that because the iterator was a temporary
object, no specific cleanup was needed for the watcher.

In fact watcher_detach() is not only relevant for the watcher itself, but
especially for its parent list to remove the current watcher from it.

As iterators are temporary objects, failing to remove their watchers from
the server watcher list causes the server watcher list to be corrupted.

On a normal iteration sequence, the last watcher_next() receives NULL
as target so it successfully detaches the last watcher from the list.
However the corner case here is with interrupted iterators: users are
free to break away from the iteration loop when a specific condition is
met for instance from the lua script, when this happens
hlua_listable_servers_pairs_iterator() doesn't get a chance to detach the
last iterator.

Also, Lua doesn't tell us that the loop was interrupted,
so to fix the issue we rely on the garbage collector to force a last
detach right before the object is freed. To achieve that, watcher_detach()
was slightly modified so that it becomes possible to call it without
knowing if the watcher is already detached or not, if watcher_detach() is
called on a detached watcher, the function does nothing. This way it saves
the caller from having to track the watcher state and makes the API a
little more convenient to use. This way we now systematically call
watcher_detach() for server iterators right before they are garbage
collected.

This was first reported in GH #3055. It can be observed when the server
list is browsed one than more time when it was already browsed from Lua
for a given proxy and the iteration was interrupted before the end. As the
watcher list is corrupted, the common symptom is watcher_attach() or
watcher_next() not ending due to the internal mt_list call looping
forever.

Thanks to GH users @sabretus and @sabretus for their precious help.

It should be backported everywhere 358166a was.

(cherry picked from commit aeff2a3b2a87ae5572e73944eb56e732bb98295f)
Signed-off-by: Aurelien DARRAGON <adarragon@haproxy.com>
(cherry picked from commit 97bb595f25b56bc6d8b68d75a6251a53fa86c4e1)
Signed-off-by: Aurelien DARRAGON <adarragon@haproxy.com>

BUG/MEDIUM: ssl/clienthello: ECDSA with ssl-max-ver TLSv1.2 and no ECDSA ciphers

Patch 23093c72 ("BUG/MINOR: ssl: suboptimal certificate selection with TLSv1.3
and dual ECDSA/RSA") introduced a problem when prioritizing the ECDSA
with TLSv1.3.

Indeed, when a client with TLSv1.3 capabilities announce a list of
ECDSA sigalgs, a list of TLSv1.3 ciphersuites compatible with ECDSA,
but only RSA ciphers for TLSv1.2, and haproxy is configured to a
ssl-max-ver TLSv1.2, then haproxy would use the ECDSA keypair, but the
client wouldn't be able to process it because TLSv1.2 was negociated.

HAProxy would be configured like that:

  ssl-default-bind-options ssl-max-ver TLSv1.2

And a client could be used this way:

  openssl s_client -connect localhost:8443 -cipher ECDHE-ECDSA-AES128-GCM-SHA256 \
          -ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256

This patch fixes the issue by checking if TLSv1.3 was configured before
allowing ECDSA is an TLSv1.3 ciphersuite is in the list.

This could be backported where 23093c72 ("BUG/MINOR: ssl: suboptimal
certificate selection with TLSv1.3 and dual ECDSA/RSA") was backported.
However this is quite sensible and we should wait a bit before the
backport.

This should fix issue #2988

(cherry picked from commit 4a298c6c5c64ecbbc8df1351df4b410216f95828)
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
(cherry picked from commit b552780290616a66ed9eb4247250c7239d159a90)
Signed-off-by: William Lallemand <wlallemand@haproxy.com>

BUG/MEDIUM: check: Set SOCKERR by default when a connection error is reported

When a connection error is reported, we try to collect as much information
as possible on the connection status and the server status is adjusted
accordingly. However, the function does nothing if there is no connection
error and if the healthcheck is not expired yet. It is a problem when an
internal error occurred. It may happen at many places and it is hard to be
sure an error is reported on the connection. And in fact, it is already a
problem when the multiplexer allocation fails. In that case, the healthcheck
is not interrupted as it should be. Concretely, it could only happen when a
connection is established.

It is hard to predict the effects of this bug. It may be unimportant. But it
could probably lead to a crash. To avoid any issue, a SOCKERR status is now
set by default when a connection error is reported. There is no reason to
report a connection error for nothing. So a healthcheck failure must be
reported. There is no "internal error" status. So a socket error is
reported.

This patch must be backport to all stable versions.

(cherry picked from commit 54d74259e9860c7017903221f27525be89983a75)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 18542f3a8b2c42598ca1e5061507c8393bc3ce2d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: cli: handle EOS/ERROR first

It is not especially a bug fixed. But APPCTX_FL_EOS and APPCTX_FL_ERROR
flags must be handled first. These flags are set by the applet itself and
should mark the end of all processing. So there is not reason to get the
output buffer in first place.

This patch could be backported as far as 3.0.

(cherry picked from commit fb7665552668df368bbc85ef56beca11c356f53a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 9a874dc54c3a6eb061f19d0a65e882d17d4b1a3e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: cli: Don't consume data if outbuf is full or not available

The output buffer must be available to process a command, at least to be
able to emit error messages. When this buffer is full or cannot be
allocated, we must wait. In that case, we must take care to notify the SE
will not consume input data. It is important to avoid wakeup in loop,
especially when the client aborts.

When the output buffer is available again and no longer full, and the CLI
applet is waiting for a command line, it must notify it will consume input
data.

This patch must be backported as far as 3.0.

(cherry picked from commit 396f0252bf3a400c9dbfdaf730e994bb44f198dd)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit de668f5eb86f0370357042badc07a90e5d565576)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: http-ana: Properly handle keep-query redirect option if no QS

The keep-query redirect option must do nothing is there is no query-string.
However, there is a bug. When there is no QS, an error is returned, leading
to return a 500-internal-error to the client.

To fix the bug, instead of returning 0 when there is no QS, we just skip the
QS processing.

This patch should fix the issue #3005. It must be backported as far as 3.1.

(cherry picked from commit edb8f2bb60d11cc92895e6cf1cb05e5b6cdc51d7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit c7fc5c4741e70322cb97bb1d913ff6db86ab0ab2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: config/server: reject QUIC addresses

QUIC is not implemented on the backend side. To prevent any issue, it is
better to reject any server configured which uses it. This is done via
_srv_parse_init() which is used both for static and dynamic servers.

This should be backported up to all stable versions.

(cherry picked from commit 1ecf2e9babe9d2505cebfc9e0f64454be66c2905)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 4b2ae7c9c644b5b27af732dc15df3d2fc7558693)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINIR: h1: Fix doc of 'accept-unsafe-...-request' about URI parsing

The description of tests performed on the URI in H1 when
'accept-unsafe-violations-in-http-request' option is wrong. It states that
only characters below 32 and 127 are blocked when this option is set,
suggesting that otherwise, when it is not set, all invalid characters in the
URI, according to the RFC3986, are blocked.

But in fact, it is not true. By default all character below 32 and above 127
are blocked. And when 'accept-unsafe-violations-in-http-request' option is
set, characters above 127 (excluded) are accepted. But characters in
(33..126) are never checked, independently of this option.

This patch should fix the issue #2906. It should be backported as far as
3.0. For older versions, the docuementation could also be clarified because
this part is not really clear.

Note the request URI validation is still under discution because invalid
characters in (33.126) are never checked and some users request a stricter
parsing.

(cherry picked from commit b2f64af3413e623aff2f2413cbbef6ac27f44f21)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 3b01e537edd2b50c449ede6cb312caa45786ca35)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: fd: Use the provided tgid in fd_insert() to get tgroup_info

In fd_insert(), use the provided tgid to ghet the thread group info,
instead of using the one of the current thread, as we may call
fd_insert() from a thread of another thread group, that will happen at
least when binding the listeners. Otherwise we'd end up accessing the
thread mask containing enabled thread of the wrong thread group, which
can lead to crashes if we're binding on threads not present in the
thread group.
This should fix Github issue #2991.

This should be backported up to 2.8.

(cherry picked from commit 6993981cd6e81448cd6a21ca32f21f2b548aa1b3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 12989a221bdf829a36f73d25ac947e95af7af59a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: quic: Missing SSL session object freeing

qc_alloc_ssl_sock_ctx() allocates an SSL_CTX object for each connection. It also
allocates an SSL object. When this function failed, it freed only the SSL_CTX object.
The correct way to free both of them is to call qc_free_ssl_sock_ctx().

Must be backported as far as 2.6.

(cherry picked from commit 6b746330692380f7e88fc757b7124b5a9c88ebd8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit aaef9e0000fb2fc6011d9a95e6e7f5241863db8f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-spop: Fix null-pointer deref on SPOP stream allocation failure

When we try to allocate a new SPOP stream, if an error is encountered,
spop_strm_destroy() is called to released the eventually allocated
stream. But, it must only be called if a stream was allocated. If the
reported error is an SPOP stream allocation failure, we must just leave to
avoid null-pointer dereference.

This patch should fix point 1 of the issue #2993. It must be backported as
far as 3.1.

(cherry picked from commit 8c4bb8cab37f72c451bc7685eaf58cb1c2f5fae2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit f4f45adb3f98570d817c6e63b662001ca95292d3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout

When a healthchecks is processed, once the first wakeup passed to start the
check, and as long as the expiration timer is not reached, only I/O events
are able to wake it up. It is an issue when there is a check timeout
defined. Especially if the connect timeout is high and the check timeout is
low. In that case, the healthcheck's task is never requeue to handle any
timeout update. When the connection is established, the check timeout is set
to replace the connect timeout. It is thus possible to report a success
while a timeout should be reported.

So, now, when an I/O event is handled, the healthcheck is requeue, except if
an success or an abort is reported.

Thanks to Thierry Fournier for report and the reproducer.

This patch must be backported to all stable versions.

(cherry picked from commit 7c788f0984623f727a71ae4aee9917ddeac1b59d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 73b733e2e0bf6d6e9914c3b58a632f2b4dcb2d8d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: config: Fix a typo in 2.7 (Name format for maps and ACLs)

"identified" was used instead of "identifier". May be backported as far as
3.0

(cherry picked from commit 8e8cdf114b1961b008fd108dbe6b6c4aa0ce2b68)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 81e566388907ca03c56ee30ad745ddd93ce04a97)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUILD: tools: properly define ha_dump_backtrace() to avoid a build warning

In resolve_sym_name() we declare a few symbols that we want to be able
to resolve. ha_dump_backtrace() was declared with a struct buffer instead
of a pointer to such a struct, which has no effect since we only want to
get the function's pointer, but produces a build warning with LTO, so
let's fix it.

This can be backported to 3.0.

(cherry picked from commit b88164d9c0eb1540c9b787478162c254ac947e8d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 6330c20026e79f54a6ead77d6c9573945a074744)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field

In continuation with 9a05c1f574 ("BUG/MEDIUM: h2/h3: reject some
forbidden chars in :authority before reassembly") and the discussion
in issue #2941, @DemiMarie rightfully suggested that Host should also
be sanitized, because it is sometimes used in concatenation, such as
this:

http-request set-url https://%[req.hdr(host)]%[pathq]

which was proposed as a workaround for h2 upstream servers that require
:authority here:

https://www.mail-archive.com/haproxy@formilux.org/msg43261.html

The current patch then adds the same check for forbidden chars in the
Host header, using the same function as for the patch above, since in
both cases we validate the host:port part of the authority. This way
we won't reconstruct ambiguous URIs by concatenating Host and path.

Just like the patch above, this can be backported afer a period of
observation.

(cherry picked from commit df00164fdd98d15e832daad34fb23249083bfb9c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly

As discussed here:
   https://github.com/httpwg/http2-spec/pull/936
   https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/'). This patch does this, both for h2 and h3.

The impact on H2 was measured in the worst case at 0.3% of the request
rate, while the impact on H3 is around 1%, but H3 was about 1% faster
than H2 before and is now on par.

It may be backported after a period of observation, and in this case it
relies on this previous commit:

   MINOR: http: add a function to validate characters of :authority

Thanks to @DemiMarie for reviving this topic in issue #2941 and bringing
new potential interesting cases.

(cherry picked from commit 9a05c1f57490ba3adb378ad8e6e26830425514e7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: http: add a function to validate characters of :authority

As discussed here:
https://github.com/httpwg/http2-spec/pull/936
https://github.com/haproxy/haproxy/issues/2941

It's important to take care of some special characters in the :authority
pseudo header before reassembling a complete URI, because after assembly
it's too late (e.g. the '/').

This patch adds a specific function which was checks all such characters
and their ranges on an ist, and benefits from modern compilers
optimizations that arrange the comparisons into an evaluation tree for
faster match. That's the version that gave the most consistent performance
across various compilers, though some hand-crafted versions using bitmaps
stored in register could be slightly faster but super sensitive to code
ordering, suggesting that the results might vary with future compilers.
This one takes on average 1.2ns per character at 3 GHz (3.6 cycles per
char on avg). The resulting impact on H2 request processing time (small
requests) was measured around 0.3%, from 6.60 to 6.618us per request,
which is a bit high but remains acceptable given that the test only
focused on req rate.

The code was made usable both for H2 and H3.

(cherry picked from commit ebab479cdf34255cd6162d2e843645f88b95327f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

[RELEASE] Released version 3.1.8

Released version 3.1.8 with the following main changes :
    - BUG/MINOR: quic: do not crash on CRYPTO ncbuf alloc failure
    - BUG/MINOR: cli: Issue an error when too many args are passed for a command
    - BUG/MAJOR: listeners: transfer connection accounting when switching listeners
    - MINOR: applet: add appctx_schedule() macro
    - CLEANUP: dns: remove unused dns_stream_server struct member
    - BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers
    - BUG/MINOR: dns: prevent ds accumulation within dss
    - BUG/MINOR: mux-h1: Don't pretend connection was released for TCP>H1>H2 upgrade
    - BUG/MINOR: mux-h1: Fix trace message in h1_detroy() to not relay on connection
    - BUG/MEDIUM: connections: Report connection closing in conn_create_mux()
    - BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends
    - MINOR: quic: rename min/max fields for congestion window algo
    - MINOR: quic: extend return value during TP parsing
    - BUG/MINOR: quic: use proper error code on missing CID in TPs
    - BUG/MINOR: quic: use proper error code on invalid server TP
    - BUG/MINOR: quic: reject retry_source_cid TP on server side
    - BUG/MINOR: quic: use proper error code on invalid received TP value
    - BUG/MINOR: quic: fix TP reject on invalid max-ack-delay
    - BUG/MINOR: quic: reject invalid max_udp_payload size
    - BUG/MEDIUM: peers: hold the refcnt until updating ts->seen
    - BUG/MINOR: cli: fix too many args detection for commands
    - BUG/MEDIUM: quic: free stream_desc on all data acked
    - BUG/MINOR: ssl/ckch: always free() the previous entry during parsing
    - BUG/MINOR: threads: fix soft-stop without multithreading support
    - BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation
    - BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state
    - BUG/MEDIUM: mux-spop: Respect the negociated max-frame-size value to send frames
    - BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready
    - BUG/MINOR: mux-spop: Use the right bitwise operator in spop_ctl()
    - BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received
    - BUG/MINOR: mux-spop: Make the demux stream ID a signed integer
    - BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error
    - MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
    - BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state
    - BUG/MEDIUM: mux-spop: Properly handle CLOSING state
    - BUG/MEDIUM: spop-conn: Report short read for partial frames payload
    - BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data
    - DEBUG: mux-spop: Review some trace messages to adjust the message or the level
    - BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers
    - BUG/MEDIUM: peers: also limit the number of incoming updates
    - DOC: ring: refer to newer RFC5424
    - DOC: config: restore default values for resolvers hold directive
    - DOC: config: recommend disabling libc-based resolution with resolvers
    - MEDIUM: stick-tables: Limit the number of old entries we remove
    - BUG/MINOR: mux-quic: do not decode if conn in error
    - MINOR: quic: refactor BBR API
    - BUG/MINOR: quic: ensure cwnd limits are always enforced
    - BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error
    - DOC: configuration: explicit multi-choice on bind shards option
    - BUG/MINOR: h3: don't insert more than one Host header
    - BUILD: debug: mark ha_crash_now() as attribute(noreturn)
    - CLEANUP: quic: Useless BIO_METHOD initialization
    - MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures
    - BUG/MEDIUM: wdt: always ignore the first watchdog wakeup
    - MEDIUM: hlua: Add function to change the body length of an HTTP Message
    - BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload
    - BUG/MEDIUM: server: fix crash after duplicate GUID insertion
    - BUG/MEDIUM: server: fix potential null-deref after previous fix
    - BUG/MAJOR: cache: Crash because of wrong cache entry deleted
    - DOC: configuration: fix the example in crt-store
    - BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request
    - BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API
    - BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers
    - REGTESTS: Make the script testing conditional set-var compatible with Vtest2
    - REGTESTS: Explicitly allow failing shell commands in some scripts
    - CI: vtest: Rely on VTest2 to run regression tests
    - CI: vtest: Fix the build script to properly work on MaOS
    - BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused
    - DOC: hlua: Add a note to warn user about httpclient object reuse
    - DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation

DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation

A few typos were noticed while gathering info for the 3.2 announce
messages, this fixes them, and will probably constitute the last
commit of this release. There's no need to backport it unless commit
94055a5e7 ("MEDIUM: hlua: Add function to change the body length of
an HTTP Message") is backported.

(cherry picked from commit 21ce685fcdc2931cf6f5aafbf1f7cb68c9c7a1a1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: hlua: Add a note to warn user about httpclient object reuse

It is not supported to reuse an lua httpclient instance to process several
requests. A new object must be created for each request. Thanks to the
previous patch ("BUG/MEDIUM: httpclient: Throw an error if an lua httpclient
instance is reused"), an error is now reported if this happens. But it is
not obvious for users. So the lua-api docuementation was updated accordingly.

This patch is related to issue #2986. It should be backported with the
commit above.

(cherry picked from commit cb7a2444d149de8e0ce89d7b4a450e5c36822412)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused

It is not expected/supported to reuse an httpclient instance to process
several requests. A new instance must be created for each request. However,
in lua, there is nothing to prevent a user to create an httpclient object
and use it in a loop to process requests.

That's unfortunate because this will apparently work, the requests will be
sent and a response will be received and processed. However internally some
ressources will be allocated and never released. When the next response is
processed, the ressources allocated for the previous one are definitively
lost.

In this patch we take care to check that the httpclient object was never
used when a request is sent from a lua script by checking
HTTPCLIENT_FS_STARTED flags. This flag is set when a httpclient applet is
spawned to process a request and never removed after that. In lua, the
httpclient applet is created when the request is sent. So, it is the right
place to do this test.

This patch should fix the issue #2986. It should be backported as far as
2.6.

(cherry picked from commit 50fca6f0b7ebee94b5ebe402e37af9d7d945fe64)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

CI: vtest: Fix the build script to properly work on MaOS

"config.h" header file is new in VTest2 and includes must be adapted to be
able to build VTest on MacOS. Let's add "-I." to make it work.

(cherry picked from commit 508e074a320cab94997b1e3a2a6a3c6c2d0501ac)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

CI: vtest: Rely on VTest2 to run regression tests

VTest2 (https://github.com/vtest/VTest2) was released and is a remplacement
for VTest. VTest was archived. So let's use the new version now.

If this commit is backported, the 2 following commits must also be
backported:

* 2808e3577 ("REGTESTS: Explicitly allow failing shell commands in some scripts")
* 82c291124 ("REGTESTS: Make the script testing conditional set-var compatible with Vtest2")

(cherry picked from commit 6a18d28ba20439a94ec38e6670e1b7b2a864bde8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTESTS: Explicitly allow failing shell commands in some scripts

Vtest2, that should replaced Vtest in few months, will reject any failing
commands in shell blocks. However, some scripts are executing some commands,
expecting an error to be able to parse the error output. So, now use "set
+e" in those scripts to explicitly state failing commads are expected.

It is just used for non-final commands. At the end, the shell block must
still report a success.

(cherry picked from commit 2808e3577fb72b693198c3f4e8f9284cb69675b3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTESTS: Make the script testing conditional set-var compatible with Vtest2

VTest2 will replaced VTest in few months. There is not so much change
expected. One of them is that a User-Agent header is added by default in all
requests, except if an custom one is already set or if "-nouseragent" option
is used. To still be compatible with VTest, it is not possible to use the
option to avoid the header addition. So, a custom user-agent is added in the
last test of "sample_fetches/cond_set_var.vtc" to be sure it will pass with
Vtest and Vtest2. It is mandatory because the request length is tested.

(cherry picked from commit 82c291124882d75625122e7f81b965fa2853e720)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers

The commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work
with applet's buffers") fixed the TCP applets API to work with applets using
its own buffers. Howver the getline() function was not updated. It could be
an issue for anyone registering a CLI commands reading lines.

This patch should be backported as far as 3.0.

(cherry picked from commit c0ecef71d7ceaf404bc18241bdb878a4e72033b4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API

The internal function responsible to receive data for TCP applets with
internal buffers is buggy. Indeed, for these applets, the buffer API is used
to get data. So there is no tests on the SE to properly detect connection
shutdowns. So, it must be performed by hand after the call to b_getblk_nc().

This patch must be backported as far as 3.0.

(cherry picked from commit c64781c2c8b307fba7499fc70102d4246e850240)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request

When a ":scheme" pseudo-header is found in a h3 request, the
HTX_SL_F_HAS_SCHM flag must be set on the HTX message. And if the scheme is
'http' or 'https', the corresponding HTX flag must also be set. So,
respectively, HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS.

It is mainly used to send the right ":scheme" pseudo-header value to H2
server on backend side.

This patch could be backported as far as 2.6.

(cherry picked from commit da9792cca85366e3da4d716c0eb4a6a8bb9c1e62)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: configuration: fix the example in crt-store

Fix a bad example in the crt-store section. site1 does not use the "web"
crt-store but the global one.

Must be backported as far as 3.0 however the section was 3.12 in
previous version.

(cherry picked from commit d607940915b097cdbb85910a0fac50ffafe075e1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MAJOR: cache: Crash because of wrong cache entry deleted

When "vary" is enabled, we can have multiple entries for a given primary
key in the cache tree. There is a limit to how many secondary entries
can be inserted for a given key. When we try to insert a new secondary
entry, if the limit is already reached, we can try to find expired
entries with the same primary key, and if the limit is still reached we
want to abort the current insertion and to remove the node that was just
inserted.

In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though,
a regression was introduced. Instead of removing the entry just inserted
as the comments suggested, we removed the second to last entry and
returned NULL. We then reset the eb.key of the cache_entry in the caller
because we assumed that the entry was already removed from the tree.

This means that some entries with an empty key were wrongly kept in the
tree and the last secondary entry, which keeps the number of secondary
entries of a given key was removed.

This ended up causing some crashes later on when we tried to iterate
over the elements of this given key. The crash could occur in multiple
places, either when trying to retrieve an entry or to add some new ones.

This crash was raised in GitHub issue #2950.
The fix should be backported up to 3.0.

(cherry picked from commit 90441e9bfe89c8df5d319f794f2cc0ee6e3797e7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: server: fix potential null-deref after previous fix

A valid build warning was reported in the CI with latest commit b40ce97ecc
("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed,
if the first test in the function fails, we branch to the err label
with guid==NULL and will crash there. Let's just test guid before
dereferencing it for freeing.

This needs to be backported to 3.0 as well since the commit above was
meant to go there.

(cherry picked from commit 28c7a22790a587c6a3ee1652188ad6786d59b687)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: server: fix crash after duplicate GUID insertion

On "add server", if a GUID is defined, guid_insert() is used to add the
entry into the global GUID tree. If a similar entry already exists, GUID
insertion fails and the server creation is eventually aborted.

A crash could occur in this case because of an invalid memory access via
guid_remove(). The latter is caused via free_server() as the server
insertion is rejected. The invalid occurs on GUID key.

The issue occurs because of guid_insert(). The function properly
deallocates the GUID key on duplicate insertion, but it failed to reset
<guid.node.key> to NULL. This caused the invalid memory access on
guid_remove(). To fix this, ensure that key member is properly resetted
on guid_insert() error path.

This must be backported up to 3.0.

(cherry picked from commit b40ce97ecca0e43c45f6a15e3a7184436d5cb467)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload

It is especially a problem with Lua filters, but it is important to disable
the 0-copy forwarding if a filter alters the payload, or at least to be able
to disable it. While the filter is registered on the data filtering, it is
not an issue (and it is the common case) because, there is now way to
fast-forward data at all. But it may be an issue if a filter decides to
alter the payload and to unregister from data filtering. In that case, the
0-copy forwarding can be re-enabled in a hardly precdictable state.

To fix the issue, a SC flags was added to do so. The HTTP compression filter
set it and lua filters too if the body length is changed (via
HTTPMessage.set_body_len()).

Note that it is an issue because of a bad design about the HTX. Many info
about the message are stored in the HTX structure itself. It must be
refactored to move several info to the stream-endpoint descriptor. This
should ease modifications at the stream level, from filter or a TCP/HTTP
rules.

This should be backported as far as 3.0. If necessary, it may be backported
on lower versions, as far as 2.6. In that case, it must be reviewed and
adapted.

(cherry picked from commit f45a632bad6fbcdadeba41038103a5738f8d709d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MEDIUM: hlua: Add function to change the body length of an HTTP Message

There was no function for a lua filter to change the body length of an HTTP
Message. But it is mandatory to be able to alter the message payload. It is
not possible update to directly update the message headers because the
internal state of the message must also be updated accordingly.

It is the purpose of HTTPMessage.set_body_len() function. The new body
length myst be passed as argument. If it is an integer, the right
"Content-Length" header is set. If the "chunked" string is used, it forces
the message to be chunked-encoded and in that case the "Transfer-Encoding"
header.

This patch should fix the issue #2837. It could be backported as far as 2.6.

(cherry picked from commit 94055a5e73e9feed8c0aed8cae751e1896be90bd)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: wdt: always ignore the first watchdog wakeup

With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread
report its own warnings"), when the TH_FL_STUCK flag was flipped on,
we'd then go to the panic code instead of giving a second chance like
before the commit. This can trigger rare cases that only happen with
moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM:
wdt: fix the stuck detection for warnings"). This is in fact due to
the loss of the common "goto update_and_leave" that used to serve
both the warning code and the flag setting for probation, and it's
apparently what hit Christian in issue #2980.

Let's make sure we exit naturally when turning the bit on for the
first time. Let's also update the confusing comment at the end of
the check that was left over by latest change.

Since the first commit was backported to 3.1, this commit should be
backported there as well.

(cherry picked from commit 0a8bfb5b900f017c17bf6460669a18cf693b5927)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures

There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces.
This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller
(qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context
initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API.

Should be easily backported as far as 2.6.

(cherry picked from commit 894595b711835dba9a7432fa364ea57806b2bb0c)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: quic: Useless BIO_METHOD initialization

This code is there from QUIC implementation start. It was supposed to
initialize <ha_quic_meth> as a BIO_METHOD static object. But this
BIO_METHOD is not used at all!

Should be backported as far as 2.6 to help integrate the next patches to come.

(cherry picked from commit a2822b17769408ce8fd17a4db4d58e5a4c0a1454)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: debug: mark ha_crash_now() as attribute(noreturn)

Building on MIPS64 with clang16 incorrectly reports some uninitialized
value warnings in stats-proxy.c due to some calls to ABORT_NOW() where
the compiler didn't know the code wouldn't return. Let's properly mark
the function as noreturn, and take this opportunity for also marking it
unused to avoid possible warnings depending on the build options (if
ABORT_NOW is not used). No backport needed though it will not harm.

(cherry picked from commit 0a91c6dcaebd41ead51f3654ca96d564ccda32af)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: h3: don't insert more than one Host header

Let's make sure we drop extraneous Host headers after having compared
them. That also works when :authority was already present. This way,
like for h1 and h2, we only keep one copy of it, while still making
sure that Host matches :authority. This way, if a request has both
:authority and Host, only one Host header will be produced (from
:authority). Note that due to the different organization of the code
and wording along the evolving RFCs, here we also check that all
duplicates are identical, while h2 ignores them as per RFC7540, but
this will be re-unified later.

This should be backported to stable versions, at least 2.8, though
thanks to the existing checks the impact is probably nul.

(cherry picked from commit b84762b3e0c0e7708ddc98ae6b721ed10dc1be30)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: configuration: explicit multi-choice on bind shards option

From the documentation, this wasn't clear enough that shards should
be followed by one of the options number / by-thread / by-group.
Align it with existing options in documentation so that it becomes
more explicit.

(cherry picked from commit 824bb93e18043876a0fa30eeb3de50ea0ccb4ea5)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error

There was no test in the demux part to detect truncated frames and to report
an error at the connection level. The SPOP streams were properly switch to
half-closed state. But waiting the associated SPOE applets were woken up and
released, the SPOP connection could be woken up several times for nothing. I
never triggered the watchdog in that case, but it is not excluded.

Now, at the end of the demux function, if a specific test was added to
detect truncated frames to report an error and close the connection.

This patch must be backported to 3.1.

(cherry picked from commit 16314bb93c2c2c47c648da880ab55bf2ac01d9d4)
[wt: update ctx]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: ensure cwnd limits are always enforced

Congestion window is limit by a minimal and maximum values which can
never be exceeded. Min value is hardcoded to 2 datagrams as recommended
by the specification. Max value is specified via haproxy configuration.

These values must be respected each time the congestion window size is
adjusted. However, in some rare occasions, limit were not always
enforced. Fix this by implementing wrappers to set or increment the
congestion window. These functions ensure limits are always applied
after the operation.

Additionnally, wrappers also ensure that if window reached a new maximum
value, it is saved in <cwnd_last_max> field.

This should be backported up to 2.6, after a brief period of
observation.

(cherry picked from commit 7bad88c35c7547d52ac170ed9f89f29cccd6c46c)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>

MINOR: quic: refactor BBR API

Write minor adjustments to QUIC BBR functions. The objective is to
centralize every modification of path cwnd field.

No functional change. This patch will be useful to simplify
implementation of global QUIC Tx memory usage limitation.

(cherry picked from commit c01d4552881301093d982efbfb4734ba5d4c65b6)
[ad: pick to simplify backport of next commit]
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>

BUG/MINOR: mux-quic: do not decode if conn in error

Add an early return to qcc_decode_qcs() if QCC instance is flagged on
error and connection is scheduled for immediate closure.

The main objective is to ensure to not trigger BUG_ON() from
qcc_set_error() : if a stream decoding has set the connection error, do
not try to process decoding on other streams as they may also encounter
an error. Thus, the connection is closed asap with the first encountered
error case.

This should be backported up to 2.6, after a period of observation.

(cherry picked from commit 6c5030f703e29bfd8deeace111bcedc6835c7065)
[ad: context adjustement, due to multiple Rx bufs not available in 3.1]
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>

MEDIUM: stick-tables: Limit the number of old entries we remove

Limit the number of old entries we remove in one call of
stktable_trash_oldest(), as we do so while holding the heavily contended
update write lock, so we'd rather not hold it for too long.
This helps getting stick tables perform better under heavy load.

(cherry picked from commit d2d4c3eb6566145d30eb38dc96b2b79d3f1db8fc)
[wt: backported since situation encountered in 3.1 as well without this
patch. It also contains the definition for STKTABLE_MAX_UPDATES_AT_ONCE
from the previous commit]
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: config: recommend disabling libc-based resolution with resolvers

Using both libc and haproxy resolvers can lead to hard to diagnose issues
when their bevahiour diverges; recommend using only one type of resolver.

Should be backported to stable versions.

Link: https://www.mail-archive.com/haproxy@formilux.org/msg45663.html
Co-authored-by: Lukas Tribus <lukas@ltri.eu>
(cherry picked from commit 4e20fab7ac85c0018e77a52657f773982f8c1875)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: config: restore default values for resolvers hold directive

Default values for hold directive (resolver context) used to be documented
but this was lost when the keyword description was reworked in 24b319b
("Default value is 10s for "valid", 0s for "obsolete" and 30s for
others.")

Restoring the part that describes the default value.

It may be backported to all stable versions with 24b319b

(cherry picked from commit 4bceca83fc14269ef8f2528294bc3cdb3fcc6c99)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: ring: refer to newer RFC5424

In the ring configuration example we refer to RFC3164 - the original BSD
syslog protocol without support for structured data (SDATA).

Let's refer to RFC5424 instead so SDATA is by default forwarded if
someone copy & pastes from the documentation:

https://discourse.haproxy.org/t/structured-data-lost-when-forwarding-logs-voa-syslog-forwarding-feature/11741/5

Should be backported to 2.6.

(cherry picked from commit 5f9ce99c799eaa8f58518201b58c32d970b10cdf)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: peers: also limit the number of incoming updates

There's a configurable limit to the number of messages sent to a
peer (tune.peers.max-updates-at-once), but this one is not applied to
the receive side. While it can usually be OK with default settings,
setups involving a large tune.bufsize (1MB and above) regularly
experience high latencies and even watchdogs during reloads because
the full learning process sends a lot of data that manages to fill
the entire buffer, and due to the compactness of the protocol, 1MB
of buffer can contain more than 100k updates, meaning taking locks
etc during this time, which is not workable.

Let's make sure the receiving side also respects the max-updates-at-once
setting. For this it counts incoming updates, and refrains from
continuing once the limit is reached. It's a bit tricky to do because
after receiving updates we still have to send ours (and possibly some
ACKs) so we cannot just leave the loop.

This issue was reported on 3.1 but it should progressively be backported
to all versions having the max-updates-at-once option available.

(cherry picked from commit b26f3fe54e0ca5d54cc4fbcd771bfefbe728991c)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers

using "send-proxy" or "send-proxy-v2" option on a ring server is not
relevant nor supported. Worse, on 2.4 it causes haproxy process to
crash as reported in GH #2965.

Let's be more explicit about the fact that this keyword is not supported
under "ring" context by ignoring the option and emitting a warning message
to inform the user about that.

Ideally, we should do the same for peers and log servers. The proper way
would be to check servers options during postparsing but we currently lack
proper cross-type server postparsing hooks. This will come later and thus
will give us a chance to perform the compatibilty checks for server
options depending on proxy type. But for now let's simply fix the "ring"
case since it is the only one that's known to cause a crash.

It may be backported to all stable versions.

(cherry picked from commit 098a5e5c0b9e6821c03945295dd1350e5d834eb3)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DEBUG: mux-spop: Review some trace messages to adjust the message or the level

Some trace messages were not really accurrate, reporting a CLOSED connection
while only an error was reported on it. In addition, an TRACE_ERROR() was
used to report a short read on HELLO/DISCONNECT frames header. But it is not
an error. a TRACE_DEVEL() should be used instead.

This patch could be backported to 3.1 to ease future backports.

(cherry picked from commit e2ae8a74e87d32730e3edba26626518fd33e3a58)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data

When an read error is detected, no error must be reported on the SPOP
connection is there are still some data to parse. It is important to be sure
to process all data before reporting the error and be sure to not truncate
received frames. However, we must also take care to handle short read case
to not wait data that will never be received.

This patch must be backported to 3.1.

(cherry picked from commit 6e46f0bf934360186aa878d75f8bf81eaae0c141)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: spop-conn: Report short read for partial frames payload

When a frame was not fully received, a short read must be reported on the
SPOP connection to help the demux to handle truncated frames. This was
performed for frames truncated on the header part but not on the payload
part. It is now properly detected.

This patch must be backported to 3.1.

(cherry picked from commit 71feb49a9f0ef1142970fca98ca8eebd516928a3)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop: Properly handle CLOSING state

The CLOSING state was not handled at all by the SPOP multiplexer while it is
mandatory when a DISCONNECT frame was sent and the mux should wait for the
DISCONNECT frame in reply from the agent. Thanks to this patch, it should be
fixed.

In addition, if an error occurres during the AGENT HELLO frame parsing, the
SPOP connection is no longer switched to CLOSED state and remains in ERROR
state instead. It is important to be able to send the DISCONNECT frame to
the agent instead of closing the TCP connection immediately.

This patch depends on following commits:

  * BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state
  * MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
  * BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error
  * BUG/MINOR: mux-spop: Make the demux stream ID a signed integer

All the series must be backported to 3.1.

(cherry picked from commit ddc5f8d92e3613ec7060b0ee8bdb220866182e86)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state

SPOP_CS_FRAME_H and SPOP_CS_FRAME_P states, that were used to handle frame
parsing, were removed. The demux process now relies on the demux stream ID
to know if it is waiting for the frame header or the frame
payload. Concretly, when the demux stream ID is not set (dsi == -1), the
demuxer is waiting for the next frame header. Otherwise (dsi >= 0), it is
waiting for the frame payload. It is especially important to be able to
properly handle DISCONNECT frames sent by the agents.

SPOP_CS_RUNNING state is introduced to know the hello handshake was finished
and the SPOP connection is able to open SPOP streams and exchange NOTIFY/ACK
frames with the agents.

It depends on the following fixes:

* MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
* BUG/MINOR: mux-spop: Make the demux stream ID a signed integer

This change will be mandatory for the next fix. It must be backported to 3.1
with the commits above.

(cherry picked from commit a3940614c24049d2c1ee9f686e0579e3a2e49b33)
[wt: adj ctx WRT tevt which is absent from 3.1
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing

After the ACK frame was parsed, it is useless to set the SPOP connection
state to SPOP_CS_FRAME_H state because this will be automatically handled by
the demux function. If it is not an issue, but this will simplify changes
for the next commit.

(cherry picked from commit 6b0f7de4e34c7da1ec7ae82e8e8a298823e50d72)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error

Till now, only SPOP connections fully closed or those with a TCP connection on
error were concerned. But available streams could be reported for SPOP
connections in error or closing state. But in these states, no NOTIFY frames
will be sent and no ACK frames will be parsed. So, no new SPOP streams should be
opened.

This patch should be backported to 3.1.

(cherry picked from commit 197eaaadfd636441ca6cc5d3e35c52197537bbd6)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-spop: Make the demux stream ID a signed integer

The demux stream ID of a SPOP connection, used when received frames are
parsed, must be a signed integer because it is set to -1 when the SPOP
connection is initialized. It will be important for the next fix.

This patch must be backported to 3.1.

(cherry picked from commit cbc10b896e647d5c62d9b021f3e2dd391d14dc4f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received

When a SPOP connection was closed or was in error, an error was
systematically reported on all its SPOP streams. However, SPOP streams that
already received their ACK frame must be excluded. Otherwise if an agent
sends a ACK and close immediately, the ACK will be ignored because the SPOP
stream will handle the error first.

This patch must be backported to 3.1.

(cherry picked from commit 6d68beace5435f24abc4292f22021e4465b4dcee)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-spop: Use the right bitwise operator in spop_ctl()

Becaues of a typo, '||' was used instead of '|' to test the SPOP conneciton
flags and decide if the mux is ready or not. The regression was introduced
in the commit fd7ebf117 ("BUG/MEDIUM: mux-spop: Wait end of handshake to
declare a spop connection ready").

This patch must be backported to 3.1 with the commit above.

(cherry picked from commit 2ccfebcebff254c95b6a70e987f3d960b38499c1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready

A SPOP connection must not be considered as ready while the hello handshake
is not finished with success. In addition, no error or shutdown must have
been reported for the underlying connection. Otherwise a freshly openned
spop connexion may be reused while it is in fact dead, leading to a
connection retry.

This patch must be backported to 3.1.

(cherry picked from commit fd7ebf117bdefdcfbd1c30a079585867eb22a57d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-spop: Respect the negociated max-frame-size value to send frames

When a SPOP connection is opened, the maximum size for frames is negociated.
This negociated size is properly used when a frame is received and if a too
big frame is detected, an error is triggered. However, the same was not
performed on the sending path. No check was performed on frames sent to the
agent. So it was possible to send frames bigger than the maximum size
supported by the the SPOE agent.

Now, the size of NOTIFY and DISCONNECT frames is checked before sending them
to the agent.

Thanks to Miroslav to have reported the issue.

This patch must be backported to 3.1.

(cherry picked from commit ce8c2d359bc338a0ac88f117bca440c93b749f15)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state

When the SPOE applet was released, if a SPOE filter context was still
attached to it, an error was reported to the filter. However, there is no
reason to report an error if the ACK message was already received. Because
of this bug, if the ACK message is received and the SPOE connection is
immediately closed, this prevents the ACK message to be processed.

This patch should be backported to 3.1.

(cherry picked from commit 1cd30c998bce72166855d6b83a8e6865c33095a1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation

When the channel API was revisted, the both functions above was added. An
offset can be passed as argument. However, this parameter could be reported
to be out of range if there was not enough input data was received yet. It
is an issue, especially with a tcp rule, because more data could be
received. If an error is reported too early, this prevent the rule to be
reevaluated later. In fact, an error should only be reported if the offset
is part of the output data.

Another issue is about the conditions to report 'nil' instead of an empty
string. 'nil' was reported when no data was found. But it is not aligned
with the documentation. 'nil' must only be returned if no more data cannot
be received and there is no input data at all.

This patch should fix the issue #2716. It should be backported as far as 2.6.

(cherry picked from commit a5de0e15959a241afc9afb39f1b02a2517894f7b)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: threads: fix soft-stop without multithreading support

When thread support is disabled ("USE_THREAD=" or "USE_THREAD=0" when
building), soft-stop doesn't work as haproxy never ends after stopping
the proxies.

This used to work fine in the past but suddenly stopped working with
ef422ced91 ("MEDIUM: thread: make stopping_threads per-group and add
stopping_tgroups") because the "break;" instruction under the stopping
condition is never executed when support for multithreading is disabled.

To fix the issue, let's add an "else" block to run the "break;"
instruction when USE_THREAD is not defined.

It should be backported up to 2.8

(cherry picked from commit 7d057e56af86cdc98c5388986931c247dbafa5b3)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: ssl/ckch: always free() the previous entry during parsing

The ckch_conf_parse() function is the generic function which parses
crt-store keywords from the crt-store section, and also from a crt-list.

When having multiple time the same keyword, a leak of the previous value
happens. This patch ensure that the previous value is always freed
before overwriting it.

This patch should be backported as far as 3.0.

(cherry picked from commit 311e0aa5c712f38700b7b185c0d5f1aa33c48613)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: quic: free stream_desc on all data acked

The following patch simplifies qc_stream_desc_ack(). The qc_stream_desc
instance is not freed anymore, even if all data were acknowledged. As
implies by the commit message, the caller is responsible to perform this
cleaning operation.
f4a83fbb14bdd14ed94752a2280a2f40c1b690d2
MINOR: quic: do not remove qc_stream_desc automatically on ACK handling

However, despite the commit instruction, qc_stream_desc_free()
invokation was not moved in the caller. This commit fixes this by adding
it after stream ACK handling. This is performed only when a transfer is
completed : all data is acknowledged and qc_stream_desc has been
released by its MUX stream instance counterpart.

This bug may cause a significant increase in memory usage when dealing
with long running connection. However, there is no memory leak, as every
qc_stream_desc attached to a connection are finally freed when quic_conn
instance is released.

This must be backported up to 3.1.

(cherry picked from commit 3fdb039a99cdee543f4bbbfc983a51eb18f7f1f0)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: cli: fix too many args detection for commands

d3f928944 ("BUG/MINOR: cli: Issue an error when too many args are passed
for a command") added a new check to prevent the command to run when
too many arguments are provided. In this case an error is reported.

However it turns out this check (despite marked for backports) was
ineffective prior to 20ec1de21 ("MAJOR: cli: Refacor parsing and
execution of pipelined commands") as 'p' pointer was reset to the end of
the buffer before the check was executed.

Now since 20ec1de21, the check works, but we have another issue: we may
read past initialized bytes in the buffer because 'p' pointer is always
incremented in a while loop without checking if we increment it past 'end'
(This was detected using valgrind)

To fix the issue introduced by 20ec1de21, let's only increment 'p' pointer
if p < end.

For 3.2 this is it, now for older versions, since d3f928944 was marked for
backport, a sligthly different approach is needed:

- conditional p increment must be done in the loop (as in this patch)
- max arg check must moved above "fill unused slots" comment where p is
assigned to the end of the buffer

This patch should be backported with d3f928944.

(cherry picked from commit 976e0bd32f6d87fccf351edf1b5a18f3e0c5c6ab)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: peers: hold the refcnt until updating ts->seen

In peer_treat_updatemsg(), we call stktable_touch_remote() after
releasing the write lock on the TS, asking it to decrement the
refcnt, then we update ts->seen. Unfortunately this is racy and
causes the issue that Christian reported in issue #2959.

The sequence of events is very hard to trigger manually, but what happens
is the following:

T1.  stktable_touch_remote(table, ts, 1);
      -> at this point the entry is in the mt_list, and the refcnt is zero.

      T2.  stktable_trash_oldest() or process_table_expire()
           -> these can run, because the refcnt is now zero.
              The entry is cleanly deleted and freed.

T1.  HA_ATOMIC_STORE(&ts->seen, 1)
      -> we dereference freed memory.

A first attempt at a fix was made by keeping the refcnt held during
all the time the entry is in the mt_list, but this is expensive as
such entries cannot be purged, causing lots of skips during
trash_oldest_data(). This managed to trigger watchdogs, and was only
hiding the real cause of the problem.

The correct approach clearly is to maintain the ref_cnt until we
touch ->seen. That's what this patch does. It does not decrement
the refcnt, while calling stktable_touch_remote(), and does it
manually after touching ->seen. With this the problem is gone.

Note that a reproducer involves the following:
  - a config with 10 stick-ctr tracking the same table with a
    random key between 10M and 100M depending on the machine.
  - the expiration should be between 10 and 20s. http_req_cnt
    is stored and shared with the peers.
  - 4 total processes with such a config on the local machine,
    each corresponding to a different peer. 3 of the peers are
    bound to half of the cores (all threads) and share the same
    threads; the last process is bound to the other half with
    its own threads.
  - injecting at full load, ~256 conn, on the shared listening
    port. After ~2x expiration time to 1 minute the lone process
    should segfault in pools code due to a corrupted by_lru list.

This problem already exists in earlier versions but the race looks
narrower. Given how difficult it is to trigger on a given machine
in its current form, it's likely that it only happens once in a
while on stable branches. The fix must be backported wherever the
code is similar, and there's no hope to reproduce it to validate
the backport.

Thanks again to Christian for his amazing help!

(cherry picked from commit 006a3acbde309f11190a003c4ac1c026480444e4)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: reject invalid max_udp_payload size

Add a checks on received max_udp_payload transport parameters. As
defined per RFC 9000, values below 1200 are invalid, and thus the
connection must be closed with TRANSPORT_PARAMETER_ERROR code.

Prior to this patch, an invalid value was silently ignored.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit 4bc7aa548adcd9ee424c65cd346e94f8749dce64)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: fix TP reject on invalid max-ack-delay

Checks are implemented on some received transport parameter values,
to reject invalid ones defined per RFC 9000. This is the case for
max_ack_delay parameter.

The check was not properly implemented as it only reject values strictly
greater than the limit set to 2^14. Fix this by rejecting values of 2^14
and above. Also, the proper error code TRANSPORT_PARAMETER_ERROR is now
set.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit ffabfb0fc3ad8774024d152fc31a7711a8a9c382)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: use proper error code on invalid received TP value

As per RFC 9000, checks must be implemented to reject invalid values for
received transport parameters. Such values are dependent on the
parameter type.

Checks were already implemented for ack_delay_exponent and
active_connection_id_limit, accordingly with the QUIC specification.
However, the connection was closed with an incorrect error code. Fix
this to ensure that TRANSPORT_PARAMETER_ERROR code is used as expected.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit b60a17aad768369ab7e328949112b50cd78bc987)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: reject retry_source_cid TP on server side

Close the connection on error if retry_source_connection_id transport
parameter is received. This is specified by RFC 9000 as this parameter
must not be emitted by a client. Previously, it was silently ignored.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit 10f1f1adce032742d60fe14ee780871c4e6a1db1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: use proper error code on invalid server TP

This commit is similar to the previous one. It fixes the error code
reported when dealing with invalid received transport parameters. This
time, it handles reception of original_destination_connection_id,
preferred_address and stateless_reset_token which must not be emitted by
the client.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit a54fdd3d926fabfc438dbaedbd3d08814fb99862)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: use proper error code on missing CID in TPs

Handle missing received transport parameter value
initial_source_connection_id / original_destination_connection_id.
Previously, such case would result in an error reported via
quic_transport_params_store(), which triggers a TLS alert converted as
expected as a CONNECTION_CLOSE. The issue is that the error code
reported in the frame was incorrect.

Fix this by returning QUIC_TP_DEC_ERR_INVAL for such conditions. This is
directly handled via quic_transport_params_store() which set the proper
TRANSPORT_PARAMETER_ERROR code for the CONNECTION_CLOSE. However, no
error is reported so the SSL handshake is properly terminated without a
TLS alert. This is enough to ensure that the CONNECTION_CLOSE frame will
be emitted as expected.

This should be backported up to 2.6. Note that is relies on previous
patch "MINOR: quic: extend return value on TP parsing".

(cherry picked from commit df6bd4909e8dfa64337662a78adc39ca702c1dc7)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: extend return value during TP parsing

Extend API used for QUIC transport parameter decoding. This is done via
the introduction of a dedicated enum to report the various error
condition detected. No functional change should occur with this patch,
as the only returned code is QUIC_TP_DEC_ERR_TRUNC, which results in the
connection closure via a TLS alert.

This patch will be necessary to properly reject transport parameters
with the proper CONNECTION_CLOSE error code. As such, it should be
backported up to 2.6 with the following series.

(cherry picked from commit 294bf26c06404e35edd4ad3381ccb26e835bd7a1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: rename min/max fields for congestion window algo

There was some possible confusion between fields related to congestion
window size min and max limit which cannot be exceeded, and the maximum
value previously reached by the window.

Fix this by adopting a new naming scheme. Enforced limit are now renamed
<limit_max>/<limit_min>, while the previously reached max value is
renamed <cwnd_last_max>.

This should be backported up to 3.1.

(cherry picked from commit 2eb1b0cd96f663b9260ee48921612566417b3b8d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends

proxy_inc_fe_cum_sess_ver_ctr() was implemented in 9969adbc
("MINOR: stats: add by HTTP version cumulated number of sessions and
requests")

As its name suggests, it is meant to be called for frontends, not backends

Also, in 9969adbc, when used under h1_init(), a precaution is taken to
ensure that the function is only called with frontends.

However, this precaution was not applied in h2_init() and qc_init().

Due to this, it remains possible to have proxy_inc_fe_cum_sess_ver_ctr()
being called with a backend proxy as parameter. While it did not cause
known issues so far, it is not expected and could result in bugs in the
future. Better fix this by ensuring the function is only called with
frontends.

It may be backported up to 2.8

(cherry picked from commit b39825ee45150415d7ed64b7ce785bb946f727bd)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: connections: Report connection closing in conn_create_mux()

Add an extra parametre to conn_create_mux(), "closed_connection".
If a pointer is provided, then let it know if the connection was closed.
Callers have no way to determine that otherwise, and we need to know
that, at least in ssl_sock_io_cb(), as if the connection was closed we
need to return NULL, as the tasklet was free'd, otherwise that can lead
to memory corruption and crashes.
This should be backported if 9240cd4a2771245fae4d0d69ef025104b14bfc23
is backported too.

(cherry picked from commit b138eab302531bc40248337132ac01a541a111be)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-h1: Fix trace message in h1_detroy() to not relay on connection

h1_destroy() may be called to release a H1C after a multiplexer upgrade. In
that case, the connection is no longer attached to the H1C. It must not be
used in the h1 trace message because the connection context is no longer a H1C.

Because of this bug, when a H1>H2 upgrade is performed, a crash may be
experienced if the H1 traces are enabled.

This patch must be backport to all stable versions.

(cherry picked from commit 7dc4e94830ef8fe9b0ffc2901d63b9f3183ed12c)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-h1: Don't pretend connection was released for TCP>H1>H2 upgrade

When an applicative upgrade of the H1 multiplexer is performed, we must not
pretend the connection was released. Indeed, in that case, a H1 stream is
still their with a stream connector attached on it. It must be detached
first before releasing the H1 connection and the underlying connection. So
it is important to not pretend the connection was already released.

Concretely, in that case h1_process() must return 0 instead of -1. It is
minor error because, AFAIK, it is harmless. But it is not correct. So let's
fix it to avoid futur bugs.

To be clear, this happens when a TCP connection is upgraded to H1 connection
and a H2 preface is detected, leading to a second upgrade from H1 to H2.

This patch may be backport to all stable versions.

(cherry picked from commit 2dc334be61a0a9feaa7b844e122c2c4ce37e1b1a)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: dns: prevent ds accumulation within dss

when dns session callback (dns_session_release()) is called upon error
(ie: when some pending queries were not sent), we try our best to
re-create the applet in order to preserve the pending queries and give
them a chance to be retried. This is done at the end of
dns_session_release().

However, doing so exposes to an issue: if the error preventing queries
from being sent is still encountered over and over the dns session could
stay there indefinitely. Meanwhile, other dns sessions may be created on
the same dns_stream_server periodically. If previous failing dns sessions
don't terminate but we also keep creating new ones, we end up accumulating
failing sessions on a given dns_stream_server, which can eventually cause
ressource shortage.

This issue was found when trying to address ("BUG/MINOR: dns: add tempo
between 2 connection attempts for dns servers")

To fix it, we track the number of failed consecutive sessions for a given
dns server. When we reach the threshold (set to 100), we consider that the
link to the dns server is broken (at least temporarily) and we force
dns_session_new() to fail, so that we stop creating new sessions until one
of the existing one eventually succeeds.

A workaround for this fix consists in setting the "maxconn" parameter on
nameserver directive (under resolvers section) to a reasonnable value so
that no more than "maxconn" sessions may co-exist on the same server at
a given time.

This may be backported to all stable versions.
("CLEANUP: dns: remove unused dns_stream_server struct member") may be
backported to ease the backport.

(cherry picked from commit 5288b39011b2449bfa896f7932c7702b5a85ee77)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers

As reported by Lukas Tribus on the mailing list [1], trying to connect to
a nameserver with invalid network settings causes haproxy to retry a new
connection attempt immediately which eventually causes unexpected CPU usage
on the thread responsible for the applet (namely 100% on one CPU will be
observed).

This can be reproduced with the test config below:

resolvers default
  nameserver ns1 tcp4@8.8.8.8:53 source 192.168.99.99
listen listen
  mode http
  bind :8080
  server s1 www.google.com resolvers default init-addr none

To fix this the issue, we add a temporisation of one second between a new
connection attempt is retried. We do this in dns_session_create() when we
know that the applet was created in the release callback (when previous
query attempt was unsuccessful), which means initial connection is not
affected.

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg45665.html

This should fix GH #2909 and may be backported to all stable versions.
This patch depends on ("MINOR: applet: add appctx_schedule() macro")

(cherry picked from commit 27236f2218e57d839197e5b6479dc8e67e484e32)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: dns: remove unused dns_stream_server struct member

dns_stream_server "max_slots" is unused, let's get rid of it

(cherry picked from commit 14ebe95a10e7fdc003e369b58463a4744e88fa8e)
[wt: just to help with next backport]
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: applet: add appctx_schedule() macro

Just like task_schedule() but for applets to wakeup an applet at a
specific time, leverages _task_schedule() internally

(cherry picked from commit 1ced5ef2fdf9a4b4ab0941849be0f4627066331f)
[wt: needed for next dns backport]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MAJOR: listeners: transfer connection accounting when switching listeners

Since we made it possible for a bind_conf to listen to multiple thread
groups with shards in 2.8 with commit 9d360604bd ("MEDIUM: listener:
rework thread assignment to consider all groups"), the per-listener
connection count was not properly transferred to the target listener
with the connection when switching to another thread group. This results
in one listener possibly reaching high values and another one possibly
reaching negative values. Usually it's not visible, unless a maxconn is
set on the bind_conf, in which case comparisons will quickly put an end
to the willingness to accept new connections.

This problem only happens when thread groups are enabled, and it seems
very hard to trigger it normally, it only impacts sockets having a single
shard, hence currently the CLI (or any conf with "bind ... shards 1"),
where it can be reproduced with a config having a very low "maxconn" on
the stats socket directive (here, 4), and issuing a few tens of
socat <<< "show activity" in parallel, or sending HTTP connections to a
single-shared listener. Very quickly, haproxy stops accepting connections
and eats CPU in the poller which tries to get its connections accepted.

A BUG_ON(l->nbconn<0) after HA_ATOMIC_DEC() in listener_release() also
helps spotting them better.

Many thanks to Christian Ruppert who once again provided a very accurate
report in GH #2951 with the required data permitting this analysis.

This fix must be backported to 2.8.

(cherry picked from commit f1064c7382e170cd59dd7984808bc436e2b3de7f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: cli: Issue an error when too many args are passed for a command

When a command is parsed to split it in an array of arguments, by default,
at most 64 arguments are supported. But no warning was emitted when there
were too many arguments. Instead, the arguments above the limit were
silently ignored. It could be an issue for some commands, like "add server",
because there was no way to know some arguments were ignored.

Now an error is issued when too many arguments are passed and the command is
not executed.

This patch should be backported to all stable versions.

(cherry picked from commit d3f92894471dd7306f857156a5820c53c10f8d88)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: do not crash on CRYPTO ncbuf alloc failure

To handle out-of-order received CRYPTO frames, a ncbuf instance is
allocated. This is done via the helper quic_get_ncbuf().

Buffer allocation was improperly checked. In case b_alloc() fails, it
crashes due to a BUG_ON(). Fix this by removing it. The function now
returns NULL on allocation failure, which is already properly handled in
its caller qc_handle_crypto_frm().

This should fix the last reported crash from github issue #2935.

This must be backported up to 2.6.

(cherry picked from commit 4309a6fbf80240b0880c5adf091f0075c3bcd53f)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>

[RELEASE] Released version 3.1.7

Released version 3.1.7 with the following main changes :
    - MINOR: log: support "raw" logformat node typecast
    - BUG/MINOR: peers: fix expire learned from a peer not converted from ms to ticks
    - BUG/MEDIUM: peers: prevent learning expiration too far in futur from unsync node
    - BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local
    - BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream()
    - BUG/MINOR: log: fix gcc warn about truncating NUL terminator while init char arrays
    - BUG/MINOR: config: silence .notice/.warning/.alert in discovery mode
    - DOC: config: fix two missing "content" in "tcp-request" examples
    - BUILD: compiler: undefine the CONCAT() macro if already defined
    - BUG/MINOR: rhttp: fix incorrect dst/dst_port values
    - BUG/MINOR: backend: do not overwrite srv dst address on reuse
    - BUG/MEDIUM: backend: fix reuse with set-dst/set-dst-port
    - BUILD: quic_sock: address a strict-aliasing build warning with gcc 5 and 6
    - DOC: update INSTALL to reflect the minimum compiler version
    - BUG/MEDIUM: stream: Fix a possible freeze during a forced shut on a stream
    - TESTS: Fix build for filltab25.c
    - MINOR: task: add thread safe notification_new and notification_wake variants
    - BUG/MINOR: hlua_fcn: fix potential UAF with Queue:pop_wait()
    - CLEANUP: log: adjust _lf_cbor_encode_byte() comment
    - BUG/MINOR: log: fix CBOR encoding with LOG_VARTEXT_START() + lf_encode_chunk()
    - BUG/MEDIUM: sample: fix risk of overflow when replacing multiple regex back-refs
    - BUG/MINOR: backend: do not use the source port when hashing clientip
    - BUG/MINOR: hlua: fix invalid errmsg use in hlua_init()
    - BUG/MINOR: debug: remove the trailing \n from BUG_ON() statements
    - DOC: config: add the missing "profiling.memory" to the global kw index
    - DOC: config: add the missing "force-cfg-parser-pause" to the global kw index
    - BUG/MINOR: http-ana: Properly detect client abort when forwarding the response
    - BUG/MEDIUM: http-ana: Report 502 from req analyzer only during rsp forwarding
    - MINOR: quic: remove references to burst in quic-cc-algo parsing
    - MINOR: quic: allow BBR testing without pacing
    - MINOR: quic: transform pacing settings into a global option
    - MINOR: quic: define quic_tune
    - BUILD: quic: fix overflow in global tune
    - MINOR: tools: let dump_addr_and_bytes() support dumping before the offset
    - MINOR: debug: in call traces, dump the 8 bytes before the return address, not after
    - MINOR: debug: detect call instructions and show the branch target in backtraces
    - BUG/MEDIUM: h3: trim whitespaces when parsing headers value
    - BUG/MEDIUM: h3: trim whitespaces in header value prior to QPACK encoding
    - BUG/MINOR: h3: filter upgrade connection header
    - BUG/MINOR: h3: reject invalid :path in request
    - BUG/MINOR: h3: reject request URI with invalid characters
    - BUG/MEDIUM: hlua: fix hlua_applet_{http,tcp}_fct() yield regression (lost data)
    - MINOR: compiler: add a __has_builtin() macro to detect features more easily
    - MINOR: compiler: add a new "ASSUME" macro to help the compiler
    - MINOR: compiler: also enable __builtin_assume() for ASSUME()
    - MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid
    - BUILD: makefile: enable backtrace by default on musl
    - BUG/MINOR: threads: set threads_idle and threads_harmless even with no threads
    - BUG/MINOR debug: fix !USE_THREAD_DUMP in ha_thread_dump_fill()
    - BUG/MINOR: wdt/debug: avoid signal re-entrance between debugger and watchdog
    - BUG/MINOR: debug: detect and prevent re-entrance in ha_thread_dump_fill()
    - MINOR: tools: also protect the library name resolution against concurrent accesses
    - MINOR: tools: protect dladdr() against reentrant calls from the debug handler
    - MINOR: debug: protect ha_dump_backtrace() against risks of re-entrance
    - MINOR: tinfo: keep a copy of the pointer to the thread dump buffer
    - MINOR: debug: always reset the dump pointer when done
    - MINOR: debug: remove unused case of thr!=tid in ha_thread_dump_one()
    - MINOR: pass a valid buffer pointer to ha_thread_dump_one()
    - MEDIUM: wdt: always make the faulty thread report its own warnings
    - MINOR: debug: make ha_stuck_warning() only work for the current thread
    - MINOR: debug: make ha_stuck_warning() print the whole message at once
    - MINOR: compiler: rely on builtin detection for __builtin_unreachable()
    - BUG/MINOR: mux-h2: prevent past scheduling with idle connections
    - BUG/MINOR: rhttp: fix reconnect if timeout connect unset
    - BUG/MINOR: rhttp: ensure GOAWAY can be emitted after reversal

BUG/MINOR: rhttp: ensure GOAWAY can be emitted after reversal

GOAWAY emission should not be emitted before preface. Thus, max_id field
from h2c acting as a server is initialized to -1, which prevents its
emission until preface is received from the peer. If acting as a client,
max_id is initialized to a valid value on the first h2s emission.

This causes an issue with reverse HTTP on the active side. First, it
starts as a client, so the peer does not emit a preface but instead a
simple SETTINGS frame. As role are switched, max_id is initialized much
later when the first h2s response is emitted. Thus, if the connection
must be terminated before any stream transfer, GOAWAY cannot be emitted.

To fix this, ensure max_id is initialized to 0 on h2_conn_reverse() for
active connect side. Thus, a GOAWAY indicating that no stream has been
handled can be generated.

Note that passive connect side is not impacted, as it max_id is
initialized thanks to preface reception.

This should be backported up to 2.9.

(cherry picked from commit 9e6f8ce3282fc5461e30adae3c76cbceb2bc327e)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: rhttp: fix reconnect if timeout connect unset

Active connect on reverse http relies on connect timeout to detect
connection failure. Thus, if this timeout was unset, connection failure
may not be properly detected.

Fix this by fallback on hardcoded value of 1s for connect if timeout is
unset in the configuration. This is considered as a minor bug, as
haproxy advises against running with timeout unset.

This must be backported up to 2.9.

(cherry picked from commit 2b8da5f9ab8639f4213ee21fc5fd5ca12c53cddb)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: mux-h2: prevent past scheduling with idle connections

While reviewing HTTP/2 MUX timeout, it seems there is a possibility that
MUX task is requeued via h2c_update_timeout() with an already expired
date. This can happens with idle connections on two cases :
* first with shut timeout, as timer is not refreshed if already set
* second with http-request and keep-alive timers, which are based on
idle_start

Queuing an already expired task is an undefined behavior. Fix this by
using task_wakeup() instead of task_queue() at the end of
h2c_update_timeout() if such case occurs.

This should be backported up to 2.6.

(cherry picked from commit 3ebdd3ae509236af1f19d80cab98b385a751426b)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: compiler: rely on builtin detection for __builtin_unreachable()

Due to __builtin_unreachable() only being associated to gcc 4.5 and
above, it turns out it was not enabled for clang. It's not used *that*
much but still a little bit, so let's enable it now. This reduces the
code size by 0.2% and makes it a bit more efficient.

(cherry picked from commit 41fc18b1d1a37e2a6645ea4a8fb4d2d5f009fc0c)
[wt: backported to shut a gcc warning in debug.c with USE_THREAD=0]
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: debug: make ha_stuck_warning() print the whole message at once

It has been noticed quite a few times during troubleshooting and even
testing that warnings can happen in avalanches from multiple threads
at the same time, and that their reporting it interleaved bacause the
output is produced in small chunks. Originally, this code inspired by
the panic code aimed at making sure to log whatever could be emitted
in case it would crash later. But this approach was wrong since writes
are atomic, and performing 5 writes in sequence in each dumping thread
also means that the outputs can be mixed up at 5 different locations
between multiple threads. The output of warnings is never very long,
and the stack-based buffer is 4kB so let's just concatenate everything
in the buffer and emit it at once using a single write(). Now there's
no longer this confusion on the output.

(cherry picked from commit 513397ac8215cdc1c0ab077a4a6809d7bd32f262)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: debug: make ha_stuck_warning() only work for the current thread

Since we no longer call it with a foreign thread, let's simplify its code
and get rid of the special cases that were relying on ha_thread_dump_fill()
and synchronization with a remote thread. We're not only dumping the
current thread so ha_thread_dump_one() is sufficient.

(cherry picked from commit c16d5415a807f5a46d34ee6aac4c35fd42dac0c0)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MEDIUM: wdt: always make the faulty thread report its own warnings

Warnings remain tricky to deal with, especially for other threads as
they require some inter-thread synchronization that doesn't cope very
well with other parallel activities such as "show threads" for example.

However there is nothing that forces us to handle them this way. The
panic for example is already handled by bouncing the WDT signal to the
faulty thread.

This commit rearranges the WDT handler to make a better used of this
existing signal bouncing feature of the WDT handler so that it's no
longer limited to panics but can also deal with warnings. In order not
to bounce on all wakeups, we only bounce when there is a suspicion,
that is, when the warning timer has been crossed. We'll let the target
thread verify the stuck flag and context switch count by itself to
decide whether or not to panic, warn, or just do nothing and update
the counters.

As a bonus, now all warning traces look the same regardless of the
reporting thread:

   call trace(16):
   |       0x6bc733 <01 00 00 e8 6d e6 de ff]: ha_dump_backtrace+0x73/0x309 > main-0x2570
   |       0x6bd37a <00 00 00 e8 d6 fb ff ff]: ha_thread_dump_fill+0xda/0x104 > ha_thread_dump_one
   |       0x6bd625 <00 00 00 e8 7b fc ff ff]: ha_stuck_warning+0xc5/0x19e > ha_thread_dump_fill
   |       0x7b2b60 <64 8b 3b e8 00 aa f0 ff]: wdt_handler+0x1f0/0x212 > ha_stuck_warning
   | 0x7fd7e2cef3a0 <00 00 00 00 0f 1f 40 00]: libpthread:+0x123a0
   | 0x7ffc6af9e634 <85 a6 00 00 00 0f 01 f9]: linux-vdso:__vdso_gettimeofday+0x34/0x2b0
   |       0x6bad74 <7c 24 10 e8 9c 01 df ff]: sc_conn_io_cb+0x9fa4 > main-0x2400
   |       0x67c457 <89 f2 4c 89 e6 41 ff d0]: main+0x1cf147
   |       0x67d401 <48 89 df e8 8f ed ff ff]: cli_io_handler+0x191/0xb38 > main+0x1cee80
   |       0x6dd605 <40 48 8b 45 60 ff 50 18]: task_process_applet+0x275/0xce9

(cherry picked from commit a06c215f08703b2a2e303b2e23816d4ad09748f0)
Signed-off-by: Willy Tarreau <w@1wt.eu>