Repositories - haproxy-2.9.git/log

BUG/MAJOR: promex: fix crash on deleted server

Promex applet is used to dump many metrics. Some of them are related to
a server instance. The applet can be interrupted in the middle of a
dump, for example waiting for output buffer space. In this case, its
context is save to resume dump on the correct instance.

A crash can occur if dump is interrupted during servers loop. If the
server instance is deleted during two scheduling of the promex applet,
its context will still referenced the deleted server on resume.

To fix this, use server refcount to prevent its deletion during parsing.
As <ctx.sv> is manipulated in a lot of places, use a static helper
function to automatically drop previous server refcount and increment
the new one.

To complete this change, also implement a release callback to ensure
server refcount is decremented in case of early applet abort.

A corresponding fix was first introduced in 3.0 dev branch. However, due
to massive differences, a new patch has been designed on top of the
current 2.9 tree. For reference, here is the 3.0 reference.

f81c00cd447e21b07ee355fd5c2fb2a4787ea4a0
BUG/MAJOR: promex: fix crash on deleted server

This must be backported to all stable releases.

[RELEASE] Released version 2.9.5

Released version 2.9.5 with the following main changes :
    - BUG/MINOR: diag: always show the version before dumping a diag warning
    - BUG/MINOR: diag: run the final diags before quitting when using -c
    - BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
    - BUILD: quic: Variable name typo inside a BUG_ON().
    - BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call
    - CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)
    - BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation
    - MINOR: quic: Stop using 1024th of a second.
    - MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
    - MINOR: debug: make sure calls to ha_crash_now() are never merged
    - MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
    - BUILD: debug: remove leftover parentheses in ABORT_NOW()
    - MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
    - BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch
    - BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line
    - MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
    - BUG/MEDIUM: ocsp: Separate refcount per instance and per store
    - BUG/MINOR: ssl: Destroy ckch instances before the store during deinit
    - BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"
    - REGTESTS: ssl: Fix empty line in cli command input
    - REGTESTS: ssl: Add OCSP related tests
    - DOC: install: recommend pcre2
    - DOC: config: fix misplaced "txn.conn_retries"
    - DOC: config: fix typos for "bytes_{in,out}"
    - DOC: config: fix misplaced "bytes_{in,out}"
    - DOC: internal: update missing data types in peers-v2.0.txt
    - BUG/MINOR: vars/cli: fix missing LF after "get var" output
    - BUG/MEDIUM: cli: fix once for all the problem of missing trailing LFs
    - CI: Update to actions/cache@v4
    - BUILD: address a few remaining calloc(size, n) cases
    - BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()
    - BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing
    - BUG/MEDIUM: quic: Wrong K CUBIC calculation.
    - MINOR: quic: Update K CUBIC calculation (RFC 9438)
    - MINOR: quic: Dynamic packet reordering threshold
    - MINOR: quic: Add a counter for reordered packets
    - BUG/MEDIUM: stconn: Allow expiration update when READ/WRITE event is pending
    - BUG/MEDIUM: stconn: Don't check pending shutdown to wake an applet up
    - CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield
    - MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
    - MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side
    - MINOR: muxes: Announce support for zero-copy forwarding on consumer side
    - BUG/MAJOR: stconn: Check support for zero-copy forwarding on both sides
    - MINOR: muxes/applet: Simplify checks on options to disable zero-copy forwarding
    - BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty
    - BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty
    - BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C
    - BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams

BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams

An error on the H2 connection was always reported as an error to the
stream-endpoint descriptor, independently on the H2 stream state. But it is
a bug to do so for closed streams. And indeed, it leads to report "SD--"
termination state for some streams while the response was fully received and
forwarded to the client, at least for the backend side point of view.

Now, errors are no longer reported for H2 streams in closed state.

This patch is related to the three previous ones:

* "BUG/MEDIUM: mux-h2: Don't report error on SE for closed H2 streams"
* "BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C"
* "BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty"

The series should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). The series should be backported to 2.9 but
only after a period of observation. In theory, older versions are also
affected but this part is pretty sensitive. So don't backport it further
except if someone ask for it.

(cherry picked from commit d9eb6d6680c4baccaced3668f8b0c7fc19246fd1)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-h2: Don't report error on SE if error is only pending on H2C

In h2s_wake_one_stream(), we must not report an error on the stream-endpoint
descriptor if the error is not definitive on the H2 connection. A pending
error on the H2 connection means there are potentially remaining data to be
demux. It is important to not truncate a message for a stream.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.

(cherry picked from commit 580ffd612348975cf3390ba5b3603c99046c848a)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty

It is similar to the previous fix ("BUG/MEDIUM: mux-h2: Don't report H2C
error on read error if dmux buffer is not empty"), but on receive side. If
the demux buffer is not empty, an error on the TCP connection must not be
immediately reported as an error on the H2 connection. We must be sure to
have tried to demux all data first. Otherwise, messages for one or more
streams may be truncated while all data were already received and are
waiting to be demux.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.

(cherry picked from commit 19fb19976f21a370fb20f528e0e570fd1a300f38)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-h2: Switch pending error to error if demux buffer is empty

When an error on the H2 connection is detected when sending data, only a
pending error is reported, waiting for an error or a shutdown on the read
side. However if a shutdown was already received, the pending error is
switched to a definitive error.

At this stage, we must also wait to have flushed the demux
buffer. Otherwise, if some data must still be demux, messages for one or
more streams may be truncated. There is already the flag H2_CF_END_REACHED
to know a shutdown was received and we no longer progress on demux side
(buffer empty or data truncated). On sending side, we should use this flag
instead to report a definitive error.

This patch is part of a series that should fix a bug reported in issue #2388
(#2388#issuecomment-1855735144). Backport instructions will be shipped in
the last commit of the series.

(cherry picked from commit 5b78cbae770e9debab186c4bbf2b708eb4561007)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: muxes/applet: Simplify checks on options to disable zero-copy forwarding

Global options to disable for zero-copy forwarding are now tested outside
callbacks responsible to perform the forwarding itself. It is cleaner this
way because we don't try at all zero-copy forwarding if at least one side
does not support it. It is equivalent to what was performed before, but it
is simplier this way.

(cherry picked from commit 081022a0c5877fde06910c02eb58a9a1a3f8c3a8)
[cf: changes about cache and stats applets were removed]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MAJOR: stconn: Check support for zero-copy forwarding on both sides

There is a nego stage when a producer is ready to forward data to the other
side. At this stage, the zero-copy forwarding may be disabled if the
consumer does not support it. However, there is a flaw with this way to
proceed. If the channel buffer is not empty, we delay the zero-copy
forwarding to flush all data from the channel first. During this delay,
receives on the endpoint (at connection level for muxes), are blocked to be
sure to have the opportunity to switch on zero-copy forwarding. It is a
problem if the consumer cannot flush data from the channel's buffer, waiting
for more data for instance.

It is especially annoying with the CLI applet, because this scenario can
happen if a command is partially received. For instance without the LF at
the end. In this case, the CLI applet is blocked because it waits more
data. The frontend connexion is also blocked because channel's data must be
flushed before trying to receive more data. Worst, this happen at where no
timeout is armed. Thus the session is stuck infinitly, client aborts cannot
be detected because receives are blocked, and the applet cannot abort on its
side because there are pending outgoing data. It is clearly a situation
where it is easy to consume all CLI slots.

To fix the issue, thanks to previous commits, we now check zero-copy
forwarding support on both sides before proceeding.

This patch relies on the following commits:

  * MINOR: muxes: Announce support for zero-copy forwarding on consumer side
  * MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side
  * MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
  * CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield

All the series must be backported to 2.9.

(cherry picked from commit ddf6b7539cc847ae07b06d000c3f11d723ca6036)
[cf: Changes about applet functions were removed]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: muxes: Announce support for zero-copy forwarding on consumer side

It is unused for now, but the muxes announce their support of the zero-copy
forwarding on consumer side. All muxes, except the fgci one, are supported
it.

(cherry picked from commit e2921ffad103cd0cf9a204f56380ce6138fe2c40)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side

The SE_FL_MAY_FASTFWD_CONS is added and it will be used by endpoints to
announce their support for the zero-copy forwarding on the consumer
side. The flag is not necessarily permanent. However, it will be used this
way for now.

(cherry picked from commit 22d4b0e901fff1392ba9a383271bb619f6309b9e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield

To fix a bug, a flag to announce the capabitlity to support the zero-copy
forwarding on the consumer side will be added on the SE descriptor. So the
old flag SE_FL_MAY_FASTFWD is renamed to indicate it concerns the producer
side. It is now SE_FL_MAY_FASTFWD_PROD. And to prepare addition of the new
flag, the bitfield is a bit reordered.

(cherry picked from commit 7598c0ba699cab31cf3df6eb59506c9b5d7edf7e)
[cf: some parts about applets were removed. A flag occurrence was missed in
src/mux_pt.c in the original commit. This was fixed]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield

To fix a bug, some SE flags must be added or renamed. To avoid mixing flags
set by the endpoint and flags set by the app, the second set of flags are
moved at the end of the bitfield, leaving the holes on the middle.

(cherry picked from commit 8cfc11f46188118d9b714b9e4e8dd2b44be18c3d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: stconn: Don't check pending shutdown to wake an applet up

This revert of commit 0b93ff8c87 ("BUG/MEDIUM: stconn: Wake applets on
sending path if there is a pending shutdown") and 9e394d34e0 ("BUG/MINOR:
stconn: Don't report blocked sends during connection establishment") because
it was not the right fixes.

We must not wake an applet up when a shutdown is pending because it means
output some data are still blocked in the channel buffer. The applet does
not necessarily consume these data. In this case, the applet may be woken up
infinitly, except if it explicitly reports it wont consume datay yet.

This patch must be backported as far as 2.8. For older versions, as far as
2.2, it may be backported. If so, a previous fix must be pushed to prevent
an HTTP applet to be stuck. In http_ana.c, in http_end_request() and
http_end_reponse(), the call to channel_htx_truncate() on the request
channel in case of MSG_ERROR must be replace by a call to
channel_htx_erase().

(cherry picked from commit 40d98176ba12367ed323edae34373b5f8deaee64)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: stconn: Allow expiration update when READ/WRITE event is pending

When a READ or a WRITE activity is reported on a channel, the corresponding
date is updated. the last-read-activity date (lra) is updated and the
first-send-block date (fsb) is reset. The event is also reported at the
channel level by setting CF_READ_EVENT or CF_WRITE_EVENT flags. When one of
these flags is set, this prevent the update of the stream's task expiration
date from sc_notify(). It also prevents corresponding timeout to be reported
from process_stream().

But it is a problem during fast-forwarding stage if no expiration date was
set by the stream. Only process_stream() resets these flags. So a first READ
or WRITE event will prevent any stream's expiration date update till a new
call to process_stream(). But with no expiration date, this will only happen
on shutdown/abort event, blocking the stream for a while.

It is for instance possible to block the stats applet or the cli applet if a
client does not consume the response. The stream may be blocked, the client
timeout is not respected and the stream can only be closed on a client
abort.

So now, we update the stream's expiration date, regardless of reported
READ/WRITE events. It is not a big deal because lra and fsb date are
properly updated. It also means an old READ/WRITE event will no prevent the
stream to report a timeout and it is expected too.

This patch must be backported as far as 2.8. On older versions, timeouts and
stream's expiration date are not updated in the same way and this works as
expected.

(cherry picked from commit 2c672f282d0ad2ba7721339a8a7e1660f1f46600)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: quic: Add a counter for reordered packets

A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.

Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.

Should be backported to 2.6.

(cherry picked from commit 167e38e0e0296e899aa894d2f3db5ba2a0c68cb5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: quic: Dynamic packet reordering threshold

Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.

So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.

Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.

Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.

Should be backported to 2.6.

(cherry picked from commit eeeb81bb499677ccaf83858dbdde21c9f3db4341)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: quic: Update K CUBIC calculation (RFC 9438)

The new formula for K CUBIC which arrives with RFC 9438 is as follows:

K = cubic_root((W_max - cwnd_epoch) / C)

Note that W_max is c->last_w_max, and cwnd_epoch is c->cwnd when entering
quic_cubic_update() just after a congestion event.

Must be backported as far as 2.6.

(cherry picked from commit 2ed53ae4a02ba49f529addb9003b747c8ab281b1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: quic: Wrong K CUBIC calculation.

The formula for K CUBIC calculation is as follows:

K = cubic_root(W_max * (1 - beta_quic) / C).

Note that this does not match the comment. But the aim of this patch is to not
hide a bug inside another patch to update this K CUBIC calculation.

The unit of C is bytes/s^3 (or segments/s^3). And we want to store K as
milliseconds. So, the conversion inside the cubic_root() to convert seconds in
milliseconds is wrong. The unit used here is bytes/(ms/1000)^3 or
bytes*1000^3/ms^3. That said, it is preferable to compute K as seconds, then
convert to milliseconds as done by this patch.

Must be backported as far as 2.6.

(cherry picked from commit 406c63ba44666deb7f09a36748039aaafa8a7d5d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing

The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.

These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.

Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.

An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.

This patch can be backported up to 2.8.

(cherry picked from commit 5e66bf26ecbf6439fafc8ef8857abe22e0874f4d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: pool: fix rare risk of deadlock in pool_flush()

As reported by github user @JB0925 in issue #2427, there is a possible
crash in pool_flush(). The problem is that if the free_list is not empty
in the first test, and is empty at the moment the xchg() is performed,
for example because another thread called it in parallel, we place a
POOL_BUSY there that is never removed later, causing the next thread to
wait forever.

This was introduced in 2.5 with commit 2a4523f6f ("BUG/MAJOR: pools: fix
possible race with free() in the lockless variant"). It has probably
very rarely been detected, because:
  - pool_flush() is only called when stopping is set
  - the function does nothing if global pools are disabled, which is
    the case on most modern systems with a fast memory allocator.

It's possible to reproduce it by modifying __task_free() to call
pool_flush() on 1% of the calls instead of only when stopping.

The fix is quite simple, it consists in moving the zeroing of the
entry in the break path after verifying that the entry was not already
busy.

This must be backported wherever commit 2a4523f6f is.

(cherry picked from commit b746af9990d2f7fe10d200026831b6eb10c4953f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUILD: address a few remaining calloc(size, n) cases

In issue #2427 Ilya reports that gcc-14 rightfully complains about
sizeof() being placed in the left term of calloc(). There's no impact
but it's a bad pattern that gets copy-pasted over time. Let's fix the
few remaining occurrences (debug.c, halog, udp-perturb).

This can be backported to all branches, and the irrelevant parts dropped.

(cherry picked from commit ab8928b9dbbae77602b1cd50b0f7189835df9164)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

CI: Update to actions/cache@v4

No functional change, but this upgrade is required, due to the v3 runtime being
deprecated:

> Node.js 16 actions are deprecated. Please update the following actions to use
> Node.js 20: actions/cache@v3. For more information see:
> https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.

(cherry picked from commit 4559470728bf12401819814647e144367a3737eb)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: cli: fix once for all the problem of missing trailing LFs

Some commands are still missing their trailing LF, and very few were even
already spotted in the past emitting more than one. The risk of missing
this LF is particularly high, especially when tests are run in non-
interactive mode where the output looks good at first glance. The problem
is that once run in interactive mode, the missing empty line makes the
command not being complete, and scripts can wait forever.

Let's tackle the problem at its root: messages emitted at the end must
always end with an LF and we know some miss it. Thus, in cli_output_msg()
we now start by removing the trailing LFs from the string, and we always
add exactly one. This way the trailing LF added by correct functions are
silently ignored and all functions are now correct.

This would need to be progressively backported to all supported versions
in order to address them all at once, though the risk of breaking a legacy
script relying on the wrong output is never zero. At first it should at
least go as far as the lastest LTS (2.8), and maybe another one depending
on user demands. Note that it also requires previous patch ("BUG/MINOR:
vars/cli: fix missing LF after "get var" output") because it fixes a test
for a bogus output for "get var" in a VTC.

(cherry picked from commit 6219a58d285060b1eaa33c4c56f89a37a33230a4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: vars/cli: fix missing LF after "get var" output

"get var" on the CLI was also missing an LF, and the vtest as well, so
that fixing only the code breaks the vtest. This must be backported to
2.4 as the issue was brought with commit c35eb38f1d ("MINOR: vars/cli:
add a "get var" CLI command to retrieve global variables").

(cherry picked from commit 5d0dd88ac6c01b1e4ecf7e63efa7ab2f7c1325cf)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: internal: update missing data types in peers-v2.0.txt

This is apparently the only location where the stored data types are
documented, but it was quite outdated as it stopped at gpc1 rate. This
patch adds the missing types (up to and including gpc_rate).

(cherry picked from commit 668eb9aebf5d751933f05bad161a8f49d1f634b1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: config: fix misplaced "bytes_{in,out}"

Counters are managed at the stream level and also work in TCP mode.

They were found in the Layer 7 section, moving them to the Layer 4
section instead.

This could be backported in 2.9 with fa0a304f3 ("DOC: config: add an
index of sample fetch keywords")

(cherry picked from commit ad8625cb6418dabb25fd133fe34fa1d1da035ea5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: config: fix typos for "bytes_{in,out}"

An extra space was placed at the start of "bytes_out" description,
and dconv was having a hard time to properly render the text in html
format because of that.

Finally, remove an extra line feed.

This should be backported in 2.9 with c7424a1ba ("MINOR: samples:
implement bytes_in and bytes_out samples")

(cherry picked from commit 8126d3b007a9677aa310418dcc41536fa7d74461)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: config: fix misplaced "txn.conn_retries"

txn.conn_retries was inserted in the internal states sample table, but it
should belong to Layer 4 sample table instead (SMP_USE_L4SRV)

This should be backported in 2.9 with fa0a304f3 ("DOC: config: add an
index of sample fetch keywords")

(cherry picked from commit a63e3fdc174b790dd4e63bc7f9623df4986aac75)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

DOC: install: recommend pcre2

Makefile comments are also updated to recommend the PCRE2 related
options. PCRE1 is EOL by now:

https://www.mail-archive.com/haproxy@formilux.org/msg41326.html

(cherry picked from commit 2192dfa6b69f499443afd0f4ffb74eea550d0ecb)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTESTS: ssl: Add OCSP related tests

Add tests that combine the OCSP update mechanism and the various
preexisting commands that allow to manipulate certificates and
crt-lists.

(cherry picked from commit 79d526f6da8ed29a4c5743f66b90e6af34cbd59c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

REGTESTS: ssl: Fix empty line in cli command input

The 'set ssl cert' command was failing because of empty lines in the
contents of the PEM file used to perform the update.
We were also missing the issuer in the newly created ckch_store, which
then raised an error when committing the transaction.

(cherry picked from commit 66b20aada4ffc5b3956f43d05f03a32033bd86b6)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"

If a certificate that has an OCSP uri is unused and gets added to a
crt-list with the ocsp auto update option "on", it would not have been
inserted into the auto update tree because this insertion was only
working on the first call of the ssl_sock_load_ocsp function.
If the configuration used a crt-list like the following:
cert1.pem *
cert2.pem [ocsp-update on] *

Then calling "del ssl crt-list" on the second line and then reverting
the delete by calling "add ssl crt-list" with the same line, then the
cert2.pem would not appear in the ocsp update list (can be checked
thanks to "show ssl ocsp-updates" command).

This patch ensures that in such a case we still perform the insertion in
the update tree.

This patch can be backported up to branch 2.8.

(cherry picked from commit e29ec2e64984604506a40434e1c529f8102722be)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: ssl: Destroy ckch instances before the store during deinit

The ckch_store's free'ing function might end up calling
'ssl_sock_free_ocsp' if the corresponding certificate had ocsp data.
This ocsp cleanup function expects for the 'refcount_instance' member of
the certificate_ocsp structure to be 0, meaning that no live
ckch instance kept a reference on this certificate_ocsp structure.
But since in ckch_store_free we were destroying the ckch_data before
destroying the linked instances, the BUG_ON would fail during a standard
deinit. Reversing the cleanup order fixes the problem.

Must be backported to 2.8.

(cherry picked from commit a290db5706e76f4cdfd20067a8e73805acddeb65)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MEDIUM: ocsp: Separate refcount per instance and per store

With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.

Must be backported to 2.8.

(cherry picked from commit befebf8b5123a1dd88d7d11298d63356f15d87b8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid

The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.

(cherry picked from commit 28e78a0a74e0b3007e0e01753bd6703c219f7ade)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line

When deleting a crt-list line through a "del ssl crt-list" call on the
CLI, we ended up free'ing the corresponding ckch instances without fully
clearing their contents. It left some dangling references on other
objects because the attache SSL_CTX was not deleted, as well as all the
ex_data referenced by it (OCSP responses for instance).

This patch can be backported up to branch 2.4.

(cherry picked from commit 23cab33b67dd9dc76de936f47dfa23b1a8ee40e5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch

When calling ckchs_dup (during a "set ssl cert" CLI command), if the
modified store had OCSP auto update enabled then the new certificate
would not keep the previous update mode and would not appear in the auto
update list.

This patch can be backported to 2.8.

(cherry picked from commit 1fda0a52029ba5fe5ea4e31da7a71ba94b217170)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT

As seen in previous commit 59acb27001 ("BUILD: quic: Variable name typo
inside a BUG_ON()."), it can sometimes happen that with DEBUG forced
without DEBUG_STRICT, BUG_ON() statements are ignored. Sadly, it means
that typos there are not even build-tested.

This patch makes these statements reference sizeof(cond) to make sure
the condition is parsed. This doesn't result in any code being emitted,
but makes sure the expression is correct so that an issue such as the one
above will fail to build (which was verified).

This may be backported as it can help spot failed backports as well.

(cherry picked from commit bc70b385fdbc06cbb83ea0404d2836a434311380)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: debug: remove leftover parentheses in ABORT_NOW()

Since d480b7b ("MINOR: debug: make ABORT_NOW() store the caller's line
number when using abort"), building with 'DEBUG_USE_ABORT' fails with:

  |In file included from include/haproxy/api.h:35,
  |                 from include/haproxy/activity.h:26,
  |                 from src/ev_poll.c:20:
  |include/haproxy/thread.h: In function â\80\98ha_set_threadâ\80\99:
  |include/haproxy/bug.h:107:47: error: expected â\80\98;â\80\99 before â\80\98_with_lineâ\80\99
  |  107 | #define ABORT_NOW() do { DUMP_TRACE(); abort()_with_line(__LINE__); } while (0)
  |      |                                               ^~~~~~~~~~
  |include/haproxy/bug.h:129:25: note: in expansion of macro â\80\98ABORT_NOWâ\80\99
  |  129 |                         ABORT_NOW();                                    \
  |      |                         ^~~~~~~~~
  |include/haproxy/bug.h:123:9: note: in expansion of macro â\80\98__BUG_ONâ\80\99
  |  123 |         __BUG_ON(cond, file, line, crash, pfx, sfx)
  |      |         ^~~~~~~~
  |include/haproxy/bug.h:174:30: note: in expansion of macro â\80\98_BUG_ONâ\80\99
  |  174 | #  define BUG_ON(cond)       _BUG_ON     (cond, __FILE__, __LINE__, 3, "FATAL: bug ",     "")
  |      |                              ^~~~~~~
  |include/haproxy/thread.h:201:17: note: in expansion of macro â\80\98BUG_ONâ\80\99
  |  201 |                 BUG_ON(!thr->ltid_bit);
  |      |                 ^~~~~~
  |compilation terminated due to -Wfatal-errors.
  |make: *** [Makefile:1006: src/ev_poll.o] Error 1

This is because of a leftover: abort()_with_line(__LINE__);
                                    ^^
Fixing it by removing the extra parentheses after 'abort' since the
abort() call is now performed under abort_with_line() helper function.

This was raised by Ilya in GH #2440.

No backport is needed, unless the above commit gets backported.

(cherry picked from commit be0165b24951f68e63b327179babebed399f62ee)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: debug: make ABORT_NOW() store the caller's line number when using abort

Placing DO_NOT_FOLD() before abort() only works in -O2 but not in -Os which
continues to place only 5 calls to abort() in h3.o for call places. The
approach taken here is to replace abort() with a new function that wraps
it and stores the line number in the stack. This slightly increases the
code size (+0.1%) but when unwinding a crash, the line number remains
present now. This is a very low cost, especially if we consider that
DEBUG_USE_ABORT is almost only used by code coverage tools and occasional
debugging sessions.

(cherry picked from commit d480b7be9663643025174eae5e047e42f7bdfc37)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: debug: make sure calls to ha_crash_now() are never merged

As indicated in previous commit, we don't want calls to ha_crash_now()
to be merged, since it will make gdb return a wrong line number. This
was found to happen with gcc 4.7 and 4.8 in h3.c where 26 calls end up
as only 5 to 18 "ud2" instructions depending on optimizations. By
calling DO_NOT_FOLD() just before provoking the trap, we can reliably
avoid this folding problem. Note that this does not address the case
where abort() is used instead (DEBUG_USE_ABORT).

(cherry picked from commit 2bb192ba911a00d554db592bd5d80ab56e9c7398)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding

Modern compilers sometimes perform function tail merging and identical
code folding, which consist in merging identical occurrences of same
code paths, generally final ones (e.g. before a return, a jump or an
unreachable statement). In the case of ABORT_NOW(), it can happen that
the compiler merges all of them into a single one in a function,
defeating the purpose of the check which initially was to figure where
the bug occurred.

Here we're creating a DO_NO_FOLD() macro which makes use of the line
number and passes it as an integer argument to an empty asm() statement.
The effect is a code position dependency which prevents the compiler
from merging the code till that point (though it may still merge the
following code). In practice it's efficient at stopping the compilers
from merging calls to ha_crash_now(), which was the initial purpose.

It may also be used to force certain optimization constructs since it
gives more control to the developer.

(cherry picked from commit e06e8a2390c42260e2998c50887a09e1a78f7252)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: Stop using 1024th of a second.

Use milliseconds in place of 1024th of a second.

Should be backported as far as 2.6.

(cherry picked from commit c977b9aa1587e58429a30a26e98e36a8415e7d65)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation

Avoid loss of precision when computing K cubic value.
Same issue when computing the congestion window value from cubic increase function
formula with possible integer varaiable wrap around.

Depends on this commit:

MINOR: quic: Code clarifications for QUIC CUBIC (RFC 9438)

Must be backported as far as 2.6.

(cherry picked from commit 19a66b290e8e6d1a9bda081038189ebbeac0b37b)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)

The first version of our QUIC CUBIC implementation is confusing because relying on
TCP CUBIC linux kernel implementation and with references to RFC 8312 which is
obsoleted by RFC 9438 (August 2023) after our implementation. RFC 8312 is a little
bit hard to understand. RFC 9438 arrived with much more clarifications.

So, RFC 9438 is about "CUBIC for Fast Long-Distance Networks". Our implementation
for QUIC is not very well documented. As it was difficult to reread this
code, this patch adds only some comments at complicated locations and rename
some macros, variables without logic modifications at all.

So, the aim of this patch is to add first some comments and variables/macros renaming
to avoid embedding too much code modifications in the same big patch.

Some code modifications will come to adapt this CUBIC implementation to this new
RFC 9438.

Rename some macros:
  CUBIC_BETA -> CUBIC_BETA_SCALED
  CUBIC_C    -> CUBIC_C_SCALED
  CUBIC_BETA_SCALE_SHIFT -> CUBIC_SCALE_FACTOR_SHIFT (this is the scaling factor
which is used only for CUBIC_BETA_SCALED)
  CUBIC_DIFF_TIME_LIMIT -> CUBIC_TIME_LIMIT

CUBIC_ONE_SCALED was added (scaled value of 1).

These cubic struct members were renamed:
->tcp_wnd -> ->W_est
->origin_point -> ->W_target
->epoch_start -> ->t_epoch
->remaining_tcp_inc -> remaining_W_est_inc

Local variables to quic_cubic_update() were renamed:
    t -> elapsed_time
    diff ->t
    delta -> W_cubic_t

Add a grahpic curve about the CUBIC Increase function.
Add big copied & pasted RFC 9438 extracts in relation with the 3 different increase
function regions.
Same thing for the fast convergence.
Fix a typo about the reference to QUIC RFC 9002.

Must be backported as far as 2.6 to ease any further modifications to come.

(cherry picked from commit 88d13caa38ccac255b29ae56a37e23280bc459dd)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call

If we were to enable 'ocsp-update' on a certificate that does not have
an OCSP URI, we would exit ssl_sock_load_ocsp with a negative error code
which would raise a misleading error message ("<cert> has an OCSP URI
and OCSP auto-update is set to 'on' ..."). This patch simply fixes the
error message but an error is still raised.

This issue was raised in GitHub #2432.
It can be backported up to branch 2.8.

(cherry picked from commit 73705ac701c6b0a8201ae29fffd9f29520a03b78)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: quic: Variable name typo inside a BUG_ON().

This build issued was introduced by this previous commit which is a bugfix:

BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.

A BUG_ON() referenced <fist> variable in place of <first>.

Must be backported as far as 2.6 as the previous commit.

(cherry picked from commit 59acb27001ab225aff7e7d1b24f25e0e17b18ff3)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.

Acknowledgements ranges are used to build ACK frames. To avoid allocating too
much such objects, a limit was set to 32(QUIC_MAX_ACK_RANGES) by this commit:

MINOR: quic: Do not allocate too much ack ranges

But there is an inversion when removing the oldest range from its tree.
eb64_first() must be used in place of eb64_last(). Note that this patch
only does this modification in addition to rename <last> variable to <first>.

This bug leads such a h2load command to block when a request ends up not
being acknowledged by haproxy even if correctly served:

/opt/nghttp2/build/bin/h2load --alpn-list h3 -t 1 -c 1 -m 1 -n 100 \
https://127.0.0.1/?s=5m

There is a remaining question to be answered. In such a case, haproxy refuses to
reopen the stream, this is a good thing but should not haproxy ackownledge the
request (because correctly parsed again).

Note that to be easily reproduced, this setting had to be applied to the client
network interface:

tc qdisc add dev eth1 root netem delay 100ms 1s loss random

Must be backported as far as 2.6.

(cherry picked from commit 0ce61d2f6ddc0232226d135fdfd28b7ad3dfcc0f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: diag: run the final diags before quitting when using -c

Final diags were added in 2.4 by commit 5a6926dcf ("MINOR: diag: create
cfgdiag module"), but it's called too late in the startup process,
because when "-c" is passed, the call is not made, while it's its primary
use case. Let's just move the call earlier.

Note that currently the check in this function is limited to verifying
unicity of server cookies in a backend, so it can be backported as far
as 2.4, but there is little value in insisting if it doesn't backport
easily.

(cherry picked from commit 75d64c0d4c9c3efb0e0c9048175231887460e8af)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: diag: always show the version before dumping a diag warning

Diag warnings were added in 2.4 by commit 7b01a8dbd ("MINOR: global:
define diagnostic mode of execution") but probably due to the split
function that checks for the mode, they did not reuse the emission of
the version string before the first warning, as was brought in 2.2 by
commit bebd21206 ("MINOR: init: report in "haproxy -c" whether there
were warnings or not"). The effet is that diag warnings are emitted
before the version string if there is no other warning nor error. Let's
just proceed like for the two other ones.

This can be backported to 2.4, though this is of very low importance.

(cherry picked from commit ced41484018835bacab2bd74506edc8aac05b1ce)
Signed-off-by: Willy Tarreau <w@1wt.eu>

[RELEASE] Released version 2.9.4

Released version 2.9.4 with the following main changes :
    - BUG/MINOR: h3: fix checking on NULL Tx buffer
    - DOC: configuration: fix set-dst in actions keywords matrix
    - BUG/MEDIUM: mux-h2: refine connection vs stream error on headers
    - MINOR: mux-h2/traces: add a missing trace on connection WU with negative inc
    - BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's CLI
    - BUG/MINOR: jwt: fix jwt_verify crash on 32-bit archs
    - BUG/MINOR: hlua: fix uninitialized var in hlua_core_get_var()
    - BUG/MEDIUM: cache: Fix crash when deleting secondary entry
    - BUG/MINOR: quic: newreno QUIC congestion control algorithm no more available
    - CLEANUP: quic: Remove unused CUBIC_BETA_SCALE_FACTOR_SHIFT macro.
    - MINOR: quic: Stop hardcoding a scale shifting value (CUBIC_BETA_SCALE_FACTOR_SHIFT)
    - MINOR: quic: extract qc_stream_buf free in a dedicated function
    - BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf
    - DOC: configuration: clarify http-request wait-for-body
    - BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions
    - MINOR: h3: add traces for stream sending function
    - BUG/MEDIUM: h3: do not crash on invalid response status code
    - BUG/MEDIUM: qpack: allow 6xx..9xx status codes
    - BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON
    - BUG/MINOR: h1: Don't support LF only at the end of chunks
    - BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
    - DOC: httpclient: add dedicated httpclient section
    - BUG/MINOR: h1-htx: properly initialize the err_pos field
    - BUG/MEDIUM: h1: always reject the NUL character in header values

BUG/MEDIUM: h1: always reject the NUL character in header values

Ben Kallus kindly reported that we still hadn't blocked the NUL
character from header values as clarified in RFC9110 and that, even
though there's no known issure related to this, it may one day be
used to construct an attack involving another component.

Actually, both Christopher and I sincerely believed we had done it
prior to releasing 2.9, shame on us for missing that one and thanks
to Ben for the reminder!

The change was applied, it was confirmed to properly reject this NUL
byte from both header and trailer values, and it's still possible to
force it to continue to be supported using the usual pair of unsafe
"option accept-invalid-http-{request|response}" for those who would
like to keep it for whatever reason that wouldn't make sense.

This was tagged medium so that distros also remember to apply it as
a preventive measure.

It should progressively be backported to all versions down to 2.0.

(cherry picked from commit 0d76a284b6abe90b7001284a5953f8f445c30ebe)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: h1-htx: properly initialize the err_pos field

Trailers are parsed using a temporary h1m struct, likely due to using
distinct h1 parser states. However, the err_pos field that's used to
decide whether or not to enfore option accept-invalid-http-request (or
response) was not initialized in this struct, resulting in using a
random value that may randomly accept or reject a few bad chars. The
impact is very limited in trailers (e.g. no message size is transmitted
there) but we must make sure that the option is respected, at least for
users facing the need for this option there.

The issue was introduced in 2.0 by commit 2d7c5395ed ("MEDIUM: htx:
Add the parsing of trailers of chunked messages"), and the code moved
from mux_h1.c to h1_htx.c in 2.1 with commit 4f0f88a9d0 ("MEDIUM:
mux-h1/h1-htx: move HTX convertion of H1 messages in dedicated file")
so the patch needs to be backported to all stable versions, and the
file adjusted for 2.0.

(cherry picked from commit 44f02d26f05a953ff28be02b74340cb767cab731)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: httpclient: add dedicated httpclient section

Move httpclient keywords into its own section and explain adding
an introductory paragraph.

Also see Github issue #2409

Should be backported to 2.6 ; but note that:
2.7 does not have httpclient.resolvers.disabled
2.6 does not have httpclient.retries and httpclient.timeout.connect

(cherry picked from commit 0ce34f8caa854e357d14938088ab63c562e15ce5)
[wt: adj ctx]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size

It is similar to the previous fix but for the chunk size parsing. But this
one is more annoying because a poorly coded application in front of haproxy
may ignore the last digit before the LF thinking it should be a CR. In this
case it may be out of sync with HAProxy and that could be exploited to
perform some sort or request smuggling attack.

While it seems unlikely, it is safer to forbid LF with CR at the end of a
chunk size.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.

(cherry picked from commit 4837e998920cbd4e43026e0a638b8ebd71c8018f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: h1: Don't support LF only at the end of chunks

When the message is chunked, all chunks must ends with a CRLF. However, on
old versions, to support bad client or server implementations, the LF only
was also accepted. Nowadays, it seems useless and can even be considered as
an issue. Just forbid LF only at the end of chunks, it seems reasonnable.

This patch must be backported to 2.9 and probably to all stable versions
because there is no reason to still support LF without CR in this case.

(cherry picked from commit 7b737da8258ebdd84e702a2d65cfd3c423f8e96d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON

A recent fix was introduced to ensure unsent data are deleted when a
QUIC MUX stream releases its qc_stream_desc instance. This is necessary
to ensure all used buffers will be liberated once all ACKs are received.
This is implemented by the following patch :

  commit ad6b13d3177945bf6a85d6dc5af80b8e34ea6191 (quic-dev/qns)
  BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf

Before this patch, buffer removal was done only on ACK reception. ACK
handling is only done in order from the oldest one. A BUG_ON() statement
is present to ensure this assertion remains valid.

This is however not true anymore since the above patch. Indeed, after
unsent data removal, the current buffer may be empty if it did not
contain yet any sent data. In this case, it is not the oldest buffer,
thus the BUG_ON() statement will be triggered.

To fix this, simply remove this BUG_ON() statement. It should not have
any impact as it is safe to remove buffers in any order.

Note that several conditions must be met to trigger this BUG_ON crash :
* a QUIC MUX stream is destroyed before transmitting all of its data
* several buffers must have been previously allocated for this stream so
  it happens only for transfers bigger than bufsize
* latency should be high enough to delay ACK reception

This must be backported wherever the above patch is (currently targetted
to 2.6).

(cherry picked from commit a13989f109033e7139c58fdc78c35ef02576d788)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: qpack: allow 6xx..9xx status codes

HTTP status codes outside of 100..599 are considered invalid in HTTP
RFC9110. However, it is explicitely stated that range 600..999 is often
used for internal communication so in practice haproxy must be lenient
with it.

Before this patch, QPACK encoder rejected these values. This resulted in
a connection error. Fix this by extending the range of allowed values
from 100 to 999.

This is linked to github issue #2422. Once again, thanks to @yokim-git
for his help here.

This must be backported up to 2.6.

(cherry picked from commit 7d22c4956c53895e5347d84d43f2a70cbdc01bef)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: h3: do not crash on invalid response status code

A crash occurs in h3_resp_headers_send() if an invalid response code is
received from the backend side. Fix this by properly flagging the
connection on error. This will cause a CONNECTION_CLOSE.

This should fix github issue #2422.

Big thanks to ygkim (@yokim-git) for his help and reactivity. Initially,
GDB reported an invalid code source location due to heavy functions
inlining inside h3_snd_buf(). The issue was found after using -Og flag.

This must be backported up to 2.6.

(cherry picked from commit 5d2fe1871a1ec4ec68a8ed262f4526e02e8e9fc1)
[wt: dropped H3_EV_TX_FRAME from trace flags]
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: h3: add traces for stream sending function

Replace h3_debug_printf() by real trace for functions used by stream
layer snd_buf callback. A new event type H3_EV_STRM_SEND is created for
the occasion.

This should be backported up to 2.6 to help investigate H3 issues on
stable releases. Note that h3_nego_ff/h3_done_ff definition are not
available from 2.8.

(cherry picked from commit df5cf9123fff25eaa8693b97e5c219827bd45581)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions

It has been found that under some rare error circumstances,
SSL_do_handshake() could return with SSL_ERROR_WANT_READ without
even trying to call the read function, causing permanent wakeups
that prevent the process from sleeping.

It was established that this only happens if the retry flags are
not systematically cleared in both directions upon any I/O attempt,
but, given the lack of documentation on this topic, it is hard to
say if this rather strange behavior is expected or not, otherwise
why wouldn't the library always clear the flags by itself before
proceeding?

In addition, this only seems to affect OpenSSL 1.1.0 and above,
and does not affect wolfSSL nor aws-lc.

A bisection on haproxy showed that this issue was first triggered by
commit a8955d57ed ("MEDIUM: ssl: provide our own BIO."), which means
that OpenSSL's socket BIO does not have this problem. And this one
does always clear the flags before proceeding. So let's just proceed
the same way. It was verified that it properly fixes the problem,
does not affect other implementations, and doesn't cause any freeze
nor spurious wakeups either.

Many thanks to Valentín Gutiérrez for providing a network capture
showing the incident as well as a reproducer. This is GH issue #2403.

This patch needs to be backported to all versions that include the
commit above, i.e. as far as 2.0.

(cherry picked from commit 1ad19917213fac57ee37e581b0ef137e36c6309d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: configuration: clarify http-request wait-for-body

Make it more explicit what happens in the various scenarios that cause
HAProxy to stop waiting when "http-request wait-for-body" is used.

Also fix a couple of grammatical errors.

Fixes: #2410
Signed-Off-By: Thayne McCombs <astrothayne@gmail.com>
(cherry picked from commit 001bddf48c5e646466bd322817e8b4dc78c3760f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf

QCS instances use qc_stream_desc for data buffering on emission. On
stream reset, its Tx channel is closed earlier than expected. This may
leave unsent data into qc_stream_desc.

Before this patch, these unsent data would remain after QCS freeing.
This prevents the buffer to be released as no ACK reception will remove
them. The buffer is only freed when the whole connection is closed. As
qc_stream_desc buffer is limited per connection, this reduces the buffer
pool for other streams of the same connection. In the worst case if
several streams are resetted, this may completely freeze the transfer of
the remaining connection streams.

This bug was reproduced by reducing the connection buffer pool to a
single buffer instance by using the following global statement :

tune.quic.frontend.conn-tx-buffers.limit 1.

Then a QUIC client is used which opens a stream for a large enough
object to ensure data are buffered. The client them emits a STOP_SENDING
before reading all data, which forces the corresponding QCS instance to
be resetted. The client then opens a new request but the transfer is
freezed due to this bug.

To fix this, adjust qc_stream_desc API. Add a new argument <final_size>
on qc_stream_desc_release() function. Its value is compared to the
currently buffered offset in latest qc_stream_desc buffer. If
<final_size> is inferior, it means unsent data are present in the
buffer. As such, qc_stream_desc_release() removes them to ensure the
buffer will finally be freed when all ACKs are received. It is also
possible that no data remains immediately, indicating that ACK were
already received. As such, buffer instance is immediately removed by
qc_stream_buf_free().

This must be backported up to 2.6. As this code section is known to
regression, a period of observation could be reserved before
distributing it on LTS releases.

(cherry picked from commit ad6b13d3177945bf6a85d6dc5af80b8e34ea6191)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: extract qc_stream_buf free in a dedicated function

On ACK reception, data are removed from buffer via qc_stream_desc_ack().
The buffer can be freed if no more data are left. A new slot is also
accounted in buffer connection pool. Extract this operation in a
dedicated private function qc_stream_buf_free().

This change should have no functional change. However it will be useful
for the next patch which needs to remove a buffer from another function.

This patch is necessary for the following bugfix. As such, it must be
backported with it up to 2.6.

(cherry picked from commit 1da5787db44547590da0f339b797d728f04a01fd)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: quic: Stop hardcoding a scale shifting value (CUBIC_BETA_SCALE_FACTOR_SHIFT)

Very minor modification to replace a statement with an hardcoded value by a macro.

Should be backported as far as 2.6 to ease any further modification to come.

(cherry picked from commit 96385f40b508a395fcd802a9dff13e968c12433d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: quic: Remove unused CUBIC_BETA_SCALE_FACTOR_SHIFT macro.

This macro is not used and has a confusing name.

Should be backported as far as 2.6.

(cherry picked from commit 574cf3fe0050a1ef36e1cca4af59a95316d67053)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: quic: newreno QUIC congestion control algorithm no more available

There is a typo in the statement to initialize this variable when selecting
newreno as cc algo:
const char *newreno = "newrno";
This would have happened if #defines had be used in place of several const char *
variables harcoded values.

Take the opportunity of this patch to use #defines for all the available cc algorithms.

Must be backported to 2.9.

(cherry picked from commit b9a163e7e1ded0dc9cda78a9da652f14f424605a)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: cache: Fix crash when deleting secondary entry

When a cache is "cold" and multiple clients simultaneously try to access
the same resource we must forward all the requests to the server. Next,
every "duplicated" response will be processed in http_action_store_cache
and we will try to cache every one of them regardless of whether this
response was already cached. In order to avoid having multiple entries
for a same primary key, the logic is then to first delete any
preexisting entry from the cache tree before storing the current one.
The actual previous response content will not be deleted yet though
because if the corresponding row is detached from the "avail" list it
might still be used by a cache applet if it actually performed a lookup
in the cache tree before the new response could be received.

This all means that we can end up using a valid row that references a
cache_entry that was already removed from the cache tree. This does not
pose any problem in regular caches (no 'vary' mechanism enabled) because
the applet only works on the data and not the 'cache_entry' information,
but in the "vary" context, when calling 'http_cache_applet_release' we
might call 'delete_entry' on the given entry which in turn tries to
iterate over all the secondary entries to find the right one in which
the secondary entry counter can be updated. We would then call
eb32_next_dup on an entry that was not in the tree anymore which ended
up crashing.

This crash was introduced by "48f81ec09 : MAJOR: cache: Delay cache
entry delete in reserve_hot function" which added the call to
"release_entry" in "http_cache_applet_release" that ended up crashing.

This issue was raised in GitHub #2417.
This patch must be backported to branch 2.9.

(cherry picked from commit 6b695123327de3fd0843462504ce88293d7eda6d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: hlua: fix uninitialized var in hlua_core_get_var()

As raised by Coverity in GH #2223, f034139bc0 ("MINOR: lua: Allow reading
"proc." scoped vars from LUA core.") causes uninitialized reads due to
smp being passed to vars_get_by_name() without being initialized first.

Indeed, vars_get_by_name() tries to read smp->sess and smp->strm pointers.
As we're only interested in the PROC var scope, it is safe to call
vars_get_by_name() with sess and strm pointers set to NULL, thus we
simply memset smp prior to calling vars_get_by_name() to fix the issue.

This should be backported in 2.9 with f034139bc0.

(cherry picked from commit 564addcb727bfb3dd46507ec824f11c20c6bb861)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: jwt: fix jwt_verify crash on 32-bit archs

The jwt_verify converter was added in 2.5 with commit 130e142ee2
("MEDIUM: jwt: Add jwt_verify converter to verify JWT integrity"). It
takes a string on input and returns an integer. It turns out that by
presetting the return value to zero before processing contents, while
the sample data is a union, it overwrites the beginning of the buffer
struct passed on input. On a 64-bit arch it's not an issue because it's
where the allocated size is stored and it's not used in the operation,
which explains why the regtest works. But on 32-bit, both the size and
the pointer are overwritten, causing a NULL pointer to be passed to
jwt_tokenize() which is not designed to support this, hence crashes.

Let's just use a temporary variable to hold the result and move the
output sample initialization to the end of the function.

This should be backported as far as 2.5.

(cherry picked from commit e41638af33d76660220ce2e3ed613e8a24fb6e55)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's CLI

The initial purpose of CSV stats through CLI was to make it easely
parsable by scripts. But in some specific cases some error or warning
messages strings containing LF were dumped into cells of this CSV.

This made some parsing failure on several tools. In addition, if a
warning or message contains to successive LF, they will be dumped
directly but double LFs tag the end of the response on CLI and the
client may consider a truncated response.

This patch extends the 'csv_enc_append' and 'csv_enc' functions used
to format quoted string content according to RFC with an additionnal
parameter to convert multi-lines strings to one line: CRs are skipped,
and LFs are replaced with spaces. In addition and optionally, it is
also possible to remove resulting trailing spaces.

The call of this function to fill strings into stat's CSV output is
updated to force this conversion.

This patch should be backported on all supported branches (issue was
already present in v2.0)

(cherry picked from commit ef02dba7bcf4cf41aaf08fa5cabc8dd2d7537f63)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: mux-h2/traces: add a missing trace on connection WU with negative inc

The test was performed but no trace emitted, which can complicate certain
diagnostics, so let's just add the trace for this rare case. It may safely
be backported though this is really not important.

(cherry picked from commit 87b74697cd58f7780f3585f70a1966553192830f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: mux-h2: refine connection vs stream error on headers

Commit 7021a8c4d8 ("BUG/MINOR: mux-h2: also count streams for refused
ones") addressed stream counting issues on some error cases but not
completely correctly regarding the conn_err vs stream_err case. Indeed,
contrary to the initial analysis, h2c_dec_hdrs() can set H2_CS_ERROR
when facing some unrecoverable protocol errors, and it's not correct
to send it to strm_err which will only send the RST_STREAM frame and
the subsequent GOAWAY frame is in fact the result of the read timeout.

The difficulty behind this lies on the sequence of output validations
because h2c_dec_hdrs() returns two results at once:
- frame processing status (done/incomplete/failed)
- connection error status

The original ordering requires to write 2 exemplaries of the exact
same error handling code disposed differently, which the patch above
tried to factor to one. After careful inspection of h2c_dec_hdrs()
and its comments, it's clear that it always returns -1 on failure,
*including* connection errors. This means we can rearrange the test
to get rid of the missing data first, and immediately enter the
no-return zone where both the stream and connection errors can be
checked at the same place, making sure to consistently maintain
error counters. This is way better because we don't have to update
stream counters on the error path anymore. h2spec now passes the
test much faster.

This will need to be backported to the same branches as the commit
above, which was already backported to 2.9.

(cherry picked from commit e1c8bfd0ed960d3b3dec39e78ad75bec117912d0)
Signed-off-by: Willy Tarreau <w@1wt.eu>

DOC: configuration: fix set-dst in actions keywords matrix

Since d54e8f8107 ("DOC: config: reorganize actions into their own section")
dconv-generated shortlink for "set-dst" in actions keywords matrix is
broken.

This is due to trailing "<expr>" which should not be specified in the
matrix, but only in the actual keyword prototype and description.

This should be backported in 2.9 with d54e8f8107.

(cherry picked from commit 62ef7966f0312e31e6fb8fc9e572e0e907e28558)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: h3: fix checking on NULL Tx buffer

The following patch was backported to handle gracefully Tx buffer
allocation failure.

BUG/MINOR: h3: close connection on sending alloc errors

However, the backport is not correct. Indeed, mux_get_buf() returns a
pointer to <qcs.tx.buf> field which is always true. Instead, an explicit
check using b_is_null() must be done instead.

This must be backported up to 2.6.

[RELEASE] Released version 2.9.3

Released version 2.9.3 with the following main changes :
    - BUILD: quic: missing include for quic_tp
    - BUG/MINOR: mux-quic: do not prevent non-STREAM sending on flow control
    - BUG/MINOR: mux-h2: also count streams for refused ones
    - BUG/MEDIUM: quic: keylog callback not called (USE_OPENSSL_COMPAT)

BUG/MEDIUM: quic: keylog callback not called (USE_OPENSSL_COMPAT)

This bug impacts only the QUIC OpenSSL compatibility module (USE_QUIC_OPENSSL_COMPAT)
and it was introduced by this commit:

BUG/MINOR: quic: Wrong keylog callback setting.

quic_tls_compat_keylog_callback() callback was no more set when the SSL keylog was
enabled by tune.ssl.keylog setting. This is the callback which sets the TLS secrets
into haproxy.

Set it again when the SSL keylog is not enabled by configuration.

Thank you to @Greg57070 for having reported this issue in GH #2412.

Must be backported as far as 2.8.

(cherry picked from commit 0eaf42a2a47f2ee73045e48274ed98e00aa44dba)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-h2: also count streams for refused ones

There are a few places where we can reject an incoming stream based on
technical errors such as decoded headers that are too large for the
internal buffers, or memory allocation errors. In this case we send
an RST_STREAM to abort the request, but the total stream counter was
not incremented. That's not really a problem, until one starts to try
to enforce a total stream limit using tune.h2.fe.max-total-streams,
and which will not count such faulty streams. Typically a client that
learns too large cookies and tries to replay them in a way that
overflows the maximum buffer size would be rejected and depending on
how they're implemented, they might retry forever.

This patch removes the stream count increment from h2s_new() and moves
it instead to the calling functions, so that it translates the decision
to process a new stream instead of a successfully decoded stream. The
result is that such a bogus client will now be blocked after reaching
the total stream limit.

This can be validated this way:

  global
        tune.h2.fe.max-total-streams 128
        expose-experimental-directives
        trace h2 sink stdout
        trace h2 level developer
        trace h2 verbosity complete
        trace h2 start now

  frontend h
        bind :8080
        mode http
        redirect location /

Sending this will fill frames with 15972 bytes of cookie headers that
expand to 16500 for storage+index once decoded, causing "message too large"
events:

  (dev/h2/mkhdr.sh -t p;dev/h2/mkhdr.sh -t s;
   for sid in {0..1000}; do
     dev/h2/mkhdr.sh  -t h -i $((sid*2+1)) -f es,eh \
       -R "828684410f7777772e6578616d706c652e636f6d \
           $(for i in {1..66}; do
             echo -n 60 7F 73 433d $(for j in {1..24}; do
               echo -n 2e313233343536373839; done);
            done) ";
   done) | nc 0 8080

Now it properly stops after sending 128 streams.

This may be backported wherever commit 983ac4397 ("MINOR: mux-h2:
support limiting the total number of H2 streams per connection") is
present, since without it, that commit is less effective.

(cherry picked from commit 7021a8c4d8bd8b280d04f481a153a1487deb08d1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: mux-quic: do not prevent non-STREAM sending on flow control

Data emitted by QUIC MUX is restrained by the peer flow control. This is
checked on stream and connection level inside qcc_io_send().

The connection level check was placed early in qcc_io_send() preambule.
However, this also prevents emission of other frames STOP_SENDING and
RESET_STREAM, until flow control limitation is increased by a received
MAX_DATA. Note that local flow control frame emission is done prior in
qcc_io_send() and so are not impacted.

In the worst case, if no MAX_DATA is received for some time, this could
delay significantly streams closure and resource free. However, this
should be rare as other peers should anticipate emission of MAX_DATA
before reaching flow control limit. In the end, this is also covered by
the MUX timeout so the impact should be minimal

To fix this, move the connection level check directly inside QCS sending
loop. Note that this could cause unnecessary looping when connection
flow control level is reached and no STOP_SENDING/RESET_STREAM are
needed.

This should be backported up to 2.6.

(cherry picked from commit 333f2cababa36f4289493647fed4f7dce0babc77)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUILD: quic: missing include for quic_tp

Add missing netinet/in.h required for in_addr/in6_addr types.

This should be backported up to 2.9.

(cherry picked from commit c121fcef30559ee9bcf0aeb1634c48f87ef4b22c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

[RELEASE] Released version 2.9.2

Released version 2.9.2 with the following main changes :
    - BUG/MINOR: resolvers: default resolvers fails when network not configured
    - DOC: config: Update documentation about local haproxy response
    - BUG/MINOR: server: Use the configured address family for the initial resolution
    - BUG/MAJOR: stconn: Disable zero-copy forwarding if consumer is shut or in error
    - MINOR: stats: store the parent proxy in stats ctx (http)
    - BUG/MEDIUM: stats: unhandled switching rules with TCP frontend
    - MINOR: server/event_hdl: add server_inetaddr struct to facilitate event data usage
    - MINOR: server/event_hdl: update _srv_event_hdl_prepare_inetaddr prototype
    - BUG/MINOR: server/event_hdl: propagate map port info through inetaddr event
    - DOC: fix typo for fastfwd QUIC option
    - BUG/MINOR: mux-quic: always report error to SC on RESET_STREAM emission
    - BUG/MINOR: mux-quic: disable fast-fwd if connection on error
    - BUG/MINOR: quic: Wrong keylog callback setting.
    - BUG/MINOR: quic: Missing call to TLS message callbacks
    - MINOR: h3: check connection error during sending
    - BUG/MINOR: h3: close connection on header list too big
    - MINOR: h3: add traces for connection init stage
    - BUG/MINOR: h3: properly handle alloc failure on finalize
    - BUG/MINOR: h3: close connection on sending alloc errors
    - BUG/MINOR: h3: disable fast-forward on buffer alloc failure
    - CI: use semantic version compare for determing "latest" OpenSSL
    - MINOR: global: export a way to list build options
    - MINOR: debug: add features and build options to "show dev"
    - REGTESTS: check attach-srv out of order declaration
    - CLEANUP: quic: Remaining useless code into server part
    - BUILD: quic: Missing quic_ssl.h header protection
    - BUG/MEDIUM: h3: fix incorrect snd_buf return value
    - BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable
    - BUG/MEDIUM: stconn: Set fsb date if zero-copy forwarding is blocked during nego
    - BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up
    - MINOR: mux-h2: support limiting the total number of H2 streams per connection
    - MINOR: ot: logsrv struct becomes logger
    - MINOR: ssl: Update ssl_fc_curve/ssl_bc_curve to use SSL_get0_group_name
    - CLEANUP: quic: Double quic_dgram_parse() prototype declaration.
    - BUG/MINOR: map: list-based matching potential ordering regression
    - REGTESTS: add a test to ensure map-ordering is preserved
    - DOC: configuration: corrected description of keyword tune.ssl.ocsp-update.mindelay

DOC: configuration: corrected description of keyword tune.ssl.ocsp-update.mindelay

Deleted the text paragraph in the description of keyword
tune.ssl.ocsp-update.mindelay, which was added in the commit 5843237
"MINOR: ssl: Add global options to modify ocsp update min/max delay",
because it was a copy of the description of tune.ssl.ssl-ctx-cache-size.

(cherry picked from commit f6ab0446fba1f87b1a703acf98ca4587609bd122)
Signed-off-by: Willy Tarreau <w@1wt.eu>

REGTESTS: add a test to ensure map-ordering is preserved

As shown in "BUG/MINOR: map: list-based matching potential ordering
regression", list-based matching types such as dom are affected by the
order in which elements are loaded from the map.

Since this is historical behavior and existing usages depend on it, we
add a test to prevent future regressions.

(cherry picked from commit 1088f0b969e18ee5a8fe829d2071507ddef0d05e)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: map: list-based matching potential ordering regression

An unexpected side-effect was introduced by 5fea597 ("MEDIUM: map/acl:
Accelerate several functions using pat_ref_elt struct ->head list")

The above commit tried to use eb tree API to manipulate elements as much
as possible in the hope to accelerate some functions.

Prior to 5fea597, pattern_read_from_file() used to iterate over all
elements from the map file in the same order they were seen in the file
(using list_for_each_entry) to push them in the pattern expression.

Now, since eb api is used to iterate over elements, the ordering is lost
very early.

This is known to cause behavior changes with existing setups (same conf
and map file) when compared with previous versions for some list-based
matching methods as described in GH #2400. For instance, the map_dom()
converter may return a different matching key from the one that was
returned by older haproxy versions.

For IP or STR matching, matching is based on tree lookups for better
efficiency, so in this case the ordering is lost at the name of
performance. The order in which they are loaded doesn't matter because
tree ordering is based on the content, it is not positional.

But with some other types, matching is based on list lookups (e.g.: dom),
and the order in which elements are pushed into the list can affect the
matching element that will be returned (in case of multiple matches, since
only the first matching element in the list will be returned).

Despite the documentation not officially stating that the file ordering
should be preserved for list-based matching methods, it's probably best
to be conservative here and stick to historical behavior. Moreover, there
was no performance benefit from using the eb tree api to iterate over
elements in pattern_read_from_file() since all elements are visited
anyway.

This should be backported to 2.9.

(cherry picked from commit b546bb6d67145db0ebd14c79d72199064208dc2f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: quic: Double quic_dgram_parse() prototype declaration.

This function is defined in the RX part (quic_rx.c) and declared in quic_rx.h
header. This is its correct place.

Remove the useless declaration of this function in quic_conn.h.

Should be backported in 2.9 where this double declaration was introduced when
moving quic_dgram_parse() from quic_conn.c to quic_rx.c.

(cherry picked from commit 37d5a26cc5a5ebe1abeb2836f89424e33408a186)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: ssl: Update ssl_fc_curve/ssl_bc_curve to use SSL_get0_group_name

The function `smp_fetch_ssl_fc_ec` gets the curve name used during key
exchange. It currently uses the `SSL_get_negotiated_group`, available
since OpenSSLv3.0 to get the nid and derive the short name of the curve
from the nid. In OpenSSLv3.2, a new function, `SSL_get0_group_name` was
added that directly gives the curve name.

The function `smp_fetch_ssl_fc_ec` has been updated to use
`SSL_get0_group_name` if using OpenSSL>=3.2 and for versions >=3.0 and <
3.2 use the old SSL_get_negotiated_group to get the curve name. Another
change made is to normalize the return value, so that
`smp_fetch_ssl_fc_ec` returns curve name in uppercase.
(`SSL_get0_group_name` returns the curve name in lowercase and
`SSL_get_negotiated_group` + `OBJ_nid2sn` returns curve name in
uppercase).

Can be backported to 2.8.

(cherry picked from commit 25da2174c62b536b00bc5b0c0bb64ae95fd826b9)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: ot: logsrv struct becomes logger

Addition to commit 18da35c "MEDIUM: tree-wide: logsrv struct becomes logger",
when the OpenTracing filter is compiled in debug mode (using OT_DEBUG=1)
then logsrv should be changed to logger here as well.

This patch should be backported to branch 2.9.

(cherry picked from commit 86298c6913fabb5dba48aeaa9befcd957e2e8ccf)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: mux-h2: support limiting the total number of H2 streams per connection

This patch introduces a new setting: tune.h2.fe.max-total-streams. It
sets the HTTP/2 maximum number of total streams processed per incoming
connection. Once this limit is reached, HAProxy will send a graceful GOAWAY
frame informing the client that it will close the connection after all
pending streams have been closed. In practice, clients tend to close as fast
as possible when receiving this, and to establish a new connection for next
requests. Doing this is sometimes useful and desired in situations where
clients stay connected for a very long time and cause some imbalance inside a
farm. For example, in some highly dynamic environments, it is possible that
new load balancers are instantiated on the fly to adapt to a load increase,
and that once the load goes down they should be stopped without breaking
established connections. By setting a limit here, the connections will have
a limited lifetime and will be frequently renewed, with some possibly being
established to other nodes, so that existing resources are quickly released.

The default value is zero, which enforces no limit beyond those implied by
the protocol (2^30 ~= 1.07 billion). Values around 1000 were found to
already cause frequent enough connection renewal without causing any
perceptible latency to most clients. One notable exception here is h2load
which reports errors for all requests that were expected to be sent over
a given connection after it receives a GOAWAY. This is an already known
limitation: https://github.com/nghttp2/nghttp2/issues/981

The patch was made in two parts inside h2_frt_handle_headers():
  - the first one, at the end of the function, which verifies if the
    configured limit was reached and if it's needed to emit a GOAWAY ;

  - the second, just before decoding the stream frame, which verifies if
    a previously configured limit was ignored by the client, and closes
    the connection if this happens. Indeed, one reason for a connection
    to stay alive for too long definitely comes from a stupid bot that
    periodically fetches the same resource, scans lots of URLs or tries
    to brute-force something. These ones are more likely to just ignore
    the last stream ID advertised in GOAWAY than a regular browser, or
    a well-behaving client such as curl which respects it. So in order
    to make sure we can close the connection we need to enforce the
    advertised limit.

Note that a regular client will not face a problem with that because in
the worst case it will have max_concurrent_streams in flight and this
limit is taken into account when calculating the advertised last
acceptable stream ID.

Just a note: it may also be possible to move the first part above to
h2s_frt_stream_new() instead so that it's not processed for trailers,
though it doesn't seem to be more interesting, first because it has
two return points.

This is something that may be backported to 2.9 and 2.8 to offer more
control to those dealing with dynamic infrastructures, especially since
for now we cannot force a connection to be cleanly closed using rules
(e.g. github issues #946, #2146).

(cherry picked from commit 983ac4397d2a6fe94c9d4960dcc8272d87801958)
[wt: also removed the debugging printfs killed in 3.0 by e19334a343]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: spoe: Never create new spoe applet if there is no server up

This test was already performed when a new message is queued into the
sending queue. However not when the last applet is released, in
spoe_release_appctx(). It is a quite old bug. It was introduced by commit
6f1296b5c7 ("BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the
last one is released").

Because of this bug, new SPOE applets may be created and quickly released
because there is no server up, in loop and while there is at least one
message in the sending queue, consuming all the CPU. It is pretty visible if
the processing timeout is high.

To fix the bug, conditions to create or not a SPOE applet are now
centralized in spoe_create_appctx(). The test about the max connections per
second and about number of active servers are moved in this function.

This patch must be backported to all stable versions.

(cherry picked from commit 72c23bd4cd3ee3a79955fbdd601206a9fa7d19eb)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: stconn: Set fsb date if zero-copy forwarding is blocked during nego

During the zero-copy forwarding, if the consumer side reports it is blocked,
it means it is blocked on send. At the stream-connector level, the event
must be reported to be sure to set/update the fsb date. Otherwise, write
timeouts cannot be properly reported. If this happens when no other timeout
is armed, this freezes the stream.

This patch must be backported to 2.9.

(cherry picked from commit 7cc41514226a2072b538c859cd83dc01de0aeb3f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: stconn: Forward shutdown on write timeout only if it is forwardable

The commit b9c87f8082 ("BUG/MEDIUM: stconn/stream: Forward shutdown on write
timeout") introduced a regression. In sc_cond_forward_shut(), the write
timeout is considered too early to forward the shutdown. In fact, it is
always considered, even if the shutdown is not forwardable yet. It is of
course unexpected. It is especially an issue when a write timeout is
encountered on server side during the connection establishment. In this
case, if shutdown is forwarded too early on the client side, the connection
is closed before the 503 error sending.

So the write timeout must indeed be considered to forward the shutdown to
the underlying layer, but only if the shutdown is forwardable. Otherwise, we
should do nothing.

This patch should fix the issue #2404. It must be backported as far as 2.2.

(cherry picked from commit 7eb7ae283557d8467403b9db35605861a0870add)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MEDIUM: h3: fix incorrect snd_buf return value

h3_resp_data_send() is used to transcode HTX data into H3 data frames.
If QCS Tx buffer is not aligned when first invoked, two separate frames
may be built, first until buffer end, then with remaining space in
front.

If buffer space is not enough for at least the H3 frame header, -1 is
returned with the flag QC_SF_BLK_MROOM set to await for more room. An
issue arises if this occurs for the second frame : -1 is returned even
though HTX data were properly transcoded and removed on the first step.
This causes snd_buf callback to return an incorrect value to the stream
layer, which in the end will corrupt the channel output buffer.

To fix this, stop considering that not enough remaining space is an
error case. Instead, return 0 if this is encountered for the first frame
or the HTX removed block size for the second one. As QC_SF_BLK_MROOM is
set, this will correctly interrupt H3 encoding. Label err is thus only
properly limited to fatal error which should cause a connection closure.
A new BUG_ON() has been added which should prevent similar issues in the
future.

This issue was detected using the following client :
$ ngtcp2-client --no-quic-dump --no-http-dump --exit-on-all-streams-close \
   127.0.0.1 20443 -n2 "http://127.0.0.1:20443/?s=50k"

This triggers the following CHECK_IF statement. Note that it may be
necessary to disable fast forwarding to enforce snd_buf usage.

Thread 1 "haproxy" received signal SIGILL, Illegal instruction.
0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130
130             CHECK_IF_HOT(c->output > c_data(c));
[ ## gdb ## ] bt
#0  0x00005555558bc48a in co_data (c=0x5555561ed428) at include/haproxy/channel.h:130
#1  0x00005555558c1d69 in sc_conn_send (sc=0x5555561f92d0) at src/stconn.c:1637
#2  0x00005555558c2683 in sc_conn_io_cb (t=0x5555561f7f10, ctx=0x5555561f92d0, state=32832) at src/stconn.c:1824
#3  0x000055555590c48f in run_tasks_from_lists (budgets=0x7fffffffdaa0) at src/task.c:596
#4  0x000055555590cf88 in process_runnable_tasks () at src/task.c:876
#5  0x00005555558aae3b in run_poll_loop () at src/haproxy.c:3049
#6  0x00005555558ab57e in run_thread_poll_loop (data=0x555555d9fa00 <ha_thread_info>) at src/haproxy.c:3251
#7  0x00005555558ad053 in main (argc=6, argv=0x7fffffffddd8) at src/haproxy.c:3948

In case CHECK_IF are not activated, it may cause crash or incorrect
transfers.

This was introduced by the following commit
  commit 2144d2418651c1f76b91cc1f6e745feecdefcb00
  BUG/MINOR: h3: close connection on sending alloc errors

This must be backported wherever the above patch is.

(cherry picked from commit 14673fe54d91af3ce28ee2d94ca77f626eb1a7ea)
[wt: adjusted context in h3_resp_data_send]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: quic: Missing quic_ssl.h header protection

Such "#ifdef USE_QUIC" prepocessor statements are used by QUIC C header
to avoid inclusion of QUIC headers when the QUIC support is not enabled
(by USE_QUIC make variable). Furthermore, this allows inclusions of QUIC
header from C file without having to protect them with others "#ifdef USE_QUIC"
statements as follows:

   #ifdef USE_QUIC
   #include <a QUIC header>
   #include <another one QUIC header>
   #endif /* USE_QUIC */

So, here if this quic_ssl.h header was included by a C file, and compiled without
QUIC support, this will lead to build errrors as follows:

In file included from <a C file...>:
        include/haproxy/quic_ssl.h:35:35: warning: â\80\98enum ssl_encryption_level_tâ\80\99
        declared inside parameter list will not be visible outside of this
        definition or declaration

Should be backported to 2.9 to avoid such building issues to come.

(cherry picked from commit fd178ccdb0e0c22ce784d53762bef33b5709bdaa)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CLEANUP: quic: Remaining useless code into server part

Remove some QUIC definitions of members from server structure as the haproxy QUIC
stack does not support at all the server part (QUIC client) as this time.
Remove the statements in relation with their initializations.

This patch should be backported as far as 2.6 to save memory.

(cherry picked from commit 860028db47ac2ad38de56c7711ce872245c70b83)
Signed-off-by: Willy Tarreau <w@1wt.eu>

REGTESTS: check attach-srv out of order declaration

Previous patch fixed a regression which caused some config with
attach-srv to be rejected if the rule was declared before the target
server itself. To better detect this kind of error, mix the declaration
order in the corresponding regtest.

(cherry picked from commit 0627f470df366e9dadf604f5f473e2dc2693200f)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: debug: add features and build options to "show dev"

The "show dev" CLI command is still missing useful elements such as the
build options, SSL version etc. Let's just add the build features and
the build options there so that it's possible to collect all of this
from a running process without having to start the executable with -vv.

This is still dumped all at once from the parsing function since the
output is small. If it were to grow, this would possibly require to be
reworked to support a context.

It might be helpful to backport this to 2.9 since it can help narrow
down certain issues.

(cherry picked from commit 9d869b10dea0f4234eb14787f658d6b64a0a4021)
Signed-off-by: Willy Tarreau <w@1wt.eu>

MINOR: global: export a way to list build options

The new function hap_get_next_build_opt() will iterate over the list of
build options. This will be used for debugging, so that the build options
can be retrieved from the CLI.

(cherry picked from commit afba58f21e7c938b793858180b3bd8d8fd6aef5d)
Signed-off-by: Willy Tarreau <w@1wt.eu>

CI: use semantic version compare for determing "latest" OpenSSL

currently "openssl-3.2.0-beta1" wins over "openssl-3.2.0" due to
string comparision. let's switch to semantic version compare

(cherry picked from commit e6d0b87f7f18cd07163687f15b086eb86ed57957)
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUG/MINOR: h3: disable fast-forward on buffer alloc failure

If QCS Tx buffer cannot be allocated in nego_ff callback, disable
fast-forward for this connection and return immediately. If snd_buf is
later finally used but still no buffer can being allocated, the
connection will be closed on error.

This should fix coverity reported in github issue #2390.

This should be backported up to 2.9.

(cherry picked from commit cfa6d4cdd058ea112c461349a39d095ba96825e4)
[cf: qcc_get_stream_txbuf() does not exist. Result of mux_get_buf() is tested
instead]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

BUG/MINOR: h3: close connection on sending alloc errors

When encoding new HTTP/3 frames, QCS Tx buffer must be allocated if
currently NULL. Previously, allocation failure was not properly checked,
leaving the connection in an unspecified state, or worse risking a
crash.

Fix this by setting <h3c.err> to H3_INTERNAL_ERROR each time the
allocation fails. This will stop sending and close the connection. In
the future, it may be better to put the connection on pause waiting for
allocation to succeed but this is too complicated to implement for now
in a reliable way.

Along with the current change, return of all HTX parsing functions
(h3_resp_*_send) were set to a negative value in case of error. A new
BUG_ON() in h3_snd_buf() ensures that if such a value is returned,
either a connection error is register (via <h3c.err>) or buffer is
temporarily full (flag QC_SF_BLK_MROOM).

This should fix github issue #2389.

This should be backported up to 2.6. Note that qcc_get_stream_txbuf()
does not exist in 2.9 and below. mux_get_buf() is its equivalent. An
explicit check b_is_null(&qcs.tx.buf) should be used there.

(cherry picked from commit 2144d2418651c1f76b91cc1f6e745feecdefcb00)
[cf: H3_EV_TX_FRAME removed from trace messages because it does not exist]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>