William Lallemand [Tue, 2 Jul 2024 16:23:34 +0000 (18:23 +0200)]
DOC: configuration: more details about the master-worker mode
Add more details about the master-worker mode in the "master-worker"
global keyword.
Should fix issue #2198.
(cherry picked from commit
419b79492a2ae8c9323b907b9d2da85c1208c372)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 1 Jul 2024 12:33:59 +0000 (14:33 +0200)]
BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
In 3.0, the CLI applet was rewritten to use its own buffers. However, the
lua part, used to register CLI commands at runtime, was not updated
accordingly. It means the lua CLI commands still try to write in the channel
buffers. This is of course totally unexepected and not supported. Because of
this bug, the applet hangs intead of returning the command result.
The registration of lua CLI commands relies on the lua TCP applets. So the
send and receive functions were fixed to use the applet's buffer when it is
required and still use the channel buffers otherwies. This way, other lua
TCP applets can still run on the legacy mode, without the applet's buffers.
This patch must be backported to 3.0.
(cherry picked from commit
e5e36ce09722e63ca4542b0b2ff1a1eb905f8208)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 1 Jul 2024 08:33:03 +0000 (10:33 +0200)]
BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
When the support for modules was added, the function producing the #HELP
line of each metric was refactored. Since then, the prefix "#HELP
<metric-name>" is printed twice because a code block was not removed. It is
now fixed.
This patch must be backported to 3.0.
(cherry picked from commit
b789cef91f5ac6123b130f5474b44fb7c57106ae)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Sun, 30 Jun 2024 04:23:30 +0000 (06:23 +0200)]
BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
Locking of the CID tree was extended in qc_check_dcid() by recent commit
05f59a5 ("BUG/MINOR: quic: fix race condition in qc_check_dcid()") but
there was a direct return from the middle of the function which was not
covered by the unlock, resulting in the function keeping the lock on
success return.
Let's just remove this return and replace it with a variable to merge all
exit paths.
This must be backported wherever the fix above is backported.
(cherry picked from commit
192abc6f834dcd09f310299afe253b17f9985407)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 27 Jun 2024 16:52:23 +0000 (18:52 +0200)]
BUG/MINOR: quic: fix race-condition on trace for CID retrieval
quic_rx_pkt_retrieve_conn() is used when parsing a received datagram
from the listener socket. It returned the quic_conn instance
corresponding to the first packet DCID, unless it is mapped to another
thread.
As expected, global CID tree access is protected by a lock in the
function. However, there is a race condition due to the final trace
where qc instance is dereferenced outside of the lock. Fix this by
adding a new trace under lock protection and remove qc deferencement at
function end.
This may fix first crash of github issue #2607.
This must be backported up to 2.8.
(cherry picked from commit
bbb9f8248e29e89c288ad55a0fb7c71280a335a0)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 27 Jun 2024 16:15:08 +0000 (18:15 +0200)]
BUG/MINOR: quic: fix race condition in qc_check_dcid()
qc_check_dcid() is a function which check that a DCID is associated to
the expected quic_conn instance. This is used for quic_conn socket
receive handler as there is a tiny risk that a datagram to another
connection was received on this socket.
As other operations on global CID tree, a lock must be used to protect
against race condition. However, as previous commit, lock was not held
long enough as CID tree node is accessed outside of the lock region. To
fix this, increase critical section until CID dereferencement is done.
The impact of this bug should be similar to the previous one. However,
risk of crash are even less reduced as it should be extremely rare to
receive datagram for other connections on a quic_conn socket. As such,
most of the time first check condition of qc_check_dcid() is enough.
This may fix first crash of issue github #2607.
This must be backported up to 2.8.
(cherry picked from commit
05f59a51ac5ba193ef37447ac88f74d3019c3399)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 27 Jun 2024 15:15:55 +0000 (17:15 +0200)]
BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid()
haproxy generates CID for clients which reuse them as DCID on their
packets. These CID are stored in a global tree quic_cid_trees. Each
operation on this tree must be done under lock protection.
quic_get_cid_tid() is a function which lookups a CID in global tree and
return the associated thread ID. This is used on datagram reception on
listener socket before redispatching the datagram to the correct thread.
This function uses a lock to protect quic_cid_trees access. However,
lock region is too small as CID tree node is accessed outside of it. Fix
this by extending lock protection for CID dereferencement until thread
ID is retrieved.
The impact of this bug is unknown, but it may possible cause crashes.
However, it is probably rare as most of datagram reception is done on
quic_conn socket which does not uses quic_get_cid_tid().
This may fix first crash of github issue #2607.
This must be backported up to 2.8.
(cherry picked from commit
72267ff35f7c82f5a32d99a03124b73d95b00a01)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Fri, 28 Jun 2024 08:50:19 +0000 (10:50 +0200)]
BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid
Ensure pseudo-header scheme is only constitued of valid characters
according to RFC 9110. If an invalid value is found, the request is
rejected and stream is resetted.
It's the same as for previous commit "BUG/MEDIUM: h3: ensure the
":method" pseudo header is totally valid" except that this time it
applies to the ":scheme" pseudo header.
This must be backported up to 2.6.
(cherry picked from commit
a3bed52d1f84ba36af66be4317a5f746d498bdf4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Fri, 28 Jun 2024 08:43:19 +0000 (10:43 +0200)]
BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid
Ensure pseudo-header method is only constitued of valid characters
according to RFC 9110. If an invalid value is found, the request is
rejected and stream is resetted.
Previously only characters forbidden in headers were rejected (NUL/CR/LF),
but this is insufficient for :method, where some other forbidden chars
might be used to trick a non-compliant backend server into seeing a
different path from the one seen by haproxy. Note that header injection
is not possible though.
This must be backported up to 2.6.
Many thanks to Yuki Mogi of FFRI Security Inc for the detailed report
that allowed to quicky spot, confirm and fix the problem.
(cherry picked from commit
789d4abd7328f0a745d67698e89bbb888d4d9b2c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Fri, 28 Jun 2024 07:41:56 +0000 (09:41 +0200)]
BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error
This is a complementary patch to
c16eba818 ("BUG/MEDIUM: server/dns:
preserve server's port upon resolution timeout or error").
Indeed, since
c16eba818, the port is properly preserved, but unsetting
server's address this way results in server_atomic_sync() function
thinking that we're actually setting a new address and not unsetting
the previous one because addr family is != AF_UNSPEC.
Upon DNS timeout, this could be observed:
[WARNING] (2588257) : Server http/s1 is going DOWN for maintenance (DNS timeout status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] (2588257) : Server http/s1 ('test1.localhost') is UP/READY (resolves again).
Notice that server timeouts and then immediately resolves again. Of course
in this case case the server's address was properly set to 0, meaning
that the server will not receive any traffic, but it is confusing and
could result in haproxy temporarily thinking that the server is actually
available while it's not.
To properly fix the issue and restore historical behavior, let's
explicitly set inetaddr's family to AF_UNSPEC after fetching original
server's address.
It should be backported in 3.0 with
c16eba818.
(cherry picked from commit
80aba1d2844165d9c6929d31cc9c2fd2e92286ed)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Thu, 27 Jun 2024 15:53:18 +0000 (17:53 +0200)]
MINOR: activity: make the memory profiling hash size configurable at build time
The MEMPROF_HASH_BITS variable was set to 10 without a possibility to
change it (beyond patching the code). After seeing a few reports already
with "other" being listed and a list with close to 1024 entries, it looks
like it's about time to either increase the hash size, or at least make
it configurable for special cases. As a reminder, in order to remain
fast, the algorithm searches no more than 16 places after the hash, so
when a table is almost full, searches are long and new places are rare.
The present patch just makes it possible to redefine it by passing
"-DMEMPROF_HASH_BITS=11" or "-DMEMPROF_HASH_BITS=12" in CFLAGS, and
moves the definition to defaults.h to make it easier to find. Such
values should be way sufficient for the vast majority of use cases.
Maybe in the future we'd change the default. At least this version
should be backported to ease rebuilds, say, till 2.8 or so.
(cherry picked from commit
290659ffd3a2eead918adc387e8842c59fbff2e7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Thu, 27 Jun 2024 08:42:44 +0000 (10:42 +0200)]
BUG/MINOR: server: fix first server template name lookup UAF
This is a follow-up for 7223296 ("BUG/MINOR: server: fix first server
template not being indexed").
Indeed, in 7223296 we added a new call to _srv_parse_set_id_from_prefix()
for the first server before handling additional ones. But we actually
overlooked the fact that _srv_parse_set_id_from_prefix() was already
performed at the end of _srv_parse_tmpl_init() for the same server.
Since _srv_parse_set_id_from_prefix() frees srv->id, it results in UAF
when performing name lookups on the first server, because used_server_name
node key still uses the freed string pointer.
The early _srv_parse_set_id_from_prefix() call (added in 7223296) and
the original one perform the same task, except that the new one is
followed by name node insertion logic required for name lookups to work
properly. So let's simply get rid of the old one at the end of the
function.
_srv_parse_set_id_from_prefix() in the 'err:' label was also removed since
is is now useless as well starting with 7223296 and would trigger the same
bug on error paths. Thanks to Amaury for noticing it.
This bug was discovered while trying to address GH issue #2620.
Thanks to @x-yuri for his detailed report (with working repro).
It should be backported in 3.0 with 7223296.
(cherry picked from commit
eec804804212374739556175f81b234d7cc8c6f0)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Lallemand [Mon, 1 Jul 2024 10:17:00 +0000 (12:17 +0200)]
DOC: configuration: add details about crt-store in bind "crt" keyword
Add some details about the certificate storage cache system in the "crt"
bind keyword.
This should be backported to 3.0. Fix issue #2618.
(cherry picked from commit
ba37ad41b26a6ba83581821c13426a7fbe4d2494)
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
Christopher Faulet [Tue, 25 Jun 2024 06:22:58 +0000 (08:22 +0200)]
BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session
When we try to kill a session, the shard must be locked before decrementing
the ref count on the session. Otherwise, the ref count can fall to 0 and a
purge task (stktable_trash_oldest or process_table_expire) may release the
session before we have the opportunity to acquire the lock on the shard to
effectively kill the session. This could lead to a double free.
Here is the scenario:
Thread 1 Thread 2
sktsess_kill(ts)
if (ATOMIC_DEC(&ts->ref_cnt) != 0)
return
/* here the ref count is 0 */
stktable_trash_oldest()
LOCK(&sh_lock)
if (!ATOMIC_LOAD(&ts->ref_cnf))
__stksess_free(ts)
UNLOCK(&sh_lock)
/* here the session was released */
LOCK(&sh_lock)
__stksess_free(ts) <--- double free
UNLOCK(&sh_lock)
The bug was introduced in 2.9 by the commit
7968fe3889 ("MEDIUM:
stick-table: change the ref_cnt atomically"). The ref count must be
decremented inside the lock for stksess_kill() and sktsess_kill_if_expired()
function.
This patch should fix the issue #2611. It must be backported as far as 2.9. On
the 2.9, there is no sharding. All the table is locked. The patch will have to
be adapted.
(cherry picked from commit
9357873641c5de29b169848fc1c808747818a1eb)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Fri, 21 Jun 2024 17:12:37 +0000 (19:12 +0200)]
BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct()
As a result of copy pasting, hlua_cli_io_handler_fct() used to report lua
exceptions like E_ETMOUT as "Lua converter" instead of "Lua cli".
Let's fix that.
It could be backported to all stable versions.
[ada: for older versions, HLUA_E_BTMOUT case didn't exist so it has to be
skipped]
(cherry picked from commit
185d230e2c615ee723270c81e4eb1eec20181918)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Mon, 24 Jun 2024 09:05:08 +0000 (11:05 +0200)]
DEV: flags/show-fd-to-flags: adapt to recent versions
The script hadn't been updated since it was introduced, and the
hard-coded field 12 doesn't match anymore (it's 16 now). Let's just
use "grep -o cflg..." to extract the desired part more flexibly.
This can be backported at least to 3.0, probably further, but it
will need to be tested prior to this. Better not bring it too far,
it's only used when debugging.
(cherry picked from commit
a14c7d194ad27f9f84c9d42aab953a162999252a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 20 Jun 2024 15:54:04 +0000 (17:54 +0200)]
BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure
On quic_tx_packet allocation failure, it is possible to trigger BUG_ON()
crash on INITIAL packet building. This statement is responsible to
ensure INITIAL packets are padded to 1.200 bytes as required. If a
packet on higher encryption level allocation fails, PADDING frame cannot
properly encoded, despite the INITIAL packet properly built.
This crash happens due to qc_txb_store() invokation after quic_tx_packet
allocation failure to validate already built packets. However, this
statement is unneeded as qc_purge_tx_buf() is called just after. Simply
remove qc_txb_store() to fix this issue.
This was detected using -dMfail.
This should be backported up to 2.6.
(cherry picked from commit
d5376b7a874776b4d5d79f9b746d4654df796f85)
[cf: ctx adjt]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 20 Jun 2024 15:51:35 +0000 (17:51 +0200)]
BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure
BUG_ON() from qcc_set_error() is triggered on HTTP/3 control stream
allocation failure. This is caused because both h3_finalize() and
qcc_init_stream_local() call qcc_set_error() which is forbidden to
prevent error code erasure.
Fix this by removing qcc_set_error() invocation from h3_finalize() on
allocation failure. Note that this function is still responsible to use
it on SETTING frame emission failure.
This was detected using -dMfail.
This must be backported up to 3.0.
(cherry picked from commit
5718c67c19766c87bb68b7624e1873a887fbbaf1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 20 Jun 2024 12:41:22 +0000 (14:41 +0200)]
BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure
Since the following commit, sedesc are created since QCS instantiation
in qcs_new().
086e51017e7731ee9820b882fe6e8cc5f0dd5352
BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream
However, sedesc is initialized before other QCS mandatory fields. If
sedesc allocation fails, a crash would occur on qcs_free() invocation
for QCS early release. To fix this, delay sedesc allocation until
function end.
This bug was detected using -dMfail.
This should be backported up to 2.6.
(cherry picked from commit
3aded1d3752a12af9b8e48f445218230e6967a06)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Fri, 21 Jun 2024 12:45:04 +0000 (14:45 +0200)]
BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission
After emitting a HTTP/3 GOAWAY frame, opening of streams higher than
advertised ID was prevented. h3_attach operation would return success
but without allocating H3S stream context for QCS. In addition, the
stream would be immediately scheduled for RESET_STREAM emission.
Despite the immediate stream close, the current is not sufficient enough
and can cause crashes. When of this occurence can be found if
STOP_SENDING is the first frame received for a stream. A crash would
occur under qcc_recv_stop_sending() after h3_attach invokation, when
h3_close() is used which try to access to H3S context.
To fix this, change h3_attach API. In case of success, H3S stream
context is always allocated, even if the stream will be scheduled for
immediate close. This renders the code more reliable.
This crash should be extremely rare, as it can only happen after GOAWAY
emission, which is only used on soft-stop or reload.
This should solve the second crash occurence reported on GH #2607.
This must be backported up to 2.8.
(cherry picked from commit
85838822ba37a92b2dcc43205a07c2b33208b985)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Fri, 21 Jun 2024 15:56:51 +0000 (17:56 +0200)]
DOC: api/event_hdl: small updates, fix an example and add some precisions
Fix an example suggesting that using EVENT_HDL_SUB_TYPE(x, y) with y being
0 was valid. Then add some notes to explain how to use
EVENT_HDL_SUB_FAMILY() and EVENT_HDL_SUB_TYPE() with valid values.
Also mention that the feature is available starting from 2.8 and not 2.7.
Finally, perform some purely cosmetic updates.
This could be backported in 2.8.
(cherry picked from commit
13e0972aeac275137b429163def950af88fecd46)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 20 Jun 2024 09:22:59 +0000 (11:22 +0200)]
SCRIPTS: git-show-backports: do not truncate git-show output
git-show-backports lists a git-show command which can be used to inspect
all commits subject to backport. This command specifies formatting
option to reproduce default git-show output, especially for commit
messages indented with 4 spaces character. However, it also add wrapping
on message line longer than 72 characters. This reduce lisibility of
messages where large info are written such as backtraces.
Improve this by changing git-show format option. Use a limit value of 0
to disable wrapping while preserving indentation.
This could be backported to every stable version to simplify backporting
process.
(cherry picked from commit
b27470fd1d06acd6dc33161e1fdb6743f72770df)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Tue, 18 Jun 2024 13:20:32 +0000 (15:20 +0200)]
BUG/MAJOR: quic: fix padding with short packets
QUIC sending functions were extended to be more flexible. Of all the
changes, they support now iterating over a variable instance of QEL
instance of only 2 previously. This change has rendered PADDING emission
less previsible, which was adjusted via the following patch :
a60609f1aa3e5f61d2a2286fdb40ebf6936a80ee
BUG/MINOR: quic: fix padding of INITIAL packets
Its main purpose was to ensure PADDING would only be generated for the
last iterated QEL instance, to avoid unnecessary padding. In parallel, a
BUG_ON() statement ensure that built INITIAL packets are always padded
to 1.200 bytes as necessary before emitted them.
This BUG_ON() statement caused crash in one particular occurence : when
building datagrams that mixed Initial long packets and 1-RTT short
packets. This last occurence type does not have a length field in its
header, contrary to Long packets. This caused a miscalculation for the
necessary padding size, with INITIAL packets not padded enough to reach
the necessary 1.200 bytes size.
This issue was detected on 3.0.2. It can be reproduced by using 0-RTT
combined with latency. Here are the used commands :
$ ngtcp2-client --tp-file=/tmp/ngtcp2-tp.txt \
--session-file=/tmp/ngtcp2-session.txt --exit-on-all-streams-close \
127.0.0.1 20443 "https://[::]/?s=32o"
$ sudo tc qdisc add dev lo root netem latency 500ms
Note that this issue cannot be reproduced on current dev version.
Indeed, it seems that the following patch introduce a slight change in
packet building ordering :
cdfceb10ae136b02e51f9bb346321cf0045d58e0
MINOR: quic: refactor qc_prep_pkts() loop
This must be backported to 3.0.
This should fix github issue #2609.
(cherry picked from commit
c714b6bb55e34c7cd2cb3ff7dbed374e6b6eae65)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 18 Jun 2024 20:19:30 +0000 (22:19 +0200)]
DOC: management: document ptr lookup for table commands
Add missing documentation and examples for the optional ptr lookup method
for table {show,set,clear} commands introduced in commit
9b2717e7 ("MINOR:
stktable: use {show,set,clear} table with ptr"), as initially described in
GH #2118.
It may be backported in 3.0.
(cherry picked from commit
7422f16da3b84829f2ecf3ff393584b5c5682e06)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Lallemand [Tue, 18 Jun 2024 10:08:19 +0000 (12:08 +0200)]
DOC: configuration: fix alphabetical order of bind options
Put the curves, ecdhe, severity-output, v4v6 and v6only keyword at the
right place.
Fix issue #2594.
Could be backported in every stable versions.
(cherry picked from commit
0cc2913aec965dabc579cd90a3d91a440f29967c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 17 Jun 2024 16:07:22 +0000 (18:07 +0200)]
BUG/MEDIUM: proxy: fix email-alert invalid free
In
fa90a7d3 ("BUG/MINOR: proxy: fix email-alert leak on deinit()"), I
tried to fix email-alert deinit() leak the simple way by leveraging
existing free_email_alert() helper function which was already used for
freeing email alert settings used in a default section.
However, as described in GH #2608, there is a subtelty that makes
free_email_alert() not suitable for use from free_proxy().
Indeed, proxy 'mailers.name' hint shares the same memory space than the
pointer to the corresponding mailers section (once the proxy is resolved,
name hint is replaced by the pointer to the section). However, since both
values share the same space (through union), we have to take care of not
freeing `mailers.name` once init_email_alert() was called on the proxy.
Unfortunately, free_email_alert() isn't protected against that, causing
double free() during deinit when mailers section is referenced from
multiple proxy sections. Since there is no easy fix, and that the leak in
itself isn't a big deal (
fa90a7d3 was simply an opportunistic fix rather
than a must-have given that the leak only occurs during deinit and not
during runtime), let's actually revert the fix to restore legacy behavior
and prevent deinit errors.
Thanks to @snetat for having reported the issue on Github as well as
providing relevant infos to pinpoint the bug.
It should be backported everywhere
fa90a7d3 was backported.
[ada: for versions prior to 3.0, simply revert the offending commit using
'git revert' as proxy_free_common() first appears in 3.0]
(cherry picked from commit
8e226682be904a6774f65e90bac0b674888cc293)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Lallemand [Fri, 14 Jun 2024 14:56:58 +0000 (16:56 +0200)]
REGTESTS: ssl: fix some regtests 'feature cmd' start condition
Since patch fde517b ("REGTESTS: wolfssl: temporarly disable some failing
reg-tests") some 'feature cmd' lines have an extra quotation mark, so
they were disable in every cases.
Must be backported to 2.9.
(cherry picked from commit
6da0879083749d5f098b8b2f4d459a70260491d2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Thu, 13 Jun 2024 17:31:29 +0000 (19:31 +0200)]
DEBUG: hlua: distinguish burst timeout errors from exec timeout errors
hlua burst timeout was introduced in
58e36e5b1 ("MEDIUM: hlua: introduce
tune.lua.burst-timeout").
It is a safety measure that allows to detect when too much time is spent
on a single lua execution (between 2 interruptions/yields), meaning that
the current thread is not able to perform other tasks. Such scenario
should be avoided because it will cause thread contention which may have
negative performance impact and could cause the watchdog to trigger. When
the burst timeout is exceeded, the current Lua execution is aborted and a
timeout error is reported to the user.
Unfortunately, the same error is currently being reported for cumulative
(AKA execution) timeout and for burst timeout, which may be confusing to
the user.
Indeed, "execution timeout" error historically results from the current
hlua context exceeding the total (cumulative) time it's allowed to run.
It is set per lua context using the dedicated tunables:
- tune.lua.session-timeout
- tune.lua.task-timeout
- tune.lua.service-timeout
We've already faced an user report where the user was able to trigger the
burst timeout and got "Lua task: execution timeout." error while the user
didn't set cumulative timeout. Thus the error was actually confusing
because it was indeed the burst timeout which was causing it due to the
use of cpu-intensive call from within the task without sufficient manual
"yield" keypoints around the cpu-intensive call to ensure it runs on a
dedicated scheduler cycle.
In this patch we make it so burst timeout related errors are reported as
"burst timeout" errors instead of "execution timeout" errors (which
in fact became the generic timeout errors catchall with
58e36e5b1).
To do this, hlua_timer_check() now returns a different value depending if
the exeeded timeout is the burst one or the cumulative one, which allows
us to return either HLUA_E_ETMOUT or HLUA_E_BTMOUT in hlua_ctx_resume().
It should improve the situation described in GH #2356 and may possibly be
backported with
58e36e5b1 to improve error reporting if it applies without
resistance.
(cherry picked from commit
983513d901bb7511ea6b1e8c3bb00d58a9d432f2)
[cf: No reason to backport further]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Fri, 14 Jun 2024 16:01:45 +0000 (18:01 +0200)]
BUG/MINOR: log: fix broken '+bin' logformat node option
In
12d08cf912 ("BUG/MEDIUM: log: don't ignore disabled node's options"),
while trying to restore historical node option inheritance behavior, I
broke the '+bin' logformat node option recently introduced in
b7c3d8c87c
("MINOR: log: add +bin logformat node option").
Indeed, because of
12d08cf912, LOG_OPT_BIN is not set anymore on
individual nodes even if it was set globally, making the feature unusable.
('+bin' is also used for binary cbor encoding)
What I should have done instead is include LOG_OPT_BIN in the options
inherited from global ones. This is what's being done in this commit.
Misleading comment was adjusted.
It must be backported in 3.0 with
12d08cf912.
(cherry picked from commit
0030f722a2fa574d1e7d90e6f242e4b6a5ace355)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Fri, 14 Jun 2024 13:00:35 +0000 (15:00 +0200)]
[RELEASE] Released version 3.0.2
Released version 3.0.2 with the following main changes :
- MINOR: log: fix "http-send-name-header" ignore warning message
- BUG/MINOR: proxy: fix server_id_hdr_name leak on deinit()
- BUG/MINOR: proxy: fix log_tag leak on deinit()
- BUG/MINOR: proxy: fix email-alert leak on deinit()
- BUG/MINOR: proxy: fix check_{command,path} leak on deinit()
- BUG/MINOR: proxy: fix dyncookie_key leak on deinit()
- BUG/MINOR: proxy: fix source interface and usesrc leaks on deinit()
- BUG/MINOR: proxy: fix header_unique_id leak on deinit()
- BUG/MEDIUM: log: fix lf_expr_postcheck() behavior with default section
- DOC: config: move "hash-key" from proxy to server options
- DOC: config: add missing section hint for "guid" proxy keyword
- DOC: config: add missing context hint for new server and proxy keywords
- BUG/MINOR: promex: Skip resolvers metrics when there is no resolver section
- MINOR: proxy: add proxy_free_common() helper function
- BUG/MEDIUM: proxy: fix UAF with {tcp,http}checks logformat expressions
- CLEANUP: log/proxy: fix comment in proxy_free_common()
- BUG/MAJOR: mux-h1: Prevent any UAF on H1 connection after draining a request
- BUG/MINOR: quic: fix padding of INITIAL packets
- DOC/MINOR: management: add missed -dR and -dv options
- DOC/MINOR: management: add -dZ option
- DOC: management: rename show stats domain cli "dns" to "resolvers"
Aurelien DARRAGON [Wed, 12 Jun 2024 09:41:54 +0000 (11:41 +0200)]
DOC: management: rename show stats domain cli "dns" to "resolvers"
In commit
f8642ee82 ("MEDIUM: resolvers: rename dns extra counters to
resolvers extra counters"), we renamed "dns" counters to "resolvers", but
we forgot to update the documentation accordingly.
This may be backported to all stable versions.
(cherry picked from commit
cf913c2f9019c2264986f38da67bed7bed191a24)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Valentine Krasnobaeva [Wed, 12 Jun 2024 08:39:11 +0000 (10:39 +0200)]
DOC/MINOR: management: add -dZ option
Add some description for missed -dZ command line option in
the "3. Starting HAProxy" chapter.
Need to be backported until 2.9.
(cherry picked from commit
61d66a3d061cfb302f1519e5a774eb7e82f57ab9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Valentine Krasnobaeva [Wed, 12 Jun 2024 08:39:02 +0000 (10:39 +0200)]
DOC/MINOR: management: add missed -dR and -dv options
Add some description for missed -dR and -dv command line options in
the "3. Starting HAProxy" chapter.
Need to be backported in every stable version.
(cherry picked from commit
27623d8393a3187ca827f752efc1956cbb89cef5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Thu, 30 May 2024 16:06:27 +0000 (18:06 +0200)]
BUG/MINOR: quic: fix padding of INITIAL packets
API for sending has been extended to support emission on more than 2 QEL
instances. However, this has rendered the PADDING emission for INITIAL
packets less previsible. Indeed, if qc_send() is used with empty QEL
instances, a padding frame may be generated before handling the last QEL
registered, which could cause unnecessary padding to be emitted.
This commit simplify PADDING by only activating it for the last QEL
registered. This ensures that no superfluous padding is generated as if
the minimal INITIAL datagram length is reached, padding is resetted
before handling last QEL instance.
This bug is labelled as minor as haproxy already emit big enough INITIAL
packets coalesced with HANDSHAKE one without needing padding. This
however render the padding code difficult to test. Thus, it may be
useful to force emission on INITIAL qel only without coalescing
HANDSHAKE packet. Here is a sample to reproduce it :
--- a/src/quic_conn.c
+++ b/src/quic_conn.c
@@ -794,6 +794,14 @@ struct task *quic_conn_io_cb(struct task *t, void *context, unsigned int state)
}
}
+ if (qc->iel && qel_need_sending(qc->iel, qc)) {
+ struct list empty = LIST_HEAD_INIT(empty);
+ qel_register_send(&send_list, qc->iel, &qc->iel->pktns->tx.frms);
+ if (qc->hel)
+ qel_register_send(&send_list, qc->hel, &empty);
+ qc_send(qc, 0, &send_list);
+ }
+
/* Insert each QEL into sending list if needed. */
list_for_each_entry(qel, &qc->qel_list, list) {
if (qel_need_sending(qel, qc))
This should be backported up to 3.0.
(cherry picked from commit
a60609f1aa3e5f61d2a2286fdb40ebf6936a80ee)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Wed, 12 Jun 2024 12:59:20 +0000 (14:59 +0200)]
BUG/MAJOR: mux-h1: Prevent any UAF on H1 connection after draining a request
Since 2.9, it is possible to drain the request payload from the H1
multiplexer in case of early reply. When this happens, the upper stream is
detached but the H1 stream is not destroyed. Once the whole request is
drained, the end of the detach stage is finished. So the H1 stream is
destroyed and the H1 connection is ready to be reused, if possible,
otherwise it is released.
And here is the issue. If some data of the next request are received with
last bytes of the drained one, parsing of the next request is immediately
started. The previous H1 stream is destroyed and a new one is created to
handle the parsing. At this stage the H1 connection may be released, for
instance because of a parsing error. This case was not properly handled.
Instead of immediately exiting the mux, it was still possible to access the
released H1 connection to refresh its timeouts, leading to a UAF issue.
Many thanks to Annika for her invaluable help on this issue.
The patch should fix the issue #2602. It must be backported as far as 2.9.
(cherry picked from commit
0e09cce0fdf104994b37a492e256d3bc37880ddc)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 11 Jun 2024 08:52:37 +0000 (10:52 +0200)]
CLEANUP: log/proxy: fix comment in proxy_free_common()
Thanks to previous commit, logformat expressions for default proxies are
also postchecked, adjusting a comment that suggests it's not the case.
(cherry picked from commit
c6931a4f01a29cb4f36e0b70900a6c97a5a2bdda)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 18:21:02 +0000 (20:21 +0200)]
BUG/MEDIUM: proxy: fix UAF with {tcp,http}checks logformat expressions
When parsing a logformat expression using parse_logformat_string(), the
caller passes the proxy under which the expression is found as argument.
This information allows the logformat expression API to check if the
expression is compatible with the proxy settings.
Since 7a21c3a ("MAJOR: log: implement proper postparsing for logformat
expressions"), the proxy compatibilty checks are postponed after the proxy
is fully parsed to ensure proxy properties are fully resolved for checks
consistency.
The way it works, is that each time parse_logformat_string() is called for
a given expression and proxy, it schedules the expression for postchecking
by appending the expression to the list of pending expression checks on
the proxy (lf_checks struct). Then, when the proxy is called with the
REGISTER_POST_PROXY_CHECK() hook, it iterates over unchecked expressions
and performs the check, then it removes the expression from its list.
However, I overlooked a special case: if a logformat expression is used
on a proxy that is disabled or a default proxy:
REGISTER_POST_PROXY_CHECK() hook is never called. Because of that, lf
expressions may still point to the proxy after the proxy is freed.
For most logformat expressions, this isn't an issue because they are
stored within the proxy itself, but this isn't the case with
{tcp,http}checks logformat expressions: during deinit() sequence, all
proxies are first cleaned up, and only then shared checks are freed.
Because of that, the below config will trigger UAF since 7a21c3a:
uaf.conf:
listen dummy
bind localhost:2222
backend testback
disabled
mode http
option httpchk
http-check send hdr test "test"
http-check expect status 200
haproxy -f uaf.conf -c:
==152096== Invalid write of size 8
==152096== at 0x21C317: lf_expr_deinit (log.c:3491)
==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:84)
==152096== by 0x2334A3: free_tcpcheck_http_hdr (tcpcheck.c:79)
==152096== by 0x2334A3: free_tcpcheck_http_hdrs (tcpcheck.c:98)
==152096== by 0x23365A: free_tcpcheck.part.0 (tcpcheck.c:130)
==152096== by 0x2338B1: free_tcpcheck (tcpcheck.c:108)
==152096== by 0x2338B1: deinit_tcpchecks (tcpcheck.c:3780)
==152096== by 0x2CF9A4: deinit (haproxy.c:2949)
==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052)
==152096== by 0x169BC0: main (haproxy.c:3996)
==152096== Address 0x52a8df8 is 6,968 bytes inside a block of size 7,168 free'd
==152096== at 0x484B27F: free (vg_replace_malloc.c:872)
==152096== by 0x2CF8AD: deinit (haproxy.c:2906)
==152096== by 0x2D0065: deinit_and_exit (haproxy.c:3052)
==152096== by 0x169BC0: main (haproxy.c:3996)
To fix the issue, let's ensure in proxy_free_common() that no unchecked
expressions may still point to the proxy after the proxy is freed by
purging the list (DEL_INIT is used to reset list items).
Special thanks to GH user @mhameed who filed a comprehensive issue with
all the relevant information required to reproduce the bug (see GH #2597),
after having first reported the issue on the alpine project bug tracker.
(cherry picked from commit
318c290ff2779c6a8ddf773fb00035266c1f74d4)
[cf: must be backported to 3.0 only]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 17:31:19 +0000 (19:31 +0200)]
MINOR: proxy: add proxy_free_common() helper function
As shown by previous patch series, having to free some common proxy
struct members twice (in free_proxy() and proxy_free_defaults()) is
error-prone: we often overlook one of the two free locations when
adding new features.
To prevent such bugs from being introduced in the future, and also avoid
code duplication, we now have a proxy_free_common() function to free all
proxy struct members that are common to all proxy types (either regular or
default ones).
This should greatly improve code maintenance related to proxy freeing
logic.
(cherry picked from commit
005e4ba715cf54204ea2fc752ba4d44a23eb61bd)
[cf: Required in 3.0 to ease backport of a bug fix]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Wed, 12 Jun 2024 06:42:56 +0000 (08:42 +0200)]
BUG/MINOR: promex: Skip resolvers metrics when there is no resolver section
By default, there is always at least on resolver section, the default one,
based on "/etc/resolv.conf" content. However, it is possible to have no
resolver at all if the file is empty or if any error occurred. Errors are
silently ignored at this stage.
In that case, there was a bug in the Prometheus exporter leading to a crash
because the resolver section list is empty. An invalid resolver entity was
used. To fix the issue we must only take care to not dump resolvers metrics
when there is no resolver.
Thanks to Aurelien to have spotted the offending commit.
This patch should fix the issue #2604. It must be backported to 3.0.
(cherry picked from commit
91fe085943cea52d0c3d04e81f8ecb6a51668b09)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 11 Jun 2024 14:39:58 +0000 (16:39 +0200)]
DOC: config: add missing context hint for new server and proxy keywords
To stay consistent with the work started in
54627f991 ("DOC: config: add
context hint for proxy keywords") and
3d4e1e682 ("DOC: config: add context
hint for server keywords"), we add missing context hint for "guid" (both
proxy and server) keyword and "hash-key" server keyword that were added
during 3.0 development.
This may be backported in 3.0.
(cherry picked from commit
c157894ba97a40f40f777344041841e423f99c2c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 11 Jun 2024 14:52:11 +0000 (16:52 +0200)]
DOC: config: add missing section hint for "guid" proxy keyword
"guid" proxy keyword added in
da754b45 ("MINOR: proxy: implement GUID
support") was lacking the section hint in the keyword description, let's
fix that.
It could be backported in 3.0 with
da754b45.
(cherry picked from commit
aec02320bdb4628839525c0704a327a812db64a4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 11 Jun 2024 14:19:04 +0000 (16:19 +0200)]
DOC: config: move "hash-key" from proxy to server options
As reported by Ashley Morris, "hash-key" keyword which was introduced in
commit
faa8c3e0 ("MEDIUM: lb-chash: Deterministic node hashes based on
server address") doesn't belong to proxy keywords and should be found in
5.2 "Server and default-server options" instead.
It should be backported in 3.0 with
faa8c3e0
(cherry picked from commit
cdf1d20e8a8eb1db0141a33ea18227d28abd5026)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Tue, 11 Jun 2024 08:15:36 +0000 (10:15 +0200)]
BUG/MEDIUM: log: fix lf_expr_postcheck() behavior with default section
Since
7a21c3a4ef ("MAJOR: log: implement proper postparsing for logformat
expressions"), logformat expressions stored in a default section are not
postchecked anymore. This is because the REGISTER_POST_PROXY_CHECK() only
evaluates regular proxies. Because of this, proxy options which are
automatically enabled on the proxy depending on the logformat expression
features in use are not set on the default proxy, which means such options
are not passed to the regular proxies that inherit from it (proxies that
and will actually be running the logformat expression during runtime).
Because of that, a logformat expression stored inside a default section
and executed by a regular proxy may not behave properly. Also, since
03ca16f38b ("OPTIM: log: resolve logformat options during postparsing"),
it's even worse because logformat node options postresoving is also
skipped, which may also alter logformat expression encoding feature.
To fix the issue, let's add a special case for default proxies in
parse_logformat_string() and lf_expr_postcheck() so that default proxies
are postchecked on the fly during parsing time in a "relaxed" way as we
cannot assume that the features involved in the logformat expression won't
be compatible with the proxy actually running it since we may have
different types of proxies inheriting from the same default section.
This bug was discovered while trying to address GH #2597.
It should be backported to 3.0 with
7a21c3a4ef and
03ca16f38b.
(cherry picked from commit
e4f122f3f46ff7b6e28f3599e8ec2006e2caac37)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 17:36:53 +0000 (19:36 +0200)]
BUG/MINOR: proxy: fix header_unique_id leak on deinit()
proxy header_unique_id wasn't cleaned up in proxy_free_defaults(),
resulting in small memory leak if "unique-id-header" was used on a
default proxy section.
It may be backported to all stable versions.
(cherry picked from commit
847c406b9a040193c374daad3269d646dda7dbcc)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 17:23:36 +0000 (19:23 +0200)]
BUG/MINOR: proxy: fix source interface and usesrc leaks on deinit()
proxy conn_src.iface_name was only freed in proxy_free_defaults(), whereas
proxy conn_src.bind_hdr_name was only freed in free_proxy().
Because of that, using "source usesrc hdr_ip()" in a default proxy, or
"source interface" in a regular or default proxy would cause memory leaks
during deinit.
It may be backported to all stable versions.
(cherry picked from commit
1aa219078dbaeef402b43af24538078fdd875790)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 16:48:49 +0000 (18:48 +0200)]
BUG/MINOR: proxy: fix dyncookie_key leak on deinit()
proxy dyncookie_key wasn't cleaned up in free_proxy(), resulting in small
memory leak if "dynamic-cookie-key" was used on a regular or default
proxy.
It may be backported to all stable versions.
(cherry picked from commit
6f53df3fcfe3dc7220cff862d3ded1601f642931)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 16:37:51 +0000 (18:37 +0200)]
BUG/MINOR: proxy: fix check_{command,path} leak on deinit()
proxy check_{command,path} members (used for "external-check" feature)
weren't cleaned up in free_proxy(), resulting in small memory leak if
"external-check command" or "external-check path" were used on a regular
or default proxy.
It may be backported to all stable versions.
(cherry picked from commit
62d0465a96ac847f40e95a9474d8971dac062114)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 15:01:32 +0000 (17:01 +0200)]
BUG/MINOR: proxy: fix email-alert leak on deinit()
proxy email-alert settings weren't cleaned up in free_proxy(), resulting
in small memory leak if "email-alert to" or "email-alert from" were used
on a regular or default proxy.
It may be backported to all stable versions.
(cherry picked from commit
fa90a7d313703cfc4e1b41d258d0d6d470ffe967)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 13:54:49 +0000 (15:54 +0200)]
BUG/MINOR: proxy: fix log_tag leak on deinit()
proxy log_tag wasn't cleaned up in free_proxy(), resulting in small
memory leak if "log-tag" was used on a regular or default proxy.
It may be backported to all stable versions.
(cherry picked from commit
77b192ea3682a2ad56418d2e4379bb763e0b427e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 16:17:34 +0000 (18:17 +0200)]
BUG/MINOR: proxy: fix server_id_hdr_name leak on deinit()
proxy server_id_hdr_name member (used for "http-send-name-header" option)
wasn't cleaned up in free_proxy(), resulting in small memory leak if
"http-send-name-header" was used on a regular or default proxy.
This may be backported to all stable versions.
(cherry picked from commit
99f340958211881dc6b48b6dc960d126179c4bcc)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Aurelien DARRAGON [Mon, 10 Jun 2024 16:11:49 +0000 (18:11 +0200)]
MINOR: log: fix "http-send-name-header" ignore warning message
Warning message to indicate that the "http-send-name-header" option is
ignored for backend in "mode log" was referenced using its internal
struct wording instead of public name (as seen in the documentation).
Let's fix that.
It may be backported with c7783fb ("MINOR: log/backend: prevent
"http-send-name-header" use with LOG mode") in 2.9.
(cherry picked from commit
e5ccfda9d31db54dfafe618404e79f09883b3ea5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 10 Jun 2024 14:15:30 +0000 (16:15 +0200)]
[RELEASE] Released version 3.0.1
Released version 3.0.1 with the following main changes :
- BUG/MINOR: cfgparse: remove the correct option on httpcheck send-state warning
- BUG/MINOR: tcpcheck: report correct error in tcp-check rule parser
- BUG/MINOR: tools: fix possible null-deref in env_expand() on out-of-memory
- DOC: configuration: add an example for keywords from crt-store
- BUG/MINOR: hlua: use CertCache.set() from various hlua contexts
- BUG/MEDIUM: h1-htx: Don't state interim responses are bodyless
- MEDIUM: stconn: Be able to unblock zero-copy data forwarding from done_fastfwd
- BUG/MEDIUM: mux-quic: Unblock zero-copy forwarding if the txbuf can be released
- BUG/MINOR: quic: prevent crash on qc_kill_conn()
- CLEANUP: hlua: use hlua_pusherror() where relevant
- BUG/MINOR: hlua: don't use lua_pushfstring() when we don't expect LJMP
- BUG/MINOR: hlua: fix unsafe hlua_pusherror() usage
- BUG/MINOR: hlua: prevent LJMP in hlua_traceback()
- BUG/MINOR: hlua: fix leak in hlua_ckch_set() error path
- CLEANUP: hlua: simplify ambiguous lua_insert() usage in hlua_ctx_resume()
- BUG/MEDIUM: mux-quic: Don't unblock zero-copy fwding if blocked during nego
- BUG/MEDIUM: ssl: wrong priority whem limiting ECDSA ciphers in ECDSA+RSA configuration
- BUG/MEDIUM: ssl: bad auth selection with TLS1.2 and WolfSSL
- BUG/MINOR: quic: fix computed length of emitted STREAM frames
- BUG/MINOR: quic: ensure Tx buf is always purged
- BUG/MEDIUM: stconn/mux-h1: Fix suspect change causing timeouts
- BUG/MAJOR: mux-h1: Properly copy chunked input data during zero-copy nego
- BUG/MINOR: mux-h1: Use the right variable to set NEGO_FF_FL_EXACT_SIZE flag
Christopher Faulet [Mon, 10 Jun 2024 09:51:07 +0000 (11:51 +0200)]
BUG/MINOR: mux-h1: Use the right variable to set NEGO_FF_FL_EXACT_SIZE flag
Instead of setting this flag on the ones used for the zero-copy negociation,
it is set on the connection flags used for xprt->rcv_buf()
call. Fortunately, there is no real consequence. The only visible effect is
the chunk size that is written on 8 bytes for no reason.
This patch is related to issue #2598. It must be backported to 3.0.
(cherry picked from commit
7bff576ebb5e9cc8d081f0a7f7b9cf657e5a3d13)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 10 Jun 2024 09:33:08 +0000 (11:33 +0200)]
BUG/MAJOR: mux-h1: Properly copy chunked input data during zero-copy nego
When data are transfered via zero-copy data forwarding, if some data were
already received, we try to immediately tranfer it during the negociation
step. If data are chunked and the chunk size is unknown, 10 bytes are reserved
to write the chunk size during the done step. However, when input data are
finally transferred, the offset is ignored. Data are copied into the output
buffer. But the first 10 bytes are then crushed by the chunk size. Thus the
chunk is truncated leading to a malformed message.
This patch should fix the issue #2598. It must be backported to 3.0.
(cherry picked from commit
e8cc8a60be614c1cf978233b0b97771c9cc8fa20)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Manley [Wed, 5 Jun 2024 20:55:11 +0000 (21:55 +0100)]
BUG/MEDIUM: stconn/mux-h1: Fix suspect change causing timeouts
This fixes an issue I've had where if a connection was idle for ~23s
it would get in a bad state. I don't understand this code, so I'm
not sure exactly why it was failing.
I discovered this by bisecting to identify the commit that caused the
regression between 2.9 and 3.0. The commit is
d2c3f8dde7c2474616c0ea51234e6ba9433a4bc1: "MINOR: stconn/connection:
Move shut modes at the SE descriptor level" - a part of v3.0-dev8.
It seems to be an innocent renaming, so I looked through it and this
stood out as suspect:
- if (mode != CO_SHW_NORMAL)
+ if (mode & SE_SHW_NORMAL)
It looks like the not went missing here, so this patch reverses that
condition. It fixes my test.
I don't quite understand what this is doing or is for so I can't write
a regression test or decent commit message. Hopefully someone else
will be able to pick this up from where I've left it.
[CF: This inverts the condition to perform clean shutdowns. This means no
clean shutdown are performed when it should do. This patch must be
backported to 3.0]
(cherry picked from commit
52eb6b23f81d8663ebc6b2fe5d9996916c05ed8f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Fri, 31 May 2024 07:42:13 +0000 (09:42 +0200)]
BUG/MINOR: quic: ensure Tx buf is always purged
quic_conn API for sending was recently refactored. The main objective
was to regroup the different functions present for both handshake and
application emission.
After this refactoring, an optimization was introduced to avoid calling
qc_send() if there was nothing new to emit. However, this prevent the Tx
buffer to be purged if previous sending was interrupted, until new
frames are finally available.
To fix this, simply remove the optimization. qc_send() is thus now
always called in quic_conn IO handlers.
The impact of this bug should be minimal as it happens only on sending
temporary error. However in this case, this could cause extra latency or
even a complete sending freeze in the worst scenario.
This must be backported up to 3.0.
(cherry picked from commit
0ef94e2dff873c0755584d0797c81e1b2c697e52)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Amaury Denoyelle [Wed, 5 Jun 2024 09:37:44 +0000 (11:37 +0200)]
BUG/MINOR: quic: fix computed length of emitted STREAM frames
qc_build_frms() is responsible to encode multiple frames in a single
QUIC packet. It accounts for room left in the buffer packet for each
newly encded frame.
An incorrect computation was performed when encoding a STREAM frame in a
single packet. Frame length was accounted twice which would reduce in
excess the buffer packet room. This caused the remaining built frames to
be reduced with the resulting packet not able to fill the whole MTU.
The impact of this bug should be minimal. It is only present when
multiple frames are encoded in a single packet after a STREAM. However
in this case datagrams built are smaller than expecting, which is
suboptimal for bandwith.
This should be backported up to 2.6.
(cherry picked from commit
50470a5181b6105ad1914c0d4e67794a2f69c80a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Lallemand [Fri, 7 Jun 2024 13:47:15 +0000 (15:47 +0200)]
BUG/MEDIUM: ssl: bad auth selection with TLS1.2 and WolfSSL
The ClientHello callback for WolfSSL introduced in haproxy 2.9, seems
not to behave correctly with TLSv1.2.
In TLSv1.2, this is the cipher that is used to chose the authentication algorithm
(ECDSA or RSA), however an SSL client can send a signature algorithm.
In TLSv1.3, the authentication is not part of the ciphersuites, and
is selected using the signature algorithm.
The mistake in the code is that the signature algorithm in TLSv1.2 are
overwritting the auth that was selected using the ciphers.
This must be backported as far as 2.9.
(cherry picked from commit
711338e1ceb061db0a5c832acdea8edbeafa712f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Lallemand [Wed, 5 Jun 2024 09:37:14 +0000 (11:37 +0200)]
BUG/MEDIUM: ssl: wrong priority whem limiting ECDSA ciphers in ECDSA+RSA configuration
The ClientHello Callback which is used for certificate selection uses
both the signature algorithms and the ciphers sent by the client.
However, when a client is announcing both ECDSA and RSA capabilities
with ECSDA ciphers that are not available on haproxy side and RSA
ciphers that are compatibles, the ECDSA certificate will still be used
but this will result in a "no shared cipher" error, instead of a
fallback on the RSA certificate.
For example, a client could send
'ECDHE-ECDSA-AES128-CCM:ECDHE-RSA-AES256-SHA and HAProxy could be
configured with only 'ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA'.
This patch fixes the issue by validating that at least one ECDSA cipher
is available on both side before chosing the ECDSA certificate.
This must be backported on all stable versions.
(cherry picked from commit
93cc23a35561cd89b353143d20962dd86aa82a9c)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Christopher Faulet [Tue, 4 Jun 2024 16:10:51 +0000 (18:10 +0200)]
BUG/MEDIUM: mux-quic: Don't unblock zero-copy fwding if blocked during nego
The previous fix (
792a645ec2 ["BUG/MEDIUM: mux-quic: Unblock zero-copy
forwarding if the txbuf can be released"]) introduced a regression. The
zero-copy data forwarding must only be unblocked if it was blocked by the
producer, after a successful negotiation.
It is important because during a negotiation, the consumer may be blocked
for another reason. Because of the flow control for instance. In that case,
there is not necessarily a TX buffer. And it unexpected to try to release an
unallocated TX buf.
In addition, the same may happen while a TX buf is still in-use. In that
case, it must also not be released. So testing the TX buffer is not the
right solution.
To fix the issue, a new IOBUF flag was added (IOBUF_FL_FF_WANT_ROOM). It
must be set by the producer if it is blocked after a sucessful negotiation
because it needs more room. In that case, we know a buffer was provided by
the consummer. In done_fastfwd() callback function, it is then possible to
safely unblock the zero-copy data forwarding if this flag is set.
This patch must be backported to 3.0 with the commit above.
(cherry picked from commit
9748df29ff655a9626d6e75ea9db79bb9afa2a50)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Tue, 4 Jun 2024 13:52:23 +0000 (15:52 +0200)]
CLEANUP: hlua: simplify ambiguous lua_insert() usage in hlua_ctx_resume()
'lua_insert(lua->T, -lua_gettop(lua->T))' is actually used to rotate the
top value with the bottom one, thus the code was overkill and the comment
was actually misleading, let's fix that by using explicit equivalent form
(absolute index).
It may be backported with
5508db9a2 ("BUG/MINOR: hlua: fix unsafe
lua_tostring() usage with empty stack") to all stable versions to ease
code maintenance.
(cherry picked from commit
2bde0d64ddf0e32257444f14e69adea8f899b74b)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Tue, 4 Jun 2024 13:15:29 +0000 (15:15 +0200)]
BUG/MINOR: hlua: fix leak in hlua_ckch_set() error path
in hlua_ckch_commit_yield() and hlua_ckch_set(), when an error occurs,
we enter the error path and try to raise an error from the <err> msg
pointer which must be freed afterwards.
However, the fact that luaL_error() never returns was overlooked, because
of that <err> msg is never freed in such case.
To fix the issue, let's use hlua_pushfstring_safe() helper to push the
err on the lua stack and then free it before throwing the error using
lua_error().
It should be backported up to 2.6 with
30fcca18 ("MINOR: ssl/lua:
CertCache.set() allows to update an SSL certificate file")
(cherry picked from commit
755c2daf0f88885fd6825c55ae59198726c4905e)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Tue, 4 Jun 2024 08:41:32 +0000 (10:41 +0200)]
BUG/MINOR: hlua: prevent LJMP in hlua_traceback()
Function is often used on error paths where no precaution is taken
against LJMP. Since the function is used on error paths (which include
out-of-memory error paths) the function lua_getinfo() could also raise
a memory exception, causing the process to crash or improper error
handling if the caller isn't prepared against that eventually. Since the
function is only used on rare events (error handling) and is lacking the
__LJMP prototype pefix, let's make it safe by protecting the lua_getinfo()
call so that hlua_traceback() callers may use it safely now (the function
will always succeed, output will be truncated in case of error).
This could be backported to all stable versions.
(cherry picked from commit
365ee28510a11993176ffd22704676732c051a4f)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Tue, 4 Jun 2024 11:08:59 +0000 (13:08 +0200)]
BUG/MINOR: hlua: fix unsafe hlua_pusherror() usage
Following previous commit's logic: hlua_pusherror() is mainly used
from cleanup paths where the caller isn't protected against LJMPs.
Caller was tempted to think that the function was safe because func
prototype was lacking the __LJMP prefix.
Let's make the function really LJMP-safe by wrapping the sensitive calls
under lua_pcall().
This may be backported to all stable versions.
(cherry picked from commit
f0e5b825cfba3ad710818e46c048ca296978283a)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Mon, 3 Jun 2024 21:18:24 +0000 (23:18 +0200)]
BUG/MINOR: hlua: don't use lua_pushfstring() when we don't expect LJMP
lua_pushfstring() is used in multiple cleanup paths (upon error) to
push the error message that will be raised by lua_error(). However this
is often done from an unprotected environment, or in the middle of a
cleanup sequence, thus we don't want the function to LJMP! (it may cause
various issues ranging from memory leaks to crashing the process..)
Hopefully this has very few chances of happening but since the use of
lua_pushfstring() is limited to error reporting here, it's ok to use our
own hlua_pushfstring_safe() implementation with a little overhead to
ensure that the function will never LJMP.
This could be backported to all stable versions.
(cherry picked from commit
c0a3c1281fac10ca7a590a4c34d102c8040e97a5)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Tue, 4 Jun 2024 10:48:45 +0000 (12:48 +0200)]
CLEANUP: hlua: use hlua_pusherror() where relevant
In hlua_map_new(), when error occurs we use a combination of luaL_where,
lua_pushfstring and lua_concat to build the error string before calling
lua_error().
It turns out that we already have the hlua_pusherror() macro which is
exactly made for that purpose so let's use it.
It could be backported to all stable versions to ease code maintenance.
(cherry picked from commit
6e484996c6e3e5d7fc35fef77333b4f64d514fcb)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Amaury Denoyelle [Tue, 4 Jun 2024 09:56:09 +0000 (11:56 +0200)]
BUG/MINOR: quic: prevent crash on qc_kill_conn()
Ensure idle_timer task is allocated in qc_kill_conn() before waking it
up. It can be NULL if idle timer has already fired but MUX layer is
still present, which prevents immediate quic_conn release.
qc_kill_conn() is only used on send() syscall fatal error to notify
upper layer of an error and close the whole connection asap.
This crash occurence is pretty rare as it relies on timing issues. It
happens only if idle timer occurs before the MUX release (a bigger
client timeout is thus required) and any send() syscall detected error.
For now, it was only reproduced using GDB to interrupt haproxy longer
than the idle timeout.
This should be backported up to 2.6.
(cherry picked from commit
f7ae84e7d1b20201b38348d9dcbaefa47eb29814)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Christopher Faulet [Tue, 4 Jun 2024 10:04:48 +0000 (12:04 +0200)]
BUG/MEDIUM: mux-quic: Unblock zero-copy forwarding if the txbuf can be released
In done_fastfwd() callback function, if nothing was forwarding while the SD
is blocked, it means there is not enough space in the buffer to proceed. It
may be because there are data to be sent. But it may also be data already
sent waiting for an ack. In this case, no data to be sent by the mux. So the
quic stream is not woken up when data are finally removed from the
buffer. The data forwarding can thus be stuck. This happens when the stats
page is requested in QUIC/H3. Only applets are affected by this issue and
only with the QUIC multiplexer because it is the only mux with already sent
data in the TX buf.
To fix the issue, the idea is to release the txbuf if possible and then
unblock the SD to perform a new zero-copy data forwarding attempt. Doing so,
and thanks to the previous patch ("MEDIUM: applet: Be able to unblock
zero-copy data forwarding from done_fastfwd"), the applet will be woken up.
This patch should fix the issue #2584. It must be backported to 3.0.
(cherry picked from commit
792a645ec21126c74c33820d1e0de63ee98aa810)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Christopher Faulet [Tue, 4 Jun 2024 09:54:18 +0000 (11:54 +0200)]
MEDIUM: stconn: Be able to unblock zero-copy data forwarding from done_fastfwd
This part is only experienced by applet. When an applet try to forward data
via an iobuf, it may decide to block for any reason even if there is free
space in the buffer. For instance, the stats applet don't procude data if
the buffer is almost full.
However, in this case, it could be good to let the consumer decide a new
attempt is possible because more space was made. So, if IOBUF_FL_FF_BLOCKED
flag is removed by the consumer when done_fastfwd() callback function is
called, the SE_FL_WANT_ROOM flag is removed on the producer sedesc. It is
only done for applets. And thanks to this change, the applet can be woken up
for a new attempt.
This patch is required for a fix on the QUIC multiplexer.
(cherry picked from commit
d2a2014f152c09ff4f28c03a4ff6c5aae000ad5d)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Christopher Faulet [Mon, 3 Jun 2024 15:46:16 +0000 (17:46 +0200)]
BUG/MEDIUM: h1-htx: Don't state interim responses are bodyless
Interim responses are by definition bodyless. But we must not set the
corresponding HTX start-line flag, beecause the start-line of the final
response is still expected. Setting the flag above too early may lead the
multiplexer on the sending side to consider the message is finished after
the headers of the interim message.
It happens with the H2 multiplexer on frontend side if a "100-Continue" is
received from the server. The interim response is sent and
HTX_SL_F_BODYLESS_RESP flag is evaluated. Then, the headers of the final
response are sent with ES flag, because HTX_SL_F_BODYLESS_RESP flag was seen
too early, leading to a protocol error if the response has a body.
Thanks to grembo for this analysis.
This patch should fix the issue #2587. It must be backported as far as 2.9.
(cherry picked from commit
7c84ee71f77616f569081060d81455c137fc13f5)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Aurelien DARRAGON [Mon, 3 Jun 2024 14:18:27 +0000 (16:18 +0200)]
BUG/MINOR: hlua: use CertCache.set() from various hlua contexts
Using CertCache.set() from init context wasn't explicitly supported and
caused the process to crash:
crash.lua:
core.register_init(function()
CertCache.set{filename="reg-tests/ssl/set_cafile_client.pem", ocsp=""}
end)
crash.conf:
global
lua-load crash.lua
listen front
bind localhost:9090 ssl crt reg-tests/ssl/set_cafile_client.pem ca-file reg-tests/ssl/set_cafile_interCA1.crt verify none
./haproxy -f crash.conf
[NOTICE] (267993) : haproxy version is 3.0-dev2-640ff6-910
[NOTICE] (267993) : path to executable is ./haproxy
[WARNING] (267993) : config : missing timeouts for proxy 'front'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
[1] 267993 segmentation fault (core dumped) ./haproxy -f crash.conf
This is because in hlua_ckch_set/hlua_ckch_commit_yield, we always
consider that we're being called from a yield-capable runtime context.
As such, hlua_gethlua() is never checked for NULL and we systematically
try to wake hlua->task and yield every 10 instances.
In fact, if we're called from the body or init context (that is, during
haproxy startup), hlua_gethlua() will return NULL, and in this case we
shouldn't care about yielding because it is ok to commit all instances
at once since haproxy is still starting up.
Also, when calling CertCache.set() from a non-yield capable runtime
context (such as hlua fetch context), we kept doing as if the yield
succeeded, resulting in unexpected function termination (operation
would be aborted and the CertCache lock wouldn't be released). Instead,
now we explicitly state in the doc that CertCache.set() cannot be used
from a non-yield capable runtime context, and we raise a runtime error
if it is used that way.
These bugs were discovered by reading the code when trying to address
Svace report documented by @Bbulatov GH #2586.
It should be backported up to 2.6 with
30fcca18 ("MINOR: ssl/lua:
CertCache.set() allows to update an SSL certificate file")
(cherry picked from commit
4f906a9c3824dd424a36f53fc3479f276333d566)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
William Lallemand [Mon, 3 Jun 2024 09:02:23 +0000 (11:02 +0200)]
DOC: configuration: add an example for keywords from crt-store
In ticket #785, people are still confused about how to use the crt-store
load parameters in a crt-list.
This patch adds an example.
This must be backported in 3.0
(cherry picked from commit
c79c3121427ca240f36e1838fba777b8e92ac81f)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Willy Tarreau [Fri, 31 May 2024 16:52:51 +0000 (18:52 +0200)]
BUG/MINOR: tools: fix possible null-deref in env_expand() on out-of-memory
In GH issue #2586 @Bbulatov reported a theoretical null-deref in
env_expand() in case there's no memory anymore to expand an environment
variable. The function should return NULL in this case so that the only
caller (str2sa_range) sees it. In practice it may only happen during
boot thus is harmless but better fix it since it's easy. This can be
backported to all versions where this applies.
(cherry picked from commit
ba958fb230d4add678913f18eb520d9d5935c968)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Willy Tarreau [Fri, 31 May 2024 16:37:56 +0000 (18:37 +0200)]
BUG/MINOR: tcpcheck: report correct error in tcp-check rule parser
When parsing tcp-check expect-header, a copy-paste error in the error
message causes the name of the header to be reporetd as the invalid
format string instead of its value. This is really harmless but should
be backported to all versions to help users understand the cause of the
problem when this happens. This was reported in GH issue #2586 by
@Bbulatov.
(cherry picked from commit
8a7afb6964b6f57c8d0481e2e391b7ec88552b26)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Willy Tarreau [Fri, 31 May 2024 16:30:16 +0000 (18:30 +0200)]
BUG/MINOR: cfgparse: remove the correct option on httpcheck send-state warning
In GH issue #2586 @Bbulatov reported a bug where the http-check
send-state flag is removed from options instead of options2 when
http-check is disabled. It only has an effect when this option is
set and http-check disabled, where it displays a warning indicating
this will be ignored. The option removed instead is srvtcpka when
this happens. It's likely that both options being so minor, nobody
ever faced it.
This can be backported to all versions.
(cherry picked from commit
d8194fab8205364d65cf5364e6daeca517eb75e0)
Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>
Willy Tarreau [Wed, 29 May 2024 12:43:38 +0000 (14:43 +0200)]
[RELEASE] Released version 3.0.0
Released version 3.0.0 with the following main changes :
- MINOR: sample: implement the uptime sample fetch
- CI: scripts: fix build of vtest regarding option -C
- CI: scripts: build vtest using multiple CPUs
- MINOR: log: rename 'log-format tag' to 'log-format alias'
- DOC: config: document logformat item naming and typecasting features
- BUILD: makefile: yearly reordering of objects by build time
- BUILD: fd: errno is also needed without poll()
- DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM"
- DOC: config: refer to the non-deprecated keywords in ocsp-update on/off
- DOC: streamline http-reuse and connection naming definition
- REGTESTS: complete http-reuse test with pool-conn-name
- DOC: config: add %ID logformat alias alternative
- CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp
- BUG/MINOR: ssl/ocsp: init callback func ptr as NULL
- CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat
- BUG/MINOR: activity: fix Delta_calls and Delta_bytes count
- CI: github: upgrade the WolfSSL job to 5.7.0
- DOC: install: update quick build reminders with some missing options
- DOC: install: update the range of tested openssl version to cover 3.3
- DEV: patchbot: prepare for new version 3.1-dev
- MINOR: version: mention that it's 3.0 LTS now.
Willy Tarreau [Wed, 31 May 2023 14:23:56 +0000 (16:23 +0200)]
MINOR: version: mention that it's 3.0 LTS now.
The version will be maintained up to around Q2 2029. Let's
also update the INSTALL file to mention this.
Willy Tarreau [Wed, 29 May 2024 12:38:21 +0000 (14:38 +0200)]
DEV: patchbot: prepare for new version 3.1-dev
The bot will now load the prompt for the upcoming 3.1 version so we have
to rename the files and update their contents to match the current version.
Willy Tarreau [Wed, 29 May 2024 08:23:59 +0000 (10:23 +0200)]
DOC: install: update the range of tested openssl version to cover 3.3
OpenSSL 3.3 is known to work since it's tested on the CI, to let's add
it to the list of known good versions.
Willy Tarreau [Wed, 29 May 2024 06:43:01 +0000 (08:43 +0200)]
DOC: install: update quick build reminders with some missing options
The quick build reminders claimed to present "all options" but were
still missing QUIC. It was also the moment to split FreeBSD and
OpenBSD apart since the latter uses LibreSSL and does not require
the openssl compatibility wrapper. We also replace the hard-coded
number of cpus for the parallel build, by the real number reported
by the system.
William Lallemand [Tue, 28 May 2024 17:16:03 +0000 (19:16 +0200)]
CI: github: upgrade the WolfSSL job to 5.7.0
WolfSSL 5.70 was released in March 2024, let's upgrade our CI job to
this version.
Valentine Krasnobaeva [Tue, 28 May 2024 15:06:24 +0000 (17:06 +0200)]
BUG/MINOR: activity: fix Delta_calls and Delta_bytes count
Thanks to the commit
5714aff4a6bf
"DEBUG: pool: store the memprof bin on alloc() and update it on free()", the
amount of memory allocations and memory "frees" is shown now on the same line,
corresponded to the caller name. This is very convenient to debug memory leaks
(haproxy should run with -dMcaller option).
The implicit drawback of this solution is that we count twice same free_calls
and same free_tot (bytes) values in cli_io_handler_show_profiling(), when
we've calculed tot_free_calls and tot_free_bytes, by adding them to the these
totalizators for p_alloc, malloc and calloc allocator types. See the details
about why this happens in a such way in __pool_free() implementation and
also in the commit message for
5714aff4a6bf.
This double addition of free counters falses 'Delta_calls' and 'Delta_bytes',
sometimes we even noticed that they show negative values.
Same problem was with the calculation of average allocated buffer size for
lines, where we show simultaneously the number of allocated and freed bytes.
Willy Tarreau [Tue, 28 May 2024 17:16:18 +0000 (19:16 +0200)]
CLEANUP: ssl_sock: move dirty openssl-1.0.2 wrapper to openssl-compat
Valentine noticed this ugly SSL_CTX_get_tlsext_status_cb() macro
definition inside ssl_sock.c that is dedicated to openssl-1.0.2 only.
It would be better placed in openssl-compat.h, which is what this
patch does. It also addresses a missing pair of parenthesis and
removes an invalid extra semicolon.
Valentine Krasnobaeva [Tue, 28 May 2024 08:49:51 +0000 (10:49 +0200)]
BUG/MINOR: ssl/ocsp: init callback func ptr as NULL
In ssl_sock_load_ocsp() it is better to initialize local scope variable
'callback' function pointer as NULL, while we are declaring it. According to
SSL_CTX_get_tlsext_status_cb() API, then we will provide a pointer to this
'on stack' variable in order to check, if the callback was already set before:
OpenSSL 1.x.x and 3.x.x:
long SSL_CTX_get_tlsext_status_cb(SSL_CTX *ctx, int (**callback)(SSL *, void *));
long SSL_CTX_set_tlsext_status_cb(SSL_CTX *ctx, int (*callback)(SSL *, void *));
WolfSSL 5.7.0:
typedef int(*tlsextStatusCb)(WOLFSSL* ssl, void*);
WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb* cb);
WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb);
When this func ptr variable stays uninitialized, haproxy comipled with ASAN
crushes in ssl_sock_load_ocsp():
./haproxy -d -f haproxy.cfg
...
AddressSanitizer:DEADLYSIGNAL
=================================================================
==114919==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x5eab8951bb32 bp 0x7ffcdd6d8410 sp 0x7ffcdd6d82e0 T0)
==114919==The signal is caused by a READ memory access.
==114919==Hint: address points to the zero page.
#0 0x5eab8951bb32 in ssl_sock_load_ocsp /home/vk/projects/haproxy/src/ssl_sock.c:1248:22
#1 0x5eab89510d65 in ssl_sock_put_ckch_into_ctx /home/vk/projects/haproxy/src/ssl_sock.c:3389:6
...
This happens, because callback variable is allocated on the stack. As not
being explicitly initialized, it may contain some garbage value at runtime,
due to the linked crypto library update or recompilation.
So, following ssl_sock_load_ocsp code, SSL_CTX_get_tlsext_status_cb() may
fail, callback will still contain its initial garbage value,
'if (!callback) {...' test will put us on the wrong path to access some
ocsp_cbk_arg properties via its pointer, which won't be set and like this
we will finish with segmentation fault.
Must be backported in all stable versions. All versions does not have
the ifdef, the previous cleanup patch is useful starting from the 2.7
version.
Valentine Krasnobaeva [Tue, 28 May 2024 09:01:11 +0000 (11:01 +0200)]
CLEANUP: ssl/ocsp: readable ifdef in ssl_sock_load_ocsp
Due to the support of different TLS/SSL libraries and its different versions,
sometimes we are forced to use different internal typedefs and callback
functions. We strive to avoid this, but time to time "#ifdef... #endif"
become inevitable.
In particular, in ssl_sock_load_ocsp() we define a 'callback' variable, which
will contain a function pointer to our OCSP stapling callback, assigned
further via SSL_CTX_set_tlsext_status_cb() to the intenal SSL context
struct in a linked crypto library.
If this linked crypto library is OpenSSL 1.x.x/3.x.x, for setting and
getting this callback we have the following API signatures
(see doc/man3/SSL_CTX_set_tlsext_status_cb.pod):
long SSL_CTX_get_tlsext_status_cb(SSL_CTX *ctx, int (**callback)(SSL *, void *));
long SSL_CTX_set_tlsext_status_cb(SSL_CTX *ctx, int (*callback)(SSL *, void *));
If we are using WolfSSL, same APIs expect tlsextStatusCb function prototype,
provided via the typedef below (see wolfssl/wolfssl/ssl.h):
typedef int(*tlsextStatusCb)(WOLFSSL* ssl, void*);
WOLFSSL_API int wolfSSL_CTX_get_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb* cb);
WOLFSSL_API int wolfSSL_CTX_set_tlsext_status_cb(WOLFSSL_CTX* ctx, tlsextStatusCb cb);
It seems, that in OpenSSL < 1.0.0, there was no support for OCSP extention, so
no need to set this callback.
Let's avoid #ifndef... #endif for this 'callback' variable definition to keep
things clear. #ifndef... #endif are usually less readable, than
straightforward "#ifdef... #endif".
Aurelien DARRAGON [Tue, 28 May 2024 08:40:46 +0000 (10:40 +0200)]
DOC: config: add %ID logformat alias alternative
unique-id sample fetch may be used instead of %ID alias but it wasn't
mentioned explicitly in the doc.
Amaury Denoyelle [Tue, 28 May 2024 12:59:58 +0000 (14:59 +0200)]
REGTESTS: complete http-reuse test with pool-conn-name
Add new test cases in http_reuse_conn_hash vtest. Ensure new server
parameter "pool-conn-name" is used as expected for idle connection name,
both alone and mixed with a SNI.
Amaury Denoyelle [Tue, 28 May 2024 10:00:32 +0000 (12:00 +0200)]
DOC: streamline http-reuse and connection naming definition
With the introduction of "pool-conn-name", documentation related to
http-reuse was rendered more complex than already, notably with multiple
cross-references between "pool-conn-name" and "sni" server keywords.
Took the opportunity to improve all http-reuse related documentation.
First, "http-reuse" keyword general purpose has been greatly expanded
and reordered.
Then, "pool-conn-name" and "sni" have been clarified, in particular the
relation between them, with the foremost being an advanced usage to the
default SSL SNI case in the context of http-reuse. Also update
attach-srv rule documentation as its name parameter is directly linked
to both "pool-conn-name" and "sni".
Willy Tarreau [Mon, 27 May 2024 18:13:42 +0000 (20:13 +0200)]
DOC: config: refer to the non-deprecated keywords in ocsp-update on/off
The doc for "ocsp-update [ off | on ]" was still referring to
"tune.ssl.ocsp-update.*" instead of "ocsp-update.*". No backport
needed.
Willy Tarreau [Mon, 27 May 2024 17:51:19 +0000 (19:51 +0200)]
DOC: config: fix two typos "RST_STEAM" vs "RST_STREAM"
These were added in 3.0-dev11 by commit
068ce2d5d2 ("MINOR: stconn:
Add samples to retrieve about stream aborts"), no backport needed.
Willy Tarreau [Mon, 27 May 2024 16:56:12 +0000 (18:56 +0200)]
BUILD: fd: errno is also needed without poll()
When building without USE_POLL, fd.c fails on errno because that one is
only included when USE_POLL is set. Let's move it outside of the ifdef.
Willy Tarreau [Mon, 27 May 2024 14:02:54 +0000 (16:02 +0200)]
BUILD: makefile: yearly reordering of objects by build time
Some large files have been split since 2.9 (e.g. stats) and build times
have moved and become less smooth, causing a less even parallel build.
As usual, a small reordering cleans all this up. The effect was less
visible than previous years though.
Aurelien DARRAGON [Mon, 27 May 2024 10:02:27 +0000 (12:02 +0200)]
DOC: config: document logformat item naming and typecasting features
The ability to give a name to a logformat_node (known as logformat item in
the documentation) implemented in
2ed6068f2a ("MINOR: log: custom name for
logformat node") wasn't documented.
The same goes for the ability to force the logformat_node's output type to
a specific type implemented in
1448478d62 ("MINOR: log: explicit
typecasting for logformat nodes")
Let's quickly describe such new usages at the start of the custom log
format section.
Aurelien DARRAGON [Mon, 27 May 2024 08:18:10 +0000 (10:18 +0200)]
MINOR: log: rename 'log-format tag' to 'log-format alias'
In 2.9 we started to introduce an ambiguity in the documentation by
referring to historical log-format variables ('%var') as log-format
tags in
739c4e5b1e ("MINOR: sample: accept_date / request_date return
%Ts / %tr timestamp values") and
454c372b60 ("DOC: configuration: add
sample fetches for timing events").
In fact, we've had this confusion between log-format tag and log-format
var for more than 10 years now, but in 2.9 it was the first time the
confusion was exposed in the documentation.
Indeed, both 'log-format variable' and 'log-format tag' actually refer
to the same feature (that is: '%B' and friends that can be used for
direct access to some log-oriented predefined fetches instead of using
%[expr] with generic sample expressions).
This feature was first implemented in
723b73ad75 ("MINOR: config: Parse
the string of the log-format config keyword") and later documented in
4894040fa ("DOC: log-format documentation"). At that time, it was clear
that we used to name it 'log-format variable'.
But later the same year, 'log-format tag' naming started to appear in
some commit messages (while still referring to the same feature), for
instance with
ffc3fcd6d ("MEDIUM: log: report SSL ciphers and version
in logs using logformat %sslc/%sslv").
Unfortunately in 2.9 when we added (and documented) new log-format
variables we officially started drifting to the misleading 'log-format
tag' naming (perhaps because it was the most recent naming found for
this feature in git log history, or because the confusion has always
been there)
Even worse, in 3.0 this confusion led us to rename all 'var' occurrences
to 'tag' in log-format related code to unify the code with the doc.
Hopefully William quickly noticed that we made a mistake there, but
instead of reverting to historical naming (log-format variable), it was
decided that we must use a different name that is less confusing than
'tags' or 'variables' (tags and variables are keywords that are already
used to designate other features in the code and that are not very
explicit under log-format context today).
Now we refer to '%B' and friends as a logformat alias, which is
essentially a handy way to print some log oriented information in the
log string instead of leveraging '%[expr]' with generic sample expressions
made of fetches and converters. Of course, there are some subtelties, such
as a few log-format aliases that still don't have sample fetch equivalent
for historical reasons, and some aliases that may be a little faster than
their generic sample expression equivalents because most aliases are
pretty much hardcoded in the log building function. But in general
logformat aliases should be simply considered as an alternative to using
expressions (with '%[expr']')
Also, under log-format context, when we want to refer to either an alias
('%alias') or an expression ('%[expr]'), we should use the generic term
'logformat item', which in fact designates a single item within the
logformat string provided by the user. Indeed, a logformat item (whether
is is an alias or an expression) always starts with '%' and may accept
optional flags / arguments
Both the code and the documentation were updated in that sense, hopefully
this will clarify things and prevent future confusions.
Willy Tarreau [Mon, 27 May 2024 09:59:46 +0000 (11:59 +0200)]
CI: scripts: build vtest using multiple CPUs
Now that vtest supports make -j, let's use it to save a bit of time
(the build time is ~6s per test by default).
Willy Tarreau [Mon, 27 May 2024 09:50:31 +0000 (11:50 +0200)]
CI: scripts: fix build of vtest regarding option -C
On Linux, GNU make emits "w" at the beginning of the MAKEFLAGS
variable if -C is passed, which happens since vtest
d6d228bcb3.
In fact it emits any of the command line flags without the leading
'-' in this case. gmake doesn't do that on BSD apparently. It's
documented under Options/Recursion in the GNU make doc. There's
also MFLAGS that could work but it does not contain the variables
definitions. So let's just avoid the -C that we don't really need.
This needs to be backported to stable versions.
William Lallemand [Mon, 27 May 2024 09:06:40 +0000 (11:06 +0200)]
MINOR: sample: implement the uptime sample fetch
'uptime' returns the uptime of the current HAProxy worker in seconds.
Willy Tarreau [Fri, 24 May 2024 15:57:29 +0000 (17:57 +0200)]
[RELEASE] Released version 3.0-dev13
Released version 3.0-dev13 with the following main changes :
- CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf
- MINOR: ssl: check parameter in ckch_conf_cmp()
- BUG/MINOR: ring: free ring's allocated area not ring's usable area when using maps
- DOC: configuration: rework the crt-store load documentation
- DEBUG: tools: add vma_set_name() helper
- DEBUG: shctx: name shared memory using vma_set_name()
- DEBUG: sink: add name hint for memory area used by memory-backed sinks
- DEBUG: pollers: add name hint for large memory areas used by pollers
- DEBUG: errors: add name hint for startup-logs memory area
- DEBUG: fd: add name hint for large memory areas
- MEDIUM: ssl: don't load file by discovering them in crt-store
- DOC: configuration: update the crt-list documentation
- DOC: configuration: add the supported crt-store options in crt-list
- BUG/MEDIUM: proto: fix fd leak in <proto>_connect_server
- MINOR: sock: set conn->err_code in case of EPERM
- BUG/MINOR: http-ana: Don't crush stream termination condition on internal error
- MAJOR: spoe: Let the SPOE back into the game
- BUG/MINOR: connection: parse PROXY TLV for LOCAL mode
- BUG/MINOR: server: free PROXY v2 TLVs on srv drop
- MINOR: rhttp: add log on connection allocation failure
- BUG/MEDIUM: rhttp: fix preconnect on single-thread
- BUG/MINOR: rhttp: prevent listener suspend
- BUG/MINOR: rhttp: fix task_wakeup state
- MINOR: session: define flag to explicitely release listener on free
- MEDIUM: rhttp: create session for active preconnect
- MINOR: rhttp: support PROXY emission on preconnect
- MINOR: connection: support PROXY v2 TLV emission without stream
- MINOR: traces: enumerate the list of levels/verbosities when not found
- BUG/MINOR: sock: fix sock_create_server_socket
- MINOR: proto: fix coding style
- BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only)
- REGTESTS: scripts: allow to change the vtest timeout
- BUG/MEDIUM: quic_tls: prevent LibreSSL < 4.0 from negotiating CHACHA20_POLY1305
- CI: scripts/build-ssl.sh: loudly fail on unsupported platforms
- BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream
- MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame
- CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable
- CI: scripts/buil-ssl: cleanup the boringssl and quictls build
- MINOR: config: add thread-hard-limit to set an upper bound to nbthread
- BUILD: quic: fix unused variable warning when threads are disabled
- BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries
- BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky session
- BUG/MEDIUM: stick-tables: make sure never to create two same remote entries
- CLEANUP: stick-tables: remove a few unneeded tests for use_wrlock
- MINOR: stick-tables: remove the uneeded read lock in stksess_free()
- CLEANUP: tools: fix vma_set_name() function comment
- DEBUG: tools: add vma_set_name_id() helper
- DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name hints
- DOC: config: fix aes_gcm_enc() description text
- BUILD: trace: fix warning on null dereference
- MEDIUM: config: prevent communication with privileged ports
- MAJOR: config: prevent QUIC with clients privileged port by default
- BUG/MINOR: quic: adjust restriction for stateless reset emission
- MINOR: quic: clarify doc for quic_recv()
- MINOR: server: generalize sni expr parsing
- MINOR: server: define pool-conn-name keyword
- MEDIUM: connection: use pool-conn-name instead of sni on reuse
- BUG/MINOR: rhttp: initialize session origin after preconnect reversal
- BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or error
- BUG/MINOR: http-htx: Support default path during scheme based normalization
- BUG/MINOR: server: Don't reset resolver options on a new default-server line
- DOC: quic: specify that connection migration is not supported
- DOC: config: fix incorrect section reference about custom log format
- DOC: config: uniformize the naming and description of custom log format args
- DOC: config: clarify the fact that custom log format is not just for logging
- REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs
Willy Tarreau [Fri, 24 May 2024 15:50:19 +0000 (17:50 +0200)]
REGTESTS: acl_cli_spaces: avoid a warning caused by undefined logs
There's a warning being reported in this reg test in the detailed startup
logs because of "log global" and "option httplog" while there's no global
section hence no logger. Let's just drop both options since they're not
relevant to this test.
Willy Tarreau [Fri, 24 May 2024 15:06:12 +0000 (17:06 +0200)]
DOC: config: clarify the fact that custom log format is not just for logging
The wording in the Custom log format section was still extremely centered
on logging, but it's about time to mention that these are usable for other
actions as well, otherwise it's very confusing for newcomers who try to
define a variable or header. The updated text also reminds about the risks
of safe encodings that may (rarely) mangle an output string, and encourages
to migrate away from the unquoted definition which is full of backslashes.
It would definitely deserve further improvements and refinements.