Willy Tarreau [Thu, 11 Jun 2020 12:25:47 +0000 (14:25 +0200)]
BUG/MEDIUM: log: don't hold the log lock during writev() on a file descriptor
In issue #648 a second problem was reported, indicating that some users
mistakenly send the log to an FD mapped on a file. This situation doesn't
even enable O_NONBLOCK and results in huge access times in the order of
milliseconds with the lock held and other threads waiting till the
watchdog fires to unblock the situation.
The problem with files is that O_NONBLOCK is ignored, and we still need
to lock otherwise we can end up with interleaved log messages.
What this patch does is different. Instead of locking all writers, it
uses a trylock so that there's always at most one logger and that other
candidates can simply give up and report a failure, just as would happen
if writev() returned -1 due to a pipe full condition. This solution is
elegant because it gives back the control to haproxy to decide to give
up when it takes too much time, while previously it was the kernel that
used to block the syscall.
However at high log rates (500000 req/s) there was up to 50% dropped logs
due to the contention on the lock. In order to address this, we try to
grab the lock up to 200 times and call ha_thread_relax() on failure. This
results in almost no failure (no more than previously with O_NONBLOCK). A
typical test with 6 competing threads writing to stdout chained to a pipe
to a single process shows around 1000 drops for 10 million logs at 480000
lines per second.
Please note that this doesn't mean that writing to a blocking FD is a good
idea, and it might only be temporarily done on testing environments for
debugging. A file or a terminal will continue to block the writing thread
while others spin a little bit and lose their logs, but the writing thread
will still experience performance-killing latencies.
This patch should be backported to 2.1 and 2.0. The code is in log.c in
2.0, but the principle is the same.
(cherry picked from commit
df187875da479c71f12e2e8edb0a2e345f189523)
[wla: the lock is the one of the fd instead of log_lock]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
Christopher Faulet [Tue, 9 Jun 2020 06:44:30 +0000 (08:44 +0200)]
[RELEASE] Released version 2.1.7
Released version 2.1.7 with the following main changes :
- BUG/MAJOR: http-htx: Don't forget to copy error messages from defaults sections
Christopher Faulet [Mon, 8 Jun 2020 15:09:17 +0000 (17:09 +0200)]
BUG/MAJOR: http-htx: Don't forget to copy error messages from defaults sections
A bug was introduced in the commit
c9dbbbdfc ("BUG/MEDIUM: http-htx: Duplicate
error messages as raw data instead of string"). No copy is performed in this
patch. The memory is allocated but the copy is not perfomed. It is stupid and it
affects all configuration with errofiles defined in a defaults section.
This patch should fix the issue #669. This bug is specific to the 2.1. Other
versions are not affected. Thus, there is no upstream commit ID for this
patch. And it should not be backported.
William Lallemand [Mon, 8 Jun 2020 08:24:54 +0000 (10:24 +0200)]
[RELEASE] Released version 2.1.6
Released version 2.1.6 with the following main changes :
- Revert "BUG/MEDIUM: connections: force connections cleanup on server changes"
- SCRIPTS: publish-release: pass -n to gzip to remove timestamp
- BUG/MINOR: peers: fix internal/network key type mapping.
- BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua action
- BUG/MEDIUM: http-htx: Duplicate error messages as raw data instead of string
- BUG/MEDIUM: hlua: Lock pattern references to perform set/add/del operations
- BUG/MEDIUM: contrib/prometheus-exporter: Properly set flags to dump metrics
- BUG/MEDIUM: mworker: fix the copy of options in copy_argv()
- BUG/MINOR: init: -x can have a parameter starting with a dash
- BUG/MINOR: init: -S can have a parameter starting with a dash
- BUG/MEDIUM: mworker: fix the reload with an -- option
- BUG/MINOR: ssl: fix a trash buffer leak in some error cases
- BUG/MINOR: mworker: fix a memleak when execvp() failed
William Lallemand [Mon, 8 Jun 2020 08:01:13 +0000 (10:01 +0200)]
BUG/MINOR: mworker: fix a memleak when execvp() failed
Free next_argv when execvp() failed.
Must be backported as far as 1.8.
Should fix issue #668.
(cherry picked from commit
9fc6c97fb30bb9c15d6b6e34c917b0b5aba7486a)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
William Lallemand [Mon, 8 Jun 2020 07:40:37 +0000 (09:40 +0200)]
BUG/MINOR: ssl: fix a trash buffer leak in some error cases
Fix a trash buffer leak when we can't take the lock of the ckch, or when
"set ssl cert" is wrongly used.
The bug was mentionned in this thread:
https://www.mail-archive.com/haproxy@formilux.org/msg37539.html
The bug was introduced by commit bc6ca7c ("MINOR: ssl/cli: rework 'set
ssl cert' as 'set/commit'").
Must be backported in 2.1.
(cherry picked from commit
e5ff4addb2300db60f5a4d1f99abd54b4a1a94f6)
[wla: function cli_parse_set_cert() was moved in 2.2 and is in
src/ssl_sock.c in 2.1]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
William Lallemand [Fri, 5 Jun 2020 12:08:41 +0000 (14:08 +0200)]
BUG/MEDIUM: mworker: fix the reload with an -- option
When HAProxy is started with a '--' option, all following parameters are
considered configuration files. You can't add new options after a '--'.
The current reload system of the master-worker adds extra options at the
end of the arguments list. Which is a problem if HAProxy was started wih
'--'.
This patch fixes the issue by copying the new option at the beginning of
the arguments list instead of the end.
This patch must be backported as far as 1.8.
(cherry picked from commit
0041741ef7253fcad2fc98f0b2a9968fb4af3574)
[wla: context edit]
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
William Lallemand [Thu, 4 Jun 2020 21:49:20 +0000 (23:49 +0200)]
BUG/MINOR: init: -S can have a parameter starting with a dash
There is no reason the -S option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)
This can be backported only if the previous patch which fixes
copy_argv() is backported too.
Could be backported as far as 1.9.
(cherry picked from commit
a6b3249935cf8e4de53b9940419683ef82cb40b8)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
William Lallemand [Thu, 4 Jun 2020 21:41:29 +0000 (23:41 +0200)]
BUG/MINOR: init: -x can have a parameter starting with a dash
There is no reason the -x option can't take an argument which starts with
a -. This limitation must only be used for options that take a
non-finite list of parameters (-sf/-st)
This can be backported only if the previous patch which fixes
copy_argv() is backported too.
Could be backported as far as 1.8.
(cherry picked from commit
4f71d304aad5199d9a0179deb34241b4a3665740)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
William Lallemand [Thu, 4 Jun 2020 15:40:23 +0000 (17:40 +0200)]
BUG/MEDIUM: mworker: fix the copy of options in copy_argv()
The copy_argv() function, which is used to copy and remove some of the
arguments of the command line in order to re-exec() the master process,
is poorly implemented.
The function tries to remove the -x and the -sf/-st options but without
taking into account that some of the options could take a parameter
starting with a dash.
In issue #644, haproxy starts with "-L -xfoo" which is perfectly
correct. However, the re-exec is done without "-xfoo" because the master
tries to remove the "-x" option. Indeed, the copy_argv() function does
not know how much arguments an option can have, and just assume that
everything starting with a dash is an option. So haproxy is exec() with
"-L" but without a parameter, which is wrong and leads to the exit of
the master, with usage().
To fix this issue, copy_argv() must know how much parameters an option
takes, and copy or skip the parameters correctly.
This fix is a first step but it should evolve to a cleaner way of
declaring the options to avoid deduplication of the parsing code, so we
avoid new bugs.
Should be backported with care as far as 1.8, by removing the options
that does not exists in the previous versions.
(cherry picked from commit
df6c5a8ffad7ec28ff6987dc6b1e26d68eb89c5e)
Signed-off-by: William Lallemand <wlallemand@haproxy.org>
Christopher Faulet [Fri, 5 Jun 2020 06:18:56 +0000 (08:18 +0200)]
BUG/MEDIUM: contrib/prometheus-exporter: Properly set flags to dump metrics
When the metrics are dumped, in the main function promex_dump_metrics(), the
appctx flags are set before entering in a new scope, among others things to know
which metrics names and descriptions to use. But, those flags are not restored
when the dump is interrupted because of a full output buffer. If this happens
after the dump of global metrics, it may only lead to extra #TYPE and #HELP
lines. But if this happens during the dump of global metrics, the following
dumps of frontends, backends and servers metrics use names and descriptions of
global ones with the unmatching indexes. This first leads to unexisting metrics
names. For instance, "haproxy_frontend_nbproc". But also to out-of-bound
accesses to name and description arrays because there are more stats fields than
info fields.
It is easy to reproduce the bug using small buffers, setting tune.bufsize to
8192 for instance.
This patch should fix the issue #666. It must be backported as far as 2.0.
(cherry picked from commit
efde955bdb7b7479267cfcd823cce26bb3f5c4d3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Wed, 3 Jun 2020 16:39:16 +0000 (18:39 +0200)]
BUG/MEDIUM: hlua: Lock pattern references to perform set/add/del operations
The pattern references lock must be hold to perform set/add/del
operations. Unfortunately, it is not true for the lua functions manipulating acl
and map files.
This patch should fix the issue #664. It must be backported as far as 1.8.
(cherry picked from commit
5cf2dfc5fd08b7ff8d27d6293e2aaa8ec18fc607)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Wed, 3 Jun 2020 14:21:04 +0000 (16:21 +0200)]
BUG/MEDIUM: http-htx: Duplicate error messages as raw data instead of string
When the parsing of a proxy starts, the error messages are duplicated from the
default proxy. chunk_dup() function is used to do so. But this function handles
the source buffer as a string and try to add a null-terminated byte to the
destination buffer after the data copy. Since the 2.1, The error messages are
only HTX messages. So the buffer must not be handled as string but as raw
data. When the HTX message is not empty, it is not a problem because the
underlying buffer is considered as full. null-terminated byte cannot be added
and an exact copy is performed. But when the error message is set to /dev/null,
the source buffer as a null size. In this case, size of the destination buffer
is incremented by one. At the end, the destination buffer has a size of 1. It is
an unexpected and undefined state. In http_reply_and_close(), these buffers are
erroneously casted to junk HTX messages leading to undefined behaviors, most
probably to crashes.
This bug is specific to the 2.1. Other versions are not affected. Thus, there is
no upstream commit ID for this patch. And it should not be backported. it should
fix the issue #648 and probably the issue #658.
Christopher Faulet [Tue, 2 Jun 2020 16:46:07 +0000 (18:46 +0200)]
BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua action
Before executing a lua action, the analyse expiration timeout of the
corresponding channel must be reset. Otherwise, when it expires, for instance
because of a call to core.msleep(), if the action yields, an expired timeout
will be used for the stream's task, leading to a loop.
This patch should fix the issue #661. It must be backported in all versions
supporting the lua.
(cherry picked from commit
23308ebf0d603d3f81b4c7a5bdcd82f1cd70c449)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Emeric Brun [Tue, 2 Jun 2020 09:17:42 +0000 (11:17 +0200)]
BUG/MINOR: peers: fix internal/network key type mapping.
Network types were directly and mistakenly mapped on sample types:
This patch fix the doc with values effectively used to keep backward
compatiblitiy on existing implementations.
In addition it adds an internal/network mapping for key types to avoid
further mistakes adding or modifying internals types.
This patch should be backported on all maintained branches,
particularly until v1.8 included for documentation part.
(cherry picked from commit
530ba38a706ed640e43eb44abe5578c7a5826d5d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Sat, 30 May 2020 04:59:07 +0000 (06:59 +0200)]
SCRIPTS: publish-release: pass -n to gzip to remove timestamp
It just appeared that the tar.gz we put online are not reproducible
because a timestamp is put by default into the archive. Passing "-n"
to gzip is sufficient to remove this timestamp, so let's do it, and
also make the gzip command configurable for more flexibility. Now
issuing the commands multiple times finally results in the same
archives being produced.
This should be backported to supported stable branches.
(cherry picked from commit
6fab3e6d91bff843dcb289172c8b7a5dd54062ac)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Dauchy [Tue, 2 Jun 2020 14:03:58 +0000 (16:03 +0200)]
Revert "BUG/MEDIUM: connections: force connections cleanup on server changes"
As explained by Christopher on github issue #665:
In 2.2, srv->idle_conns and srv->safe_conns are thread-safe lists. But
not in 2.1. So the patch must be reverted or the lists must be changed
to use mt_list instead. The same must be done in 2.0, but the mt_list
does not exist on this version.
I choose to revert it as the original bug is truly revealed in v2.2
after commit
079cb9af22da6 ("MEDIUM: connections: Revamp the way idle
connections are killed")
this should resolve github issue #665
this reverts commit
7eab37b6819af685c647cf5a581e29fca2f3e079.
this reverts commit
3ad3306ec0bcb0cd4ca2b9ba134ed67663473ee8.
The patch was directly introduced on 2.1, there is no upstream commit ID for
this patch.
William Lallemand [Fri, 29 May 2020 11:17:23 +0000 (13:17 +0200)]
[RELEASE] Released version 2.1.5
Released version 2.1.5 with the following main changes :
- BUG/MINOR: protocol_buffer: Wrong maximum shifting.
- MINOR: ssl: improve the errors when a crt can't be open
- BUG/MINOR: ssl/cli: memory leak in 'set ssl cert'
- BUG/MINOR: ssl: memleak of the struct cert_key_and_chain
- BUG/MINOR: connection: always send address-less LOCAL PROXY connections
- BUG/MINOR: peers: Incomplete peers sections should be validated.
- DOC: hashing: update link to hashing functions
- MINOR: version: Show uname output in display_version()
- DOC: Improve documentation on http-request set-src
- BUG/MINOR: ssl: default settings for ssl server options are not used
- BUG/MEDIUM: http-ana: Handle NTLM messages correctly.
- BUG/MINOR: tools: fix the i386 version of the div64_32 function
- BUG/MINOR: http: make url_decode() optionally convert '+' to SP
- DOC: option logasap does not depend on mode
- MEDIUM: memory: make pool_gc() run under thread isolation
- MINOR: contrib: make the peers wireshark dissector a plugin
- BUG/MINOR: check: Update server address and port to execute an external check
- MINOR: checks: Add a way to send custom headers and payload during http chekcs
- BUG/MINOR: checks: Respect the no-check-ssl option
- BUG/MEDIUM: server/checks: Init server check during config validity check
- BUG/MINOR: checks: chained expect will not properly wait for enough data
- BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function
- BUG/MINOR: mux-fcgi: Be sure to have a connection as session's origin to use it
- BUG/MEDIUM: capture: capture-req/capture-res converters crash without a stream
- BUG/MEDIUM: capture: capture.{req,res}.* crash without a stream
- BUG/MEDIUM: http: the "http_first_req" sample fetch could crash without a steeam
- BUG/MEDIUM: http: the "unique-id" sample fetch could crash without a steeam
- BUG/MEDIUM: sample: make the CPU and latency sample fetches check for a stream
- BUG/MEDIUM: listener: mark the thread as not stuck inside the loop
- MINOR: threads: export the POSIX thread ID in panic dumps
- BUG/MINOR: debug: properly use long long instead of long for the thread ID
- BUG/MEDIUM: shctx: really check the lock's value while waiting
- BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock
- MINOR: stream: report the list of active filters on stream crashes
- MINOR: haproxy: export run_poll_loop
- MINOR: tools: add new function dump_addr_and_bytes()
- MINOR: tools: add resolve_sym_name() to resolve function pointers
- MINOR: debug: use resolve_sym_name() to dump task handlers
- MINOR: cli: make "show fd" rely on resolve_sym_name()
- MEDIUM: debug: add support for dumping backtraces of stuck threads
- MINOR: debug: call backtrace() once upon startup
- BUILD: Makefile: include librt before libpthread
- MINOR: wdt: do not depend on USE_THREAD
- MINOR: debug: report the number of entries in the backtrace
- MINOR: debug: improve backtrace() on aarch64 and possibly other systems
- MINOR: debug: use our own backtrace function on clang+x86_64
- MINOR: debug: dump the whole trace if we can't spot the starting point
- BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms
- BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr()
- BUILD: Makefile: add linux-musl to TARGET
- REGTEST: ssl: test the client certificate authentication
- REGTEST: http-rules: Require PCRE or PCRE2 option to run map_redirect script
- Revert "BUG/MINOR: connection: always send address-less LOCAL PROXY connections"
- Revert "BUG/MINOR: connection: make sure to correctly tag local PROXY connections"
- BUG/MINOR: checks/server: use_ssl member must be signed
- BUG/MINOR: checks: Compute the right HTTP request length for HTTP health checks
- BUG/MINOR: checks: Remove a warning about http health checks
- BUG/MEDIUM: mux_fcgi: Free the FCGI connection at the end of fcgi_release()
- BUG/MEDIUM: mux-fcgi: Fix wrong test on FCGI_CF_KEEP_CONN in fcgi_detach()
- BUG/MEDIUM: connections: force connections cleanup on server changes
- BUG/MEDIUM: h1: Don't compare host and authority if only h1 headers are parsed
- BUG/MEDIUM: ssl: fix the id length check within smp_fetch_ssl_fc_session_id()
- CLEANUP: connections: align function declaration
- BUG/MINOR: sample: Set the correct type when a binary is converted to a string
- BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS()
- BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}()
- BUG/MEDIUM: lua: Fix dumping of stick table entries for STD_T_DICT
- BUG/MINOR: config: Make use_backend and use-server post-parsing less obscur
- BUG/MINOR: http-ana: fix NTLM response parsing again
- BUG/MEDIUM: http_ana: make the detection of NTLM variants safer
- BUG/MINOR: cfgparse: Abort parsing the current line if an invalid \x sequence is encountered
- BUG/MINOR: pools: use %u not %d to report pool stats in "show pools"
- BUG/MINOR: pollers: remove uneeded free in global init
- BUG/MINOR: soft-stop: always wake up waiting threads on stopping
- BUILD: select: only declare existing local labels to appease clang
- BUG/MEDIUM: streams: Remove SF_ADDR_SET if we're retrying due to L7 retry.
- BUG/MEDIUM: stream: Only allow L7 retries when using HTTP.
- BUG/MINOR: cache: Don't needlessly test "cache" keyword in parse_cache_flt()
- BUG/MAJOR: mux-fcgi: Stop sending loop if FCGI stream is blocked for any reason
- BUG/MEDIUM: ring: write-lock the ring while attaching/detaching
- BUG/MINOR: checks: Respect check-ssl param when a port or an addr is specified
- BUG/MINOR: server: Fix server_finalize_init() to avoid unused variable
- DOC: retry-on can only be used with mode http
- DOC/MINOR: halog: Add long help info for ic flag
- DOC: SPOE is no longer experimental
- BUG/MINOR: logs: prevent double line returns in some events.
- REGTESTS: checks: Fix tls_health_checks when IPv6 addresses are used
- BUG/MEDIUM: logs: fix trailing zeros on log message.
- BUG/MINOR: lua: Add missing string length for lua sticktable lookup
- BUG/MINOR: nameservers: fix error handling in parsing of resolv.conf
Willy Tarreau [Thu, 28 May 2020 16:07:10 +0000 (18:07 +0200)]
BUG/MINOR: nameservers: fix error handling in parsing of resolv.conf
In issue #657, Coverity found a bug in the "nameserver" parser for the
resolv.conf when "parse-resolv-conf" is set. What happens is that if an
unparsable address appears on a "nameserver" line, it will destroy the
previously allocated pointer before reporting the warning, then the next
"nameserver" line will dereference it again and wlil cause a crash. If
the faulty nameserver is the last one, it will only be a memory leak.
Let's just make sure we preserve the pointer when handling the error.
The patch also fixes a typo in the warning.
The bug was introduced in 1.9 with commit
44e609bfa ("MINOR: dns:
Implement `parse-resolv-conf` directive") so the fix needs to be backported
up to 1.9 or 2.0.
(cherry picked from commit
78675252fb6ce195c145afd2862fa53087bf585f)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Nathan Neulinger [Wed, 4 Mar 2020 02:32:47 +0000 (20:32 -0600)]
BUG/MINOR: lua: Add missing string length for lua sticktable lookup
In hlua_stktable_lookup(), the key length is never set so all
stktable:lookup("key") calls return nil from lua.
This patch must be backported as far as 1.9.
[Cf: I slightly updated the patch to use lua_tolstring() instead of
luaL_checkstring() + strlen()]
(cherry picked from commit
31a841c323b0e9692695a57d4747f71a32b9fb68)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Emeric Brun [Thu, 28 May 2020 12:21:33 +0000 (14:21 +0200)]
BUG/MEDIUM: logs: fix trailing zeros on log message.
This patch removes all trailing LFs and Zeros from
log messages. Previously only the last LF was removed.
It's a regression from
e8ea0ae6f6 "BUG/MINOR: logs:
prevent double line returns in some events."
This should fix github issue #654
(cherry picked from commit
fa9d780119adc687e58a8b09e55780a0173b4ab4)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Mon, 25 May 2020 05:59:59 +0000 (07:59 +0200)]
REGTESTS: checks: Fix tls_health_checks when IPv6 addresses are used
In tls_health_checks.vtc, when IPv6 addresses are used, A config error is
reported because of the "addr" server parameter. Because there is no specified
port, the IPv6 address must be enclosed into brackets to be properly parsed. It
also works with IPv4 addresses. But instead, a dummy port is added to the addr
parameter. This way, we also check the port parameter, when specified, is used
in priority over the port found in the addr parameter.
This patch should fix the issue #646.
(cherry picked from commit
ed48657e0297268d83d4b07027843c0a8ab05d36)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Emeric Brun [Tue, 12 May 2020 17:33:15 +0000 (19:33 +0200)]
BUG/MINOR: logs: prevent double line returns in some events.
Historically some messages used to already contain the trailing LF but
not all, and __do_send_log adds a new one in needed cases. It also does
trim a trailing LF in certain cases while computing the max message
length, as a result of subtracting 1 to the available room in the
destination buffer. But the way it's done is wrong since some messages
still contain it.
So the code was fixed to always trim the trailing LF from messages if
present, and then only subtract 1 from the destination buffer room
instead of the size..
Note: new sink API is not designed to receive a trailing LF on
event messages
This could be backported to relevant stable versions with particular
care since the logic of the code changed a bit since 1.6 and there
may be other locations that need to be adjusted.
(cherry picked from commit
9e8ea0ae6f63fe1a8119d99b1218ccc095b390ac)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Wed, 13 May 2020 06:25:12 +0000 (08:25 +0200)]
DOC: SPOE is no longer experimental
The SPOE was marked as experiemental since the begining. But, it is no longer
true. This can be an obstacle to its use.
(cherry picked from commit
3b78809de09c6227164f6f48d587f86c827d5de1)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Aleksandar Lazi [Fri, 15 May 2020 20:58:30 +0000 (22:58 +0200)]
DOC/MINOR: halog: Add long help info for ic flag
Add missing long help text for the ic (ip count) flag
(cherry picked from commit
6112f5ccd28376444c5dfccee9c3fe2be4545070)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Jerome Magnin [Wed, 13 May 2020 18:09:57 +0000 (20:09 +0200)]
DOC: retry-on can only be used with mode http
The documentation for retry-on hints at it being meant to be used
in conjuction with mode http, but since we've a had bug report
involving mode tcp and retry-on, lets make it explicit in the
documentation that it only works with mode http and will be
ignored otherwise.
(cherry picked from commit
5ce3c14aa97396502fe68c6ce42a32ea99361046)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Mon, 27 Apr 2020 09:17:10 +0000 (11:17 +0200)]
BUG/MINOR: server: Fix server_finalize_init() to avoid unused variable
The variable 'ret' must only be declared When HAProxy is compiled with the SSL
support (more precisely SSL_CTRL_SET_TLSEXT_HOSTNAME must be defined).
No backport needed.
(cherry picked from commit
b3b53524addbea79f5928b0bd5c58fd201a3e828)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
[Cf: The patch fixes a bug introduced by the commit
8892e5d3. Thus it must be
backported to the same versions.]
Christopher Faulet [Wed, 20 May 2020 20:36:24 +0000 (22:36 +0200)]
BUG/MINOR: checks: Respect check-ssl param when a port or an addr is specified
When a check port or a check address is specified, the check transport layer is
ignored. So it is impossible to do a SSL check in this case. This bug was
introduced by the commit
8892e5d30 ("BUG/MEDIUM: server/checks: Init server
check during config validity check").
This patch should fix the issue #643. It must be backported to all branches
where the above commit was backported.
(cherry picked from commit
66163ec616bd3dfb06605bfc5a6e640ea7ada7d7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Tue, 19 May 2020 17:21:45 +0000 (19:21 +0200)]
BUG/MEDIUM: ring: write-lock the ring while attaching/detaching
The LIST_ADDQ() and LIST_DEL_INIT() calls made to attach/detach a waiter
to the ring were made under a read lock which was sufficient in front of
the writer's write lock. But it's not sufficient against other readers!
Thus theorically multiple "show events" on the same ring buffer on the
CLI could result in a crash, even though for now I couldn't manage to
reproduce it.
This fixes commit
1d181e489c ("MEDIUM: ring: implement a wait mode for
watchers") so it must be backported to 2.1, possibly further if the ring
code gets backported.
(cherry picked from commit
223ddedb467494d78ebf67e0ac19206f767ffe6a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Tue, 19 May 2020 13:13:00 +0000 (15:13 +0200)]
BUG/MAJOR: mux-fcgi: Stop sending loop if FCGI stream is blocked for any reason
Because of a typo error in conditions to exit the sending loop, it is possible
to loop infinitely in fcgi_snd_buf() function. Instead of checking the FCGI
stream is not blocked to continue sending data, the FCGI connection is used. So
it is possible to have a stream blocked because there is not enough space in the
mux buffers to copy more data but continue to loop to send more data.
This patch should fix the issue #637. It must be backported to 2.1.
(cherry picked from commit
fe410d685ad8bce7399e50695098b69aea07e558)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 18 May 2020 09:58:16 +0000 (11:58 +0200)]
BUG/MINOR: cache: Don't needlessly test "cache" keyword in parse_cache_flt()
parse_cache_flt() is the registered callback for the "cache" filter keyword. It
is only called when the "cache" keyword is found on a filter line. So, it is
useless to test the filter name in the callback function.
This patch should fix the issue #634. It may be backported as far as 1.9.
(cherry picked from commit
2a37cdbe6b2cf7e9fa8654ee017af12ceb6015c2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Olivier Houchard [Wed, 13 May 2020 17:07:20 +0000 (19:07 +0200)]
BUG/MEDIUM: stream: Only allow L7 retries when using HTTP.
Only allow L7 retries when using HTTP, it only really makes sense for HTTP,
anyway, and as the L7 retries code assume the message will be HTX, it will
crash when used with mode TCP.
This should fix github issue #627.
This should be backported to 2.1 and 2.0.
(cherry picked from commit
7dd7b908db1cb804ffe30ef208f7eac1824c1889)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Olivier Houchard [Tue, 12 May 2020 20:18:14 +0000 (22:18 +0200)]
BUG/MEDIUM: streams: Remove SF_ADDR_SET if we're retrying due to L7 retry.
In do_l7_retry(), remove the SF_ADDR_SET flag. Otherwise,
assign_server_address() won't be called again, which means for 2.1 or 2.2,
we will always retry to connect to the server that just failed, and for 2.0,
that we will try to use to whatever the address is for the connection,
probably the last server used by that connection before it was pool_free()
and reallocated.
This should be backported to 2.1 and 2.0.
(cherry picked from commit
8cabc9783aaa8309f4ec6f7dd95cd74c61ed2e61)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Jerome Magnin [Wed, 13 May 2020 13:11:02 +0000 (15:11 +0200)]
BUILD: select: only declare existing local labels to appease clang
Commit
42a50bd19 ("BUG/MINOR: pollers: remove uneeded free in global
init") removed the 'fail_revt' label from the _do_init() function
in src/ev_select.c but left the local label declaration, which makes
clang unhappy and unable to build.
These labels are only historic and unneeded anyway so let's remove them.
This should be backported where
42a50bd19 is backported.
(cherry picked from commit
c9c475e40d9104cc12ebc9320cb768bb4794766d)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Wed, 13 May 2020 11:51:01 +0000 (13:51 +0200)]
BUG/MINOR: soft-stop: always wake up waiting threads on stopping
Currently the soft-stop can lead to old processes remaining alive for as
long as two seconds after receiving a soft-stop signal. What happens is
that when receiving SIGUSR1, one thread (usually the first one) wakes up,
handles the signal, sets "stopping", goes into runn_poll_loop(), and
discovers that stopping is set, so its also sets itself in the
stopping_thread_mask bit mask. After this it sees that other threads are
not yet willing to stop, so it continues to wait.
From there, other threads which were waiting in poll() expire after one
second on poll timeout and enter run_poll_loop() in turn. That's already
one second of wait time. They discover each in turn that they're stopping
and see that other threads are not yet stopping, so they go back waiting.
After the end of the first second, all threads know they're stopping and
have set their bit in stopping_thread_mask. It's only now that those who
started to wait first wake up again on timeout to discover that all other
ones are stopping, and can now quit. One second later all threads will
have done it and the process will quit.
This is effectively strictly larger than one second and up to two seconds.
What the current patch does is simple, when the first thread stops, it sets
its own bit into stopping_thread_mask then wakes up all other threads to do
also set theirs. This kills the first second which corresponds to the time
to discover the stopping state. Second, when a thread exists, it wakes all
other ones again because some might have gone back sleeping waiting for
"jobs" to go down to zero (i.e. closing the last connection). This kills
the last second of wait time.
Thanks to this, as SIGUSR1 now acts instantly again if there's no active
connection, or it stops immediately after the last connection has left if
one was still present.
This should be backported as far as 2.0.
(cherry picked from commit
d7a6b2f742340ae64eead154bce80ccb25158f84)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Dauchy [Mon, 11 May 2020 13:20:03 +0000 (15:20 +0200)]
BUG/MINOR: pollers: remove uneeded free in global init
Since commit
d4604adeaa8c ("MAJOR: threads/fd: Make fd stuffs
thread-safe"), we init pollers per thread using a helper. It was still
correct for mono-thread mode until commit
cd7879adc2c4 ("BUG/MEDIUM:
threads: Run the poll loop on the main thread too"). We now use a deinit
helper for all threads, making those free uneeded.
Only poll and select are affected by this very minor issue.
it could be backported from v1.8 to v2.1.
Fixes:
cd7879adc2c4 ("BUG/MEDIUM: threads: Run the poll loop on the main
thread too")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
(cherry picked from commit
42a50bd19be38962944377c73a7dc496fbd3bbc1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Wed, 6 May 2020 15:10:37 +0000 (17:10 +0200)]
BUG/MINOR: pools: use %u not %d to report pool stats in "show pools"
In dump_pools_to_trash() we happen to use %d to display unsigned ints!
This has probably been there since "show pools" was introduced so this
fix must be backported to all versions. The impact is negligible since
no pool uses 2 billion entries. It could possibly affect the report of
failed allocation counts but in this case there's a bigger problem to
solved!
(cherry picked from commit
cece694bcca10ec4d81346131297e66e35477dca)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Tim Duesterhus [Thu, 7 May 2020 17:21:31 +0000 (19:21 +0200)]
BUG/MINOR: cfgparse: Abort parsing the current line if an invalid \x sequence is encountered
This fixes OSS Fuzz issue https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21931.
OSS Fuzz detected a hang on configuration parsing for a 200kB line with a large number of
invalid escape sequences. Most likely due to the amounts of error output generated.
This issue is very minor, because usually generated configurations are to be trusted.
The bug exists since at the very least HAProxy 1.4. The patch may be backported if desired.
(cherry picked from commit
e6291956e7ad622dd2e4ac864caf05c93db417d3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Thu, 7 May 2020 17:27:02 +0000 (19:27 +0200)]
BUG/MEDIUM: http_ana: make the detection of NTLM variants safer
In issue #511 a problem was reported regarding NTLM and undesired session
sharing. This was caused by an attempt to limit the protection against
NTLM breakage to just NTLM and not properly working schemes in commit
fd9b68c48 ("BUG/MINOR: only mark connections private if NTLM is detected").
Unfortunately as reported in the issue above, the extent of possible
challenges for NTLM is a bit more complex than just the "NTLM" or
"Negotiate" words. There's also "Nego2" and these words can be followed
by a base64 value, which is not validated here. The list of possible
entries doesn't seem to be officially documented but can be reconstructed
from different public documents:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-ntht/
7daaf621-94d9-4942-a70a-
532e81ba293e
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-n2ht/
5c1d2bbc-e1d6-458f-9def-
dd258c181310
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-n2ht/
9201ed70-d245-41ce-accd-
e609637583bf
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-n2ht/
02be79f3-e360-475f-b468-
b96c878c70c7
This patch tries to fix all this on top of previous attempts by making
as private any connection that returns a www-authenticate header starting
with "Nego" or "NTLM". We don't need to be too strict, we really just
want to leave the connection shared if really sure it can be.
This must be backported to 1.8 but will require some adaptations. In
1.9 and 2.0 the check appears both for legacy and HTX. The simplest
thing to do is to look for "Negotiate" and fix all relevant places.
(cherry picked from commit
f1dccedcf63fdc1f6117cafaac89948904cc75b9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Thu, 7 May 2020 17:10:15 +0000 (19:10 +0200)]
BUG/MINOR: http-ana: fix NTLM response parsing again
Commit
9df188695f ("BUG/MEDIUM: http-ana: Handle NTLM messages correctly.")
tried to address an HTTP-reuse issue reported in github issue #511 by making
sure we properly detect extended NTLM responses, but made the match case-
sensitive while it's a token so it's case insensitive.
This should be backported to the same versions as the commit above.
(cherry picked from commit
49a1d28fcb69b87317ab7ae7f26505c69ec927d9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Thu, 7 May 2020 13:59:33 +0000 (15:59 +0200)]
BUG/MINOR: config: Make use_backend and use-server post-parsing less obscur
During use_backend and use-server post-parsing, if the log-format string used to
specify the backend or the server is just a single string, the log-format string
(a list, internally) is replaced by the string itself. Because the field is an
union, the list is not emptied according to the rules of art. The element, when
released, is not removed from the list. There is no bug, but it is clearly not
obvious and error prone.
This patch should fix #544. The fix for the use_backend post-parsing may be
backported to all stable releases. use-server is static in 2.1 and prior.
(cherry picked from commit
f82ea4ae4ca4a6fca70a6e874643db887a39f037)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Adis Nezirovic [Tue, 5 May 2020 11:57:28 +0000 (13:57 +0200)]
BUG/MEDIUM: lua: Fix dumping of stick table entries for STD_T_DICT
The issue can easily be reproduced with "stick on" statement
backend BE_NAME
stick-table type ip size 1k
stick on src
and calling dump() method on BE_NAME stick table from Lua
Before the fix, HAProxy would return 500 and log something like
the following:
runtime error: attempt to index a string value from [C] method 'dump'
Where one would expect a Lua table like this:
{
["IP_ADDR"] = {
["server_id"] = 1,
["server_name"] = "srv1"
}
}
This patch needs to backported to 1.9 and later releases.
(cherry picked from commit
ad9f9ed3f48c296a3f0fbe961ab142161c79c461)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Tue, 5 May 2020 14:13:36 +0000 (16:13 +0200)]
BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_UPDATE_{MIN,MAX}()
Just like in previous patch, it happens that HA_ATOMIC_UPDATE_MIN() and
HA_ATOMIC_UPDATE_MAX() would evaluate the (val) argument up to 3 times.
However this time it affects both thread and non-thread versions. It's
strange because the copy was properly performed for the (new) argument
in order to avoid this. Anyway it was done for the "val" one as well.
A quick code inspection showed that this currently has no effect as
these macros are fairly limited in usage.
It would be best to backport this for long-term stability (till 1.8)
but it will not fix an existing bug.
(cherry picked from commit
a4d9ee3d1cc7b321be420742263b67620036bb71)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Willy Tarreau [Tue, 5 May 2020 13:58:00 +0000 (15:58 +0200)]
BUG/MINOR: threads: fix multiple use of argument inside HA_ATOMIC_CAS()
When threads are disabled, HA_ATOMIC_CAS() becomes a simple compound
expression. However this expression presents a problem, which is that
its arguments are evaluated multiple times, once for the comparison
and once again for the assignement. This presents a risk of performing
some side-effect operations twice in the non-threaded case (e.g. in
case of auto-increment or function return).
The macro was rewritten using local copies for arguments like the
other macros do.
Fortunately a complete inspection of the code indicates that this case
currently never happens. It was however responsible for the strict-aliasing
warning emitted when building fd.c without threads but with 64-bit CAS.
This may be backported as far as 1.8 though it will not fix any existing
bug and is more of a long-term safety measure in case a future fix would
depend on this behavior.
(cherry picked from commit
d66345d6b0639dfc42986a4326e6f6f6d18e053e)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Thu, 30 Apr 2020 07:57:40 +0000 (09:57 +0200)]
BUG/MINOR: sample: Set the correct type when a binary is converted to a string
A binary sample data can be converted, implicitly or not, to a string by cutting
the buffer on the first null byte.
I guess this patch should be backported to all stable versions.
(cherry picked from commit
472ad51edeb4bb5ea601d7e296b5d0a0cd1a140a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Dauchy [Mon, 4 May 2020 11:52:40 +0000 (13:52 +0200)]
CLEANUP: connections: align function declaration
srv_cleanup_connections() is supposed to be static, so mark it as so.
This patch should be backported where commit
6318d33ce625
("BUG/MEDIUM: connections: force connections cleanup on server changes")
will be backported, that is to say v1.9 to v2.1.
Fixes:
6318d33ce625 ("BUG/MEDIUM: connections: force connections cleanup
on server changes")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
(cherry picked from commit
707ad328ef6f9061f2da04f81431bc2fe2b2348a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Dragan Dosen [Mon, 4 May 2020 07:07:28 +0000 (09:07 +0200)]
BUG/MEDIUM: ssl: fix the id length check within smp_fetch_ssl_fc_session_id()
After we call SSL_SESSION_get_id(), the length of the id in bytes is
stored in "len", which was never checked. This could cause unexpected
behavior when using the "ssl_fc_session_id" or "ssl_bc_session_id"
fetchers (eg. the result can be an empty value).
The issue was introduced with commit 105599c ("BUG/MEDIUM: ssl: fix
several bad pointer aliases in a few sample fetch functions").
This patch must be backported to 2.1, 2.0, and 1.9.
(cherry picked from commit
f35d69e7fc13aab89afcf394c5b96133d3060c1a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Mon, 4 May 2020 07:01:45 +0000 (09:01 +0200)]
BUG/MEDIUM: h1: Don't compare host and authority if only h1 headers are parsed
When only request headers are parsed, the host header should not be compared to
the request authority because no start-line was parsed. Thus there is no
authority.
Till now this bug was hidden because this parsing mode was only used for the
response in the FCGI multiplexer. Since the HTTP checks refactoring, the request
headers may now also be parsed without the start-line.
This patch fixes the issue #610. It must be backported to 2.1.
(cherry picked from commit
7032a3fd0a71426d338190ec0eeb86f5c547c3dd)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
William Dauchy [Sat, 2 May 2020 19:52:36 +0000 (21:52 +0200)]
BUG/MEDIUM: connections: force connections cleanup on server changes
I've been trying to understand a change of behaviour between v2.2dev5 and
v2.2dev6. Indeed our probe is regularly testing to add and remove
servers on a given backend such as:
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -
-> curl on the corresponding frontend: reply from server:31255
# echo "set server be_foo/srv1 addr 10.236.139.34 port 31257" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31257' by 'stats socket command'
# echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
health check port updated.
# echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31257 -
-> curl on the corresponding frontend: reply for server:31257
(notice the difference of weight)
# echo "set server be_foo/srv1 state maint" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 addr 0.0.0.0 port 0" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '10.236.139.34' to '0.0.0.0', port changed from '31257' to '0' by 'stats socket command'
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 263 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 0.0.0.0 0 1 256 256 0 15 3 0 14 0 0 0 - 0 -
-> curl on the corresponding frontend: reply from server:31255
# echo "set server be_foo/srv1 addr 10.236.139.34 port 31256" | sudo socat stdio /var/lib/haproxy/stats
IP changed from '0.0.0.0' to '10.236.139.34', port changed from '0' to '31256' by 'stats socket command'
# echo "set server be_foo/srv1 weight 256" | sudo socat stdio /var/lib/haproxy/stats
# echo "set server be_foo/srv1 check-port 8500" | sudo socat stdio /var/lib/haproxy/stats
health check port updated.
# echo "set server be_foo/srv1 state ready" | sudo socat stdio /var/lib/haproxy/stats
# echo "show servers state be_foo" | sudo socat stdio /var/lib/haproxy/stats
113 be_foo 1 srv0 10.236.139.34 2 0 1 1 105 15 3 4 6 0 0 0 - 31255 -
113 be_foo 2 srv1 10.236.139.34 2 0 256 256 2319 15 3 2 6 0 0 0 - 31256 -
-> curl on the corresponding frontend: reply from server:31257 (!)
Here we indeed would expect to get an anver from server:31256. The issue
is highly linked to the usage of `pool-purge-delay`, with a value which
is higher than the duration of the test, 10s in our case.
a git bisect between dev5 and dev6 seems to show commit
079cb9af22da6 ("MEDIUM: connections: Revamp the way idle connections are killed")
being the origin of this new behaviour.
So if I understand the later correctly, it seems that it was more a
matter of chance that we did not saw the issue earlier.
My patch proposes to force clean idle connections in the two following
cases:
- we set a (still running) server to maintenance
- we change the ip/port of a server
This commit should be backported to 2.1, 2.0, and 1.9.
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
(cherry picked from commit
6318d33ce62580d6ba6d418890e69300715291c4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Sat, 2 May 2020 07:21:24 +0000 (09:21 +0200)]
BUG/MEDIUM: mux-fcgi: Fix wrong test on FCGI_CF_KEEP_CONN in fcgi_detach()
When a stream is detached from its connection, we try to move the connection in
an idle list to keep it opened, the session one or the server one. But it must
only be done if there is no connection error and if we want to keep it
open. This last statement is true if FCGI_CF_KEEP_CONN flag is set. But the test
is inverted at the stage.
This patch must be backported to 2.1.
(cherry picked from commit
9bcd973a818bcc737f627ef9cef211b33229f18f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Sat, 2 May 2020 07:17:52 +0000 (09:17 +0200)]
BUG/MEDIUM: mux_fcgi: Free the FCGI connection at the end of fcgi_release()
fcgi_release() function is responsible to release a FCGI connection. But the
release of the connection itself is missing.
This patch must be backported to 2.1.
(cherry picked from commit
8694f25040958cfab9e592bee821bc5769d1a1b5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Christopher Faulet [Wed, 20 May 2020 14:30:32 +0000 (16:30 +0200)]
BUG/MINOR: checks: Remove a warning about http health checks
A warning message was backported with the commit 104d66a ("MINOR: checks: Add a
way to send custom headers and payload during http chekcs"). We try to avoid to
backport such warning in stable versions. Thus, the warning is removed.
The patch was directly introduced on 2.1, there is no upstream commit ID for
this patch. It must be backported everywhere the abobe commit is.
Christopher Faulet [Wed, 20 May 2020 14:15:02 +0000 (16:15 +0200)]
BUG/MINOR: checks: Compute the right HTTP request length for HTTP health checks
A bug was introduced by the commit 104d66a ("MINOR: checks: Add a way to send
custom headers and payload during http chekcs"). When the method and the uri are
configured for an HTTP health check, but not the version, the request length is
not properly computed. One byte is missing (the space between the method and the
uri).
This patch should fix the issue #641. The patch was directly introduced on 2.1,
there is no upstream commit ID for this patch. It must be backported everywhere
the abobe commit is.
Christopher Faulet [Mon, 27 Apr 2020 10:13:06 +0000 (12:13 +0200)]
BUG/MINOR: checks/server: use_ssl member must be signed
(cherry picked from commit
4a8c026117f2f19bd2f8f0ac64954c8a4f717bc8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
[Cf: Forgot to mention to backport it with the commit
8a5bf35ce
("BUG/MINOR: checks: Respect the no-check-ssl option")]
Willy Tarreau [Thu, 7 May 2020 16:13:07 +0000 (18:13 +0200)]
Revert "BUG/MINOR: connection: make sure to correctly tag local PROXY connections"
This reverts commit
f1b378487af3f22fd66b8013e52530c9a6c522de.
As explained in this thread, it breaks Dovecot:
https://www.mail-archive.com/haproxy@formilux.org/msg36890.html
Sadly, the proposed fix there in turn broke Apache 2.4. A different
(likely configurable) solution will be needed.
This patch must also be reverted from 2.0.
Willy Tarreau [Thu, 7 May 2020 16:11:49 +0000 (18:11 +0200)]
Revert "BUG/MINOR: connection: always send address-less LOCAL PROXY connections"
This reverts commit
edeae2af45fe5f918031b559b8cc62d3c3abc919.
As reported by Olivier Doucet here, it breaks Apache 2.4:
https://www.mail-archive.com/haproxy@formilux.org/msg37218.html
Christopher Faulet [Wed, 29 Apr 2020 12:32:26 +0000 (14:32 +0200)]
REGTEST: http-rules: Require PCRE or PCRE2 option to run map_redirect script
Only PCRE was specified as required option to execute this script. But PCRE2 is
an valid alternative.
(cherry picked from commit
8aa825a35686520420ad65dcdb1219285a22c059)
Signed-off-by: Willy Tarreau <w@1wt.eu>
William Lallemand [Tue, 28 Apr 2020 19:52:38 +0000 (21:52 +0200)]
REGTEST: ssl: test the client certificate authentication
This reg-test tests the client auth feature of HAProxy for both the
backend and frontend section with a CRL list.
This reg-test uses 2 chained listeners because vtest does not handle the
SSL. Test the frontend client auth and the backend side at the same
time.
It sends 3 requests: one with a correct certificate, one with an expired
one and one which was revoked. The client then checks if we received the
right one with the right error.
Certificates, CA and CRL are expiring in 2050 so it should be fine for
the CI.
This test could be backported as far as HAProxy 1.6
(cherry picked from commit
2e0dbb74126115e91adcf584f671cd9d1952ddd9)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Thu, 16 Apr 2020 13:14:17 +0000 (15:14 +0200)]
BUILD: Makefile: add linux-musl to TARGET
Other users are using musl, namely on Docker. It builds fine with
linux-glibc-legacy but not linux-glibc, which needs to first disable
USE_BACKTRACE. Better add a valid entry for it instead of hacking
around another libc.
(cherry picked from commit
39b2fda9155ee0233585e48701274756337796aa)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 09:31:58 +0000 (10:31 +0100)]
BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr()
The approach was wrong. USE_DL is for the makefile to know if it's required
to append -ldl at link time. Some platforms do not need it (and in fact do
not have it) yet they have a working dladdr(). The real condition is related
to ELF. Given that due to Lua, all platforms that require -ldl already have
USE_DL set, let's replace USE_DL with __ELF__ here and consider that dladdr
is always needed on ELF, which basically is already the case.
(cherry picked from commit
109201fc5c64ea1ae98e8f75f6890efa88d981bf)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 09:19:36 +0000 (10:19 +0100)]
BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms
resolve_sym_name() doesn't build when USE_DL is set on non-GNU platforms
because "Elf(W)" isn't defined. Since it's only used for dladdr1(), let's
refactor all this so that we can completely ifdef out that part on other
platforms. Now we have a separate function to perform the call depending
on the platform and it only returns the size when available.
(cherry picked from commit
9133e48f2a6fa9deccfe02798dcb2f2abd26ed9e)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 06:39:32 +0000 (07:39 +0100)]
MINOR: debug: dump the whole trace if we can't spot the starting point
Instead of special-casing the use of the symbol resolving to decide
whether to dump a partial or complete trace, let's simply start over
and dump everything when we reach the end after having found nothing.
It will be more robust against dirty traces as well.
(cherry picked from commit
a91b7946bdd69fd837085f67f085565317efb1b1)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 10:54:16 +0000 (11:54 +0100)]
MINOR: debug: use our own backtrace function on clang+x86_64
A test on FreeBSD with clang 4 to 8 produces this on a call to a
spinning loop on the CLI:
call trace(5):
| 0x53e2bc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e
with our own function it correctly produces this:
call trace(20):
| 0x53e2dc [eb 16 48 63 c3 48 c1 e0]: wdt_handler+0x10c
| 0x800e02cfe [e8 5d 83 00 00 8b 18 8b]: libthr:pthread_sigmask+0x53e
| 0x800e022bf [48 83 c4 38 5b 41 5c 41]: libthr:pthread_getspecific+0xdef
| 0x7ffffffff003 [48 8d 7c 24 10 6a 00 48]: main+0x7fffffb416f3
| 0x801373809 [85 c0 0f 84 6f ff ff ff]: libc:__sys_gettimeofday+0x199
| 0x801373709 [89 c3 85 c0 75 a6 48 8b]: libc:__sys_gettimeofday+0x99
| 0x801371c62 [83 f8 4e 75 0f 48 89 df]: libc:gettimeofday+0x12
| 0x51fa0a [48 89 df 4c 89 f6 e8 6b]: ha_thread_dump_all_to_trash+0x49a
| 0x4b723b [85 c0 75 09 49 8b 04 24]: mworker_cli_sockpair_new+0xd9b
| 0x4b6c68 [85 c0 75 08 4c 89 ef e8]: mworker_cli_sockpair_new+0x7c8
| 0x532f81 [4c 89 e7 48 83 ef 80 41]: task_run_applet+0xe1
So let's add clang+x86_64 to the list of platforms that will use our
simplified version. As a bonus it will not require to link with
-lexecinfo on FreeBSD and will work out of the box when passing
USE_BACKTRACE=1.
(cherry picked from commit
899e5f69a1085983f52a46cd842921874fd2de36)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 06:44:06 +0000 (07:44 +0100)]
MINOR: debug: improve backtrace() on aarch64 and possibly other systems
It happens that on aarch64 backtrace() only returns one entry (tested
with gcc 4.7.4, 5.5.0 and 7.4.1). Probably that it refrains from unwinding
the stack due to the risk of hitting a bad pointer. Here we can use
may_access() to know when it's safe, so we can actually unwind the stack
without taking risks. It happens that the faulting function (the one
just after the signal handler) is not listed here, very likely because
the signal handler uses a special stack and did not create a new frame.
So this patch creates a new my_backtrace() function in standard.h that
either calls backtrace() or does its own unrolling. The choice depends
on HA_HAVE_WORKING_BACKTRACE which is set in compat.h based on the build
target.
(cherry picked from commit
13faf16e1ebfc93e49dc5a8948519fad600e1136)
[wt: adjusted context in debug.c]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 06:38:23 +0000 (07:38 +0100)]
MINOR: debug: report the number of entries in the backtrace
It's useful to get an indication of unresolved stuff or memory
corruption to have the apparent depth of the stack trace in the
output, especially if we dump nothing.
(cherry picked from commit
cdd8074433d8ada867ba9b2bc11db098fd363035)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 09:53:07 +0000 (10:53 +0100)]
MINOR: wdt: do not depend on USE_THREAD
There is no reason for restricting the use of the watchdog to threads
anymore, as it works perfectly without threads as well.
(cherry picked from commit
e58114e0e5e1ef1f67194bde0b0b8d159bd3ac48)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 07:31:47 +0000 (08:31 +0100)]
BUILD: Makefile: include librt before libpthread
Statically building on for i386/x86_64 on linux+glibc 2.18 fails in rt with
undefined references to pthread_attr_init and a few others. Let's just swap
the two libs in order to fix this.
(cherry picked from commit
c0bbdc196ded7b6d28221ca9b96f0cf8b41203ab)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 4 Mar 2020 05:01:40 +0000 (06:01 +0100)]
MINOR: debug: call backtrace() once upon startup
Calling backtrace() will access libgcc at runtime. We don't want to do
it after the chroot, so let's perform a first call to have it ready in
memory for later use.
(cherry picked from commit
0214b45a61cd7cfd59d729b23f497a687192cde6)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 14:40:23 +0000 (15:40 +0100)]
MEDIUM: debug: add support for dumping backtraces of stuck threads
When a panic() occurs due to a stuck thread, we'll try to dump a
backtrace of this thread if the config directive USE_BACKTRACE is
set (which is the case on linux+glibc). For this we use the
backtrace() call provided by glibc and iterate the pointers through
resolve_sym_name(). In order to minimize the output (which is limited
to one buffer), we only do this for stuck threads, and we start the
dump above ha_panic()/ha_thread_dump_all_to_trash(), and stop when
meeting known points such as main/run_tasks_from_list/run_poll_loop.
If enabled without USE_DL, the dump will be complete with no details
except that pointers will all be given relative to main, which is
still better than nothing.
The new USE_BACKTRACE config option is enabled by default on glibc since
it has been present for ages. When it is set, the export-dynamic linker
option is enabled so that all non-static symbols are properly resolved.
(cherry picked from commit
f5b4e064dcb1f7c97c87b68dbbbf7a4371e05bc7)
[wt: adjusted context in makefile and debug.c ;
s/run_tasks_from_list/process_runnable_tasks for 2.1 and older]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 16:29:58 +0000 (17:29 +0100)]
MINOR: cli: make "show fd" rely on resolve_sym_name()
This way we can drop all hard-coded iocb matching.
(cherry picked from commit
cf12f2ee6665281342487a0a3517ba3286245550)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 16:13:02 +0000 (17:13 +0100)]
MINOR: debug: use resolve_sym_name() to dump task handlers
Now in "show threads", the task/tasklet handler will be resolved
using this function, which will provide more detailed results and
will still support offsets to main for unresolved symbols.
(cherry picked from commit
2e89b0930b32b6f694df1c7b35271686acc7cb93)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 16:09:08 +0000 (17:09 +0100)]
MINOR: tools: add resolve_sym_name() to resolve function pointers
We use various hacks at a few places to try to identify known function
pointers in debugging outputs (show threads & show fd). Let's centralize
this into a new function dedicated to this. It already knows about the
functions matched by "show threads" and "show fd", and when built with
USE_DL, it can rely on dladdr1() to resolve other functions. There are
some limitations, as static functions are not resolved, linking with
-rdynamic is mandatory, and even then some functions will not necessarily
appear. It's possible to do a better job by rebuilding the whole symbol
table from the ELF headers in memory but it's less portable and the gains
are still limited, so this solution remains a reasonable tradeoff.
(cherry picked from commit
eb8b1ca3eb4c8d4688e1a4a1d9c1b91f68324e09)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 14:57:10 +0000 (15:57 +0100)]
MINOR: tools: add new function dump_addr_and_bytes()
This function dumps <n> bytes from <addr> in hex form into buffer <buf>
enclosed in brackets after the address itself, formatted on 14 chars
including the "0x" prefix. This is meant to be used as a prefix for code
areas. For example: "0x7f10b6557690 [48 c7 c0 0f 00 00 00 0f]: "
It relies on may_access() to know if the bytes are dumpable, otherwise "--"
is emitted. An optional prefix is supported.
(cherry picked from commit
762fb3ec8e05ca7a61b28e8a876c194c9db591ef)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Tue, 3 Mar 2020 13:59:56 +0000 (14:59 +0100)]
MINOR: haproxy: export run_poll_loop
This will help refine debug traces.
(cherry picked from commit
3ebd55ee5154910241cc7541772e039f510b6750)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 14:57:02 +0000 (16:57 +0200)]
MINOR: stream: report the list of active filters on stream crashes
Now we very rarely catch spinning streams, and whenever we catch one it
seems a filter is involved, but we currently report no info about them.
Let's print the list of enabled filters on the stream with such a crash
to help with the reports. A typical output will now look like this:
[ALERT] 121/165908 (1110) : A bogus STREAM [0x7fcaf4016a60] is spinning at 2 calls per second and refuses to die, aborting now! Please report this error to developers [strm=0x7fcaf4016a60 src=127.0.0.1 fe=l1 be=l1 dst=<CACHE> rqf=
6dc42000 rqa=48000 rpf=
a0040223 rpa=
24000000 sif=EST,10008 sib=DIS,80110 af=(nil),0 csf=0x7fcaf4023c00,10c000 ab=0x7fcaf40235f0,4 csb=(nil),0 cof=0x7fcaf4016610,1300:H1(0x7fcaf4016840)/RAW((nil))/tcpv4(29) cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0) filters={0x7fcaf4016fb0="cache store filter", 0x7fcaf4017080="compression filter"}]
This may be backported to 2.0.
(cherry picked from commit
9753d61288936db804628c1e0bad20b7593ad384)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 11:15:32 +0000 (13:15 +0200)]
BUG/MEDIUM: shctx: bound the number of loops that can happen around the lock
Given that a "count" value of 32M was seen in _shctx_wait4lock(), it
is very important to prevent this from happening again. It's absolutely
essential to prevent the value from growing unbounded because with an
increase of the number of threads, the number of successive failed
attempts will necessarily grow.
Instead now we're scanning all 2^p-1 values from 3 to 255 and are
bounding to count to 255 so that in the worst case each thread tries an
xchg every 255 failed read attempts. That's one every 4 on average per
thread when there are 64 threads, which corresponds to the initial count
of 4 for the first attempt so it seems like a reasonable value to keep a
low latency.
The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
(cherry picked from commit
86c6a9221aa5bd7516300f399e87aba09d61de06)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 11:05:29 +0000 (13:05 +0200)]
BUG/MEDIUM: shctx: really check the lock's value while waiting
Jérôme reported an amazing crash in the spinlock version of
_shctx_wait4lock() with an extremely high <count> value of 32M! The
root cause is that the function cannot deal with contention on the lock
at all because it forgets to check if the lock's value has changed! As
such, every time it's called due to a contention, it waits twice as
long before trying again and lets the caller check for the contention
by itself.
The correct thing to do is to compare the value again at each loop.
This way it makes sure to mostly perform read accesses on the shared
cache line without writing too often, and to be ready fast enough to
try to grab the lock. And we must not increase the count on success
either!
Unfortunately I'd have expected to see a performance boost on the cache
with this but there was absolutely no change, so it's very likely that
these issues only happen once in a while and are sufficient to derail
the process when they strike, but not to have a permanent performance
impact.
The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
(cherry picked from commit
3801bdc3fc1c2c6d596b4c4ee3449a76f2da8654)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 10:26:03 +0000 (12:26 +0200)]
BUG/MINOR: debug: properly use long long instead of long for the thread ID
I changed my mind twice on this one and pushed after the last test with
threads disabled, without re-enabling long long, causing this rightful
build warning.
This needs to be backported if the previous commit
ff64d3b027 ("MINOR:
threads: export the POSIX thread ID in panic dumps") is backported as
well.
(cherry picked from commit
f0e5da20e1a6e9b532f7de66079da5fa53e9d01c)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 09:28:49 +0000 (11:28 +0200)]
MINOR: threads: export the POSIX thread ID in panic dumps
It is very difficult to map a panic dump against a gdb thread dump
because the thread numbers do not match. However gdb provides the
pthread ID but this one is supposed to be opaque and not to be cast
to a scalar.
This patch provides a fnuction, ha_get_pthread_id() which retrieves
the pthread ID of the indicated thread and casts it to an unsigned
long long so as to lose the least possible amount of information from
it. This is done cleanly using a union to maintain alignment so as
long as these IDs are stored on 1..8 bytes they will be properly
reported. This ID is now presented in the panic dumps so it now
becomes possible to map these threads. When threads are disabled,
zero is returned. For example, this is a panic dump:
Thread 1 is about to kill the process.
*>Thread 1 : id=0x7fe92b825180 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
stuck=1 prof=0 harmless=0 wantrdv=0
cpu_ns: poll=5119122 now=
2009446995 diff=
2004327873
curr_task=0xc99bf0 (task) calls=4 last=0
fct=0x592440(task_run_applet) ctx=0xca9c50(<CLI>)
strm=0xc996a0 src=unix fe=GLOBAL be=GLOBAL dst=<CLI>
rqf=848202 rqa=0 rpf=
80048202 rpa=0 sif=EST,200008 sib=EST,204018
af=(nil),0 csf=0xc9ba40,8200
ab=0xca9c50,4 csb=(nil),0
cof=0xbf0e50,1300:PASS(0xc9cee0)/RAW((nil))/unix_stream(20)
cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0)
call trace(20):
| 0x59e4cf [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xff/0x10c
| 0x7fe92c170690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
| 0x7ffce29519d9 [48 c1 e2 20 48 09 d0 48]: linux-vdso:+0x9d9
| 0x7ffce2951d54 [eb d9 f3 90 e9 1c ff ff]: linux-vdso:__vdso_gettimeofday+0x104/0x133
| 0x57b484 [48 89 e6 48 8d 7c 24 10]: main+0x157114
| 0x50ee6a [85 c0 75 76 48 8b 55 38]: main+0xeaafa
| 0x50f69c [48 63 54 24 20 85 c0 0f]: main+0xeb32c
| 0x59252c [48 c7 c6 d8 ff ff ff 44]: task_run_applet+0xec/0x88c
Thread 2 : id=0x7fe92b6e6700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=786738 now=1086955 diff=300217
curr_task=0
Thread 3 : id=0x7fe92aee5700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=828056 now=1129738 diff=301682
curr_task=0
Thread 4 : id=0x7fe92a6e4700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=818900 now=1153551 diff=334651
curr_task=0
And this is the gdb output:
(gdb) info thr
Id Target Id Frame
* 1 Thread 0x7fe92b825180 (LWP 15234) 0x00007fe92ba81d6b in raise () from /lib64/libc.so.6
2 Thread 0x7fe92b6e6700 (LWP 15235) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
3 Thread 0x7fe92a6e4700 (LWP 15237) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
4 Thread 0x7fe92aee5700 (LWP 15236) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
We can clearly see that while threads 1 and 2 are the same, gdb's
threads 3 and 4 respectively are haproxy's threads 4 and 3.
This may be backported to 2.0 as it removes some confusion in github issues.
(cherry picked from commit
ff64d3b027d968a29b5bac55121bbce591fe54b2)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 1 May 2020 07:51:11 +0000 (09:51 +0200)]
BUG/MEDIUM: listener: mark the thread as not stuck inside the loop
We tried hard to make sure we report threads as not stuck at various
crucial places, but one of them is special, it's the listener_accept()
function. The reason it is special is because it will loop a certain
number of times (default: 64) accepting incoming connections, allocating
resources, dispatching them to other threads or running L4 rules on them,
and while all of this is supposed to be extremely fast, when the machine
slows down or runs low on memory, the expectedly small delays in malloc()
caused by contention with other threads can quickly accumulate and suddenly
become critical to the point of triggering the watchdog. Furthermore, it
is technically possible to trigger this by pure configuration by setting
a huge tune.maxaccept value, which should not be possible.
Given that each operation isn't related to the same task but to a different
one each time, it is appropriate to mark the thread as not stuck each time
it accepts new work that possibly gets dispatched to other threads which
execute it.
This looks like this could be a good reason for the issue reported in
issue #388.
This fix must be backported to 2.0.
(cherry picked from commit
8d2c98b76c505e199ba3abac632fa98aab9f7b20)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 29 Apr 2020 09:59:02 +0000 (11:59 +0200)]
BUG/MEDIUM: sample: make the CPU and latency sample fetches check for a stream
cpu_calls, cpu_ns_avg, cpu_ns_tot, lat_ns_avg and lat_ns_tot depend on the
stream to find the current task and must check for it or they may cause a
crash if misused or used in a log-format string after commit
5f940703b3
("MINOR: log: Don't depends on a stream to process samples in log-format
string").
This must be backported as far as 1.9.
(cherry picked from commit
e0dd210cea36972a6f566b9b49d8f0c33340df99)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 29 Apr 2020 09:50:38 +0000 (11:50 +0200)]
BUG/MEDIUM: http: the "unique-id" sample fetch could crash without a steeam
Since commit
5f940703b3 ("MINOR: log: Don't depends on a stream to process
samples in log-format string") it has become quite obvious that a few sample
fetch functions and converters were still heavily dependent on the presence
of a stream without testing for it.
The unique-id sample fetch function, if called without a stream, will result
in a crash.
This fix adds a check for the stream's existence, and should be backported
to all stable versions up to 1.7.
(cherry picked from commit
a1062a4de8ba300979cc380b69b284f86dc13853)
[wt: adjusted ctx: unique_id was a ptr not an ist in 2.1]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 29 Apr 2020 09:52:13 +0000 (11:52 +0200)]
BUG/MEDIUM: http: the "http_first_req" sample fetch could crash without a steeam
Since commit
5f940703b3 ("MINOR: log: Don't depends on a stream to process
samples in log-format string") it has become quite obvious that a few sample
fetch functions and converters were still heavily dependent on the presence
of a stream without testing for it.
The http_first_req sample fetch function, if called without a stream, will
result in a crash.
This fix adds a check for the stream's existence, and should be backported
to all stable versions up to 1.6.
(cherry picked from commit
79512b6d8f2f049a86b2332d4f4dc972cc33d631)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 29 Apr 2020 09:44:54 +0000 (11:44 +0200)]
BUG/MEDIUM: capture: capture.{req,res}.* crash without a stream
Since commit
5f940703b3 ("MINOR: log: Don't depends on a stream to process
samples in log-format string") it has become quite obvious that a few sample
fetch functions and converters were still heavily dependent on the presence
of a stream without testing for it.
The capture.req.hdr, capture.res.hdr, capture.req.method, capture.req.uri,
capture.req.ver and capture.res.ver sample fetches used to assume the
presence of a stream, which is not necessarily the case (especially after
the commit above) and would crash haproxy if incorrectly used. Let's make
sure they check for this stream.
This fix adds a check for the stream's existence, and should be backported
to all stable versions up to 1.6.
(cherry picked from commit
0898c2d2f2c5ae043a443550992efd90ec18b2ab)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Wed, 29 Apr 2020 09:22:08 +0000 (11:22 +0200)]
BUG/MEDIUM: capture: capture-req/capture-res converters crash without a stream
Since commit
5f940703b3 ("MINOR: log: Don't depends on a stream to process
samples in log-format string") it has become quite obvious that a few sample
fetch functions and converters were still heavily dependent on the presence
of a stream without testing for it.
The capture-req and capture-res converters were in this case and could
crash the process if misused.
This fix adds a check for the stream's existence, and should be backported
to all stable versions up to 1.6.
(cherry picked from commit
5575896ba1120f136e680fd669afb5b084faa530)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Fri, 24 Apr 2020 05:19:04 +0000 (07:19 +0200)]
BUG/MINOR: mux-fcgi: Be sure to have a connection as session's origin to use it
When default parameters are set in a request message, we get the client
connection using the session's origin. But it is not necessarily a
conncection. For instance, for health checks, thanks to recent changes, it may
be a check object. At this step, the client connection may be NULL. Thus, we
must be sure to have a client connection before using it.
This patch must be backported to 2.1.
(cherry picked from commit
bb86a0f7be6ba8a248321a59c4b865e01e33f106)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Tue, 21 Apr 2020 09:48:53 +0000 (11:48 +0200)]
BUG/MINOR: obj_type: Handle stream object in obj_base_ptr() function
The stream object (OBJ_TYPE_STREAM) was missing in the switch statement of the
obj_base_ptr() function.
This patch must be backported as far as 2.0.
(cherry picked from commit
a142c1deb401308055f29b9d52a47e5a0ad962e7)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Gaetan Rivet [Thu, 13 Feb 2020 09:30:01 +0000 (10:30 +0100)]
BUG/MINOR: checks: chained expect will not properly wait for enough data
TCP check expect matching strings or binary patterns are able to know
prior to applying their match function whether the available data is
already sufficient to attempt the match or not.
As such, on insufficient data the expect is postponed. This behavior
avoids unnecessary matches when the data could not possibly match.
When chaining expect, upon passing the previous and going onto the next
however, this length check is not done again. Then the match is done and
will necessarily fail, triggering a new wait for more data. The end
result is the same for a slightly higher cost.
Check received data length for all expects in a chain.
This bug exists since the introduction of the feature:
Fixes:
5ecb77f4c76f ("MEDIUM: checks: add send/expect tcp based check")
Version 1.5+ impacted.
(cherry picked from commit
738ee76aa7a9968cb9c9610725c26622c50f0b33)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Thu, 26 Mar 2020 18:48:20 +0000 (19:48 +0100)]
BUG/MEDIUM: server/checks: Init server check during config validity check
The options and directives related to the configuration of checks in a backend
may be defined after the servers declarations. So, initialization of the check
of each server must not be performed during configuration parsing, because some
info may be missing. Instead, it must be done during the configuration validity
check.
Thus, callback functions are registered to be called for each server after the
config validity check, one for the server check and another one for the server
agent-check. In addition deinit callback functions are also registered to
release these checks.
This patch should be backported as far as 1.7. But per-server post_check
callback functions are only supported since the 2.1. And the initcall mechanism
does not exist before the 1.9. Finally, in 1.7, the code is totally
different. So the backport will be harder on older versions.
(cherry picked from commit
8892e5d30b2c20fce04f2f44202bc1f127076c15)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Fri, 27 Mar 2020 17:55:49 +0000 (18:55 +0100)]
BUG/MINOR: checks: Respect the no-check-ssl option
This options is used to force a non-SSL connection to check a SSL server or to
invert a check-ssl option inherited from the default section. The use_ssl field
in the check structure is used to know if a SSL connection must be used
(use_ssl=1) or not (use_ssl=0). The server configuration is used by default.
The problem is that we cannot distinguish the default case (no specific SSL
check option) and the case of an explicit non-SSL check. In both, use_ssl is set
to 0. So the server configuration is always used. For a SSL server, when
no-check-ssl option is set, the check is still performed using a SSL
configuration.
To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value :
* 0 = use server config
* 1 = force SSL
* -1 = force non-SSL
The same is done for the server parameter. It is not really necessary for
now. But it is a good way to know is the server no-ssl option is set.
In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl
to 1 for a check. Instead the flag is directly tested to prepare or destroy the
server SSL context.
This patch should be backported as far as 1.8.
(cherry picked from commit
f61f33a1b274c2a42afd96aab19ee8e1d8b121cc)
[wt: minor context adjustments]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Thu, 9 Apr 2020 06:44:06 +0000 (08:44 +0200)]
MINOR: checks: Add a way to send custom headers and payload during http chekcs
The 'http-check send' directive have been added to add headers and optionnaly a
payload to the request sent during HTTP healthchecks. The request line may be
customized by the "option httpchk" directive but there was not official way to
add extra headers. An old trick consisted to hide these headers at the end of
the version string, on the "option httpchk" line. And it was impossible to add
an extra payload with an "http-check expect" directive because of the
"Connection: close" header appended to the request (See issue #16 for details).
So to make things official and fully support payload additions, the "http-check
send" directive have been added :
option httpchk POST /status HTTP/1.1
http-check send hdr Content-Type "application/json;charset=UTF-8" \
hdr X-test-1 value1 hdr X-test-2 value2 \
body "{id: 1, field: \"value\"}"
When a payload is defined, the Content-Length header is automatically added. So
chunk-encoded requests are not supported yet. For now, there is no special
validity checks on the extra headers.
This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and
as far as possible, it may be backported, at least as far as 1.8.
(cherry picked from commit
8acb1284bcd47301fce0aa45a7083dbcef94cb5f)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Christopher Faulet [Sun, 26 Apr 2020 07:50:31 +0000 (09:50 +0200)]
BUG/MINOR: check: Update server address and port to execute an external check
Server address and port may change at runtime. So the address and port passed as
arguments and as environment variables when an external check is executed must
be updated. The current number of connections on the server was already updated
before executing the command. So the same mechanism is used for the server
address and port. But in addition, command arguments are also updated.
This patch must be backported to all stable versions. It should fix the
issue #577.
(cherry picked from commit
aaae9a0e9952bef30bb0da0eecd714f2db64d545)
Signed-off-by: Willy Tarreau <w@1wt.eu>
William Lallemand [Sat, 25 Apr 2020 20:03:29 +0000 (22:03 +0200)]
MINOR: contrib: make the peers wireshark dissector a plugin
The wireshark dissector could only be build within wireshark, which
means maintaining a wireshark binary just for this dissector. It was not
really convenient to update wireshark because of this.
This patch converts the dissector into a .so plugin which is built with
the .h found in distributions instead of the whole wireshark sources.
(cherry picked from commit
2be58f758402010f35d67279bec7125c9665e936)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Fri, 24 Apr 2020 04:15:24 +0000 (06:15 +0200)]
MEDIUM: memory: make pool_gc() run under thread isolation
pool_gc() causes quite some stress on the memory allocator because
it calls a lot of free() calls while other threads are using malloc().
In addition, pool_gc() needs to take care of possible locking because
it may be called from pool allocators. One way to avoid all this is to
use thread_isolate() to make sure the gc runs alone. By putting less
pressure on the pools and getting rid of the locks, it may even take
less time to complete.
(cherry picked from commit
c0e2ff202bda535aa60f3dc272ccf19a7b4f5094)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Jerome Magnin [Thu, 23 Apr 2020 17:01:17 +0000 (19:01 +0200)]
DOC: option logasap does not depend on mode
The documentation for option logasap misleads into thinking it is
only valid for mode http. It is actually valid for mode tcp too,
so this patch tries to disambiguate the current wording.
(cherry picked from commit
95fb57b92331c211f69af333cf5e043284ffbf25)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Thu, 23 Apr 2020 15:54:47 +0000 (17:54 +0200)]
BUG/MINOR: http: make url_decode() optionally convert '+' to SP
The url_decode() function used by the url_dec converter and a few other
call points is ambiguous on its processing of the '+' character which
itself isn't stable in the spec. This one belongs to the reserved
characters for the query string but not for the path nor the scheme,
in which it must be left as-is. It's only in argument strings that
follow the application/x-www-form-urlencoded encoding that it must be
turned into a space, that is, in query strings and POST arguments.
The problem is that the function is used to process full URLs and
paths in various configs, and to process query strings from the stats
page for example.
This patch updates the function to differentiate the situation where
it's parsing a path and a query string. A new argument indicates if a
query string should be assumed, otherwise it's only assumed after seeing
a question mark.
The various locations in the code making use of this function were
updated to take care of this (most call places were using it to decode
POST arguments).
The url_dec converter is usually called on path or url samples, so it
needs to remain compatible with this and will default to parsing a path
and turning the '+' to a space only after a question mark. However in
situations where it would explicitly be extracted from a POST or a
query string, it now becomes possible to enforce the decoding by passing
a non-null value in argument.
It seems to be what was reported in issue #585. This fix may be
backported to older stable releases.
(cherry picked from commit
62ba9ba6ca259f45f7bd8de436b743b3ad9ac04a)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Willy Tarreau [Thu, 23 Apr 2020 15:08:02 +0000 (17:08 +0200)]
BUG/MINOR: tools: fix the i386 version of the div64_32 function
As reported in issue #596, the edx register isn't marked as clobbered
in div64_32(), which could technically allow gcc to try to reuse it
if it needed a copy of the 32 highest bits of the o1 register after
the operation.
Two attempts were tried, one using a dummy 32-bit local variable to
store the intermediary edx and another one switching to "=A" and making
result a long long. It turns out the former makes the resulting object
code significantly dirtier while the latter makes it better and was
kept. This is due to gcc's difficulties at working with register pairs
mixing 32- and 64- bit values on i386. It was verified that no code
change happened at all on x86_64, armv7, aarch64 nor mips32.
In practice it's only used by the frequency counters so this bug
cannot even be triggered but better fix it.
This may be backported to stable branches though it will not fix any
issue.
(cherry picked from commit
09568fd54d2f091860cafa5173893445cd55c44c)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Olivier Houchard [Wed, 22 Apr 2020 19:51:14 +0000 (21:51 +0200)]
BUG/MEDIUM: http-ana: Handle NTLM messages correctly.
When checking www-authenticate headers, we don't want to just accept
"NTLM" as value, because the server may send "HTLM <base64 value>". Instead,
just check that it starts with NTLM.
This should be backported to 2.1, 2.0, 1.9 and 1.8.
(cherry picked from commit
9df188695fbf1ff17de3861ec5b281365800c7f0)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Jerome Magnin [Wed, 22 Apr 2020 09:40:18 +0000 (11:40 +0200)]
BUG/MINOR: ssl: default settings for ssl server options are not used
Documentation states that default settings for ssl server options can be set
using either ssl-default-server-options or default-server directives. In practice,
not all ssl server options can have default values, such as ssl-min-ver, ssl-max-ver,
etc..
This patch adds the missing ssl options in srv_ssl_settings_cpy() and srv_parse_ssl(),
making it possible to write configurations like the following examples, and have them
behave as expected.
global
ssl-default-server-options ssl-max-ver TLSv1.2
defaults
mode http
listen l1
bind 1.2.3.4:80
default-server ssl verify none
server s1 1.2.3.5:443
listen l2
bind 2.2.3.4:80
default-server ssl verify none ssl-max-ver TLSv1.3 ssl-min-ver TLSv1.2
server s1 1.2.3.6:443
This should be backported as far as 1.8.
This fixes issue #595.
(cherry picked from commit
2e8d52f869ed7673a8274ec7b045161e09350251)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Olivier Doucet [Tue, 21 Apr 2020 07:32:56 +0000 (09:32 +0200)]
DOC: Improve documentation on http-request set-src
This patch adds more explanation on how to use "http-request set-src"
and a link to "option forwardfor".
This patch can be applied to all previous version starting at 1.6
Reviewed-by: Tim Duesterhus <tim@bastelstu.be>
(cherry picked from commit
56e3120f9ee0db8de166f5e6c9cf2ce2fc4c2364)
Signed-off-by: Willy Tarreau <w@1wt.eu>