BUG/MAJOR: resolvers: add other missing references during resolution removal
authorWilly Tarreau <w@1wt.eu>
Mon, 18 Oct 2021 14:49:14 +0000 (16:49 +0200)
committerChristopher Faulet <cfaulet@haproxy.com>
Wed, 3 Nov 2021 10:39:59 +0000 (11:39 +0100)
There is a fundamental design bug in the resolvers code which is that
a list of active resolutions is being walked to try to delete outdated
entries, and that the code responsible for removing them also removes
other elements, including the next one which will be visited by the
list iterator. This randomly causes a use-after-free condition leading
to crashes, infinite loops and various other issues such as random memory
corruption.

A first fix for the memory fix for this was brought by commit 0efc0993e
("BUG/MEDIUM: resolvers: Don't release resolution from a requester
callbacks"). While preparing for more fixes, some code was factored by
commit 11c6c3965 ("MINOR: resolvers: Clean server in a dedicated function
when removing a SRV item"), which inadvertently passed "0" as the "safe"
argument all the time, missing one case of removal protection, instead
of always using "safe". This patch reintroduces the correct argument.

This must be backported with all fixes above.

Cc: Christopher Faulet <cfaulet@haproxy.com>
(cherry picked from commit 2a67aa0a51acc2d3469061b8732faf7e597f9c69)
Signed-off-by: Willy Tarreau <w@1wt.eu>
(cherry picked from commit 3223d8574e37b1baaf1e9000244c5b1c07178e0a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>

src/dns.c

index b5dec72..5515519 100644 (file)
--- a/src/dns.c
+++ b/src/dns.c
@@ -633,9 +633,9 @@ int dns_read_name(unsigned char *buffer, unsigned char *bufend,
  *
  * Must be called with the DNS lock held.
  */
-static void dns_srvrq_cleanup_srv(struct server *srv)
+static void dns_srvrq_cleanup_srv(struct server *srv, int safe)
 {
-       dns_unlink_resolution(srv->dns_requester, 0);
+       dns_unlink_resolution(srv->dns_requester, safe);
        HA_SPIN_LOCK(SERVER_LOCK, &srv->lock);
        srvrq_update_srv_status(srv, 1);
        free(srv->hostname);
@@ -669,7 +669,7 @@ static struct task *dns_srvrq_expire_task(struct task *t, void *context, unsigne
                goto end;
 
        HA_SPIN_LOCK(DNS_LOCK, &srv->srvrq->resolvers->lock);
-       dns_srvrq_cleanup_srv(srv);
+       dns_srvrq_cleanup_srv(srv, 0);
        HA_SPIN_UNLOCK(DNS_LOCK, &srv->srvrq->resolvers->lock);
 
  end:
@@ -710,7 +710,7 @@ static void dns_check_dns_response(struct dns_resolution *res)
                        else if (item->type == DNS_RTYPE_SRV) {
                                /* Remove any associated server */
                                list_for_each_entry_safe(srv, srvback, &item->attached_servers, srv_rec_item)
-                                       dns_srvrq_cleanup_srv(srv);
+                                       dns_srvrq_cleanup_srv(srv, 0);
                        }
 
                        LIST_DEL(&item->list);
@@ -2020,7 +2020,7 @@ void dns_detach_from_resolution_answer_items(struct dns_resolution *res,  struct
                        if (item->type == DNS_RTYPE_SRV) {
                                list_for_each_entry_safe(srv, srvback, &item->attached_servers, srv_rec_item) {
                                        if (srv->srvrq == srvrq)
-                                               dns_srvrq_cleanup_srv(srv);
+                                               dns_srvrq_cleanup_srv(srv, safe);
                                }
                        }
                }