From 101de50014fd2788217fcc683c5733d664e55cdc Mon Sep 17 00:00:00 2001 From: Baptiste Assmann Date: Tue, 4 Aug 2020 10:57:21 +0200 Subject: [PATCH] BUG/MAJOR: dns: disabled servers through SRV records never recover A regression was introduced by 13a9232ebc63fdf357ffcf4fa7a1a5e77a1eac2b when I added support for Additional section of the SRV responses.. Basically, when a server is managed through SRV records additional section and it's disabled (because its associated Additional record has disappeared), it never leaves its MAINT state and so never comes back to production. This patch updates the "snr_update_srv_status()" function to clear the MAINT status when the server now has an IP address and also ensure this function is called when parsing Additional records (and associating them to new servers). This can cause severe outage for people using HAProxy + consul (or any other service registry) through DNS service discovery). This should fix issue #793. This should be backported to 2.2. (cherry picked from commit 87138c3524bc4242dc48cfacba82d34504958e78) Signed-off-by: Christopher Faulet (cherry picked from commit e7811e2add39f0329502181ed2d43a52275f697d) [cf: Must be backported as far as 2.0 because of recent changes on resolvers] Signed-off-by: Christopher Faulet --- src/server.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/server.c b/src/server.c index 8f2398f..eaae723 100644 --- a/src/server.c +++ b/src/server.c @@ -3985,6 +3985,12 @@ int snr_update_srv_status(struct server *s, int has_no_ip) /* If resolution is NULL we're dealing with SRV records Additional records */ if (resolution == NULL) { + /* since this server has an IP, it can go back in production */ + if (has_no_ip == 0) { + srv_clr_admin_flag(s, SRV_ADMF_RMAINT); + return 1; + } + if (s->next_admin & SRV_ADMF_RMAINT) return 1; -- 1.7.10.4