From: Emeric Brun Date: Tue, 23 Feb 2021 15:50:53 +0000 (+0100) Subject: BUG/MEDIUM: peers: reset starting point if peers appears longly disconnected X-Git-Tag: v2.4-dev18~23 X-Git-Url: http://git.haproxy.org/?a=commitdiff_plain;h=d9729da98262f2136ad4eac44c3ec2f710cb4a49;p=haproxy-2.5.git BUG/MEDIUM: peers: reset starting point if peers appears longly disconnected If two peers are disconnected and during this period they continue to process a large amount of local updates, after a reconnection they may take a long time before restarting to push their updates. because the last pushed update would appear internally in futur. This patch fix this resetting the cursor on acked updates at the maximum point considered in the past if it appears in futur but it means we may lost some updates. A clean fix would be to update the protocol to be able to signal a remote peer that is was not updated for a too long period and needs a full resync but this is not yet supported by the protocol. This patch should be backported on all supported branches ( >= 1.6 ) --- diff --git a/src/peers.c b/src/peers.c index 69c2287..b473a4c 100644 --- a/src/peers.c +++ b/src/peers.c @@ -2396,7 +2396,20 @@ static inline void init_accepted_peer(struct peer *peer, struct peers *peers) /* Init cursors */ for (st = peer->tables; st ; st = st->next) { st->last_get = st->last_acked = 0; + HA_SPIN_LOCK(STK_TABLE_LOCK, &st->table->lock); + /* if st->update appears to be in future it means + * that the last acked value is very old and we + * remain unconnected a too long time to use this + * acknowlegement as a reset. + * We should update the protocol to be able to + * signal the remote peer that it needs a full resync. + * Here a partial fix consist to set st->update at + * the max past value + */ + if ((int)(st->table->localupdate - st->update) < 0) + st->update = st->table->localupdate + (2147483648U); st->teaching_origin = st->last_pushed = st->update; + HA_SPIN_UNLOCK(STK_TABLE_LOCK, &st->table->lock); } /* reset teaching and learning flags to 0 */ @@ -2433,7 +2446,20 @@ static inline void init_connected_peer(struct peer *peer, struct peers *peers) /* Init cursors */ for (st = peer->tables; st ; st = st->next) { st->last_get = st->last_acked = 0; + HA_SPIN_LOCK(STK_TABLE_LOCK, &st->table->lock); + /* if st->update appears to be in future it means + * that the last acked value is very old and we + * remain unconnected a too long time to use this + * acknowlegement as a reset. + * We should update the protocol to be able to + * signal the remote peer that it needs a full resync. + * Here a partial fix consist to set st->update at + * the max past value. + */ + if ((int)(st->table->localupdate - st->update) < 0) + st->update = st->table->localupdate + (2147483648U); st->teaching_origin = st->last_pushed = st->update; + HA_SPIN_UNLOCK(STK_TABLE_LOCK, &st->table->lock); } /* Init confirm counter */