MINOR: tools: improve word fingerprinting by counting presence
authorWilly Tarreau <w@1wt.eu>
Mon, 15 Mar 2021 08:34:27 +0000 (09:34 +0100)
committerWilly Tarreau <w@1wt.eu>
Mon, 15 Mar 2021 08:38:42 +0000 (09:38 +0100)
commit9294e8822f6f359ea9dba6f19b97fb75a72c433e
treea634cd557ccf951d6858b1583167362aed4308d8
parent101df31503e7bef59cd6096cd9eb2d708de7471b
MINOR: tools: improve word fingerprinting by counting presence

The distance between two words can be high due to a sub-word being missing
and in this case it happens that other totally unrealted words are proposed
because their average score looks lower thanks to being shorter. Here we're
introducing the notion of presence of each character so that word sequences
that contain existing sub-words are favored against the shorter ones having
nothing in common. In addition we do not distinguish being/end from a
regular delimitor anymore. That made it harder to spot inverted words.
include/haproxy/tools.h
src/tools.c