I am trying to figure out exactly how weighted terms work in a ISABOUT query in SQL SERVER.
Here is where I currently am:
Each query returns the following rows:
**QUERY 1 (weight 1):** *Initial ranking*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (1) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 306342 249 272619 156 221557 114
**QUERY 2 (weight 0.8):** *Ranking increases, initial order is preserved*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.8) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 306342 321 272619 201 221557 146
**QUERY 3 (weight 0.2):** *Ranking increases, initial order is preserved*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.2) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 306342 998 272619 877 221557 692
**QUERY 4 (weight 0.17):** *Ranking decreases, best match is now last, inverted behavior for these terms begin at 0.17*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.17) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 272619 960 221557 958 306342 802
**QUERY 5 (weight 0.16):** *Ranking increases, best match is now second*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.17) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 272619 978 306342 935 221557 841
**QUERY 6 (weight 0.01):** *Ranking decreases, best match is last again*
SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.01) ) ') ORDER BY RANK DESC, [KEY] KEY RANK 221557 105 272619 77 306342 50
Best match for weight 1 has a rank of 249 and while weight goes down to 0.2 ranking of best match increases to 998.
From 0.2 to 0.17 ranking decreases and from 0.16 results are inverted (*the weight values that reproduce this behavior depend on terms and maybe on columns searched...*)
It seems there is a point where weight means the opposite, something like "do not include this term".
Do you have any explanation of this behavior?
Why ranking increases when weight decreases?
Why ranking decreases after some point until results are inverted and how can you predict this point?
I use a custom "word-breaker", when user searches for something creating the following query:
CONTAINSTABLE(documentParts, title, ' ISABOUT ( "wordA wordB wordC" weight (0.8),"wordA*" NEAR "wordB*" NEAR "wordC*" weight (0.6),"wordA*" (0.1),"wordB*" (0.1),"wordC*" (0.1), ) ')
Am I to expect big ranks for for 0.1 words?
Is the following query the same as above and am I to expect some weird behavior with the 0.1 rankings?
CONTAINSTABLE(documentParts, title, ' ISABOUT ( "wordA wordB wordC" weight (0.8) ), OR ISABOUT ( "wordA*" NEAR "wordB*" NEAR "wordC*" weight (0.6) ), OR ISABOUT ( "wordA*" (0.1) ), OR ISABOUT ( "wordB*" (0.1) ), OR ISABOUT ( "wordC*" (0.1) ), ')
George Kosmidis