Quantcast
Channel: Forum SQL Server Database Engine
Viewing all articles
Browse latest Browse all 15694

SQL Server, ISABOUT, weighted terms

$
0
0

I am trying to figure out exactly how weighted terms work in a ISABOUT query in SQL SERVER.
Here is where I currently am:

Each query returns the following rows:


**QUERY 1 (weight 1):** *Initial ranking*  

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (1) ) ') ORDER BY RANK DESC, [KEY]        

KEY     RANK    
306342 249    
272619 156    
221557 114


**QUERY 2 (weight 0.8):**  *Ranking increases, initial order is preserved*

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.8) ) ') ORDER BY RANK DESC, [KEY]     

KEY     RANK     
306342  321     
272619  201     
221557  146

**QUERY 3 (weight 0.2):**  *Ranking increases, initial order is preserved*

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.2) ) ') ORDER BY RANK DESC, [KEY]

KEY    RANK     
306342 998     
272619 877     
221557 692

**QUERY 4 (weight 0.17):** *Ranking decreases, best match is now last, inverted behavior for these terms begin at 0.17*  

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.17) ) ') ORDER BY RANK DESC, [KEY]     
KEY      RANK     
272619   960     
221557   958     
306342   802

**QUERY 5 (weight 0.16):** *Ranking increases, best match is now second*

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.17) ) ') ORDER BY RANK DESC, [KEY]   

KEY      RANK     
272619   978     
306342   935     
221557   841

**QUERY 6 (weight 0.01):**  *Ranking decreases, best match is last again*

SELECT * FROM CONTAINSTABLE(documentParts, title, 'ISABOUT ("e" weight (0.01) ) ') ORDER BY RANK DESC, [KEY]      
KEY       RANK     
221557 105     
272619 77     
306342 50


Best match for weight 1 has a rank of 249 and while weight goes down to 0.2 ranking of best match increases to 998.
From 0.2 to 0.17 ranking decreases and from 0.16 results are inverted (*the weight values that reproduce this behavior depend on terms and maybe on columns searched...*)

It seems there is a point where weight means the opposite, something like "do not include this term".
Do you have any explanation of this behavior?
Why ranking increases when weight decreases?
Why ranking decreases after some point until results are inverted and how can you predict this point?

I use a custom "word-breaker", when user searches for something creating the following query:

CONTAINSTABLE(documentParts, title,           '
ISABOUT (              "wordA wordB wordC" weight (0.8),"wordA*" NEAR "wordB*" NEAR "wordC*" weight (0.6),"wordA*" (0.1),"wordB*" (0.1),"wordC*" (0.1), 
 ) ')

Am I to expect big ranks for for 0.1 words?
Is the following query the same as above and am I to expect some weird behavior with the 0.1 rankings?

CONTAINSTABLE(documentParts, title, '         
ISABOUT ( "wordA wordB wordC" weight (0.8) ),
OR ISABOUT ( "wordA*" NEAR "wordB*" NEAR "wordC*" weight (0.6) ),
OR ISABOUT ( "wordA*" (0.1) ),           
OR ISABOUT ( "wordB*" (0.1) ),           
OR ISABOUT ( "wordC*" (0.1) ),           
')


George Kosmidis


Viewing all articles
Browse latest Browse all 15694

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>