Similar concepts
Pages with this concept
Similarity |
Page |
Snapshot |
| 178 |
document collections with different sets of queries then we can still use these measures to indicate which system satisfies the user more
...Significance tests Once we have our retrieval effectiveness figures we may wish to establish that the difference in effectiveness under two conditions is statistically significant
...Parametric tests are inappropriate because we do not know the form of the underlying distribution
...On the face of it non parametric tests might provide the answer
... |
| 182 |
19
...20
...21
...22
...23
...24
...25
...26
...27
...28
...29
...30
...31
...32
...33
...34
...35
...36
...37
...38
...39
...40
...41
... |
| 13 |
18
...19
...20
...21
...22
...23
...24
...25
...26
...27
...28
...29
...30
...31
...32
...33
...34
...35
...36
... |
| 179 |
that Di is continuous and that it is derived from a symmetric distribution,neither of which is normally met in IR data
...It seems therefore that some of the more sophisticated statistical tests are inappropriate
...The way it works is as follows:Let Za Q 1,Za Q 2,...P Za >Zb P Za <Zb [1]2 Under this hypothesis we expect the number of pairs which have Za >Zb to equal the number of pairs which have Za <Zb
...In IR this test is usually used as a one tailed test,that is,the alternative hypothesis prescribes the superiority of retrieval under condition a over condition b,or vice versa
...The use of the sign test raises a number of interesting points
... |
| 194 |
CHAN,F
...CHANG,C
...CHOU,C
...CHOW,C
...CLEVERDON,C
...CLEVERDON,C
...CLEVERDON,C
...CLIFFORD,H
...CLIMENSON,W
...COATES,E
...CODD,E
...COLE,A
...Comparative Systems Laboratory,An Inquiry into Testing of Information Retrieval Systems,3 Vols
...CONOVER,W
...COOPER,M
...COOPER,W
...COOPER,W
...COOPER,W
...COOPER,W
...CORMACK,R
...COX,D
...CROFT,W
...CROFT,W
...CROFT,W
...CROUCH,D
...CUADRA,A
... |
| 146 |
There has been much debate in the past as to whether precision and recall are in fact the appropriate quantities to use as measures of effectiveness
...1 the most commonly used pair;2 fairly well understood quantities
...The final question How to evaluate?has a large technical answer
...Before proceeding to the technical details relating to the measurement of effectiveness it is as well to examine more closely the concept of relevance which underlies it
...Relevance Relevance is a subjective notion
... |
|
|