Page 108 Concepts and similar pages

Concepts

Similarity Concept
Retrieval effectiveness
Operational information retrieval
Generality
Index term
Information retrieval definition
Automatic document classification
Information retrieval system
Index term weighting
Term
Experimental information retrieval

Similar pages

Similarity Page Snapshot
106 account of past performance ...Consider now a retrieval strategy that has been implemented by means of a matching function M ...It is the aim of every retrieval strategy to retrieve the relevant documents A and withhold the non relevant documents A ...the decision procedure M Q,D T >0 corresponds to a linear discriminant function used to linearly separate two sets A and A in R [t]...M Q 0,D >T whenever D [[propersubset]]A and M Q 0,D <T whenever D [[propersubset]][[Alpha]]The interesting thing is that starting with any Q we can adjust it iteratively using feedback information so that it will converge to Q 0 ...
105 search ...Interactive search formulation A user confronted with an automatic retrieval system is unlikely to be able to express his information need in one go ...1 the frequency of occurrence in the data base of his search terms;2 the number of documents likely to be retrieved by his query;3 alternative and related terms to be the ones used in his search;4 a small sample of the citations likely to be retrieved;and 5 the terms used to index the citations in 4 ...All this can be conveniently provided to a user during his search session by an interactive retrieval system ...The sample of citations and their indexing will give him some idea of what kind of documents are likely to be retrieved and thus some idea of how effective his search terms have been in expressing his information need ...Examples,both operational and experimental,of systems providing mechanisms of this kind are MEDLINE [11]...We now look at a mathematical approach to the use of feedback where the system automatically modifies the query ...Feedback The word feedback is normally used to describe the mechanism by whicha system can improve its performance on a task by taking
119 and The importance of writing it this way,apart from its simplicity,is that for each document x to calculate g x we simply add the coefficients ci for those index terms that are present,i ...The constant C which has been assumed the same for all documents x will of course vary from query to query,but it can be interpreted as the cut off applied to the retrieval function ...Let us now turn to the other part of g x,namely ci and let us try and interpret it in terms of the conventional contingency table ...There will be one such table for each index term;I have shown it for the index term i although the subscript i has not been used in the cells ...This is in fact the weighting formula F 4 used by Robertson and Sparck Jones 1 in their so called retrospective experiments ...
160 system so that if we were to adopt [[Delta]]as a measure of effectiveness we could be throwing away vital information needed to make an extrapolation to the performance of other systems ...The Cooper model expected search length In 1968,Cooper [20]stated:The primary function of a retrieval system is conceived to be that of saving its users to as great an extent as is possible,the labour of perusing and discarding irrelevant documents,in their search for relevant ones ...a only one relevant document is wanted;b some arbitrary number n is wanted;c all relevant documents are wanted;4 a given proportion of the relevant documents is wanted,etc ...Thus,the index is a measure of performance for a query of given type ...The output of a search strategy is assumed to be a weak ordering of documents ...
107 exists there is an iterative procedure which will ensure that Q will converge to Q 0 in a finite number of steps ...The iterative procedure is called the fixed increment error correction procedure ...It goes as follows:Qi Qi 1 cD if M Qi 1,D T <0 and D [[propersubset]]A Qi Qi 1 cD if M Qi 1,D T >0 and D [[propersubset]]A and no change made to Qi 1 if it diagnoses correctly ...The situation in actual retrieval is not as simple ...Once again this is not the whole story ...If M is taken to be the cosine function Q,D Q D then it is easy to show that [[Phi]]is maximised by where c is an arbitrary proportionality constant ...
95 Five SEARCH STRATEGIES Introduction So far very little has been said about the actual process by which the required information is located ...All search strategies are based on comparison between the query and the stored documents ...The distinctions made between different kinds of search strategies can sometimes be understood by looking at the query language,that is the language in which the information need is expressed ...Boolean search A Boolean search strategy retrieves those documents which are true
146 There has been much debate in the past as to whether precision and recall are in fact the appropriate quantities to use as measures of effectiveness ...1 the most commonly used pair;2 fairly well understood quantities ...The final question How to evaluate?has a large technical answer ...Before proceeding to the technical details relating to the measurement of effectiveness it is as well to examine more closely the concept of relevance which underlies it ...Relevance Relevance is a subjective notion ...
6 language input and storage more feasible ...The reader will have noticed that already,the idea of relevance has slipped into the discussion ...Intellectually it is possible for a human to establish the relevance of a document to a query ...An information retrieval system Let me illustrate by means of a black box what a typical IR system would look like ...Starting with the input side of things ...
109 retrieval ...Anew classic paper on the limitations of a Boolean search is Verhoeff et al ...References 1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12 ...