Page 189 Concepts and similar pages

Concepts

Similarity Concept
Question Answering systems
Ideal test collection
Data retrieval systems
Cluster based retrieval
Term
Document representative
Relevance
Information retrieval system
Data model
Indexing

Similar pages

Similarity Page Snapshot
30 In practice,one seeks some sort of optimal trade off between representation and discrimination ...The emphasis on representation leads to what one might call a document orientation:that is,a total preoccupation with modelling what the document is about ...This point of view is also adopted by those concerned with defining a concept of information,they assume that once this notion is properly explicated a document can be represented by the information it contains [37]...The emphasis on discrimination leads to a query orientation ...Automatic keyword classification Many automatic retrieval systems rely on thesauri to modify queries and document representatives to improve the chance of retrieving relevant documents ...
145 automatic and interactive retrieval system?Studies to gauge this are going on but results are hard to interpret ...It should be apparent now that in evaluating an information retrieval system we are mainly concerned with providing data so that users can make a decision as to 1 whether they want such a system social question and 2 whether it will be worth it ...The second question what to evaluate?boils down to what can we measure that will reflect the ability of the system to satisfy the user ...1 The coverage of the collection,that is,the extent to which the system includes relevant matter;2 the time lag,that is,the average interval between the time the search request is made and the time an answer is given;3 the form of presentation of the output;4 the effort involved on the part of the user in obtaining answers to his search requests;5 the recall of the system,that is,the proportion of relevant material actually retrieved in answer to a search request;6 the precision of the system,that is,the proportion of retrieved material that is actually relevant ...It is claimed that 1 4 are readily assessed ...
6 language input and storage more feasible ...The reader will have noticed that already,the idea of relevance has slipped into the discussion ...Intellectually it is possible for a human to establish the relevance of a document to a query ...An information retrieval system Let me illustrate by means of a black box what a typical IR system would look like ...Starting with the input side of things ...
190 The lesson that is to be learnt is that should new research get underway it will be very important to have a suitable data base ready ...Information retrieval systems are likely to play an every increasing part in the community ...One major recent development is that computers and data bases are becoming linked into A study recommending the provision of such an experimental test bed has recently been completed,see Sparck Jones and van Rijsbergen,Information retrieval test collections,Journal of Documentation,32,59 75 1976 ...networks ...By extending the user population to include the non specialist,it is likely that an IR system will be expected to provide not just a citation,but a display of the text,or part of it,and perhaps answer simple questions about the retrieved documents ...To bring all this about the document retrieval system will have to be interfaced and integrated with data retrieval systems,to give access to facts related to those in the documents ...Another example can be found in the context of computer aided instruction,where it is clearly a good idea to give a student access to a document retrieval system which will provide him with further reading on a topic of his immediate interest ...
188 In basing a theory of evaluation on the theory of measurement,is it possible to devise a measure of effectiveness not starting with precision and recall but simply with the set of relevant documents and the set of retrieved documents?If so,can we generalise such a measure to take account of degree of relevance?An alternative derivation of an E type measure could be done in terms of recall and fallout ...Up to now the measurement of effectiveness has proved fairly intractable to statistical analysis ...I think the Robertson model described in Chapter 7 goes some way to being considered as a reasonable statistical model ...There may be laws of retrieval such as the well known trade off between precision and recall that are worth establishing either empirically or by theoretical argument ...6 ...There is a need for more intensive research into the problems of what to use to represent the content of documents in a computer ...Information retrieval systems,both operational and experimental,have been keyword based ...The major reason for this rather simple minded approach to document retrieval is a very good one ...
105 search ...Interactive search formulation A user confronted with an automatic retrieval system is unlikely to be able to express his information need in one go ...1 the frequency of occurrence in the data base of his search terms;2 the number of documents likely to be retrieved by his query;3 alternative and related terms to be the ones used in his search;4 a small sample of the citations likely to be retrieved;and 5 the terms used to index the citations in 4 ...All this can be conveniently provided to a user during his search session by an interactive retrieval system ...The sample of citations and their indexing will give him some idea of what kind of documents are likely to be retrieved and thus some idea of how effective his search terms have been in expressing his information need ...Examples,both operational and experimental,of systems providing mechanisms of this kind are MEDLINE [11]...We now look at a mathematical approach to the use of feedback where the system automatically modifies the query ...Feedback The word feedback is normally used to describe the mechanism by whicha system can improve its performance on a task by taking
148 relevant to an information need if and only if it contains at least one sentence which is relevant to that need ...Earlier on I stated that this notion of relevance was only of limited use at the moment ...Saracevic [8]has summarised some of the more recent work on probabilistic interpretations of relevance ...Precision and recall,and others We now leave the speculations about relevance and return to the promised detailed discussion of the measurement of effectiveness ...It is helpful at this point to introduce the famous contingency table which is not really a contingency table at all ...
184 Eight THE FUTURE Future research In the preceding chapters I have tried to bring together some of the more elaborate tools that are used during the design of an experimental information retrieval system ...1 ...Substantial evidence that large document collections can be handled successfully by means of automatic classification will encourage new work into ways of structuring such collections ...It is therefore of some importance that using the kind of data already in existence,that is using document descriptions in terms of keywords,we establish that document clustering on large document collections can be both effective and efficient ...
5 frequency of occurrence and co occurrence of index terms in the relevant and non relevant documents ...Chapter 7:Evaluation here I give a traditional view of the measurement of effectiveness followed by an explanation of some of the more promising attempts at improving the art ...Chapter 8:The Future contains some speculation about the future of IR and tries to pinpoint some areas of research where further work is desperately needed ...Information retrieval Since the 1940 s the problem of information storage and retrieval has attracted increasing attention ...In principle,information storage and retrieval is simple ...When high speed computers became available for non numerical work,many thought that a computer would be able to read an entire document collection to extract the relevant documents ...