Similarity |
Page |
Snapshot |
| 105 |
search
...Interactive search formulation A user confronted with an automatic retrieval system is unlikely to be able to express his information need in one go
...1 the frequency of occurrence in the data base of his search terms;2 the number of documents likely to be retrieved by his query;3 alternative and related terms to be the ones used in his search;4 a small sample of the citations likely to be retrieved;and 5 the terms used to index the citations in 4
...All this can be conveniently provided to a user during his search session by an interactive retrieval system
...The sample of citations and their indexing will give him some idea of what kind of documents are likely to be retrieved and thus some idea of how effective his search terms have been in expressing his information need
...Examples,both operational and experimental,of systems providing mechanisms of this kind are MEDLINE [11]...We now look at a mathematical approach to the use of feedback where the system automatically modifies the query
...Feedback The word feedback is normally used to describe the mechanism by whicha system can improve its performance on a task by taking |
| 8 |
The process may involve structuring the information in some appropriate way,such as classifying it
...Finally,we come to the output,which is usually a set of citations or document numbers
...IR in perspective This section is not meant to constitute an attempt at an exhaustive and complete account of the historical development of IR
...Since the emphasis in this book is on a particular approach to document representation,I shall restrict myself here to a few remarks about its history
...At this point,it may be convenient to elaborate on the use of keyword
...The use of statistical information about distributions of words in documents was further exploited by Maron and Kuhns [11]who obtained statistical associations between keywords
... |
| 6 |
language input and storage more feasible
...The reader will have noticed that already,the idea of relevance has slipped into the discussion
...Intellectually it is possible for a human to establish the relevance of a document to a query
...An information retrieval system Let me illustrate by means of a black box what a typical IR system would look like
...Starting with the input side of things
... |
| 14 |
Two AUTOMATIC TEXT ANALYSIS Introduction Before a computerised information retrieval system can actually operate to retrieve some information,that information must have already been stored inside the computer
...The starting point of the text analysis process may be the complete document text,an abstract,the title only,or perhaps a list of words only
...The developments and advances in the process of representation have been reviewed every year by the appropriate chapters of Cuadra s Annual Review of Information Science and Technology
... |
| 5 |
frequency of occurrence and co occurrence of index terms in the relevant and non relevant documents
...Chapter 7:Evaluation here I give a traditional view of the measurement of effectiveness followed by an explanation of some of the more promising attempts at improving the art
...Chapter 8:The Future contains some speculation about the future of IR and tries to pinpoint some areas of research where further work is desperately needed
...Information retrieval Since the 1940 s the problem of information storage and retrieval has attracted increasing attention
...In principle,information storage and retrieval is simple
...When high speed computers became available for non numerical work,many thought that a computer would be able to read an entire document collection to extract the relevant documents
... |
| 145 |
automatic and interactive retrieval system?Studies to gauge this are going on but results are hard to interpret
...It should be apparent now that in evaluating an information retrieval system we are mainly concerned with providing data so that users can make a decision as to 1 whether they want such a system social question and 2 whether it will be worth it
...The second question what to evaluate?boils down to what can we measure that will reflect the ability of the system to satisfy the user
...1 The coverage of the collection,that is,the extent to which the system includes relevant matter;2 the time lag,that is,the average interval between the time the search request is made and the time an answer is given;3 the form of presentation of the output;4 the effort involved on the part of the user in obtaining answers to his search requests;5 the recall of the system,that is,the proportion of relevant material actually retrieved in answer to a search request;6 the precision of the system,that is,the proportion of retrieved material that is actually relevant
...It is claimed that 1 4 are readily assessed
... |
| 95 |
Five SEARCH STRATEGIES Introduction So far very little has been said about the actual process by which the required information is located
...All search strategies are based on comparison between the query and the stored documents
...The distinctions made between different kinds of search strategies can sometimes be understood by looking at the query language,that is the language in which the information need is expressed
...Boolean search A Boolean search strategy retrieves those documents which are true |
| 22 |
entry in the list defining B and PT as equivalent stem endings if the preceding characters match
...The assumption in the context of IR is that if two words have the same underlying stem then they refer to the same concept and should be indexed as such
...It is inevitable that a processing system such as this will produce errors
...My description of the three stages has been deliberately undetailed,only the underlying mechanism has been explained
...Surprisingly,this kind of algorithm is not core limited but limited instead by its processing time
...The final output from a conflation algorithm is a set of classes,one for each stem detected
...Queries are of course treated in the same way
...Indexing An index language is the language used to describe documents and requests
... |
| 30 |
In practice,one seeks some sort of optimal trade off between representation and discrimination
...The emphasis on representation leads to what one might call a document orientation:that is,a total preoccupation with modelling what the document is about
...This point of view is also adopted by those concerned with defining a concept of information,they assume that once this notion is properly explicated a document can be represented by the information it contains [37]...The emphasis on discrimination leads to a query orientation
...Automatic keyword classification Many automatic retrieval systems rely on thesauri to modify queries and document representatives to improve the chance of retrieving relevant documents
... |