Concept: Stemming

Stemming

Similar concepts

Similarity

Concept

Automatic indexing

Indexing

Index language

Index language specificity

Indexing specificity

Automatic thesaurus

Automatic abstracting

Indexing exhaustivity

Automatic text analysis

Automatic classification

Pages with this concept

Similarity

Page

Snapshot

Generating document representatives conflation Ultimately one would like to develop a text processing system which by menas of computable methods with the minimum of human intervention will generate from the input text full text,abstract,or title a document representative adequate for use in an automatic retrieval system ...Such a system will usually consist of three parts:1 removal of high frequency words,2 suffix stripping,3 detecting equivalent stems ...The removal of high frequency words,stop words or fluff words is one way of implementing Luhn s upper cut off ...Table 2 ...The second stage,suffix stripping,is more complicated ...Table 2 ...1 the length of remaining stem exceeds a given number;the default is usually 2;2 the stem ending satisfies a certain condition,e ...Many words,which are equivalent in the above sense,map to one morphological form by removing their suffixes ...