Package | Description |
---|---|
org.apache.lucene.analysis.icu.segmentation |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
|
Modifier and Type | Class | Description |
---|---|---|
class |
DefaultICUTokenizerConfig |
Default
ICUTokenizerConfig that is generally applicable
to many languages. |
Constructor | Description |
---|---|
ICUTokenizer(Reader input,
ICUTokenizerConfig config) |
Construct a new ICUTokenizer that breaks text into words from the given
Reader, using a tailored BreakIterator configuration.
|
Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.