|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.cjk.CJKTokenizer
@Deprecated public final class CJKTokenizer
CJKTokenizer is designed for Chinese, Japanese, and Korean languages.
The tokens returned are every two adjacent characters with overlap match.
Example: "java C1C2C3C4" will be segmented to: "java" "C1C2" "C2C3" "C3C4".
Additionally, the following is applied to Latin text (such as English):
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.Tokenizer |
---|
input |
Constructor Summary | |
---|---|
CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
Deprecated. |
|
CJKTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
Deprecated. |
|
CJKTokenizer(Reader in)
Deprecated. Construct a token stream processing the given input. |
Method Summary | |
---|---|
void |
end()
Deprecated. |
boolean |
incrementToken()
Deprecated. Returns true for the next token in the stream, or false at EOS. |
void |
reset()
Deprecated. |
void |
reset(Reader reader)
Deprecated. |
Methods inherited from class org.apache.lucene.analysis.Tokenizer |
---|
close, correctOffset |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public CJKTokenizer(Reader in)
in
- I/O readerpublic CJKTokenizer(org.apache.lucene.util.AttributeSource source, Reader in)
public CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory, Reader in)
Method Detail |
---|
public boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException
- - throw IOException when read error public final void end()
end
in class org.apache.lucene.analysis.TokenStream
public void reset() throws IOException
reset
in class org.apache.lucene.analysis.TokenStream
IOException
public void reset(Reader reader) throws IOException
reset
in class org.apache.lucene.analysis.Tokenizer
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |