Lucene4.9使用 mmseg4j1.9遇到的问题,修改mmseg4j源码解决了

作者: hunanlzg
发布时间:2015-07-08 16:56:15

今天在写一个Lucene4.9demo的时候,直接用mmseg4j1.9分词器。但是程序报出了异常。

java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow. 	at org.Apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:111) 	at java.io.BufferedReader.fill(BufferedReader.java:161) 	at java.io.BufferedReader.read(BufferedReader.java:182) 	at java.io.FilterReader.read(FilterReader.java:65) 	at java.io.PushbackReader.read(PushbackReader.java:90) 	at com.chenlb.mmseg4j.MMSeg.readNext(MMSeg.java:42) 	at com.chenlb.mmseg4j.MMSeg.next(MMSeg.java:64) 	at com.chenlb.mmseg4j.analysis.MMSegTokenizer.incrementToken(MMSegTokenizer.java:64) 	at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:604) 	at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:342) 	at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:301) 	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:222) 	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450) 	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) 	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1222) 	at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1203) 	at LuceneDemo.createIndex(LuceneDemo.java:37) 	at LuceneTest.test1(LuceneTest.java:8) 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 	at java.lang.reflect.Method.invoke(Method.java:483) 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:73) 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:46) 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41) 	at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173) 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) 	at org.junit.runners.ParentRunner.run(ParentRunner.java:220) 	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:46) 	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)  

网上查需要修改mmseg4j的源码。

要修改的地方如下

修改 MMSegTokenizer 类的reset方法(其实就是加一句话)

	public void reset() throws IOException { 		//lucene 4.0 		//org.apache.lucene.analysis.Tokenizer.setReader(Reader) 		//setReader 自动被调用, input 自动被设置。 		super.reset();   //加这一句 		mmSeg.reset(input); 	}
修改好后,生成class文件,然后替换原来jar包里面的那个class文件,在重新打jar包就可以了。
楼主新人,第一次修改源码,虽然就是在网上抄了一句话。但是还是很开心的。学习了如何修改源码,如何在打jar等。大牛勿喷哈。

版权声明:本文为博主原创文章,未经博主允许不得转载。

标签: Lucene MM
来源:http://blog.csdn.net/hunanlzg/article/details/37911347

推荐: