I would like to clean the indexes file of certain documents. To do this I am trying to retrieve all documents in the systems through a java program. At the moment I use the Rest search api to retrieve all documents. But with this option I am finding I need to keep paging the results.
Do you know a way of reading all documents within a Java program. If I was using Lucene I would read the directory but with this option I get "no segments found in the directory.
Has anyone here got some experience of reading all documents within a java program ?
This topic has been locked.
4 replies Latest Post - 2012-11-16T17:00:46Z by bfoyle
Pinned topic read all documents in a program - java program
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-11-16T17:00:46Z at 2012-11-16T17:00:46Z by bfoyle
Re: read all documents in a program - java program2012-11-16T10:30:11Z in response to SystemAdminHi Xmax,
perhaps you can try using SIAPI StreamingSearch. The last time I used that feature was with OmniFind 8.5, but as the sample is still there for ICA 3.0 I would guess it will still work (besides the SIAPI Search is declared deprecated).
Have a look at the samples in <ES_INSTALL_ROOT>\samples\siapi\StreamingSearchExample.java
Your query would be : and theoretically that should return you all the documents...
Re: read all documents in a program - java program2012-11-16T12:52:06Z in response to SystemAdminHi Marcell,
thanks for the quick reply. Yes that has worked for me. Thanks very much. This is in the SEAPI . Isn't that going to be deprecated ?
in any case cheers Marcell and have a good weekend,
bfoyle 060001WDQ360 PostsACCEPTED ANSWER
Re: read all documents in a program - java program2012-11-16T17:00:46Z in response to SystemAdminI think the streaming portion is the one part we are going to have to keep the way it is because of the exact problem you are finding with the paging in the REST API.