Creating a Search Collection
Search collections are created using the search-collection-create function. When creating a new collection, it is a good idea to base the new collection on a pre-defined, pre-configured collection in which most basic search collection configuration has already been done. This minimizes the amount of dynamic configuration that needs to be done for the new collection. This base collection can be created via the Watson Explorer Engine administration tool (which is usually much more convenient than creating and fully configuring a collection using the API). After creating a search collection, you will need to ensure that the appropriate services for that collection are running, as described in Starting, Stopping, Managing, and Monitoring Collection-Specific Services.
The default-push and default-broker-push collections are provided as examples of collections to which data is pushed rather than being retrieved by crawling. The default-push collection is the default collection on which collections that are created using the Watson Explorer Engine API are based. This collection will be used automatically if you do not specify a value for the based-on parameter.
Search collection names follow the standard rules for an XML NMTOKEN. They can consist of any combination of letters, digits, combining characters, and extenders, based on the Unicode definitions of these terms. In more practical terms, search collection names typically consist of letters, digits, and the characters '.', '-', '_', and ':'.
When a new collection is created, a source with the same name is also created unless a source by that name already exists. If a source by that name does exist, this does not generate an exception, but the new collection is not associated with the pre-existing source. (If you want them to be associated, you can manually configure a connection by modifying the source programmatically or in the Watson Explorer Engine administration tool.) When a collection is deleted, Watson Explorer Engine will also attempt to delete the corresponding source without generating an exception if the corresponding source does not exist.
Unlike projects and other configuration objects, inheritance for collection configurations is done at creation time and is not dynamic. In other word, changes to the configuration of a collection that other collections are based upon will not propagate to the collections that are based upon it.
XML message:
<SearchCollectionCreate xmlns="urn:/velocity/types"> <collection>my-new-collection</collection> <based-on>my-base-collection</based-on> </SearchCollectionCreate>
In C#:
SearchCollectionCreate scc = new SearchCollectionCreate();
scc.collection = COLLECTION;
scc.basedon = BASED_ON;
scc.collectionmeta = new SearchCollectionCreateCollectionmeta();
scc.collectionmeta.vsemeta = new vsemeta();
scc.collectionmeta.vsemeta.vsemetainfo = new vsemetainfo[1];
scc.collectionmeta.vsemeta.vsemetainfo[0] = new vsemetainfo();
scc.collectionmeta.vsemeta.vsemetainfo[0].livecrawldir = "e:\\parent-dir\\" + COLLECTION + "\\crawl0";
scc.collectionmeta.vsemeta.vsemetainfo[0].stagingcrawldir = "e:\\parent-dir\\" + COLLECTION + "\\crawl1";
scc.collectionmeta.vsemeta.vsemetainfo[0].cachedir = "e:\\parent-dir\\" + COLLECTION + "\\cache";
port.SearchCollectionCreate(scc);
In Java:
SearchCollectionCreate scc = new SearchCollectionCreate();
scc.setCollection(COLLECTION);
scc.setBasedOn(BASED_ON);
port.searchCollectionCreate(scc);