search-collection-enqueue

Enqueue content (or URL references) to be processed by the crawler. The SOAP name of this function is: SearchCollectionEnqueue

Synopsis

collection-broker-enqueue-response nodeset search-collection-enqueue(collection, subcollection, crawl-urls, exception-on-failure, crawl-type);
nmtoken collection;
enum subcollection;
collection-broker-enqueue-response nodeset crawl-urls;
boolean exception-on-failure;
enum crawl-type;

Parameters

nmtoken collection - The name of the collection to which you will enqueue the XML. (Required)
enum subcollection - The subcollection in which to enqueue. Default value: live. Possible values: live|staging.
crawl-url nodeset crawl-urls - A set of crawl-urls which can either contain content to be indexed or point to a URL to be crawled. Please ensure that: You specify a URL for each crawl-url (which will be used as a reference for update purposes). If the URL is just a reference and does not need to be fetched, you set the status to complete (so that the crawler does not try to fetch the URL). If you are enqueueing a URL multiple times and expect the new enqueueing to overwrite the previous ones, you set the enqueue-type to reenqueued. What you enqueue is allowed. This is usually done by setting the default-allow curl-option to allow in the crawler configuration of your collection (which will be set by default if you based your collection on default-push) or ensuring that the URL satisfies the filter requirements set by the seed in that same configuration. You can also force the URL to be allowed by specifying the same default-allow curl option under the crawl-url node that you are passing. (Required)
boolean exception-on-failure - If false then the search-collection-enqueue exception will be thrown only for problems related to communicating with the crawler. If true, the exception will be thrown if any URL was not successfully enqueued. Default value: false.
enum crawl-type - If the crawler needs to be started, what mode to start the crawler in. Using resume-and-idle will cause the crawler to not process any existing data, allowing for a faster response time. Using resume will cause any pending data to be processed before the enqueue, preserving the ordering of pending URLs. Default value: resume-and-idle. Possible values: resume|resume-and-idle.

Return Value

crawler-service-enqueue-response nodeset

Exceptions

search-collection-invalid-name
search-collection-enqueue

Authentication

Like all Watson Explorer Engine API functions with the exception of ping, the search-collection-enqueue function requires authentication.

When using REST, you can simply pass v.username and v.password as CGI parameters via HTTP or HTTPS to authenticate the REST call to the search-collection-enqueue function.

When using the SOAP API, you can pass credentials as parameters on an endpoint, or you can leverage the authentication method that is supported by all Watson Explorer Engine functions. Each provides a setAuthentication method that can be passed an authentication object to provide the username and password under which a function executes. An example of this in Java for a SOAP call to the search-collection-enqueue function is the following:

    Authentication authentication = new Authentication();
    authentication.setUsername("joe-user");
    authentication.setPassword("joes-password");

    SearchCollectionEnqueue foo = new SearchCollectionEnqueue();
    foo.setAuthentication(authentication);

A single authentication object would typically be reused throughout each individual application.