search-collection-enqueue-url
Enqueues a URL for processing by the crawler. The SOAP name of this function is: SearchCollectionEnqueueUrl
Synopsis
crawler-service-enqueue-response nodeset search-collection-enqueue-url(collection, subcollection, url, synchronization, enqueue-type, force-allow, crawl-type);
nmtoken collection;
enum subcollection;
string url;
enum synchronization;
enum enqueue-type;
boolean force-allow;
enum crawl-type;Parameters
- nmtoken collection - The name of the collection to which you will enqueue a URL. (Required)
- enum subcollection - The subcollection in which to enqueue. Default value: live. Possible values: live|staging.
- string url - The URL to enqueue. (Required)
- enum synchronization - Indicates at which point the crawler should return success for an enqueued crawl-url: none: immediately after receiving the enqueue. Enqueued: after the crawl-url is found to satisfy the crawl conditions and attempts to be fetched. To-be-crawled: after the crawl-url is found to satisfy the crawl conditions, attempts to be fetched, and is committed to secondary storage. To-be-indexed: after the resource at the URL has been crawled, converted, and committed to secondary storage. Indexed: after the converted resource has been recorded by the indexer. Default value: enqueued. Possible values: enqueued|indexed|none|to-be-crawled|to-be-indexed.
- enum enqueue-type - Indicates how the enqueued URL should be processed by the crawler: none: The URL is subject to all the standard checks: deduplication, URL pattern limits and expiration. forced: Ignore the duplicates check and URL limits when processing the crawl-url. reenqueued: Ignore the duplicates check, URL limits, and all expiration options when processing the crawl-url. Default value: none. Possible values: none|forced|reenqueued.
- boolean force-allow - Force this URL to be allowed (and never filtered), whatever the settings are in the collection crawler configuration. Default value: false.
- enum crawl-type - If the crawler needs to be started, what mode to start the crawler in. Using resume-and-idle causes the crawler to not process any existing data, allowing for a faster response time. Using resume causes any pending data to be processed before the enqueue, preserving the ordering of pending URLs. Default value: resume-and-idle. Possible values: resume|resume-and-idle.
Return Value
- crawler-service-enqueue-response nodeset
Exceptions
- search-collection-invalid-name
- search-collection-enqueue
Authentication
Like all Watson Explorer Engine API functions except for ping, the search-collection-enqueue-url function requires authentication.
When using REST, you can simply pass v.username and v.password as CGI parameters via HTTP or HTTPS to authenticate the REST call to the search-collection-enqueue-url function.
When using the SOAP API, you can pass credentials as parameters on an endpoint, or you can leverage the authentication method that is supported by all Watson Explorer Engine functions. Each provides a setAuthentication method that can be passed an authentication object to provide the user name and password under which a function runs. An example of this in Java for a SOAP call to the search-collection-enqueue-url function is the following:
Authentication authentication = new Authentication();
authentication.setUsername("joe-user");
authentication.setPassword("joes-password");
SearchCollectionEnqueueUrl foo = new SearchCollectionEnqueueUrl();
foo.setAuthentication(authentication);
A single authentication object would typically be reused throughout each individual application.