Configuring Keyed Database Seeds

The following values can be configured for this type of seed:

  • Host - The hostname or IP address of the database server.
  • Port (optional) - The port on which the database is listening.
  • Username (optional) - Username for accessing this database.
  • Password (optional) - Password for accessing this database.
  • Database system - The type of database to connect to.
  • Database name - The name of the database to retrieve results from.
  • Table to retrieve - The name of a table to retrieve.
  • Key Column (optional) - The name of a column to use as a unique key. This is used to create unique URLs at crawl time, and to update those same URLs later.
  • Timestamp Column (optional) - The name of a column that contains a timestamp denoting when the row was last updated.
  • Start Time (optional) - The time (specified as seconds since the epoch) to start a crawl from. Requires the Timestamp Column option to be set. This sets the XML value <value-of-var name="live-crawl-date" />
  • Fetch Size (optional) - The number of rows to retrieve from the database in each interaction. A large value will minimize network overhead, but will require more memory to store the results while they are being processed.
  • Maximum converted size (optional) - Maximum size of the data for a document. This is the largest block of memory that will be loaded at one time. Only increase this limit if you have sufficient memory on your computer.