Connecting to a STIX Bundle data source

Connect the STIX Bundle data source to the platform to enable your applications and dashboards to collect and analyze STIX Bundle security data. Universal Data Insights connectors enable federated search across your security products.

Structured Threat Information eXpression (STIX) is a language and serialization format. A STIX Bundle is a collection of STIX Objects and marking definitions that are grouped in a single container. Organizations can share cyberthreat intelligence (CTI) by using STIX Objects. The marking definitions are used as the requirements for handling and sharing the CTI data.

Before you begin

Ensure that a valid STIX Bundle is generated. To validate a STIX Bundle file, you can use the bundle validator script that is stored in the bundle_validator folder of the STIX-Shifter GitHub repository. To validate and troubleshoot the bundle JSON file, follow the instructions in README.md file.

Provide access to a JSON file that contains a valid STIX Bundle. The JSON file can be uploaded to a web server, cloud storage, or to any location that is accessible by HTTP request. Optionally, the URL of the JSON file can have a basic authentication requirement.

Alternatively, you can configure an adapter for demonstration purposes that is not connected to any real data source. This dummy connector always returns sample data in the same way as it would return data from IBM® QRadar, Splunk Enterprise Security, or Carbon Black CB Response. For example, an administrator can configure the connector through the UI and a Data Explorer user can then run queries against the connector and view sample results.

If you have a firewall between your cluster and the data source target, use the IBM Security Edge Gateway to host the containers. The Edge Gateway must be V1.6 or later. For more information, see Setting up Edge Gateway.

About this task

To query a STIX Bundle, your bundle must be STIX 2.0 and must contain observed-data objects.

Procedure

  1. Go to Menu > Connections > Data sources.
  2. On the Data Sources tab, click Connect a data source.
  3. Click STIX Bundle, then click Next.
  4. Configure the connection to the data source.
    1. In the Data source name field, assign a name to uniquely identify the data source connection.
      You can create multiple connection instances to a data source so it would be good to clearly set them apart by name. Only alphanumeric characters and the following special characters are allowed: - . _
    2. In the Data source description field, write a description to indicate the purpose of the data source connection.
      You can create multiple connection instances to a data source, so it is useful to clearly indicate the purpose of each connection by description. Only alphanumeric characters and the following special characters are allowed: - . _
    3. If you have a firewall between your cluster and the data source target, use the Edge Gateway to host the containers. In the Edge gateway (optional) field, specify which Edge Gateway to use.
      Select an Edge Gateway to host the connector. It can take up to five minutes for the status of newly deployed data source connections on the Edge Gateway to show as being connected.
    4. In the Full URL of a stix-bundle file field, set the URL of the STIX Bundle JSON file so that the platform can communicate with it. This information is required.
      Alternatively, you can use the following URLs to configure a dummy data source connection that is only for demonstration purposes. The STIX Bundle URLs contain sample data for the respective data sources.
      • CloudWatch: https://raw.github.com/opencybersecurityalliance/stix-shifter/develop/data/cybox/aws/aws_cloudwatch_logs_19062020.json
      • QRadar®: https://raw.github.com/opencybersecurityalliance/stix-shifter/develop/data/cybox/qradar/qradar_observed_2000.json
      • Splunk: https://raw.github.com/opencybersecurityalliance/stix-shifter/develop/data/cybox/splunk/splunk_observed_1143.json
      • Carbon Black CB Response: https://raw.github.com/opencybersecurityalliance/stix-shifter/develop/data/cybox/carbon_black/cb_observed_156.json
  5. Set the query parameters to control the behavior of the search query on the data source.
    1. In the Concurrent search limit field, set the number of simultaneous connections that can be made to the data source. The default limit for the number of connections is 4. The value must not be less than 1 and must not be greater than 100.
    2. In the Query search timeout limit field, set the time limit in minutes for how long the query is run on the data source. The default time limit is 30. When the value is set to zero, there is no timeout. The value must not be less than 1 and must not be greater than 120.
    3. In the Query time range field, set the time range in minutes for the search, represented as the last X minutes. The default is 5 minutes. The value must not be less than 1 and must not be greater than 10,000.
  6. Optional: If you need to customize the STIX attributes mapping, click Customize attribute mapping and edit the JSON blob to map new or existing properties to their associated target data source fields.
  7. Configure identity and access.
    1. Click Add a Configuration.
    2. In the Configuration name field, enter a unique name to describe the access configuration and distinguish it from the other access configurations for this data source connection that you might set up. Only alphanumeric characters and the following special characters are allowed: - . _
    3. In the Configuration description field, enter a unique description to describe the access configuration and distinguish it from the other access configurations for this data source connection that you might set up. Only alphanumeric characters and the following special characters are allowed: - . _
    4. Click Edit access and choose which users can connect to the data source and the type of access.
    5. In the Username (optional) field, enter a username with access to the search API.
    6. In the Password (optional) field, enter the password for that username.
    7. Click Add.
    8. To save your configuration and establish the connection, click Done.
    You can see the data source connection configuration that you added under Connections on the data source settings page. A message on the card indicates connection with the data source.
    When you add a data source, it might take a few minutes before the data source shows as being connected.
    Tip: After you connect a data source, it might take up to 30 seconds to retrieve the data. Before the full data set is returned, the data source might display as unavailable. After the data is returned, the data source shows as being connected, and a polling mechanism occurs to validate the connection status. The connection status is valid for 60 seconds after every poll.

    You can add other connection configurations for this data source that have different users and different data access permissions.

  8. To edit your configurations, complete the following steps:
    1. On the Data Sources tab, select the data source connection that you want to edit.
    2. In the Configurations section, click Edit Configuration (Edit configuration icon).
    3. Edit the identity and access parameters and click Save.

Results

If you use the URLs that are supplied to configure a dummy data source connection, as outlined in step 6, these URLs contain only sample data. Therefore, queries to the data source do not return data unless the query is based on the sample data. The following simple example queries return results based on the sample data that is contained in the URLs.

[ ipv4-addr:value != '127.0.0.1' ]
[ network-traffic:src_ref.value != '127.0.0.1' ]
[ network-traffic:dst_port = 443 ]
[ user-account:user_id = 'test' ]

What to do next

Test the connection by running a query with IBM Security Data Explorer. To use Data Explorer, you must have data sources that are connected so that the application can run queries and retrieve results across a unified set of data sources. The search results vary depending on the data that is contained in your configured data sources. For more information about how to build a query in Data Explorer, see Build a query.