You can use threat hunting in Data Explorer to import, group, join, and manipulate data to prove, or disprove your hypotheses. For example, you may want to know if the highly critical resources are connected to suspicious external DNS servers.
A hunt is a proactive investigation of an unknown threat to prove or disprove a hypothesis. A hunt comprises steps (questions), variables (answers), and snapshots (evidence).
A step is a question that the hunter asks to see some data. A step includes the Kestrel statement and the hunter's comment.
A snapshot is a static view of a variable. Snapshots are saved as evidence to be referenced in reports.
Steps are at the core of a hunt. A step includes the Kestrel statement and the hunter's comment to retrieve data. Step names are only used to identify the step. If you do not provide a name, a default name is applied based on the command in the statement.
When you type a Kestrel command, you can choose to use the prompted template for writing a Kestrel statement.
Data Explorer supports the following commands:
- GET - search your data sources: Search for data from any of the connected data sources using STIX patterning.
- APPLY - add analytics: Analyze and enrich your data.
- GROUP - group by column: Create a grouped table based on a column within a variable.
- JOIN - combine variables: Combine two variables together using column values.
- DISP - create a snapshot: Select specific columns from a variable to display in a snapshot in reports.
- FIND - find entities within a variable: Find and return entities that are connected to a specified list of entities.
- NEW - create variable: Create a variable with entities that directly come from the specified data.
- SORT - sort column: Reorder entities in a variable and output the same set of entities with the new order to a new variable.
- COPY - copy data to a new variable: Copy an existing variable to create another.
- MERGE - union entities: Merge entities in multiple variables using a logical union.
Kestrel is a threat hunting language built for finding previously unknown threats. It provides the ability to find anomalies and malicious behavior that went undetected by your existing defenses. Statements can comprise multiple commands and parameters. You must understand Kestrel to properly use threat hunting in Data Explorer. For more information, see Kestrel Threat Hunting Language.
Analytics provide for analysis and enrichment of security data. You can create Kestrel statements
APPLY command to run these analytics on variables you've created in
previous hunt steps.
- car-enrich - Add the risk score from connected assets and risk
repository. See Connected Assets and Risk connectors.
- Parameters: None.
- skcluster - Provide clustering result for selected columns. For more
information, see Scikit-learn Clustering.
columns: List of columns/attributes to pass to clustering algorithm.
method: Name of the clustering algorithm; one of “kmeans” (the default), or “dbscan”.Tip:kmeans example
APPLY skcluster ON var WITH method=kmeans, n_clusters=3, columns=src_byte_count,dst_byte_countdbscan example
APPLY skcluster ON var WITH method=dbscan, eps=0.5, columns=src_byte_count,dst_byte_count
- suspicious-process-scoring - Compute a suspicion score for process objects.
- Parameters: None.
- tis-enrich - Add Threat Intelligence to objects such as ipv4-addr,
domain, url, and file (hashes).
Note: Running the command does not necessarily return results, even when the following successful message appears in the screen.
- Parameters: None.
The data enrichment was added successfully.
The message merely indicates that the job was carried out successfully even when no real results were generated.
- outliers - Compute outlier scores from a dataset. Scores closer to 0 are
normal, closer to 1.0 are abnormal.
columns: List of columns/attributes to pass to outlier detection algorithm.
method: Name of the outlier detection algorithm; one of "auto" (the default; automatically determine an appropriate method), "zscore", or "isolationforest". zscore is a univariate method - it can only use a single numerical attribute as input.
k: Integer multiplier for standard deviation (only used when method is "zscore"). Default is 3."
contamination: The proportion of outliers in the data set (only used when method is "isolationforest"). Default is 'auto'."