Machine learning analytic requirements

Machine learning models can take some time to train and build based on different analytic requirements.

Machine learning analytics

  • Peer group analytics: These analytics identify users who engage in similar activities and the model places them into peer groups. Alerts are then generated based on deviations from a user from their peer group.

    Peer Group models do not have a training phase. They have only a model building phase and a scoring phase because they ingest back 30 days of a user’s data.

  • HOUR_TO_WINDOW Analytics: The model is cumulative, so each hour is evaluated against the model for each user as a whole as opposed to having its own model.

    The hours to train this type of model are written as real-world hours. For example, 240 hours of data needed would mean that 10 days or 240 real-world hours would need to elapse.

  • HOUR_TO_HOUR Analytics: These analytics build a model for each user for each of the hours of the day. For example, there will be a 1 PM – 2 PM model, a 2 PM – 3 PM model and so on.

    The hours required to train are written as hours for each model. For example, 10 hours of data needed would mean that 10 days or 240 real-world hours need to elapse.

Analytic terminology
  • minTimeSpan: This is the model’s minimum required number of hours to be able to train and build.
  • ticketMinSampleCounts: This differs from minTimeSpan as it’s the minimum number of hours that are needed to generate alerts. Peer Group Analytics do not have this parameter.
  • Number of roles: The minimum number of roles and the maximum number of roles are a range that the model tests for each and every number in that range to determine how many clusters of low-level categories to segment users into. The model evaluates each number of clusters within the range and determines what number of topics is optimal.

Peer group analytics

The following models use Peer group analytics:

  • Activity Distribution
  • Defined Peer Group
  • Internal Asset Access by Peer Group
  • Internal Destination Port by Peer Group
  • Internal Network Zone by Peer Group
  • Learned Peer Group
  • Process Execution by Peer Group
These are the parameters for a peer group model to build:
  • Type of analytic: Peer Group
  • Minimum number of roles: 2
  • Maximum number of roles: 13
    Note: Learned Peer Group, Defined Peer Group, and Activity Distribution all have a maximum of 20
  • Minimum number of groups: 5
  • Maximum number of groups: 10
    Note: Learned Peer Group is the only one with a maximum of 20.
  • Minimum number of events: 10
  • Minimum amount of data to build (mintimespan): 7 Days
  • Current set amount of data to build (timespan): 30 Days

HOUR_TO_WINDOW analytics

The following models use HOUR_TO_WINDOW analytics:
  • DDL Events
  • DML Events
  • Inbound Data Transfer
  • Internal Asset Access
  • Internal Destination Port
  • Internal Network Zone
  • Large HTTP Transfer
  • Outbound Transfer Attempts
  • Outbound Transfer Attempts (by Volume)
  • Process Usage
  • Risk Posture
  • Successful Access and Authentication Activity

These are the parameters for an HOUR_TO_WINDOW model to build:

  • Type of analytic: HOUR_TO_WINDOW
  • Minimum amount of data to build (mintimespan): 240 Hours
  • Minimum amount of data to generate alerts (ticketMinSampleCounts): 240 Hours
  • Current set amount of data to build (timespan): 240 Hours
  • SenseValue: 5

HOUR_TO_HOUR analytics

The following models use HOUR_TO_HOUR analytics:
  • Aggregated Activity
  • Access Activity
  • Authentication Activity
  • Suspicious Activity
These are the parameters for an HOUR_TO_HOUR model to build:
  • Type of analytic: HOUR_TO_HOUR
  • Minimum amount of data to build (mintimespan): 240 Hours
  • Current set amount of data to build (timespan): 240 Hours
  • Minimum amount of data to generate alerts (ticketMinSampleCounts): 10 Hours (10 real-world days)