Space-Time-Boxes Node

Space-Time-Boxes (STB) are an extension of Geohashed spatial locations. More specifically, an STB is an alphanumeric string that represents a regularly shaped region of space and time.

For example, the STB dr5ru7|2013-01-01 00:00:00|2013-01-01 00:15:00 is made up of the following three parts:

  • The geohash dr5ru7
  • The start timestamp 2013-01-01 00:00:00
  • The end timestamp 2013-01-01 00:15:00

As an example, you could use space and time information to improve confidence that two entities are the same because they are virtually in the same place at the same time. Alternatively, you could improve the accuracy of relationship identification by showing that two entities are related due to their proximity in space and time.

You can choose the Individual Records or Hangouts mode as appropriate for your requirements. Both modes require the same basic details, as follows:

Latitude field. Select the field that identifies the latitude (in WGS84 coordinate system).

Longitude field. Select the field that identifies the longitude (in WGS84 coordinate system).

Timestamp field. Select the field that identifies the time or date.

Individual Record Options

Use this mode to add an additional field to a record to identify its location at a given time.

Derive. Select one or more densities of space and time from which to derive the new field. See Defining Space-Time-Box density for more information.

Field name extension. Type the extension that you would like added to the new field name(s). You can choose to add this extension as either a Suffix or Prefix.

Hangout Options

A hangout can be thought of as a location and/or time in which an entity is continually or repeatedly found. For example, this could be used to identify a vehicle that makes regular transportation runs and identify any deviations from the norm.

The hangout detector monitors the movement of entities and flags conditions where an entity is observed to be "hanging out" in the area. The hangout detector automatically assigns each flagged hangout to one or more STBs, and uses in-memory entity and event tracking to detect hangouts with optimum efficiency.

STB Density. Select the density of space and time from which to derive the new field. For example, a value of STB_GH4_10MINS would correspond to a four-character geohash box of size approximately 20 km by 20 km and a 10-minute time window. See Defining Space-Time-Box density for more information.

Entity ID field. Select the entity to be used as the hangout identifier. This ID field identifies the event.

Minimum number of events. An event is a row in the data. Select the minimum number of occurrences of an event for the entity to be considered to be hanging out. A hangout must also qualify based on the following Dwell time is at least field.

Dwell time is at least. Specify the minimum duration over which the entity must dwell in the same location. This can help exclude, for example, a car waiting at a traffic light from being considered as hanging out. A hangout must also qualify based on the previous Minimum number of events field.

Following is more detail about what qualifies as a hangout:

Let e1, ..., en denote all time ordered events that are received from a given entity ID during a time duration (t1, tn). These events qualify as a hangout if:
  • n >= minimum number of events
  • tn - t1 >= minimum dwell time
  • All events e1, ..., en occur in the same STB

Allow hangouts to span STB boundaries. If this option is selected the definition of a hangout is less strict and could include, for example, an entity that hangs out in more than one Space-Time-Box. For example, if your STBs are defined as whole hours, selecting this option would recognize an entity that hangs out for an hour as valid, even if the hour consisted of the 30 minutes before midnight and the 30 minutes after midnight. If this option is not selected, 100% of hangout time must be within a single Space-Time-Box.

Min proportion of events in qualifying timebox (%). Only available if Allow hangouts to span STB boundaries is selected. Use this to control the degree to which a hangout reported in one STB might in fact overlap another. Select the minimum proportion of events that must occur within a single STB to identify a hangout. If set to 25%, and the proportion of events is 26%, this qualifies as being a hangout.

For example, suppose you configure the hangout detector to require at least two events (minimum number of events = 2) and a contiguous hover time of at least 2 minutes in a 4-byte-geohash space box and a 10-minute time box (STB_NAME = STB_GH4_10MINS). When a hangout is detected, say the entity hovers in the same 4-byte-geohash space box while the three qualifying events occur within a 10-minute time span between 4:57pm and 5:07pm at 4:58pm, 5:01pm, and 5:03pm. The qualifying timebox percent value specifies the STBs that be credit for the hangout, as follows:
  • 100%. Hangout is reported in the 5:00 - 5:10pm time-box and not in the 4:50 - 5:00pm time-box (events 5:01pm and 5:03pm meet all conditions that are required for a qualifying hangout and 100% of these events occurred in the 5:00 - 5:10 time box).
  • 50%. Hangouts in both the time-boxes are reported (events 5:01pm and 5:03pm meet all conditions that are required for a qualifying hangout and at least 50% of these events occurred in the 4:50 - 5:00 time box and at least 50% of these events occurred in the 5:00 - 5:10 time box).
  • 0%. Hangouts in both the time-boxes are reported.

When 0% is specified, hangout reports include the STBs representing every time box that is touched by the qualifying duration. The qualifying duration needs to be less than or equal to the corresponding duration of the time box in the STB. In other words, there should never be a configuration where a 10-minute STB is configured in tandem with a 20-minute qualifying duration.

A hangout is reported as soon as the qualifying conditions are met, and is not reported more than once per STB. Suppose that three events qualify for a hangout, and 10 total events occur within a qualifying duration all within the same STB. In that case, the hangout is reported when the third qualifying event occurs. None of the additional seven events trigger a hangout report.

Note:
  • The hangout detector's in-memory event data is not shared across processes. Therefore, a particular entity has affinity with a particular hangout detector node. That is, incoming motion data for an entity must always be consistently passed to the hangout detector node tracking that entity, which is ordinarily the same node throughout the run.
  • The hangout detector's in-memory event data is volatile. Whenever the hangout detector is exited and restarted, any work-in-progress hangouts are lost. This means stopping and restarting the process might cause the system to miss reporting real hangouts. A potential remedy involves replaying some of the historical motion data (for example, going back 48 hours and replaying the motion records that are applicable to any node that was restarted).
  • The hangout detector must be fed data in time-sequential order.