rekall.interval_set_mapping module

Abstraction for managing multiple different IntervalSets.

This module provides IntervalSetMapping, which is a wrapper around a mapping from keys to IntervalSets. They keys can be anything that’s hashable in Python.

The most common use case is to restrict IntervalSet operations to be from the same domain; for example, suppose we use frame number in a video as the temporal dimension of an Interval. Then we would want to differentiate Intervals from different videos, since the temporal dimension represents different domains. We could map from video path or video ID in some database to the IntervalSet for that video.

The key need not be fixed. For example, if we have IntervalSets over live broadcasts of news, we can use video metadata to recover the absolute UTC time as the temporal dimension. Then it may make sense to re-group all the Intervals into one IntervalSet, or use the TV channel name as the domain key to re-group for downstream processing. IntervalSetMapping provides a convenient mechanism for dynamic re-grouping.

class rekall.interval_set_mapping.IntervalSetMapping(grouped_intervals)

Bases: collections.abc.MutableMapping

A wrapper around a dictionary from key to IntervalSet.

It uses method reflection to expose all the same methods as IntervalSet, and delegates the method to the underlying IntervalSet of each domain in the collection. When calling binary methods such as join or minus on two IntervalSetMapping’s, it will match up IntervalSet’s by key and perform the method on the underlying IntervalSet’s for each domain.

For each in-system method of IntervalSet (i.e. the return value is an IntervalSet), the corresponding method on IntervalSetMapping returns an IntervalSetMapping as well.

For each out-of-system method on IntervalSet, namely size, empty, fold, and match, the corresponding method on IntervalSetMapping returns a dictionary from the key to the result of the method call on the underlying IntervalSet.

IntervalSetMapping exposes Python’s getter/setter paradigm as well, so individual IntervalSet’s can be referenced using bracket notation and their key.

The methods to wrap from IntervalSet are defined by the class constants: UNARY_METHODS, BINARY_METHODS and OUT_OF_SYSTEM_UNARY_METHODS.

Example

Here are some examples of how IntervalSetMapping reflects IntervalSet’s methods:

ism1 = IntervalSetMapping({
    1: IntervalSet(...),
    2: IntervalSet(...),
    10: IntervalSet(...)
})
ism2 = IntervalSetMapping({
    1: IntervalSet(...),
    3: IntervalSet(...),
    10: IntervalSet(...)
})

# Unary methods
ism1.map(mapper) == IntervalSetMapping({
    1: ism1[1].map(mapper),  # IntervalSet
    2: ism1[2].map(mapper),  # IntervalSet
    10: ism1[10].map(mapper) # IntervalSet
})

# Binary methods
ism1.join(ism2, ...) == IntervalSetMapping({
    1: ism1[1].join(ism2[1], ...),   # IntervalSet
    10: ism1[10].join(ism2[10], ...) # IntervalSet
})

# Out of system unary methods:
ism1.size() == {
    1: ism1[1].size(),   # Number
    2: ism1[2].size(),   # Number
    10: ism1[10].size()  # Number
}
Atrributes:
UNARY_METHODS: List of methods that IntervalSetMapping reflects from
IntervalSet and that will return a IntervalSetMapping where the IntervalSet under each group is transformed under the unary operation. See IntervalSet documentation for arguments and behavior for each method.
BINARY_METHODS: List of methods that IntervalSetMapping reflects from
IntervalSet and that will take a second IntervalSetMapping and will return an IntervalSetMapping where the binary operation is performed on the two IntervalSets with the same key. See IntervalSet documentation for arguments and behavior for each method.
OUT_OF_SYSTEM_UNARY_METHODS: List of methods that IntervalSetMapping
reflects from IntervalSet and that will return a dictionary mapping from IntervalSet keys to return values of the methods.
BINARY_METHODS = ['merge', 'union', 'join', 'minus', 'filter_against', 'collect_by_interval']
OUT_OF_SYSTEM_UNARY_METHODS = ['size', 'duration', 'empty', 'fold', 'match']
UNARY_METHODS = ['filter_size', 'map', 'filter', 'group_by', 'fold_to_set', 'map_payload', 'dilate', 'group_by_axis', 'coalesce', 'split']
add_key_to_payload()

Adds key to payload of each interval in each IntervalSet.

If each interval in an IntervalSet with key K had payload P before, it now has the tuple (P, K) as payload.

Returns:A new IntervalSetMapping with the transformed intervals.

Note

The original IntervalSetMapping is unchanged. This is the same behavior as all unary methods of IntervalSet.

classmethod from_intervalset(intervalset, key_fn)

Constructs an IntervalSetMapping from an IntervalSet by grouping by key_fn.

Parameters:
  • intervalset (IntervalSet) – An interval set containing all intervals to put in the mapping.
  • key_fn – A function that takes an interval and returns the domain key.
Returns:

An IntervalSetMapping with the same intervals organized into domains by their key accroding to key_fn.

classmethod from_iterable(iterable, key_parser, bounds_parser, payload_parser=<function IntervalSetMapping.<lambda>>, progress=False, total=None)

Constructs an IntervalSetMapping from an iterable.

Parameters:
  • iterable – An iterable of arbitrary elements. Each element will become an interval in the collection.
  • key_parser – A function that takes an element in iterable and returns the key for the interval.
  • bounds_parser – A function that takes an element in iterable and returns the bounds for the interval.
  • payload_parser (optional) – A function that takes an element in iterable and returns the payload for the interval. Defaults to producing None for all elements.
  • progress (Bool, optional) – Whether to display a progress bar using tqdm. Defaults to False.
  • total (int, optional) – Total number of elements in iterable. Only used to estimate ETA for the progress bar, and only takes effect if progress is True.
Returns:

A IntervalSetMapping constructed from iterable and the parsers provided.

Note

Everything in iterable will be materialized in RAM.

get_flattened_intervalset()

Get an IntervalSet containing all intervals in all the IntervalSets.

get_grouped_intervals()

Get dictionary from key to IntervalSet.

items() → a set-like object providing a view on D's items
keys() → a set-like object providing a view on D's keys
values() → an object providing a view on D's values