Path metrics#

Eventstream#

Eventstream.path_metrics(metrics, path_id_col=None)[source]#

Calculate metrics for each path.

Parameters:
metricstuple, or list

A metric or a list of metrics to be calculated.

Each metric can be defined with the following types.

  • str. The following metric aliases are supported.
    • len: the number of events in the path.

    • has:TARGET_EVENT: whether the path contains the specified target event.

    • time_to:TARGET_EVENT: the time to the first occurrence of the specified target event.

  • pd.NamedAgg. It is applied to a single column of the grouped DataFrame. See pandas documentation for the details.

  • Callable. An arbitrary function to be applied to the grouped DataFrame with apply method.

A metric should be passed as a tuple of two elements:
  • a metric definition, according to the mentioned types.

  • a metric name.

Examples of the metrics:

metrics = [
    ('len', 'path_length'),
    ('has:cart', 'has_cart'),
    ('time_to:cart', 'time_to_cart'),
    (lambda _df: (_df['event'] == 'cart').sum(), 'cart_count'),
    (pd.NamedAgg('timestamp', lambda s: len(s.dt.date.unique())), 'active_days')
]
path_id_colstr, optional

A column name associated with a path identifier. A default value is linked to the user column from eventstream schema.

Returns:
pd.DataFrame or pd.Series

A DataFrame (for multiple metrics) or Series (for a single metric) with the calculated metric values. The index consists of path ids.