What’s new in 3.2.0 (2023-11-13)#
New Features#
Eventstream#
Improved working with
RawDataSchema
. All columns of a raw DataFrame exceptuser_id
,event
,timestamp
are considered as custom now and added to the eventstream automatically. New argumentcustom_cols
shapes a white list for the columns to be added only. See eventstream user guide for details.EventstreamSchema
can be defined as a dictionary. See eventstream user guide.Synthetic events
path_start
andpath_end
are added automatically to an eventstream as ifAddStartEndEvents
was applied.
Data processors#
Added GroupEventsBulk data processor. Now you can apply multiple grouping operations simultaneously.
stream.group_events_bulk(
{
'product': lambda _df: _df['event'].str.startswith('product'),
'delivery': lambda _df: _df['event'].str.startswith('delivery')
}
)
Added Pipe data processor. It allows you to modify an eventstream as if you worked with pandas DataFrame.
stream.pipe(lambda _df: _df.assign(new_column=100))
schema
argument is not mandatory for custom functions ofFilterEvents
andGroupEvents
data processors.
stream.filter_events(lambda _df: _df['user_id'] == 'user_12345')
The architecture of the data processors was improved and simplified. Some legacy features were removed.
Transition graph#
The default
edges_weight_col
andnodes_weight_col
is set touser_id
. It means that the default weights are associated with the number of unique users who had given transition or experienced given event.