What’s new in 3.1.0 (2023-10-04)#

New Features#

Transition graph#

  • Added layout_dump parameter to the plot method. Now you can specify the path to the JSON file containing the saved node positions using this parameter. It ensures that the saved positions are applied, maintaining the desired mutual positioning of the nodes. See the transition graph user guide for details.

stream.transition_graph(layout_dump='/path/to/node_params.json')

Preprocessing graph#

Introduced the ability export and import a preprocessing graph from/to a configuration with two new methods PreprocessingGraph.export_to_file() and PreprocessingGraph.import_from_file(). See preprocessing user guide for details.

# save preprocessing config to file
path_to_file = '/path/to/pgraph_config.json'
pgraph.export_to_file(path_to_file)

# create new PreprocessingGraph instance
new_pgraph = stream.preprocessing_graph()

# restore the preserved preprocessing configurations
new_pgraph.import_from_file(path_to_file)

Step matrix#

The default value of the threshold has been changed from 0 to 0.01. By default, the events of low frequency are collapsed now to the THRESHOLDED artificial event. See the Step matrix user guide for the details.

Eventstream#

Added events_order parameter to Evenstream constructor. It sets the order of raw events with the same timestamps. See eventstream user guide for details.

df = pd.DataFrame(
    [
        ['user_1', 'A', '2023-01-01 00:00:00'],
        ['user_1', 'B', '2023-01-01 00:00:00'],
        ['user_2', 'B', '2023-01-01 00:00:03'],
        ['user_2', 'A', '2023-01-01 00:00:03'],
        ['user_2', 'A', '2023-01-01 00:00:04']
    ],
    columns=['user_id', 'event', 'timestamp']
)
stream = Eventstream(df, events_order=["B", "A"])

SplitSessions#

Two new parameters delimiter_events and delimiter_col have been added. The former enables the ability to split sessions based on either a single separating event or a pair of events. With the latter, you can specify a custom column that contains session identifiers. The data processor will automatically insert session_start and session_end events at the appropriate locations in the eventstream based on the provided column values.

stream.split_sessions(delimiter_events=["session_delimiter"])
stream.split_sessions(delimiter_events=["custom_start", "custom_end"])
stream.split_sessions(delimiter_col="custom_ses_id")

See Data processors user guide for details.

Improvements#

  • Added support for Python 3.11. See installation guide for details.

  • Resolved a significant number of warnings.

  • Increased the library’s dependency sustainability.

  • Added support for Safari and Firefox browsers. See installation guide for details.

  • Stabilized the functionality of TransitionGraph and PreprocessingGraph GUIs in popular environments: JupyterLab, Jupyter Notebook, and JupyterHub. See installation guide for details.

  • Developed a new data processor architecture, resulting in enhanced performance and reduced resource requirements.

  • Added custom columns aggregation in CollapseLoops data processor.

Bug fixes#

  • Bug in Clusters.set_clusters() was assigning user clusters incorrectly ignoring pd.Series index.

  • Bug in Stattests output was swapping the group labels.

  • Bug in StepMatrix was raising an exception when using target and groups arguments simultaneously.