SplitSessions#

Data processor#

class retentioneering.data_processors_lib.split_sessions.SplitSessions(params)[source]#

Create new synthetic events, that divide users’ paths on sessions: session_start (or session_start_cropped) and session_end (or session_end_cropped). Also create a new column that contains session number for each event in input eventstream. Session number will take the form: {user_id}_{session_number through one user path}.

Parameters:
timeoutTuple(float, DATETIME_UNITS)

Threshold value and its unit of measure. session_start and session_end events are always placed before the first and after the last event in each user’s path. Because user can have more than one session, it calculates timedelta between every two consecutive events in each user’s path. If the calculated timedelta is more than selected timeout, new synthetic events - session_start and session_end are created inside the user path, marking session starting and ending points.

mark_truncatedbool, default False

If True - calculates timedelta between:

  • first event in each user’s path and first event in the whole eventstream.

  • last event in each user’s path and last event in the whole eventstream.

For users with timedelta less than selected timeout, a new synthetic event - session_start_cropped or session_end_cropped will be added.

session_colstr, default “session_id”

The name of the session_col.

Returns:
Eventstream

Eventstream with new synthetic events and session_col.

event_name

event_type

timestamp

session_start

session_start

first_event

session_end

session_end

last_event

session_start_cropped

session_start_cropped

first_event

session_end_cropped

session_end_cropped

last_event

If the delta between timestamps of two consecutive events (raw_event_n and raw_event_n+1) is greater than the selected timeout the user will have more than one session:

user_id

event_name

event_type

timestamp

session_col

1

session_start

session_start

first_event

1_0

1

session_end

session_end

raw_event_n

1_0

1

session_start

session_start

raw_event_n+1

1_1

1

session_end

session_end

last_event

1_1

See also

TimedeltaHist

Plot the distribution of the time deltas between two events.

Eventstream.describe

Show general eventstream statistics.

Eventstream.describe_events

Show general eventstream events statistics.

Notes

See Data processors user guide for the details.

class retentioneering.data_processors_lib.split_sessions.SplitSessionsParams(*, timeout, mark_truncated=False, session_col='session_id')[source]#

A class with parameters for SplitSessions class.

Eventstream#

SplitSessionsHelperMixin.split_sessions(timeout, session_col='session_id', mark_truncated=False)[source]#

A method of Eventstream class that creates new synthetic events in each user’s path: session_start (or session_start_cropped) and session_end (or session_end_cropped). The created events divide users’ paths on sessions. Also creates a new column that contains session number for each event in the input eventstream Session number will take the form: {user_id}_{session_number through one user path}. The created events and column are added to the input eventstream.

Parameters:
See parameters description

SplitSessions

Returns:
Eventstream

Input eventstream with new synthetic events and session_col.