Cohorts#

Cohorts Class#

class retentioneering.tooling.cohorts.cohorts.Cohorts(eventstream)[source]#

A class that provides methods for cohort analysis. The users are split into groups depending on the time of their first appearance in the eventstream; thus each user is associated with some cohort_group. Retention rates of the active users belonging to each cohort_group are calculated within each cohort_period.

Parameters:
eventstreamEventstreamType

See also

Eventstream.cohorts

Call Cohorts tool as an eventstream method.

EventTimestampHist

Plot the distribution of events over time.

UserLifetimeHist

Plot the distribution of user lifetimes.

Notes

See Cohorts user guide for the details.

fit(cohort_start_unit, cohort_period, average=True, cut_bottom=0, cut_right=0, cut_diagonal=0)[source]#

Calculates the cohort internal values with the defined parameters. Applying fit method is necessary for the following usage of any visualization or descriptive Cohorts methods.

Parameters:
cohort_start_unitDATETIME_UNITS

The way of rounding and formatting of the moment from which the cohort count begins. The minimum timestamp is rounded down to the selected datetime unit.

For example: assume we have an eventstream with the following minimum timestamp - “2021-12-28 09:08:34.432456”. The result of roundings with different DATETIME_UNITS is shown in the table below:

cohort_start_unit

cohort_start_moment

Y

2021-01-01 00:00:00

M

2021-12-01 00:00:00

W

2021-12-27 00:00:00

D

2021-08-28 00:00:00

cohort_periodTuple(int, DATETIME_UNITS)

The cohort_period size and its DATETIME_UNIT. This parameter is used in calculating:

  • Start moments for each cohort from the moment specified with the cohort_start_unit parameter

  • Cohort periods for each cohort from its start moment.

averagebool, default True
  • If True - calculating average for each cohort period.

  • If False - averaged values aren’t calculated.

cut_bottomint, default 0

Drop ‘n’ rows from the bottom of the cohort matrix. Average is recalculated.

cut_rightint, default 0

Drop ‘n’ columns from the right side of the cohort matrix. Average is recalculated.

cut_diagonalint, default 0

Replace values in ‘n’ diagonals (last period-group cells) with np.nan. Average is recalculated.

Notes

Parameters cohort_start_unit and cohort_period should be consistent. Due to “Y” and “M” being non-fixed types, it can be used only with each other or if cohort_period_unit is more detailed than cohort_start_unit. More information - about numpy timedelta

Only cohorts with at least 1 user in some period are shown.

See Cohorts user guide for the details.

heatmap(width=5.0, height=5.0)[source]#

Builds a heatmap based on the calculated cohort matrix values. Should be used after fit().

Parameters:
widthfloat, default 5.0

Width of the figure in inches.

heightfloat, default 5.0

Height of the figure in inches.

Returns:
matplotlib.axes.Axes
lineplot(plot_type='cohorts', width=7.0, height=5.0)[source]#

Create a chart representing each cohort dynamics over time. Should be used after fit().

Parameters:
plot_type: ‘cohorts’, ‘average’ or ‘all’
  • if cohorts - shows a lineplot for each cohort,

  • if average - shows a lineplot only for the average values over all the cohorts,

  • if all - shows a lineplot for each cohort and also for their average values.

widthfloat, default 7.0

Width of the figure in inches.

heightfloat, default 5.0

Height of the figure in inches.

Returns:
matplotlib.axes.Axes
property params#

Returns the parameters used for the last fitting. Should be used after fit().

property values#

Returns a pd.DataFrame representing the calculated cohort matrix values. Should be used after fit().

Returns:
pd.DataFrame

Eventstream#

Eventstream.cohorts(cohort_start_unit, cohort_period, average=True, cut_bottom=0, cut_right=0, cut_diagonal=0, width=5.0, height=5.0, show_plot=True)[source]#

Show a heatmap visualization of the user appearance grouped by cohorts.

Parameters:
show_plotbool, default True

If True, a cohort matrix heatmap is shown.

See other parameters’ description

Cohorts

Returns:
Cohorts

A Cohorts class instance fitted to the given parameters.