Cohorts#
Cohorts Class#
- class retentioneering.tooling.cohorts.cohorts.Cohorts(eventstream)[source]#
A class that provides methods for cohort analysis. The users are split into groups depending on the time of their first appearance in the eventstream; thus each user is associated with some
cohort_group
. Retention rates of the active users belonging to eachcohort_group
are calculated within eachcohort_period
.- Parameters:
- eventstreamEventstreamType
See also
Eventstream.cohorts
Call Cohorts tool as an eventstream method.
EventTimestampHist
Plot the distribution of events over time.
UserLifetimeHist
Plot the distribution of user lifetimes.
Notes
See Cohorts user guide for the details.
- fit(cohort_start_unit, cohort_period, average=True, cut_bottom=0, cut_right=0, cut_diagonal=0)[source]#
Calculates the cohort internal values with the defined parameters. Applying
fit
method is necessary for the following usage of any visualization or descriptiveCohorts
methods.- Parameters:
- cohort_start_unitDATETIME_UNITS
The way of rounding and formatting of the moment from which the cohort count begins. The minimum timestamp is rounded down to the selected datetime unit.
For example: assume we have an eventstream with the following minimum timestamp - “2021-12-28 09:08:34.432456”. The result of roundings with different
DATETIME_UNITS
is shown in the table below:cohort_start_unit
cohort_start_moment
Y
2021-01-01 00:00:00
M
2021-12-01 00:00:00
W
2021-12-27 00:00:00
D
2021-08-28 00:00:00
- cohort_periodTuple(int, DATETIME_UNITS)
The cohort_period size and its
DATETIME_UNIT
. This parameter is used in calculating:Start moments for each cohort from the moment specified with the
cohort_start_unit
parameterCohort periods for each cohort from its start moment.
- averagebool, default True
If
True
- calculating average for each cohort period.If
False
- averaged values aren’t calculated.
- cut_bottomint, default 0
Drop ‘n’ rows from the bottom of the cohort matrix. Average is recalculated.
- cut_rightint, default 0
Drop ‘n’ columns from the right side of the cohort matrix. Average is recalculated.
- cut_diagonalint, default 0
Replace values in ‘n’ diagonals (last period-group cells) with
np.nan
. Average is recalculated.
Notes
Parameters
cohort_start_unit
andcohort_period
should be consistent. Due to “Y” and “M” being non-fixed types, it can be used only with each other or ifcohort_period_unit
is more detailed thancohort_start_unit
. More information - about numpy timedeltaOnly cohorts with at least 1 user in some period are shown.
See Cohorts user guide for the details.
- heatmap(width=5.0, height=5.0)[source]#
Builds a heatmap based on the calculated cohort matrix values. Should be used after
fit()
.- Parameters:
- widthfloat, default 5.0
Width of the figure in inches.
- heightfloat, default 5.0
Height of the figure in inches.
- Returns:
- matplotlib.axes.Axes
- lineplot(plot_type='cohorts', width=7.0, height=5.0)[source]#
Create a chart representing each cohort dynamics over time. Should be used after
fit()
.- Parameters:
- plot_type: ‘cohorts’, ‘average’ or ‘all’
if
cohorts
- shows a lineplot for each cohort,if
average
- shows a lineplot only for the average values over all the cohorts,if
all
- shows a lineplot for each cohort and also for their average values.
- widthfloat, default 7.0
Width of the figure in inches.
- heightfloat, default 5.0
Height of the figure in inches.
- Returns:
- matplotlib.axes.Axes
Eventstream#
- Eventstream.cohorts(cohort_start_unit, cohort_period, average=True, cut_bottom=0, cut_right=0, cut_diagonal=0, width=5.0, height=5.0, show_plot=True)[source]#
Show a heatmap visualization of the user appearance grouped by cohorts.
- Parameters:
- show_plotbool, default True
If
True
, a cohort matrix heatmap is shown.- See other parameters’ description
- Returns:
- Cohorts
A
Cohorts
class instance fitted to the given parameters.