Stattests ========= |colab| |jupyter| .. |jupyter| raw:: html Download - Jupyter Notebook .. |colab| raw:: html Google Colab The Stattests class comprise simple utilities for two-group statistical hypothesis testing. Loading data ------------ Throughout this guide we use our demonstration :doc:`simple_shop ` dataset. It has already been converted to :doc:`Eventstream` and assigned to ``stream`` variable. If you want to use your own dataset, upload it following :ref:`this instruction`. .. code-block:: python import numpy as np from retentioneering import datasets stream = datasets.load_simple_shop() General usage ------------- The primary way to use the Stattests is to call :py:meth:`Eventstream.stattests()` method. Beforehand, we need to set the following arguments: ``groups`` and ``func``. The former defines two user groups to be compared, the latter -- a metric of interest to be compared in these two groups. For our first example, we will split users 50/50 based on the index: .. code-block:: python data = stream.to_dataframe() users = data['user_id'].unique() index_separator = int(users.shape[0]/2) user_groups = users[:index_separator], users[index_separator:] print(user_groups[0]) print(user_groups[1]) .. parsed-literal:: array([219483890, 964964743, 629881394, ..., 901422808, 523047643, 724268790]) array([315196393, 443659932, 865093748, ..., 965024600, 831491833, 962761227]) Optionally, we can define the names of the groups to be displayed in the method output with the ``group_names`` argument. Let us say we are interested in the proportion of ``cart`` events in a user's path. So the ``func`` parameter will look like this: .. code-block:: python def cart_share(df): return len(df[df['event'] == 'cart']) / len(df) The interface of the ``func`` function expects its single argument to be a pandas.DataFrame that contains a single user path. The function output must be a either a scalar number or a string. For example, if we pick a ``some_user`` id and apply ``cart_share`` function to the corresponding trajectory, we get the metric value of a single user. .. code-block:: python some_user = user_groups[0][378] cart_share(data[data['user_id'] == some_user]) .. parsed-literal:: 0.15384615384615385 Let us run the test that is defined by ``test`` argument. There is no need to specify a test hypothesis type - when applicable, the method computes the statistics for both one-sided hypothesis tests. ``Stattests`` outputs the statistic that could be significant, indicating which of the groups metric value could be *greater*: .. code-block:: python stream.stattests( groups=user_groups, func=cart_share, group_names=['random_group_1', 'random_group_2'], test='ttest' ) .. parsed-literal:: random_group_1 (mean ± SD): 0.075 ± 0.095, n = 1875 random_group_2 (mean ± SD): 0.078 ± 0.102, n = 1876 'random_group_1' is greater than 'random_group_2' with p-value: 0.21369 power of the test: 8.85% The method outputs the test p-value, along with group statistics and an estimate of test power (which is a heuristic designed for t-test). As expected, we see that the p-value is too high to register a statistical difference. Test power ~~~~~~~~~~ Changing the ``alpha`` parameter will influence estimated power of the test. For example, if we lower if to 0.01 (from the default 0.05), we would expect the power to also drop: .. code-block:: python stream.stattests( groups=user_groups, func=cart_share, group_names=['random_group_1', 'random_group_2'], test='ttest', alpha=0.01 ) .. parsed-literal:: random_group_1 (mean ± SD): 0.075 ± 0.095, n = 1875 random_group_2 (mean ± SD): 0.078 ± 0.102, n = 1876 'random_group_1' is greater than 'random_group_2' with p-value: 0.21369 power of the test: 2.11% Categorical variables ~~~~~~~~~~~~~~~~~~~~~ We might be interested in testing for difference in a categorical variable - for instance, in an indicator variable that indicates whether a user entered ``cart`` state zero, one, two or more than two times. In such cases, a contingency table independence test could be suitable. Let us check if the distribution of the mentioned variable differs between the users who checked: - ``product1`` exclusively - ``product2`` exclusively: .. code-block:: python user_group_1 = set(data[data['event'] == 'product1']['user_id']) user_group_2 = set(data[data['event'] == 'product2']['user_id']) user_group_1 -= user_group_1 & user_group_2 user_group_2 -= user_group_1 & user_group_2 .. code-block:: python def cart_count(df): cart_count = len(df[df['event'] == 'cart']) if cart_count <= 2: return str(cart_count) return '>2' some_user = user_groups[0][378] cart_count(data[data['user_id'] == some_user]) .. parsed-literal:: '2' .. code-block:: python some_user = user_groups[0][379] cart_count(data[data['user_id'] == some_user]) .. parsed-literal:: '0' To test the statistical difference between the distribution of ``0``, ``1``, ``2``, and ``>2`` categories we apply ``chi2_contingency`` test. .. code-block:: python stream.stattests( groups=(user_group_1, user_group_2), func=cart_count, group_names=('product_1_group', 'product_2_group'), test='chi2_contingency' ) .. parsed-literal:: product_1_group (size): n = 580 product_2_group (size): n = 1430 Group difference test with p-value: 0.00000 In this case, the output contains only the ``group_names``, group sizes and the resulting test statistics. We can see that the variable of interest indeed differs between the exclusive users of two products.