Transition matrix ================= The transition matrix shows the frequencies of a transition between a pair of events. It is strongly connected to :doc:`transition graph `. Each value of a transition matrix is essentially an edge weight of the corresponding transition graph. For example, the weight of ``A → B`` transition is located at ``A`` row and ``B`` column of the transition matrix. See the :ref:`this ` and :ref:`this ` section of the transition graph user guide for the details. Loading data ------------ Throughout this guide we use our demonstration :doc:`simple_shop ` dataset. It has already been converted to :doc:`Eventstream` and assigned to ``stream`` variable. If you want to use your own dataset, upload it following :ref:`this instruction`. .. code-block:: python from retentioneering import datasets stream = datasets.load_simple_shop() A basic example --------------- Similar to transition graph's parameters :ref:`edges_norm_type and edges_weight_col`, transition matrix has ``norm_type`` and ``weight_col`` arguments. The default parameters are ``norm_type=None`` and ``weight_col='user_id'``. .. code-block:: python stream.transition_matrix(norm_type='node', weight_col='user_id') .. figure:: /_static/user_guides/transition_matrix/transition_matrix_basic.png :width: 500 For example, from this matrix we can see that with respect to ``norm_type='node'`` and ``weight_col='user_id'`` configuration, the weight of the edge ``cart → catalog`` is 25% meaning that 25% of the users who had ``cart`` event had also ``cart → catalog`` transition. There are some arguments that control the appearance of a transition matrix. - By default, we do not recommend to plot any matrix of dimension > 60. To override this, you can set ``show_large_matrix=True`` explicitly. - Matrix values are displayed if the matrix dimension is <= 30, and not displayed otherwise. To show or hide them explicitly use ``show_values`` flag. - The ``precision`` argument sets the number of digits after the decimal point. - ``heatmap_axis`` allows you to color each row or each column by with a separate heatmap. .. code-block:: python stream.transition_matrix(norm_type='node', weight_col='event_id', heatmap_axis=0) .. figure:: /_static/user_guides/transition_matrix/transition_matrix_heatmap_axis.png :width: 500 This is a row-wise heatmap of the Markov matrix (``norm_type='node'``, ``weight_col='event_id'``) that highlights the basic property of Markov transition matrix: the sum of each row equals to 1. The next example demonstrates how to hide the cell values and make the image smaller: .. code-block:: python stream.transition_matrix( norm_type='node', weight_col='event_id', show_values=False, figsize=(4, 4) ) .. figure:: /_static/user_guides/transition_matrix/transition_matrix_no_values.png :width: 300 Differential transition matrix ------------------------------ Similar to some other tools (e.g. :ref:`step matrix`, :ref:`funnels`, :ref:`sequences`), transition matrix supports comparison of two groups of users. If M1 and M2 are transition matrices for the first and the second groups then the differential matrix is defined as M1 - M2. The ``groups`` argument defines the groups of paths to compare. You can pass either a collection of two collections containing path ids, or a couple of pre-defined segment values. The latter option is often preferable since pre-defined segments allows you to compare the same groups using other Retentioneering tools too. For example, if you have a segment ``country`` you can compare the users from the US and the UK like this: .. code-block:: python stream.transition_matrix(groups=['has_payment_done', 'US', 'UK']) See :doc:`the segments user guide` for more details on how to create and use segments. If you do not want to create a segment explicitly, you can pass a pair of collections containing path ids on the fly like this: .. code-block:: python group1 = [39690243, 56229892, 770891782, 189849617, 345530386] group2 = [950233183, 681437279, 816957536, 913156199, 614680680] stream.transition_matrix( weight_col='user_id', norm_type='full', groups=[group1, group2] ) .. figure:: /_static/user_guides/transition_matrix/diff_transition_matrix.png :width: 500 Using a separate instance ------------------------- :py:meth:`Eventstream.transition_matrix()` returns an instance of :py:meth:`TransitionMatrix` class that have ``values`` attribute so you can get the transition matrix as a pandas.DataFrame. To supress plotting the heatmap matrix, use ``show_plot=False`` flag. .. code-block:: python tm = stream.transition_matrix(show_plot=False) tm.values .. raw:: html
cart catalog ... payment_done payment_cash
cart 1.0 571.0 ... 0.0 0.0
catalog 1709.0 4857.0 ... 0.0 0.0
... ... ... ... ... ...
payment_done 0.0 0.0 ... 0.0 0.0
payment_cash 0.0 0.0 ... 104.0 0.0