.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/online_user_actions.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_online_user_actions.py: .. _tutorials-online-user-actions: =================== Online user actions =================== This example reproduces a typical data science situation in an internet company. We start from a pandas DataFrame with online user actions, for instance for an online text editor: the user can create a page, edit it, or delete it. We want to construct and visualize a graph of the users highlighting collaborations on the same page/project. .. GENERATED FROM PYTHON SOURCE LINES 10-16 .. code-block:: Python import igraph as ig import numpy as np import pandas as pd import matplotlib.pyplot as plt .. GENERATED FROM PYTHON SOURCE LINES 17-21 Let's start by preparing some toy data representing online users. Each row indicates a certain action taken by a user (e.g. click on a button within a website). Actual user data usually come with time stamp, but that's not essential for this example. .. GENERATED FROM PYTHON SOURCE LINES 21-41 .. code-block:: Python action_dataframe = pd.DataFrame( [ ["dsj3239asadsa3", "createPage", "greatProject"], ["2r09ej221sk2k5", "editPage", "greatProject"], ["dsj3239asadsa3", "editPage", "greatProject"], ["789dsadafj32jj", "editPage", "greatProject"], ["oi32ncwosap399", "editPage", "greatProject"], ["4r4320dkqpdokk", "createPage", "miniProject"], ["320eljl3lk3239", "editPage", "miniProject"], ["dsj3239asadsa3", "editPage", "miniProject"], ["3203ejew332323", "createPage", "private"], ["3203ejew332323", "editPage", "private"], ["40m11919332msa", "createPage", "private2"], ["40m11919332msa", "editPage", "private2"], ["dsj3239asadsa3", "createPage", "anotherGreatProject"], ["2r09ej221sk2k5", "editPage", "anotherGreatProject"], ], columns=["userid", "action", "project"], ) .. GENERATED FROM PYTHON SOURCE LINES 42-46 The goal of this example is to check when two users worked on the same page. We choose to use a weighted adjacency matrix for this, i.e. a table with rows and columns indexes by the users that has nonzero entries whenever folks collaborate. First, let's get the users and prepare an empty matrix: .. GENERATED FROM PYTHON SOURCE LINES 46-53 .. code-block:: Python users = action_dataframe["userid"].unique() adjacency_matrix = pd.DataFrame( np.zeros((len(users), len(users)), np.int32), index=users, columns=users, ) .. GENERATED FROM PYTHON SOURCE LINES 54-55 Then, let's iterate over all projects one by one, and add all collaborations: .. GENERATED FROM PYTHON SOURCE LINES 55-61 .. code-block:: Python for _project, project_data in action_dataframe.groupby("project"): project_users = project_data["userid"].values for i1, user1 in enumerate(project_users): for user2 in project_users[:i1]: adjacency_matrix.at[user1, user2] += 1 .. GENERATED FROM PYTHON SOURCE LINES 62-64 There are many ways to achieve the above matrix, so don't be surprised if you came up with another algorithm ;-) Now it's time to make the graph: .. GENERATED FROM PYTHON SOURCE LINES 64-66 .. code-block:: Python g = ig.Graph.Weighted_Adjacency(adjacency_matrix, mode="plus") .. GENERATED FROM PYTHON SOURCE LINES 67-69 We can take a look at the graph via plotting functions. We can first make a layout: .. GENERATED FROM PYTHON SOURCE LINES 69-71 .. code-block:: Python layout = g.layout("circle") .. GENERATED FROM PYTHON SOURCE LINES 72-73 Then we can prepare vertex sizes based on their closeness to other vertices .. GENERATED FROM PYTHON SOURCE LINES 73-76 .. code-block:: Python vertex_size = g.closeness() vertex_size = [10 * v**2 if not np.isnan(v) else 10 for v in vertex_size] .. GENERATED FROM PYTHON SOURCE LINES 77-78 Finally, we can plot the graph: .. GENERATED FROM PYTHON SOURCE LINES 78-90 .. code-block:: Python fig, ax = plt.subplots() ig.plot( g, target=ax, layout=layout, vertex_label=g.vs["name"], vertex_color="lightblue", vertex_size=vertex_size, edge_width=g.es["weight"], ) plt.show() .. image-sg:: /tutorials/images/sphx_glr_online_user_actions_001.png :alt: online user actions :srcset: /tutorials/images/sphx_glr_online_user_actions_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 91-93 Loops indicate "self-collaborations", which are not very meaningful. To filter out loops without losing the edge weights, we can use: .. GENERATED FROM PYTHON SOURCE LINES 93-106 .. code-block:: Python g = g.simplify(combine_edges="first") fig, ax = plt.subplots() ig.plot( g, target=ax, layout=layout, vertex_label=g.vs["name"], vertex_color="lightblue", vertex_size=vertex_size, edge_width=g.es["weight"], ) plt.show() .. image-sg:: /tutorials/images/sphx_glr_online_user_actions_002.png :alt: online user actions :srcset: /tutorials/images/sphx_glr_online_user_actions_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.160 seconds) .. _sphx_glr_download_tutorials_online_user_actions.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: online_user_actions.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: online_user_actions.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: online_user_actions.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_