.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/online_user_actions.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_online_user_actions.py: .. _tutorials-online-user-actions: =================== Online user actions =================== This example reproduces a typical data science situation in an internet company. We start from a pandas DataFrame with online user actions, for instance for an online text editor: the user can create a page, edit it, or delete it. We want to construct and visualize a graph of the users highlighting collaborations on the same page/project. .. GENERATED FROM PYTHON SOURCE LINES 10-16 .. code-block:: Python import igraph as ig import numpy as np import pandas as pd import matplotlib.pyplot as plt .. GENERATED FROM PYTHON SOURCE LINES 17-21 Let's start by preparing some toy data representing online users. Each row indicates a certain action taken by a user (e.g. click on a button within a website). Actual user data usually come with time stamp, but that's not essential for this example. .. GENERATED FROM PYTHON SOURCE LINES 21-40 .. code-block:: Python action_dataframe = pd.DataFrame([ ['dsj3239asadsa3', 'createPage', 'greatProject'], ['2r09ej221sk2k5', 'editPage', 'greatProject'], ['dsj3239asadsa3', 'editPage', 'greatProject'], ['789dsadafj32jj', 'editPage', 'greatProject'], ['oi32ncwosap399', 'editPage', 'greatProject'], ['4r4320dkqpdokk', 'createPage', 'miniProject'], ['320eljl3lk3239', 'editPage', 'miniProject'], ['dsj3239asadsa3', 'editPage', 'miniProject'], ['3203ejew332323', 'createPage', 'private'], ['3203ejew332323', 'editPage', 'private'], ['40m11919332msa', 'createPage', 'private2'], ['40m11919332msa', 'editPage', 'private2'], ['dsj3239asadsa3', 'createPage', 'anotherGreatProject'], ['2r09ej221sk2k5', 'editPage', 'anotherGreatProject'], ], columns=['userid', 'action', 'project'], ) .. GENERATED FROM PYTHON SOURCE LINES 41-45 The goal of this example is to check when two users worked on the same page. We choose to use a weighted adjacency matrix for this, i.e. a table with rows and columns indexes by the users that has nonzero entries whenever folks collaborate. First, let's get the users and prepare an empty matrix: .. GENERATED FROM PYTHON SOURCE LINES 45-52 .. code-block:: Python users = action_dataframe['userid'].unique() adjacency_matrix = pd.DataFrame( np.zeros((len(users), len(users)), np.int32), index=users, columns=users, ) .. GENERATED FROM PYTHON SOURCE LINES 53-54 Then, let's iterate over all projects one by one, and add all collaborations: .. GENERATED FROM PYTHON SOURCE LINES 54-60 .. code-block:: Python for _project, project_data in action_dataframe.groupby('project'): project_users = project_data['userid'].values for i1, user1 in enumerate(project_users): for user2 in project_users[:i1]: adjacency_matrix.at[user1, user2] += 1 .. GENERATED FROM PYTHON SOURCE LINES 61-63 There are many ways to achieve the above matrix, so don't be surprised if you came up with another algorithm ;-) Now it's time to make the graph: .. GENERATED FROM PYTHON SOURCE LINES 63-65 .. code-block:: Python g = ig.Graph.Weighted_Adjacency(adjacency_matrix, mode='plus') .. GENERATED FROM PYTHON SOURCE LINES 66-68 We can take a look at the graph via plotting functions. We can first make a layout: .. GENERATED FROM PYTHON SOURCE LINES 68-70 .. code-block:: Python layout = g.layout('circle') .. GENERATED FROM PYTHON SOURCE LINES 71-72 Then we can prepare vertex sizes based on their closeness to other vertices .. GENERATED FROM PYTHON SOURCE LINES 72-75 .. code-block:: Python vertex_size = g.closeness() vertex_size = [10 * v**2 if not np.isnan(v) else 10 for v in vertex_size] .. GENERATED FROM PYTHON SOURCE LINES 76-77 Finally, we can plot the graph: .. GENERATED FROM PYTHON SOURCE LINES 77-89 .. code-block:: Python fig, ax = plt.subplots() ig.plot( g, target=ax, layout=layout, vertex_label=g.vs['name'], vertex_color="lightblue", vertex_size=vertex_size, edge_width=g.es["weight"], ) plt.show() .. image-sg:: /tutorials/images/sphx_glr_online_user_actions_001.png :alt: online user actions :srcset: /tutorials/images/sphx_glr_online_user_actions_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 90-92 Loops indicate "self-collaborations", which are not very meaningful. To filter out loops without losing the edge weights, we can use: .. GENERATED FROM PYTHON SOURCE LINES 92-105 .. code-block:: Python g = g.simplify(combine_edges='first') fig, ax = plt.subplots() ig.plot( g, target=ax, layout=layout, vertex_label=g.vs['name'], vertex_color="lightblue", vertex_size=vertex_size, edge_width=g.es["weight"], ) plt.show() .. image-sg:: /tutorials/images/sphx_glr_online_user_actions_002.png :alt: online user actions :srcset: /tutorials/images/sphx_glr_online_user_actions_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.532 seconds) .. _sphx_glr_download_tutorials_online_user_actions.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: online_user_actions.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: online_user_actions.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: online_user_actions.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_