.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/stochastic_variability.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_tutorials_stochastic_variability.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_stochastic_variability.py:


.. _tutorials-stochastic-variability:

=========================================================
Stochastic Variability in Community Detection Algorithms
=========================================================

This example demonstrates the use of stochastic community detection methods to check whether a network possesses a strong community structure, and whether the partitionings we obtain are meaningul. Many community detection algorithms are randomized, and return somewhat different results after each run, depending on the random seed that was set. When there is a robust community structure, we expect these results to be similar to each other. When the community structure is weak or non-existent, the results may be noisy and highly variable. We will employ several partion similarity measures to analyse the consistency of the results, including the normalized mutual information (NMI), the variation of information (VI), and the Rand index (RI).

.. GENERATED FROM PYTHON SOURCE LINES 13-18

.. code-block:: Python

    import igraph as ig
    import matplotlib.pyplot as plt
    import itertools
    import random


.. GENERATED FROM PYTHON SOURCE LINES 19-22

.. note::
  We set a random seed to ensure that the results look exactly the same in
  the gallery. You don't need to do this when exploring randomness.

.. GENERATED FROM PYTHON SOURCE LINES 22-24

.. code-block:: Python

    random.seed(42)


.. GENERATED FROM PYTHON SOURCE LINES 25-27

We will use Zachary's karate club dataset [1]_, a classic example of a network
with a strong community structure:

.. GENERATED FROM PYTHON SOURCE LINES 27-29

.. code-block:: Python

    karate = ig.Graph.Famous("Zachary")


.. GENERATED FROM PYTHON SOURCE LINES 30-34

We will compare it to an an Erdős-Rényi :math:`G(n, m)` random network having
the same number of vertices and edges. The parameters 'n' and 'm' refer to the
vertex and edge count, respectively. Since this is a random network, it should
have no community structure.

.. GENERATED FROM PYTHON SOURCE LINES 34-36

.. code-block:: Python

    random_graph = ig.Graph.Erdos_Renyi(n=karate.vcount(), m=karate.ecount())


.. GENERATED FROM PYTHON SOURCE LINES 37-38

First, let us plot the two networks for a visual comparison:

.. GENERATED FROM PYTHON SOURCE LINES 38-69

.. code-block:: Python


    # Create subplots
    fig, axes = plt.subplots(1, 2, figsize=(12, 6), subplot_kw={"aspect": "equal"})

    # Karate club network
    ig.plot(
        karate,
        target=axes[0],
        vertex_color="lightblue",
        vertex_size=30,
        vertex_label=range(karate.vcount()),
        vertex_label_size=10,
        edge_width=1,
    )
    axes[0].set_title("Karate club network")

    # Random network
    ig.plot(
        random_graph,
        target=axes[1],
        vertex_color="lightcoral",
        vertex_size=30,
        vertex_label=range(random_graph.vcount()),
        vertex_label_size=10,
        edge_width=1,
    )
    axes[1].set_title("Erdős-Rényi random network")

    plt.show()


.. image-sg:: /tutorials/images/sphx_glr_stochastic_variability_001.png
   :alt: Karate club network, Erdős-Rényi random network
   :srcset: /tutorials/images/sphx_glr_stochastic_variability_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 70-71

Function to compute similarity between partitions using various methods:

.. GENERATED FROM PYTHON SOURCE LINES 71-81

.. code-block:: Python

    def compute_pairwise_similarity(partitions, method):
        similarities = []

        for p1, p2 in itertools.combinations(partitions, 2):
            similarity = ig.compare_communities(p1, p2, method=method)
            similarities.append(similarity)

        return similarities


.. GENERATED FROM PYTHON SOURCE LINES 82-89

The Leiden method, accessible through :meth:`igraph.Graph.community_leiden()`,
is a modularity maximization approach for community detection.  Since exact
modularity maximization is NP-hard, the algorithm employs a greedy heuristic
that processes vertices in a random order.  This randomness leads to
variation in the detected communities across different runs, which is why
results may differ each time the method is applied. The following function
runs the Leiden algorithm multiple times:

.. GENERATED FROM PYTHON SOURCE LINES 89-100

.. code-block:: Python

    def run_experiment(graph, iterations=100):
        partitions = [
            graph.community_leiden(objective_function="modularity").membership
            for _ in range(iterations)
        ]
        nmi_scores = compute_pairwise_similarity(partitions, method="nmi")
        vi_scores = compute_pairwise_similarity(partitions, method="vi")
        ri_scores = compute_pairwise_similarity(partitions, method="rand")
        return nmi_scores, vi_scores, ri_scores


.. GENERATED FROM PYTHON SOURCE LINES 101-102

Run the experiment on both networks:

.. GENERATED FROM PYTHON SOURCE LINES 102-105

.. code-block:: Python

    nmi_karate, vi_karate, ri_karate = run_experiment(karate)
    nmi_random, vi_random, ri_random = run_experiment(random_graph)


.. GENERATED FROM PYTHON SOURCE LINES 106-108

Finally, let us plot histograms of the pairwise similarities of the obtained
partitionings to understand the result:

.. GENERATED FROM PYTHON SOURCE LINES 108-151

.. code-block:: Python

    fig, axes = plt.subplots(2, 3, figsize=(12, 6))
    measures = [
        # Normalized Mutual Information (0-1, higher = more similar)
        (nmi_karate, nmi_random, "NMI", 0, 1),
        # Variation of Information (0+, lower = more similar)
        (vi_karate, vi_random, "VI", 0, max(vi_karate + vi_random)),
        # Rand Index (0-1, higher = more similar)
        (ri_karate, ri_random, "RI", 0, 1),
    ]
    colors = ["red", "blue", "green"]

    for i, (karate_scores, random_scores, measure, lower, upper) in enumerate(measures):
        # Karate club histogram
        axes[0][i].hist(
            karate_scores,
            bins=20,
            range=(lower, upper),
            density=True,  # Probability density
            alpha=0.7,
            color=colors[i],
            edgecolor="black",
        )
        axes[0][i].set_title(f"{measure} - Karate club network")
        axes[0][i].set_xlabel(f"{measure} score")
        axes[0][i].set_ylabel("PDF")

        # Random network histogram
        axes[1][i].hist(
            random_scores,
            bins=20,
            range=(lower, upper),
            density=True,
            alpha=0.7,
            color=colors[i],
            edgecolor="black",
        )
        axes[1][i].set_title(f"{measure} - Random network")
        axes[1][i].set_xlabel(f"{measure} score")
        axes[0][i].set_ylabel("PDF")

    plt.tight_layout()
    plt.show()


.. image-sg:: /tutorials/images/sphx_glr_stochastic_variability_002.png
   :alt: NMI - Karate club network, VI - Karate club network, RI - Karate club network, NMI - Random network, VI - Random network, RI - Random network
   :srcset: /tutorials/images/sphx_glr_stochastic_variability_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 152-169

We have compared the pairwise similarities using the NMI, VI, and RI measures
between partitonings obtained for the karate club network (strong community
structure) and a comparable random graph (which lacks communities).

The Normalized Mutual Information (NMI) and Rand Index (RI) both quantify
similarity, and take values from :math:`[0,1]`. Higher values indicate more
similar partitionings, with a value of 1 attained when the partitionings are
identical.

The Variation of Information (VI) is a distance measure. It takes values from
:math:`[0,\infty]`, with lower values indicating higher similarities. Identical
partitionings have a distance of zero.

For the karate club network, NMI and RI value are concentrated near 1, while
VI is concentrated near 0, suggesting a robust community structure. In contrast
the values obtained for the random network are much more spread out, showing
inconsistent partitionings due to the lack of a clear community structure.

.. GENERATED FROM PYTHON SOURCE LINES 171-172

.. [1] W. Zachary: "An Information Flow Model for Conflict and Fission in Small Groups". Journal of Anthropological Research 33, no. 4 (1977): 452–73. https://www.jstor.org/stable/3629752


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.974 seconds)


.. _sphx_glr_download_tutorials_stochastic_variability.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: stochastic_variability.ipynb <stochastic_variability.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: stochastic_variability.py <stochastic_variability.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: stochastic_variability.zip <stochastic_variability.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_