gammagl.utils.homophily

class homophily(edge_index, y, batch=None, method: str = 'edge')[source]

The homophily of a graph characterizes how likely nodes with the same label are near each other in a graph. There are many measures of homophily that fits this definition. In particular:

  • In the “Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs” paper, the homophily is the fraction of edges in a graph which connects nodes that have the same class label:

    \[\frac{| \{ (v,w) : (v,w) \in \mathcal{E} \wedge y_v = y_w \} | } {| \mathcal{E}|}\]

    That measure is called the edge homophily ratio.

  • In the “Geom-GCN: Geometric Graph Convolutional Networks” paper, edge homophily is normalized across neighborhoods:

    \[\frac{1}{| \mathcal{V}|} \sum_{v \in \mathcal{V}} \frac{ | \{ (w,v) : w \in \mathcal{N}(v) \wedge y_v = y_w \} | } { | \mathcal{N}(v)| }\]

    That measure is called the node homophily ratio.

  • In the “Large-Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods” paper, edge homophily is modified to be insensitive to the number of classes and size of each class:

    \[\frac{1}{C-1} \sum_{k=1}^{C} \max \left(0, h_k - \frac{| \mathcal{C}_k|} {| \mathcal{V}|} \right)\]

    where \(C\) denotes the number of classes, \(| \mathcal{C}_k|\) denotes the number of nodes of class \(k\), and \(h_k\) denotes the edge homophily ratio of nodes of class \(k\). Thus, that measure is called the class insensitive edge homophily ratio.

Parameters:
  • edge_index (tensor) – The graph connectivity.

  • y (tensor) – The labels.

  • batch (tensor, optional) – Batch vector\(\mathbf{b} \in {\{ 0, \ldots,B-1\}}^N\), which assigns each node to a specific example. (default: None)

  • method (str, optional) – The method used to calculate the homophily, either "edge" (first formula), "node" (second formula) or "edge_insensitive" (third formula). (default: "edge")

  • Examples

    >>> edge_index = tlx.convert_to_tensor([[0, 1, 2, 3],
    ...                            [1, 2, 0, 4]])
    >>> y = tlx.convert_to_tensor([0, 0, 0, 0, 1])
    >>> # Edge homophily ratio
    >>> homophily(edge_index, y, method='edge')
    0.75
    >>> # Node homophily ratio
    >>> homophily(edge_index, y, method='node')
    0.6000000238418579
    >>> # Class insensitive edge homophily ratio
    >>> homophily(edge_index, y, method='edge_insensitive')
    0.19999998807907104