gammagl.utils.homophily¶

class homophily(edge_index, y, batch=None, method: str = 'edge')[source]¶

The homophily of a graph characterizes how likely nodes with the same label are near each other in a graph. There are many measures of homophily that fits this definition. In particular:

In the “Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs” paper, the homophily is the fraction of edges in a graph which connects nodes that have the same class label:

\[\frac{| \{ (v,w) : (v,w) \in \mathcal{E} \wedge y_v = y_w \} | } {| \mathcal{E}|}\]

That measure is called the edge homophily ratio.
In the “Geom-GCN: Geometric Graph Convolutional Networks” paper, edge homophily is normalized across neighborhoods:

\[\frac{1}{| \mathcal{V}|} \sum_{v \in \mathcal{V}} \frac{ | \{ (w,v) : w \in \mathcal{N}(v) \wedge y_v = y_w \} | } { | \mathcal{N}(v)| }\]

That measure is called the node homophily ratio.
In the “Large-Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods” paper, edge homophily is modified to be insensitive to the number of classes and size of each class:

\[\frac{1}{C-1} \sum_{k=1}^{C} \max \left(0, h_k - \frac{| \mathcal{C}_k|} {| \mathcal{V}|} \right)\]

where \(C\) denotes the number of classes, \(| \mathcal{C}_k|\) denotes the number of nodes of class \(k\), and \(h_k\) denotes the edge homophily ratio of nodes of class \(k\). Thus, that measure is called the class insensitive edge homophily ratio.

Parameters:

edge_index (tensor) – The graph connectivity.
y (tensor) – The labels.
batch (tensor, optional) – Batch vector\(\mathbf{b} \in {\{ 0, \ldots,B-1\}}^N\), which assigns each node to a specific example. (default: None)
method (str, optional) – The method used to calculate the homophily, either "edge" (first formula), "node" (second formula) or "edge_insensitive" (third formula). (default: "edge")

Examples –

>>> edge_index = tlx.convert_to_tensor([[0, 1, 2, 3],
...                            [1, 2, 0, 4]])
>>> y = tlx.convert_to_tensor([0, 0, 0, 0, 1])
>>> # Edge homophily ratio
>>> homophily(edge_index, y, method='edge')
0.75
>>> # Node homophily ratio
>>> homophily(edge_index, y, method='node')
0.6000000238418579
>>> # Class insensitive edge homophily ratio
>>> homophily(edge_index, y, method='edge_insensitive')
0.19999998807907104