gammagl.utils.homophily¶
- class homophily(edge_index, y, batch=None, method: str = 'edge')[source]¶
The homophily of a graph characterizes how likely nodes with the same label are near each other in a graph. There are many measures of homophily that fits this definition. In particular:
In the “Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs” paper, the homophily is the fraction of edges in a graph which connects nodes that have the same class label:
\[\frac{| \{ (v,w) : (v,w) \in \mathcal{E} \wedge y_v = y_w \} | } {| \mathcal{E}|}\]That measure is called the edge homophily ratio.
In the “Geom-GCN: Geometric Graph Convolutional Networks” paper, edge homophily is normalized across neighborhoods:
\[\frac{1}{| \mathcal{V}|} \sum_{v \in \mathcal{V}} \frac{ | \{ (w,v) : w \in \mathcal{N}(v) \wedge y_v = y_w \} | } { | \mathcal{N}(v)| }\]That measure is called the node homophily ratio.
In the “Large-Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods” paper, edge homophily is modified to be insensitive to the number of classes and size of each class:
\[\frac{1}{C-1} \sum_{k=1}^{C} \max \left(0, h_k - \frac{| \mathcal{C}_k|} {| \mathcal{V}|} \right)\]where \(C\) denotes the number of classes, \(| \mathcal{C}_k|\) denotes the number of nodes of class \(k\), and \(h_k\) denotes the edge homophily ratio of nodes of class \(k\). Thus, that measure is called the class insensitive edge homophily ratio.
- Parameters:
edge_index (tensor) – The graph connectivity.
y (tensor) – The labels.
batch (tensor, optional) – Batch vector\(\mathbf{b} \in {\{ 0, \ldots,B-1\}}^N\), which assigns each node to a specific example. (default:
None
)method (str, optional) – The method used to calculate the homophily, either
"edge"
(first formula),"node"
(second formula) or"edge_insensitive"
(third formula). (default:"edge"
)Examples –
>>> edge_index = tlx.convert_to_tensor([[0, 1, 2, 3], ... [1, 2, 0, 4]]) >>> y = tlx.convert_to_tensor([0, 0, 0, 0, 1]) >>> # Edge homophily ratio >>> homophily(edge_index, y, method='edge') 0.75 >>> # Node homophily ratio >>> homophily(edge_index, y, method='node') 0.6000000238418579 >>> # Class insensitive edge homophily ratio >>> homophily(edge_index, y, method='edge_insensitive') 0.19999998807907104