gammagl.datasets.Coauthor

class Coauthor(root: str | None = None, name: str = 'cs', transform: Callable | None = None, pre_transform: Callable | None = None, force_reload: bool = False)[source]

The Coauthor CS and Coauthor Physics networks from the “Pitfalls of Graph Neural Network Evaluation” paper. Nodes represent authors that are connected by an edge if they co-authored a paper. Given paper keywords for each author’s papers, the task is to map authors to their respective field of study.

Parameters:
  • root (str, optional) – Root directory where the dataset should be saved.

  • name (str, optional) – The name of the dataset ("CS", "Physics").

  • transform (callable, optional) – A function/transform that takes in an gammagl.data.Graph object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an gammagl.data.Graph object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • (bool (force_reload) – (default: False)

  • optional) (Whether to re-process the dataset.) – (default: False)

  • Stats

    Name

    #nodes

    #edges

    #features

    #classes

    CS

    18,333

    163,788

    6,805

    15

    Physics

    34,493

    495,924

    8,415

    5

url = 'https://github.com/shchur/gnn-benchmark/raw/master/data/npz/'
property raw_dir: str
property processed_dir: str
property raw_file_names: str

The name of the files in the self.raw_dir folder that must be present in order to skip downloading.

property processed_file_names: str

The name of the files in the self.processed_dir folder that must be present in order to skip processing.

download()[source]

Downloads the dataset to the self.raw_dir folder.

process()[source]

Processes the dataset to the self.processed_dir folder.