gammagl.datasets.DBLP

class DBLP(root: str | None = None, transform: Callable | None = None, pre_transform: Callable | None = None, force_reload: bool = False)[source]

A subset of the DBLP computer science bibliography website, as collected in the “MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding” paper.

DBLP is a heterogeneous graph containing four types of entities - authors (4,057 nodes), papers (14,328 nodes), terms (7,723 nodes), and conferences (20 nodes). The authors are divided into four research areas (database, data mining, artificial intelligence, information retrieval). Each author is described by a bag-of-words representation of their paper keywords.

metapaths = [[(‘author’, ‘paper’), (‘paper’, ‘author’)],

[(‘author’, ‘paper’), (‘paper’, ‘term’), (‘term’, ‘paper’), (‘paper’, ‘term’)], [(‘author’, ‘paper’), (‘paper’, ‘venue’), (‘venue’, ‘paper’), (‘paper’, ‘term’)]]

Parameters:
  • root (str, optional) – Root directory where the dataset should be saved.

  • transform (callable, optional) – A function/transform that takes in an gammagl.data.HeteroGraph object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an gammagl.data.HeteroGraph object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • (bool (force_reload) – (default: False)

  • optional) (Whether to re-process the dataset.) – (default: False)

url = 'https://www.dropbox.com/s/yh4grpeks87ugr2/DBLP_processed.zip?dl=1'
property raw_file_names: List[str]

The name of the files in the self.raw_dir folder that must be present in order to skip downloading.

property processed_file_names: str

The name of the files in the self.processed_dir folder that must be present in order to skip processing.

download()[source]

Downloads the dataset to the self.raw_dir folder.

process()[source]

Processes the dataset to the self.processed_dir folder.