gammagl.datasets.Planetoid¶
- class Planetoid(root: str | None = None, name: str = 'cora', split: str = 'public', num_train_per_class: int = 20, num_val: int = 500, num_test: int = 1000, transform: Callable | None = None, pre_transform: Callable | None = None, force_reload: bool = False)[source]¶
The citation network datasets “Cora”, “CiteSeer” and “PubMed” from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. Nodes represent documents and edges represent citation links. Training, validation and test splits are given by binary masks.
- Parameters:
root (str, optional) – Root directory where the dataset should be saved.
name (str, optional) – The name of the dataset (
"Cora"
,"CiteSeer"
,"PubMed"
).split (str, optional) –
The type of dataset split (
"public"
,"full"
,"random"
). If set to"public"
, the split will be the public fixed split from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. If set to"full"
, all nodes except those in the validation and test sets will be used for training (as in the “FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling” paper). If set to"random"
, train, validation, and test sets will be randomly generated, according tonum_train_per_class
,num_val
andnum_test
. (default:"public"
)num_train_per_class (int, optional) – The number of training samples per class in case of
"random"
split. (default:20
)num_val (int, optional) – The number of validation samples in case of
"random"
split. (default:500
)num_test (int, optional) – The number of test samples in case of
"random"
split. (default:1000
)transform (callable, optional) – A function/transform that takes in an
gammagl.data.Graph
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
gammagl.data.Graph
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)(bool (force_reload) – (default:
False
)optional) (Whether to re-process the dataset.) – (default:
False
)
Tip
Name
#nodes
#edges
#features
#classes
Cora
2,708
10,556
1,433
7
CiteSeer
3,327
9,104
3,703
6
PubMed
19,717
88,648
500
3
- url = 'https://github.com/kimiyoung/planetoid/raw/master/data'¶
- property raw_file_names: List[str]¶
The name of the files in the
self.raw_dir
folder that must be present in order to skip downloading.