gammagl.datasets.PolBlogs

class PolBlogs(root: str | None = None, transform: Callable | None = None, pre_transform: Callable | None = None, force_reload: bool = False)[source]

The Political Blogs dataset from the “The Political Blogosphere and the 2004 US Election: Divided they Blog” paper.

Polblogs is a graph with 1,490 vertices (representing political blogs) and 19,025 edges (links between blogs). The links are automatically extracted from a crawl of the front page of the blog. Each vertex receives a label indicating the political leaning of the blog: liberal or conservative.

Parameters:
  • root (str, optional) – Root directory where the dataset should be saved.

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • (bool (force_reload) – (default: False)

  • optional) (Whether to re-process the dataset.) – (default: False)

STATS:

#nodes

#edges

#features

#classes

1,490

19,025

0

2

url = 'https://netset.telecom-paris.fr/datasets/polblogs.tar.gz'
property raw_file_names: List[str]

The name of the files in the self.raw_dir folder that must be present in order to skip downloading.

property processed_file_names: str

The name of the files in the self.processed_dir folder that must be present in order to skip processing.

download()[source]

Downloads the dataset to the self.raw_dir folder.

process()[source]

Processes the dataset to the self.processed_dir folder.