This function reads a snowball from Apache Parquet format and returns a list
containing nodes and edges, which can be either Arrow Datasets or tibbles.
Usage
read_snowball(
snowball = NULL,
edge_type = c("core", "extended", "outside"),
return_data = FALSE,
shorten_ids = TRUE
)Arguments
- snowball
The directory of the Parquet files as poppulater by
pro_snowball().- edge_type
type of the returned edges. Possible values are:
core: only edges from or to the keypapers are selectedextended, only edges between thenodesare selected (this includescoreedges)outside: only edges where either thefromor thetois not innodesmultiple are allowed.
- return_data
Logical indicating whether to return an
ArrowObjectrepresenting the corpus (default) or atibblecontaining the whole corpus shou,d be returned.- shorten_ids
If
TRUEthe ids will be shortened, i.e. the parthttps://openalex.org/will be removed