Get data from HoloFood database
Arguments
- type
NULL
orcharacter scalar
specifying the type of data to query. Must be one of the following options:"analysis-summaries"
,"animals"
,"genome-catalogues"
,"samples"
,"sample_metadata_markers"
or"viral-catalogues"
. When genome or viral catalogues is fetched by their accession ID, the type can also be"genomes"
or"fragments"
. (Default:NULL
)- accession.type
NULL
orcharacter scalar
specifying the type of accession IDs. Must be one of the following options:"animals"
,"genome-catalogues"
,"samples"
or"viral-catalogues"
. (Default:NULL
)- accession
NULL
orcharacter vector
specifying the accession IDs of typeaccession.type
. (Default:NULL
)- flatten
Logical scalar
specifying whether to flatten the resultingdata.frame
. This means that columns with multiple values are separated to multiple columns. (Default:FALSE
)- ...
optional arguments:
max.hits
NULL
orinteger scalar
specifying the maximum number of results to fetch. When NULL, all results are fetched. (Default:NULL
)use.cache
Logical scalar
specifying whether to use cache (Default:FALSE
)cache.dir
Character scalar
specifying cache directory. (Default:tempdir()
)clear.cache
Logical scalar
specifying whether to remove and clear cache (Default:FALSE
)
Details
With getData
, you can fetch data from the database. Compared to
getResult
, this function is more flexible since it can fetch any kind
of data from the database. However, this function returns the data
without further wrangling as list
or data.frame
which are not
optimized format for fetching data on samples.
Search results can be filtered; for example, animals can be filtered based on available samples. See [Api browser](https://www.holofooddata.org/api/docs) for information on filters. You can find help on customizing queries from [here](https://emg-docs.readthedocs.io/en/latest/api.html#customising-queries).
Examples
# Find genome catalogues
catalogues <- getData(type = "genome-catalogues")
head(catalogues)
#> id title
#> 1 chicken-gut-v1-0 HoloFood Chicken Gut v1
#> 2 salmon-gut-v1-0 HoloFood Salmon Gut v1
#> biome related_mag_catalogue_id
#> 1 root:Host-associated:Birds:Digestive system chicken-gut-v1-0
#> 2 root:Host-associated:Fish:Digestive system non-model-fish-gut-gut-v2-0
#> system analysis_summaries
#> 1 chicken c("HoloF....
#> 2 salmon c("HoloF....
# Find genomes based on certain genome catalogue iD
res <- getData(
type = "genomes", accession.type = "genome-catalogues",
accession = catalogues[1, "id"], max.hits = 100)
head(res)
#> accession cluster_representative
#> 1 MGYG000308381 MGYG000310807
#> 2 MGYG000308382 MGYG000320406
#> 3 MGYG000308383 MGYG000319700
#> 4 MGYG000308384 MGYG000316616
#> 5 MGYG000308385 MGYG000318258
#> 6 MGYG000308386 MGYG000312616
#> taxonomy
#> 1 Bacteria > Firmicutes_A > Clostridia > Oscillospirales > Acutalibacteraceae > Acutalibacter > Acutalibacter ornithocaccae
#> 2 Bacteria > Firmicutes_A > Clostridia > Lachnospirales > Lachnospiraceae > Scybalocola > Scybalocola faecipullorum
#> 3 Bacteria > Cyanobacteria > Vampirovibrionia > Gastranaerophilales > Gastranaerophilaceae > Stercorousia > Stercorousia sp000437435
#> 4 Bacteria > Actinobacteriota > Actinomycetia > Actinomycetales > Bifidobacteriaceae > Bifidobacterium > Bifidobacterium pullorum_B
#> 5 Bacteria > Firmicutes_A > Clostridia > Lachnospirales > CAG-274 > Gallispira > Gallispira edinburgensis
#> 6 Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacterales > Enterobacteriaceae > Escherichia > Escherichia coli
#> representative_url metadata1
#> 1 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000310807 781
#> 2 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000320406 782
#> 3 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000319700 783
#> 4 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000316616 784
#> 5 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000318258 785
#> 6 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000312616 786
#> metadata.Genome_type metadata.Length metadata.N_contigs metadata.N50
#> 1 MAG 1890035 189 12501
#> 2 MAG 2145586 258 9928
#> 3 MAG 2206567 44 140695
#> 4 MAG 1441532 255 6193
#> 5 MAG 2550799 74 51548
#> 6 MAG 3876255 351 15239
#> metadata.GC_content metadata.Completeness metadata.Contamination
#> 1 62.27 86.2 0.0
#> 2 44.49 67.14 0.0
#> 3 30.68 83.76 1.28
#> 4 64.96 79.39 2.35
#> 5 37.11 89.82 0.0
#> 6 51.07 87.75 1.01
#> metadata.rRNA_5S metadata.rRNA_16S metadata.rRNA_23S metadata.tRNAs
#> 1 0.0 0.0 0.0 15
#> 2 0.0 0.0 0.0 13
#> 3 0.0 0.0 0.0 20
#> 4 0.0 10.18 0.0 19
#> 5 0.0 16.96 0.0 16
#> 6 95.8 0.0 8.34 19
#> metadata.Genome_accession metadata.Sample_accession metadata.Study_accession
#> 1 ERZ15233365 SAMEA112264197 ERP122587
#> 2 ERZ15233366 SAMEA112264327 ERP122587
#> 3 ERZ15233367 SAMEA112264363 ERP122587
#> 4 ERZ15233368 SAMEA112264171 ERP122587
#> 5 ERZ15233369 SAMEA112264145 ERP122587
#> 6 ERZ15233370 SAMEA112264292 ERP122587
#> metadata.Country metadata.Continent
#> 1 Spain Europe
#> 2 Spain Europe
#> 3 Spain Europe
#> 4 Spain Europe
#> 5 Spain Europe
#> 6 Spain Europe
#> metadata.FTP_download
#> 1 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003108/MGYG000310807/genomes1/MGYG000308381.gff.gz
#> 2 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003204/MGYG000320406/genomes1/MGYG000308382.gff.gz
#> 3 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003197/MGYG000319700/genomes1/MGYG000308383.gff.gz
#> 4 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003166/MGYG000316616/genomes1/MGYG000308384.gff.gz
#> 5 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003182/MGYG000318258/genomes1/MGYG000308385.gff.gz
#> 6 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003126/MGYG000312616/genomes1/MGYG000308386.gff.gz