Skip to contents

Get data from HoloFood database

Usage

getData(
  type = NULL,
  accession.type = NULL,
  accession = NULL,
  flatten = FALSE,
  ...
)

Arguments

type

NULL or character scalar specifying the type of data to query. Must be one of the following options: "analysis-summaries", "animals", "genome-catalogues", "samples", "sample_metadata_markers" or "viral-catalogues". When genome or viral catalogues is fetched by their accession ID, the type can also be "genomes" or "fragments". (Default: NULL)

accession.type

NULL or character scalar specifying the type of accession IDs. Must be one of the following options: "animals", "genome-catalogues", "samples" or "viral-catalogues". (Default: NULL)

accession

NULL or character vector specifying the accession IDs of type accession.type. (Default: NULL)

flatten

Logical scalar specifying whether to flatten the resulting data.frame. This means that columns with multiple values are separated to multiple columns. (Default: FALSE)

...

optional arguments:

  • max.hits NULL or integer scalar specifying the maximum number of results to fetch. When NULL, all results are fetched. (Default: NULL)

  • use.cache Logical scalar specifying whether to use cache (Default: FALSE)

  • cache.dir Character scalar specifying cache directory. (Default: tempdir())

  • clear.cache Logical scalar specifying whether to remove and clear cache (Default: FALSE)

Value

list or data.frame

Details

With getData, you can fetch data from the database. Compared to getResult, this function is more flexible since it can fetch any kind of data from the database. However, this function returns the data without further wrangling as list or data.frame which are not optimized format for fetching data on samples.

Search results can be filtered; for example, animals can be filtered based on available samples. See [Api browser](https://www.holofooddata.org/api/docs) for information on filters. You can find help on customizing queries from [here](https://emg-docs.readthedocs.io/en/latest/api.html#customising-queries).

See also

Examples


# Find genome catalogues
catalogues <- getData(type = "genome-catalogues")
head(catalogues)
#>                 id                   title
#> 1 chicken-gut-v1-0 HoloFood Chicken Gut v1
#> 2  salmon-gut-v1-0  HoloFood Salmon Gut v1
#>                                         biome    related_mag_catalogue_id
#> 1 root:Host-associated:Birds:Digestive system            chicken-gut-v1-0
#> 2  root:Host-associated:Fish:Digestive system non-model-fish-gut-gut-v2-0
#>    system analysis_summaries
#> 1 chicken       c("HoloF....
#> 2  salmon       c("HoloF....

# Find genomes based on certain genome catalogue iD
res <- getData(
    type = "genomes", accession.type = "genome-catalogues",
    accession = catalogues[1, "id"], max.hits = 100)
head(res)
#>       accession cluster_representative
#> 1 MGYG000308381          MGYG000310807
#> 2 MGYG000308382          MGYG000320406
#> 3 MGYG000308383          MGYG000319700
#> 4 MGYG000308384          MGYG000316616
#> 5 MGYG000308385          MGYG000318258
#> 6 MGYG000308386          MGYG000312616
#>                                                                                                                             taxonomy
#> 1          Bacteria > Firmicutes_A > Clostridia > Oscillospirales > Acutalibacteraceae > Acutalibacter > Acutalibacter ornithocaccae
#> 2                  Bacteria > Firmicutes_A > Clostridia > Lachnospirales > Lachnospiraceae > Scybalocola > Scybalocola faecipullorum
#> 3 Bacteria > Cyanobacteria > Vampirovibrionia > Gastranaerophilales > Gastranaerophilaceae > Stercorousia > Stercorousia sp000437435
#> 4  Bacteria > Actinobacteriota > Actinomycetia > Actinomycetales > Bifidobacteriaceae > Bifidobacterium > Bifidobacterium pullorum_B
#> 5                            Bacteria > Firmicutes_A > Clostridia > Lachnospirales > CAG-274 > Gallispira > Gallispira edinburgensis
#> 6           Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacterales > Enterobacteriaceae > Escherichia > Escherichia coli
#>                                                representative_url metadata1
#> 1 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000310807       781
#> 2 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000320406       782
#> 3 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000319700       783
#> 4 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000316616       784
#> 5 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000318258       785
#> 6 https://www.ebi.ac.uk/metagenomics/api/v1/genomes/MGYG000312616       786
#>   metadata.Genome_type metadata.Length metadata.N_contigs metadata.N50
#> 1                  MAG         1890035                189        12501
#> 2                  MAG         2145586                258         9928
#> 3                  MAG         2206567                 44       140695
#> 4                  MAG         1441532                255         6193
#> 5                  MAG         2550799                 74        51548
#> 6                  MAG         3876255                351        15239
#>   metadata.GC_content metadata.Completeness metadata.Contamination
#> 1               62.27                  86.2                    0.0
#> 2               44.49                 67.14                    0.0
#> 3               30.68                 83.76                   1.28
#> 4               64.96                 79.39                   2.35
#> 5               37.11                 89.82                    0.0
#> 6               51.07                 87.75                   1.01
#>   metadata.rRNA_5S metadata.rRNA_16S metadata.rRNA_23S metadata.tRNAs
#> 1              0.0               0.0               0.0             15
#> 2              0.0               0.0               0.0             13
#> 3              0.0               0.0               0.0             20
#> 4              0.0             10.18               0.0             19
#> 5              0.0             16.96               0.0             16
#> 6             95.8               0.0              8.34             19
#>   metadata.Genome_accession metadata.Sample_accession metadata.Study_accession
#> 1               ERZ15233365            SAMEA112264197                ERP122587
#> 2               ERZ15233366            SAMEA112264327                ERP122587
#> 3               ERZ15233367            SAMEA112264363                ERP122587
#> 4               ERZ15233368            SAMEA112264171                ERP122587
#> 5               ERZ15233369            SAMEA112264145                ERP122587
#> 6               ERZ15233370            SAMEA112264292                ERP122587
#>   metadata.Country metadata.Continent
#> 1            Spain             Europe
#> 2            Spain             Europe
#> 3            Spain             Europe
#> 4            Spain             Europe
#> 5            Spain             Europe
#> 6            Spain             Europe
#>                                                                                                                                metadata.FTP_download
#> 1 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003108/MGYG000310807/genomes1/MGYG000308381.gff.gz
#> 2 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003204/MGYG000320406/genomes1/MGYG000308382.gff.gz
#> 3 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003197/MGYG000319700/genomes1/MGYG000308383.gff.gz
#> 4 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003166/MGYG000316616/genomes1/MGYG000308384.gff.gz
#> 5 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003182/MGYG000318258/genomes1/MGYG000308385.gff.gz
#> 6 ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/chicken-gut/v1.0/all_genomes/MGYG0003126/MGYG000312616/genomes1/MGYG000308386.gff.gz