We are excited to announce the launch of MGnify’s Notebook Server. It provides an online, no-installation-needed, Jupyter Lab environment for users to explore programmatic access to MGnify’s datasets using Python or with R using the MGnifyR package.
MGnify is delighted to announce the release of our latest MAG catalogue, comprising 2729 species-level cluster representative genomes derived from cow rumen datasets.
Version 2 of the Unified Human Gastrointestinal Genome catalogue released
MGnify are excited to announce the release of version 2 of the Unified Human Gastrointestinal Genome (UHGG) catalogue. This is an updated version of the catalogue published by Almeida et al. Nature Biotech (2021). We have added 5,878 new genomes from two studies (PRJEB37358 and PRJNA544527), representing 129 new species.
Automated annotations are now available for publications linked to metagenomic studies on MGnify, powered by Europe PMC.
The Earth Microbiome Project is now available in MGnify. The Earth Microbiome Project (EMP)1 is a wide ranging collaborative effort that attempts to characterise the taxonomic and functional diversity of microbial life on the planet. Founded in 2010, the project includes 96 different studies, comprising approximately 26k individual sequencing runs from a diverse range of biomes.
The MGnify protein sequence database comprises sequences predicted from assemblies generated from publicly available metagenomic datasets. The initial release in August 2017 comprised just under 50 million sequences; the current version contains in excess of 800 million. All sequences now have stable accessions.
EBI Metagenomics becomes MGnify
We are pleased to announce that EBI Metagenomics has changed its name to MGnify in preparation for a series of forthcoming updates and improvements with the resource. The change also reflects the increasingly collaborative nature of the project within EMBL and across the scientific community. MGnify will continue to be a free resource for the assembly, analysis, archiving and browsing of all types of microbiome derived sequence data, providing insights into the phylogenetic diversity and functional potential of environmental samples. In addition to the new name, the website has been completely re-written to take advantage of our new API, which provides access to the metadata and analysis results.
Need to compile metadata to perform trait associations using our metagenomic data? Interested in correlating species abundance with the origin of the sample to identify organisms associated with a particular environment or state? Try our latest metagenomics toolkit (called: “mg-toolkit”) - a beta version of a tool to enable scientists to download all of the sample metadata for a given study to a single csv file. Simply install as follows:
Want to perform comprehensive meta-analysis of samples from publicly available metagenomics studies? Interested in discovering patterns in metagenomic data to predict disease? Our REST API allows both human and machines to query over 100,000 publicly available metagenomic and metatranscriptomic datasets. The base URL to the API https://www.ebi.ac.uk/metagenomics/api/ provides access to several data collections, such as studies, samples, runs, biomes and experiment-types. They can be filtered by a set of attributes, such as biome, allowing selection of samples that belong to the same microbial ecosystem. For instance, to retrieve oceanic data: https://www.ebi.ac.uk/metagenomics/api/latest/studies?lineage=root:Environmental:Aquatic:Marine:Oceanic. Retrieving data from our API is as simple as sending an HTTP request, where the response returns a JSON object formatted data structure that contains the resource type, associated object identifier (id) with attributes and relationships linking to other resources. For example, https://www.ebi.ac.uk/metagenomics/api/latest/studies/PRJEB1787 retrieves a metagenomics dataset produced during experiments of the Tara Oceans Expedition.
Analysis Pipeline v4.1 Released
As you may have seen from the EBI Metagenomics website, we have recently deployed a new version of our analysis pipeline (v4.1), which is now the default for analysis of submitted data. Our previous pipeline update (v4.0) was released approximately 6 months ago and involved substantial upgrades, including a move to a new method for identifying rRNAs and complete change to the way in which taxonomic analysis was performed.
One Thousand Publicly Available Projects
This week, EBI Metagenomics hit a major milestone as we passed over one thousand publicly available projects on the site. This corresponds to over 60,000 samples, comprising more than 80,000 individual runs, and represents the analysis of over 300 billion nucleotide sequences from a wide range of environmental biomes.
Interested in bulk download of our data? Did you know that we provide a Python script for the bulk download of publicly available project data? The tool iterates over all samples and runs in a project and builds an appropriate root URL, which it uses to download individual analyses result files. Different file types can be specified, allowing you to download, for example, all reads encoding 16S rRNAs, all taxonomic assignments, or all predicted protein coding sequences, for a particular project. To find out more, click the ‘Bulk download script’ link below.
We are hiring!
The EBI Metagenomics Portal and MG-RAST are the world-leading platforms offering free-to-use analysis services for the characterisation of metagenomics sequences. The Metagenomics Exchange is a new collaboration between these platforms, aiming to promote data exchange, discovery and cross talk between the resources and their analysis pipelines. Metagenomics analysis is challenging in terms of scale of data and diversity of data.
Interested in comparing the functional profile of sequencing runs within a project? Now it is possible, using our comparison tool, which provides analysis based on a slimmed-down subset of Gene Ontology (GO) terms, specially developed to describe metagenomic data.
The microbial population (or microbiome) of the human gut is involved in a wide range of important processes, such as digestion, production of vitamins and other nutrients, detoxification, protection from pathogens, and helping to shape the host immune system. Gut microbial communities represent substantial reservoirs of genetic and metabolic diversity: different people have different types of microorganisms in their gut, and community composition can change over time or with diet.
Plankton ecosystems contain a phenomenal reservoir of life: more than 10 billion organisms inhabit every litre of oceanic water, including viruses, prokaryotes, unicellular eukaryotes (protists), and metazoans. Plankton’s importance for the earth’s climate is at least equivalent to that of the rainforest. Yet only a small fraction of organisms that compose it have been classified and analysed.