Skip to content

Subworkflows

This page lists all available subworkflows in the Microbiome Informatics (EBI-Metagenomics) repository.

assembly_decontamination

MGnify Production assembly decontamination subworkflow. Performs sequential decontamination of assembled contigs against the human genome, PhiX and a genome specified by the user (usually the sample host). It designed to remove any contigs that come from any of those references.

combined_gene_caller

MGnify combined gene calling. This workflow runs gene prediction with Pyrodigal and FragGeneScanRS, and then combines the resulting predictions. The merged output contains all the gene predictions from Pyrodigal, along with genes predicted by FragGeneScanRS that do not overlap with any Pyrodigal gene. Optionally, it can mask (remove) genes that overlap with regions from a masking file.

contigs_taxonomic_classification

Getting per contig taxonomic annotations for metagenomic assembly based on taxonomic classification of predicted proteins with DIAMOND and CAT.

decontaminate_contigs

MGnify decontamination workflow for contigs.

Decontamination algorithm: Remove contigs with query coverage ≥ min_qcov AND percentage identity ≥ min_pid

detect_rna

Extraction of specific cmsearch-identified RNA sequences from a fasta file using EASEL

fasta_domainannotation

Protein domain annotation

goslim_swf

Get GO term and GO-slim term counts out of an InterProScan .tsv output file

mapseq_otu_krona

Taxononmy assignment and visualisation of reads using input reference database

reads_bwamem2_decontamination

Short-reads mapping to a reference genome and remove matching reads

reads_qc

Quality control and merging of fastq-format short-reads using fastp, generating fasta

rrna_extraction

Extraction of specific cmsearch-identified rRNA sequences from a fasta file using EASEL