Skip to content

detect_rna

Description

Extraction of specific cmsearch-identified RNA sequences from a fasta file using EASEL

Installation

ebi-metagenomics/detect_rna

nf-core modules -g https://www.github.com/ebi-metagenomics/nf-modules install detect_rna

Pipelines

This subworkflow is used by the following pipelines:

Components

This subworkflow uses the following components:

Input

Name Type Description Pattern
meta map Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ] -
ch_fasta file The input channel containing the fasta files Structure: [ val(meta), path(fasta) ] *.{fasta, fasta.gz, fa, fa.gz}
rfam directory The folder containing Rfam database for use with cmsearch/cmscan Structure: path(cm) -
claninfo file The input file containing the claninfo to use for cmsearchtbloutdeoverlap Structure: path(claninfo) *.claninfo
mode value choose cmsearch or cmscan method to use -
separate_subunits boolean Specify true to separate hits into the different RNA subunits -
chunk_flag boolean Specify true to use seqkit/split2 to chunk contigs into sequences of specific length e.g. 50M. IMPORTANT NOTE, YOU HAVE TO SPECIFY CHUNK LENGTH USING ext.args, e.g. --by-length 50M. See nextflow.config for unit test for a full example -

Output

Name Type Description Pattern
versions file File containing software versions Structure: [ path(versions.yml) ] versions.yml
cmsearch_deoverlap_coords Channel containing deoverlapped cmsearch .tblout files Structure: [ val(meta), path("*.tblout.deoverlapped") ] -
easel_coords Channel containing fasta output from esl-sfetch Structure: [ val(meta), path("*.fasta") ] -
ssu_fasta Channel containing SSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*SSU.fasta") ] -
lsu_fasta Channel containing LSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*LSU.fasta") ] -
rrna_bacteria Channel containing bacterial rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_bacteria.fasta") ] -
rrna_archaea Channel containing archaeal rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_archaea.fasta") ] -
eukarya Channel containing eukaryan rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_eukarya.fasta") ] -
fiveS_fasta Channel containing 5S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5S.fasta") ] -
five_eightS_fasta Channel containing 5.8S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5_8S.fasta") ] -
ncrna_fasta Channel containing non-coding RNA sequences Structure: [ val(meta), path("sequence-categorisation/*other_ncRNA.fasta") ] -

People

Authors

@Kate_Sakharova