detect_rna
Description¶
Extraction of specific cmsearch-identified RNA sequences from a fasta file using EASEL
Installation¶
nf-core modules -g https://www.github.com/ebi-metagenomics/nf-modules install detect_rna
Pipelines¶
This subworkflow is used by the following pipelines:
Components¶
This subworkflow uses the following components:
seqkit/split2(module)cat/cat(module)infernal/cmsearch(module)infernal/cmscan(module)convertcmscantocmsearch(module)cmsearchtbloutdeoverlap(module)easel/eslsfetch(module)extractcoords(module)
Input¶
| Name | Type | Description | Pattern |
|---|---|---|---|
meta |
map | Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ] |
- |
ch_fasta |
file | The input channel containing the fasta files Structure: [ val(meta), path(fasta) ] | *.{fasta, fasta.gz, fa, fa.gz} |
rfam |
directory | The folder containing Rfam database for use with cmsearch/cmscan Structure: path(cm) | - |
claninfo |
file | The input file containing the claninfo to use for cmsearchtbloutdeoverlap Structure: path(claninfo) | *.claninfo |
mode |
value | choose cmsearch or cmscan method to use | - |
separate_subunits |
boolean | Specify true to separate hits into the different RNA subunits | - |
chunk_flag |
boolean | Specify true to use seqkit/split2 to chunk contigs into sequences of specific length e.g. 50M. IMPORTANT NOTE, YOU HAVE TO SPECIFY CHUNK LENGTH USING ext.args, e.g. --by-length 50M. See nextflow.config for unit test for a full example |
- |
Output¶
| Name | Type | Description | Pattern |
|---|---|---|---|
versions |
file | File containing software versions Structure: [ path(versions.yml) ] | versions.yml |
cmsearch_deoverlap_coords |
Channel containing deoverlapped cmsearch .tblout files Structure: [ val(meta), path("*.tblout.deoverlapped") ] | - | |
easel_coords |
Channel containing fasta output from esl-sfetch Structure: [ val(meta), path("*.fasta") ] | - | |
ssu_fasta |
Channel containing SSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*SSU.fasta") ] | - | |
lsu_fasta |
Channel containing LSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*LSU.fasta") ] | - | |
rrna_bacteria |
Channel containing bacterial rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_bacteria.fasta") ] | - | |
rrna_archaea |
Channel containing archaeal rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_archaea.fasta") ] | - | |
eukarya |
Channel containing eukaryan rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_eukarya.fasta") ] | - | |
fiveS_fasta |
Channel containing 5S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5S.fasta") ] | - | |
five_eightS_fasta |
Channel containing 5.8S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5_8S.fasta") ] | - | |
ncrna_fasta |
Channel containing non-coding RNA sequences Structure: [ val(meta), path("sequence-categorisation/*other_ncRNA.fasta") ] | - |