detect_rna
Description¶
Extraction of specific cmsearch-identified RNA sequences from a fasta file using EASEL
Installation¶
nf-core modules -g https://www.github.com/ebi-metagenomics/nf-modules install detect_rna
Pipelines¶
This subworkflow is used by the following pipelines:
Components¶
This subworkflow uses the following components:
seqkit/split2(module)cat/cat(module)infernal/cmsearch(module)infernal/cmscan(module)convertcmscantocmsearch(module)cmsearchtbloutdeoverlap(module)easel/eslsfetch(module)extractcoords(module)
Input¶
| Name | Type | Description | Pattern | 
|---|---|---|---|
meta | 
map | Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ] | 
- | 
ch_fasta | 
file | The input channel containing the fasta files Structure: [ val(meta), path(fasta) ] | *.{fasta, fasta.gz, fa, fa.gz} | 
rfam | 
directory | The folder containing Rfam database for use with cmsearch/cmscan Structure: path(cm) | - | 
claninfo | 
file | The input file containing the claninfo to use for cmsearchtbloutdeoverlap Structure: path(claninfo) | *.claninfo | 
mode | 
value | choose cmsearch or cmscan method to use | - | 
separate_subunits | 
boolean | Specify true to separate hits into the different RNA subunits | - | 
chunk_flag | 
boolean | Specify true to use seqkit/split2 to chunk contigs into sequences of specific length e.g. 50M.  IMPORTANT NOTE, YOU HAVE TO SPECIFY CHUNK LENGTH USING ext.args, e.g. --by-length 50M. See nextflow.config for unit test for a full example | 
- | 
Output¶
| Name | Type | Description | Pattern | 
|---|---|---|---|
versions | 
file | File containing software versions Structure: [ path(versions.yml) ] | versions.yml | 
cmsearch_deoverlap_coords | 
Channel containing deoverlapped cmsearch .tblout files Structure: [ val(meta), path("*.tblout.deoverlapped") ] | - | |
easel_coords | 
Channel containing fasta output from esl-sfetch Structure: [ val(meta), path("*.fasta") ] | - | |
ssu_fasta | 
Channel containing SSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*SSU.fasta") ] | - | |
lsu_fasta | 
Channel containing LSU fasta sequences Structure: [ val(meta), path("sequence-categorisation/*LSU.fasta") ] | - | |
rrna_bacteria | 
Channel containing bacterial rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_bacteria.fasta") ] | - | |
rrna_archaea | 
Channel containing archaeal rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_archaea.fasta") ] | - | |
eukarya | 
Channel containing eukaryan rRNA sequences Structure: [ val(meta), path("sequence-categorisation/rRNA_eukarya.fasta") ] | - | |
fiveS_fasta | 
Channel containing 5S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5S.fasta") ] | - | |
five_eightS_fasta | 
Channel containing 5.8S rRNA sequences Structure: [ val(meta), path("sequence-categorisation/*5_8S.fasta") ] | - | |
ncrna_fasta | 
Channel containing non-coding RNA sequences Structure: [ val(meta), path("sequence-categorisation/*other_ncRNA.fasta") ] | - |