【4.1】rRNA数据库-sliva

sliva rRNA数据库(http://www.arb-silva.de/)用来检查和比对RNA序列,既可以针对16S/18S,SSU,也可以针对23S/28S, LSU,包括了Bacteria, Archaea and Eukarya。同时也是ARB的官方指定数据库。

  • LSU: Large subunit (23S/28Sribosomal RNAs)
  • SSU: Small subunit (16S/18Sribosomal RNAs)

一.下载

1.1 针对arb的下载

到目前(2015.2.4,最新的数据库为Realease119,网页版的已经到121版本了,但是现在不提供下载)

下载介绍http://www.arb-silva.de/download/arb-files/

下载地址:http://www.arb-silva.de/no_cache/download/archive/release_119/ARB_files/

我选择的是其中的 RefNR 99,他是Ref 119的无冗余版本。

wget –c 

http://www.arb-silva.de/fileadmin/silva_databases/release_119/ARB_files/SSURef_NR99_119_SILVA_14_07_14_opt.arb.tgz;

md5sum SSURef_NR99_119_SILVA_14_07_14_opt.arb.tgz;

wget –c ;http://www.arb-silva.de/fileadmin/silva_databases/release_119/ARB_files/SSURef_NR99_119_SILVA_14_07_14_opt.arb.tgz

1.2 仅仅是下载fasta文件

下载地址:http://www.arb-silva.de/no_cache/download/archive/release_119/Exports/

根据下载的需求,选择针对23S/28Sribosomal RNAs的LSU或者是针对16S/18Sribosomal RNAs的SSU;然后选择是否去冗余的,我选择去,即Nr99;然后选择是否trunc,即是否对名字缩写;选择是否全长比对结果;

### *_tax_silva.fasta.gz

-----------------

Multi FASTA files of the SSU/LSU databases including the SILVAtaxonomy for Bacteria, Archaea and Eukaryotes in the header.

REMARK: The sequences in the files are NOT truncated to theeffective LSU or SSU genes. They contain the full entries as they have been depositedin the public repositories (ENA/GenBank/DDBJ).

Fasta header:

>accession_number.start_position.stop_position taxonomic pathorganism name
 
### *_tax_silva_full_align_trunc.fasta.gz

-----------------------

Multi FASTA files of the SSU/LSU databases including the SILVAtaxonomy for Bacteria, Archaea and Eukaryotes in the header (including the FULLalignment).

REMARK: Sequences in these files haven been truncated. This meansthat all nucleotides that have not been aligned were removed from thesequence.

### *_tax_silva_trunc.fasta.gz

-----------------------

Multi FASTA files of the SSU/LSU database including the SILVAtaxonomy for Bacteria, Archaea and Eukaryotes in the header.

REMARK: Sequences in these files haven been truncated. This meansthat all nucleotides that have not been aligned were removed from thesequence.

生成使用与Mothur的silva数据库:http://blog.mothur.org/2014/08/08/SILVA-v119-reference-files/

参考资料

  • Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, PepliesJ, Glöckner FO (2013) The SILVA ribosomal RNA gene database project: improveddata processing and web-based tools. Opens external link in new windowNucl.Acids Res. 41 (D1): D590-D596
药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn