【4.6.1】预测microRNA结合位点--miRanda
一、 安装
#wget -c https://link.zhihu.com/?target=http%3A//cbio.mskcc.org/microrna_data/miRanda-aug2010.tar.gz
wget -c http://ftp.genek.cn:8888/Share/linux_software/miRanda-aug2010.tar.gz
tar -xvf miRanda-aug2010.tar.gz
cd miRanda-3.3a/
./configure --prefix=/data4/software/microRNA/miRanda-3.3a-install
make && make install
cd /data4/software/microRNA/miRanda-3.3a-install/bin
./miranda --help
二、测试
cd /data4/software/microRNA/miRanda-3.3a/examples
[sam@c01 examples]$ ll
total 16
-rw-r--r-- 1 sam sam 95 Mar 27 2010 bantam_stRNA.fasta
-rw-r--r-- 1 sam sam 4086 Mar 27 2010 hid_UTR.fasta
[sam@c01 examples]$ head bantam_stRNA.fasta
>gi|29565487|emb|AJ550546.1| Drosophila melanogaster microRNA miR-bantam
GTGAGATCATTTTGAAAGCTG
[sam@c01 examples]$ head hid_UTR.fasta
>gi|945100|gb|U31226.1|DMU31226 Drosophila melanogaster head involution defective protein (hid) mRNA, complete cds (3'UTR only)
TGACAAAAAATAAAAAACGAAATCCATCGTGAACAGTTTTGTGTTTTTAAATCAGTTCTAAACACGAAAA
GGGTTGATGAAAAACGCAGAAGAATCCGAAAAACTAACTAACCGAGCAAAAACTTGACTTGAGTGTTGTT
TGACAAATCAGGAAAGATAAAAAACAAATCATAAGAAAAAACTGCACGAAAAATGAAAAAGTTTCTAATA
TTCAAAATCTTGCACAAGAAATACAAAATCAATTAAAGTGAACTCTAACCAAAAGTTGTACACAAAATAA
AAAGCAAAACAAAGCAGCGAAGAACAATCACAAGAAGAGCAAAGTGCCAACAAAGTGCAGGAAGGAAGGA
AGCGGATAAGGACAAAAAGGAAGCCAGCACACACACACACACCCACACAATGGCCGTGCCCTTTTATTTG
CCCGAGGGCGGCGCCGATGACGTAGCGTCGAGTTCATCGGGAGCCTCGGGCAACTCCTCCCCCCACAACC
ACCCACTTCCCTCGAGCGCATCCTCGTCCGTCTCCTCCTCGGGCGTGTCCTCGGCCTCCGCCTCCTCGGC
CTCATCTTCGTCATCCGCATCGTCGGACGGCGCCAGCAGCGCCGCCTCGCAATCGCCGAACACCACCACC
/data4/software/microRNA/miRanda-3.3a-install/bin/miranda bantam_stRNA.fasta hid_UTR.fasta
miranda bantam_stRNA.fasta hid_UTR.fasta -sc 140 -en -1 > test.out.txt
[sam@c01 examples]$ cat test.out.txt | grep ">"
>gi|29565487|emb|AJ550546.1| gi|945100|gb|U31226.1|DMU31226 167.00 -24.54 2 20 3340 3360 18 83.33% 94.44%
>gi|29565487|emb|AJ550546.1| gi|945100|gb|U31226.1|DMU31226 156.00 -20.03 2 17 2505 2525 15 86.67% 93.33%
>gi|29565487|emb|AJ550546.1| gi|945100|gb|U31226.1|DMU31226 155.00 -14.57 2 16 2852 2872 14 78.57% 85.71%
>gi|29565487|emb|AJ550546.1| gi|945100|gb|U31226.1|DMU31226 152.00 -14.18 2 18 3820 3841 17 76.47% 76.47%
>>gi|29565487|emb|AJ550546.1| gi|945100|gb|U31226.1|DMU31226 630.00 -73.32 167.00 -24.54 1 21 3902 3340 2505 2852 3820
结果解读:
1)">“开头的几行,是miRNA(gi|29565487|emb|AJ550546.1|)靶标到基因(gi|945100|gb|U31226.1|DMU31226)上的不同位置
2)">“开头的行对应的信息依次是:miRNA id,基因id,打分,自由能,miRNA起始位置,miRNA终止位置,基因起始位置,基因终止位置,靶标结合miRNA长度,miRNA结合长度与miRNA总长占比,基因上结合长度(大于等于前者,可能是一个跨度)对miRNA总长占比
3)"»“的行是综合信息,依次代表:miRNA id,基因id,总打分,总自由能,最大打分,最大自由能,链信息,miRNA长度,基因长度,靶标到基因上的位置(1个或多个,这里是4个)
4)综上,我们可以抓取“»”的行,来获取靶向的基因,如果需要其他信息也可以自己按需提取,so easy
三、个性化分析
3.1 下载miRBase成熟序列,提取human的miRNA进行举例
cd /data4/software/microRNA
wget https://www.mirbase.org/download/mature.fa
下載相关物种的microRNA
https://www.mirbase.org/browse/results/?organism=mmu
https://www.mirbase.org/browse/results/?organism=hsa
grep -A 1 'Homo sapiens' mature.fa >hsa-mir.fa
grep -A 1 'Mus musculus' mature.fa >mmu-mir.fa
cat mmu-mir.fa hsa-mir.fa > hsa-mmu-mir.fa
grep -v '\-\-' hsa-mmu-mir.fa >hsa-mmu-mir-2.fa
/data4/software/microRNA/miRanda-3.3a-install/bin/miranda /data4/software/microRNA/hsa-mmu-mir-2.fa input/154-no5UTR.fa -sc 140 -en -1 | grep ">>" > result_1.txt
$ less -SN hsa_mature10_targets.txt
1 >>hsa-let-7a-5p ENST00000000412.7 146.00 -11.97 146.00 -11.97 2 22 2756 960
2 >>hsa-let-7a-5p ENST00000001008.5 143.00 -17.75 143.00 -17.75 4 22 3732 2611
3 >>hsa-let-7a-5p ENST00000002125.8 156.00 -21.39 156.00 -21.39 6 22 2176 1248
4 >>hsa-let-7a-5p ENST00000002829.7 146.00 -16.38 146.00 -16.38 10 22 3802 3480
5 >>hsa-let-7a-5p ENST00000003084.10 299.00 -39.65 152.00 -20.76 11 22 6132 3077 5087
6 >>hsa-let-7a-5p ENST00000003100.12 143.00 -18.40 143.00 -18.40 12 22 3210 2586
7 >>hsa-let-7a-5p ENST00000003302.8 438.00 -47.15 152.00 -16.09 13 22 4669 1132 2011 4340
8 >>hsa-let-7a-5p ENST00000004531.14 163.00 -20.40 163.00 -20.40 17 22 7560 1101
9 >>hsa-let-7a-3p ENST00000002125.8 161.00 -12.52 161.00 -12.52 23 21 2176 1476
10 >>hsa-let-7a-3p ENST00000002165.10 158.00 -14.96 158.00 -14.96 24 21 2356 1993
11 >>hsa-let-7a-3p ENST00000003084.10 148.00 -6.47 148.00 -6.47 28 21 6132 5758
12 >>hsa-let-7a-3p ENST00000003302.8 149.00 -16.95 149.00 -16.95 30 21 4669 3959
13 >>hsa-let-7a-3p ENST00000003912.7 146.00 -7.39 146.00 -7.39 32 21 5481 3640
14 >>hsa-let-7a-3p ENST00000004531.14 292.00 -16.76 146.00 -13.42 34 21 7560 3954 6657
提取相应的结果
cat *out |grep ">>" |sort -k5,5nr |awk 'NR<50000' |sed 's/>>//g' |cat <(echo "Seq1,Seq2,Tot Score,Tot Energy,Max Score,Max Energy,Strand,Len1,Len2,Positions" |tr "," "\t") - >MirandaOutput.tab
四、讨论
有个在线的工具,可以使用
參考資料
这里是一个广告位,,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn