【4.3.1】HLA分型工具--HLA-HD

一、安装

需要发邮件申请软件:https://www.genome.med.kyoto-u.ac.jp/HLA-HD/

需要安装bowtie2

tar -xvf hlahd.1.5.0.tar.gz
# tar -zxvf hlahd.1.5.0.tar.gz
cd hlahd.1.5.0
sh install.sh

更新dictionary,有点耗时,要几分钟。

sh update.dictionary.sh

二、running

ulimit -Sa
ulimit -n 1024

如果你的内存非常大,这个数据可以设置大一点,否则有时多线程跑的时候还会报错。

官方说要解压fastq.gz,但实际上压缩文件也可以跑!

例子:fastq

hlahd.sh -t [thread_num] -m [minimum length of reads] \
  -c [trimming rate] \
  -f [path_to freq_data directory] \
  fastq_1 fastq_2 \
  gene_split_filt path_to_dictionary_directory \
  IDNAME[any name] output_directory 

hlahd.sh -t 2 -m 100 -c 0.95 -f freq_data/ \
  data/sample_1.fastq data/sample_2.fastq \
  HLA_gene.split.txt  dictionary/  \
  sampleID estimation

如果是bam文件:

Using  bam files mapped to human genome
If you have mapped result to human genome, you can create fastq of mhc region and unmapped reads by using samtools and picard tools as follows:
#Extract MHC region
:for GRCh38.p12
>samtools view -h -b sample.hgmap.sorted.bam chr6:28,510,120-33,480,577 > sample.mhc.bam
:for GRCh37
>samtools view -h -b sample.hgmap.sorted.bam chr6:28,477,797-33,448,354 > sample.mhc.bam
#Extract unmap reads
>samtools view -b -f 4 sample.sorted.bam > sample.unmap.bam
#Merge bam files
>samtools merge -o sample.merge.bam sample.unmap.bam sample.mhc.bam
#Create fastq
>java -jar picard.jar SamToFastq I=sample.merge.bam F=sample.hlatmp.1.fastq F2=sample.hlatmp.2.fastq
#Change fastq ID
>cat sample.hlatmp.1.fastq |awk ‘{if(NR%4 == 1){O=$0;gsub(“/1″,” 1″,O);print O}else{print $0}}’ > sample.hla.1.fastq
>cat sample.hlatmp.2.fastq |awk ‘{if(NR%4 == 1){O=$0;gsub(“/2″,” 2″,O);print O}else{print $0}}’ > sample.hla.2.fastq
  • 10X单细胞RNAseq数据HLA分型工具:scHLAcount
  • WES数据只能检测ABC三种结果的: OptiType
  • 检测gene数量比较多的HLA_scan

参考资料

药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn