【4.3】R数据框中有字符串的处理

用R处理一批数据的时候,因为有的变量为空或者为‘.’字符,导致导入的数据,都变成了chr,这可如何是好?

解决办法:

aa=read.table('snps_result/rs10010131.tsv',header=T,comment.char="")
bb=apply(aa,2,as.numeric)
cc = as.data.frame(bb)

解决过程:

数据框中有字符串,强行转化呀

bb=apply(aa,2,as.numeric)

###强行转化以后

bb$DP报错

$ operator is invalid for atomic vectors

is.matrix(aa)   # returns TRUE
	
ggplot(bb,aes(x=DP))+geom_histogram()
Error: ggplot2 doesn't know how to deal with data of class matrix

出错原因:ggplot only works with data.fram

解决办法:

cc = as.data.frame(bb)

个人案例:

library('reshape2)
library('ggplot2')
setwd('F:/work/1.ngs/GenoNGS_dev/test/panel_evaluation/wellwise_20170606/6.两次测序数据的比较/amplicon_depth')

aa =read.table('compare_result.tsv',header=T)
bb= apply(aa,2,as.numeric)
cc = as.data.frame(bb)

dd= cc[,c(1,3,4,5,6)]
ee = melt(dd,id.vars='id',value.name='value',variable.name = 'bq')
for (ii in 1:nrow(ee)) {if (!is.na(ee$value[ii])) {if (ee$value[ii]>5000){ee$value[ii] =5000}}}

ff = ee[which(ee$bq=='median_depth1' |ee$bq=='median_depth2' ),]

ggplot(ff,aes(reorder(id,value),value,color=bq))+geom_point() +xlab('Amplicon ID') +ylab('Amplicon Depth')
药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn