Reading a gz file

Reading a gz file

Ray ZhangRay Zhang Posts: 1Questions: 1Answers: 0

I am trying to read a txt.gz data set but met some error and warning messages as follows:

library(data.table)
dt_gz_anno <- fread( paste0("gunzip -c ",in.path,in.file, chrom, ".annovar.hg19_multianno.txt.gz" ), header=T)

Read 0.0% of 524475 rows
Error in fread(paste0("gunzip -c ", in.path, in.file, chrom, ".annovar.hg19_multianno.txt.gz"), :
Expected sep (',') but new line or EOF ends field 5 on line 1586 when reading data: 21 9906784 9906784 A G downstream TEKT4P2 . . . 21p11.2 Score=0.988103;Name=chr4_gl000194_random:0 . . . . . . . . . . . . . . . . .
In addition: Warning messages:
1: In fread(paste0("gunzip -c ", in.path, in.file, chrom, ".annovar.hg19_multianno.txt.gz"), :
Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data:
2: In fread(paste0("gunzip -c ", in.path, in.file, chrom, ".annovar.hg19_multianno.txt.gz"), :
Bumped column 3 to type character on data row 1584, field contains '0;GN=10597;HWEAF=0.00160709;HWDGF=0.996765'. Coercing previously read values in this column from logical, integer or numeric back to character which may not be lossless;
3: In fread(paste0("gunzip -c ", in.path, in.file, chrom, ".annovar.hg19_multianno.txt.gz"), :
Bumped column 5 to type character on data row 1584, field contains '2.62245e-05;IBC=0;HWE_SLP=-0.256613;NS_NREF=3537;ABE=0.665484;ABZ=15.1599;BQZ=0.262836;CYZ=-2.67689;STZ=-5.6356;NMZ=-7.71835;IOR=-0.267773;NM1=0.906873;NM0=0.476757;SVM=-0.0944579 GT

I am wondering how I can modify the fread or use some other commands? Thanks a lot for your help in advance!
Ray

Answers

  • allanallan Posts: 61,822Questions: 1Answers: 10,127 Site admin

    Hi Ray,

    Could you clarify what your question has to do with the DataTables library please?

    Allan

This discussion has been closed.