Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Is there a way to selectively extract from a .zip archive those files with names matching a pattern?

For example, if I want to use all .csv files from the archive and ignore other files.

Current approach:

zipped_file_names <- unzip('some_archive.zip') # extracts everything, captures file names
csv_nms <-  grep('csv', zipped_file_names, ignore.case=TRUE, value=TRUE)
library('data.table')
comb_tbl <- rbindlist(lapply(csv_nms,  function(x) cbind(fread(x, sep=',', header=TRUE, 
                                                               stringsAsFactors=FALSE), 
                                                         file_nm=x) ), fill=TRUE ) 

Instead of just selecting which ones to read (csv_nms), I'm looking for a way to choose which ones to extract in the first place.

I'm currently on v3.2.2 (Windows).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
514 views
Welcome To Ask or Share your Answers For Others

1 Answer

Thanks to comment from @user20650.

Use two calls to unzip. First with list=TRUE just to get the $Name for the files. Second with files= to extract only the files whose names match the pattern.

  zipped_csv_names <- grep('\.csv$', unzip('some_archive.zip', list=TRUE)$Name, 
                           ignore.case=TRUE, value=TRUE)
  unzip('some_archive.zip', files=zipped_csv_names)
  comb_tbl <- rbindlist(lapply(zipped_csv_names,  
                               function(x) cbind(fread(x, sep=',', header=TRUE,
                                                       stringsAsFactors=FALSE),
                                                 file_nm=x)), fill=TRUE ) 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...