fix for "cannot allocate vecotr of size"

Last updated on Feb 20, 2021 1 min read R

More package author’s introduction, please access this link

Instead of loading everything at once into your RAM, you divide your data into chunks. To quote author of the disk.frame package: “we go from”R can only deal with data that fits in RAM" to “R can deal with any data that fits on disk”." While data.frame uses in-RAM to process, disk.frame uses hard drive to store and process data.

disk.frame also allows parallel processing.

library("disk.frame")
# setup_disk.frame() # sets up background workers equal to the number of CPU cores
setup_disk.frame(workers = 2) # or you number of workers
options(future.globals.maxSize = Inf) # large dataset can be transferred between sessions  
# attr(data.df, "path") # path to where the disk.frame is stored

# to convert data.frame to a disk.frame
data.df <- as.disk.frame(original_data_frame)

# to convert one large CSV
# takes care of splitting large CSV into smaller ones 
diskf <- disk.frame::csv_to_disk.frame(path_to_csv_file) # you can also specify, outdir = , overwrite = T. 

# to convert multiple CSV
multiple_CSV = c(path_to_csv_file1,path_to_csv_file2)
diskf = disk.frame::csv_to_disk.frame(multiple_CSV)

# for faster performance, specify which column to manipulate
result = df %?% 
  srckeep(c("column1","column2")) %>%
  dplyr::filter()

data data manipulation R

Mike Nguyen

Researcher

My research interests include marketing, and social science.

fix for "cannot allocate vecotr of size"

Mike Nguyen

Researcher

Related