Apache Arrow

more information can be found in URSA Labs

This example is from Arrow Vignettes

arrow

best when working with big data

Prep

library("arrow", warn.conflicts = FALSE)
library("dplyr", warn.conflicts = FALSE)

check if S3 support is included.

arrow::arrow_with_s3()

If TRUE, sync data locally import from https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page

arrow::copy_files("s3://ursa-labs-taxi-data", "nyc-taxi")

since the data is in Parquet format, we use

ds <- open_dataset("nyc-taxi", partitioning = c("year", "month"))

then you can start using data set as usual

ds
Mike Nguyen, PhD
Mike Nguyen, PhD
Visiting Professor of Data Sciences and Operations

My research interests include marketing, and social science.

Next
Previous

Related