Create your own package in R

From .R Files to R Packages: A Game Changer for Empirical Researchers

Hello, fellow data enthusiasts! Today, we’re going to embark on a journey that will take us from a scattered mess of .R files to the organized, efficient world of R packages. Why, you ask? Well, let me tell you a story.

Once upon a time, I was just like you. I had a bunch of utility functions scattered across numerous .R files. Every time I needed a function I wrote months, weeks, or even days ago, I had to embark on a treasure hunt through my files. More often than not, I ended up re-writing the function because it was faster than searching for it. Sound familiar?

Well, I finally decided it was time to take the next step and start wrapping all my utility functions into R packages. And let me tell you, it was a game-changer. Here’s why:

  1. Efficiency: Having your own R package can significantly speed up your workflow. Instead of re-writing the same functions over and over again, you can simply call them from your package. This not only saves time but also ensures consistency in your analyses.

  2. Reproducibility: Reproducibility is a cornerstone of good research. With an R package, you can ensure that your analyses are reproducible and transparent. This is especially important when collaborating with others or when your work is subject to peer review.

  1. Learning and Development: Creating your own R package is a great way to deepen your understanding of R and programming principles. It encourages good practices such as writing clear, concise code and thorough documentation.

  1. Sharing and Collaboration: R packages are an excellent way to share your work and collaborate with others. By making your code available as a package, you make it easy for others to use and build upon your work. This can lead to collaborations and advancements that you may not have achieved on your own.

  1. Career Advancement: Having your own R package can be a significant feather in your cap professionally. It demonstrates your proficiency in R and your ability to create reusable, efficient code. This can be a great selling point in job interviews or grant applications.

  2. Customization: With your own R package, you can create functions tailored to your specific needs. For example, if you’re a marketing researcher, you might create a function that generates plots in the style required by the American Marketing Association, like this:

#' @export
amatheme <- function() {
  ggplot2::theme_bw(base_size = 14, base_family = "serif") + 
    ggplot2::theme(
      panel.grid.major = ggplot2::element_blank(),
      panel.grid.minor = ggplot2::element_blank(),
      panel.border = ggplot2::element_blank(),
      line = ggplot2::element_line(),
      text = ggplot2::element_text(),
      legend.title = ggplot2::element_text(size = ggplot2::rel(0.6), face = "bold"),
      legend.text = ggplot2::element_text(size = ggplot2::rel(0.6)),
      legend.background = ggplot2::element_rect(color = "black"),
      plot.title = ggplot2::element_text(
        size = ggplot2::rel(1.2),
        face = "bold",
        hjust = 0.5,
        margin = ggplot2::margin(b = 15)
      ),
      plot.margin = ggplot2::unit(c(1, 1, 1, 1), "cm"),
      axis.line = ggplot2::element_line(colour = "black", linewidth = 0.8),
      axis.ticks = ggplot2::element_line(),
      axis.title.x = ggplot2::element_text(size = ggplot2::rel(1.2), face = "bold"),
      axis.title.y = ggplot2::element_text(size = ggplot2::rel(1.2), face = "bold"),
      axis.text.y = ggplot2::element_text(size = ggplot2::rel(1)),
      axis.text.x = ggplot2::element_text(size = ggplot2::rel(1))
    )
}

This function creates a custom ggplot2 theme that complies with the American Marketing Association style. With this function in your R package, you can easily generate plots that are ready for your next marketing paper.

But the beauty of R packages is that they’re not just for marketing researchers. Anyone can use them. For example, you can check out my own package on GitHub. To install it, simply run:

remotes::install_github("mikenguyen13/mikenguyen")

Feel free to use it for your research, whether you’re in marketing, science, or just a data enthusiast like me.

So, are you ready to dive into the world of R packages? Great! Let’s get started.

Setting Up

First things first, we need to install a couple of packages that will make our lives easier: devtools and roxygen2. You can install them using the following commands:

install.packages("devtools")
install.packages("roxygen2")

Creating the Framework for Your First Package

Now that we have our tools, we can start building the framework for our package. We can do this using devtools:

devtools::create("myfirstpackage")

This command creates a folder with the same name as your package name and populates it with a few files. For now, we’ll focus on the DESCRIPTION file (where all the meta-data about your package goes) and the R folder (where all your R code goes).

Alternatively, you can use the usethis package. It should give you all the skeleton files you need.

usethis::create_package("package_name")

But this way won’t allow you to do it within a directory linked with github. You will have to manually make a git repo

usethis::use_git()

Add a DESCRIPTION file

usethis::use_description()

When you open this file, you will see it is pretty much self-explanatory

Add a license

There are many options for you to choose from. I typically just type usethis::uselicense (I know this is incorrect; there is no such function, but R will suggest functions that are close in spelling). The most common license is GPL-3.

usethis::use_gpl3_license("Mike Nguyen")

Adding Your R Functions

All your R functions that you want in your R package belong in the R directory. You can create an .R file that has the same name as the function you want in it. For instance, let’s create a file called R/amatheme.R and add our function to it.

Even easier, just use this function

usethis::use_r("your_function")
usethis::use_test("your_function")

Remember to add the #’ @export tag above the function. This tag indicates that the function should be “exposed” to users to use.

Let’s say we have a function amatheme that we want to include in our package. We would create a new .R file in the R directory of our package, let’s call it amatheme.R. In this file, we would define our function:

#' @export
amatheme <- function() {
  # Function body goes here
}

Load Your Package

To install and and load your package, you can simply

library(usethis)
library(devtools)
# don't even need to specify your library
install()
library("your_library")

Documenting Your Functions

Now, let’s talk about documentation. You know when you type ?amatheme in R and get that nice documentation? That’s what we’re going to do next. We can leverage off the roxygen2 package to document our functions.

usethis::use_roxygen_md()

To document our amatheme function, we would add special comments above the function definition in the amatheme.R file:

#' Title of the function
#'
#' @description A detailed description of what the function does.
#' @param param1 Description of the first parameter.
#' @param param2 Description of the second parameter.
#' @return Description of the return value.
#' @examples 
#' example1
#' example2
#' @export
amatheme <- function(param1, param2) {
  # Function body goes here
}

Once you’ve got your documentation completed, you can run devtools::document() to generate the .Rd files.

Package Documentation

One of the key aspects of creating an R package is documentation. Good documentation is crucial for ensuring that others can understand and use your package. This includes not only commenting your code but also creating help files for each function and a detailed package vignette.

To create a vignette (i.e., tutorial) for our package, we would use the usethis::use_vignette("introduction") function. This creates a template R Markdown file in the vignettes directory of our package. We can edit this file to provide a detailed introduction to our package.

A package vignette is a long-form guide that provides a comprehensive overview of your package. It typically includes an introduction to the package, detailed examples of how to use each function, and any other information that users might need to understand your package.

To install the vignette with the package

install(build_vignettes = T)

To see the vignette

vignette("vignette_name")

Dependencies

When creating an R package, it’s important to carefully manage your package dependencies. These are other R packages that your package relies on. You should aim to minimize your dependencies to reduce the likelihood of conflicts and errors. When you do need to use other packages, make sure to specify them in the Imports field of your DESCRIPTION file.

For example, if our package depends on ggplot2, we would add the following line to the DESCRIPTION file:

Imports:
  ggplot2

Add Bells and Whistles

ReadMe file

So, you’ve heard me blabber about creating your own package in R. But what if you don’t want to stop at just pushing it to your GitHub folder and hoping someone stumbles upon it during their 3 a.m. coding binge? What if you want to make it official, put a badge on it, and show it off to the world like a proud pet owner?

Let’s Start Simple

usethis::use_readme_md()

Think of your README as the flashy storefront of your package. You wouldn’t walk into a store without a sign, right? (Unless you’re an adventurer, in which case, hats off to you!)

The README should cover:

  • Purpose: Why did you make this? (And please don’t say “just because.”)

  • Installation: We’re all eager beavers, but how do we get our paws on it?

  • Usage: Once we have it, how do we use it without breaking our R session?

Add A PDF Manual

I always put the manual in the folder as my package so that I can sync it on GitHub

devtools::build_manual(path = getwd())
# alternatively you can do 
devtools::check(manual=TRUE)

A Logo: Because Even R Packages Need Swag

Did someone say branding? Just as you wouldn’t forget your grandmother’s face, you want people to remember your package. Create a chic logo that screams “I mean business... but in a fun, R-related way.”

use_logo()

Citation: Give Credit Where It’s Due

If someone uses your package for their groundbreaking research and ends up winning a Nobel Prize, you want a slice of that credit pie (or at least a mention in the footnotes). Always provide a way for folks to cite your work. A little fame never hurt anybody!

use_citation()

Badge of Honor: The Bling-Bling for Your Package

use_badge()

Your package aced its tests? Has a 100% download rate by all three members of your family? Slap on a badge! It’s the jewelry of the coding world.

Lifecycle Badge: Is Your Package a Baby or a Grandmaster?

use_lifecycle_badge()

Let people know the maturity of your package. Is it experimental or stable? Is it questioning its place in the R universe? A lifecycle badge gives a quick glance into the evolutionary stage of your creation.

Data

If you want to include raw data files (like .csv, .tsv, .txt, etc.) in your R package, you should place them in the inst/extdata directory of your package. This is a standard location for storing raw data in R packages.

Here’s how you can use such data:

  1. Add your raw data file to the inst/extdata directory in your package. For example, you might add a file called model-coef.rds.

  2. After installing your package, you can access the data file using the system.file() function. Here’s how you can do it:

data_file_path <- system.file("extdata", "model-coef.rds", package = "myfirstpackage")
data <- readRDS(data_file_path)

In this example, system.file() generates the full system path to the data file within the installed package. Then readRDS() is used to read the data into R.

Remember, the inst directory in an R package is for “installed files” - these are files that are not R code or data, but which should be included with the package. The contents of the inst directory are copied by R into the root of the package when it is installed, which is why you don’t include inst in the path when using system.file().

Rules of Thumb and Industry Hacks

Here are a few rules of thumb and industry hacks to keep in mind when creating your R package:

  • Keep it simple: Try to keep your package focused on one task or theme. This makes it easier for others to understand what your package does and how to use it.

  • Use meaningful function names: Your function names should be descriptive and follow a consistent naming convention. This makes it easier for users to understand what each function does.

  • Test your package: Make sure to thoroughly test your package before releasing it. This includes checking that all functions work as expected and that all documentation is clear and accurate.

  • Keep learning: Creating an R package is a learning process. Don’t be afraid to make mistakes and keep improving your package over time.

# Alway remember to 
devtools::document()
usethis::use_test("yournewfunction")

put export(yournewfunction) and import(needed_package) in NAMESPACE file

put the packages requirements under Import: in DESCRIPTION file

put an example of the function in the Vignette file

Installing and Using Your R Package

So, we’ve got our functions, we’ve got our documentation, what’s next? It’s time to install and use our package! We can use the devtools::install() function to install our R package into our R system library. Then we can load up our package with library("myfirstpackage").

Distributing Your R Package

Now that we’ve got our shiny new R package, how do we share it with the world? The easiest way is to distribute it through GitHub.

To distribute our package on GitHub, we first need to create a new repository on GitHub. Then, we can use Git to commit our package files and push them to the GitHub repository. Here’s an example of how to do this in the terminal:

# Navigate to the directory of your package
cd path/to/your/package

# Initialize a new Git repository
git init

# Add all files in the directory to the Git repository
git add .

# Commit the files
git commit -m "Initial commit"

# Add the GitHub repository as a remote
git remote add origin https://github.com/yourusername/yourrepository.git

# Push the files to the GitHub repository
git push -u origin master

Once you’ve pushed your package to GitHub, anyone can install it using the following command:

devtools::install_github("yourusername/myfirstpackage")

To add a touch of academic flair to your distribution, why not submit it to a journal? Perhaps to the Journal of Open Source Software or the Journal of Open Research Software (specifically for R is The R Journal). Personally, I’d give a thumbs up to The Journal of Open Source Software. Not only because I can spell it correctly, but also because they cozy up to your GitHub or ORCID account right off the bat.

Feeling even fancier? Consider creating a website to showcase all your vignettes. This not only provides a platform for your work but also allows you to monitor traffic and track visitors. For this, the pkgdown package is a splendid tool. It effortlessly transforms your R package documentation into a polished website, integrating your README, vignettes, function documentation, and other essential files to craft a cohesive online presence.

Setting up:

# Install the pkgdown package if you haven't already:
# install.packages("pkgdown", dependencies = TRUE)

# Build the website. Ensure you're in your package directory:
pkgdown::build_site()

Upon execution, a docs directory will be generated, containing your website.

Upload to GitHub: Ensure you commit and push all updates, especially the docs folder, to GitHub.

Activating GitHub Pages:

  • Navigate to your GitHub repository’s settings.

  • Under the “GitHub Pages” section, set the source to the master branch, specifically pointing to the /docs folder.

Accessing Your Site: Your tailored package website is readily available at:

https://<username>.github.io/<repository>/

And that’s it! You’ve successfully created, documented, installed, and distributed your very own R package. But wait, there’s more! You can also check out my own package on GitHub. To install it, simply run:

remotes::install_github("mikenguyen13/mikenguyen")

Feel free to use it for your research, whether you’re in marketing, science, or just a data enthusiast like me. Happy coding!

Mike Nguyen, PhD
Mike Nguyen, PhD
Visiting Professor of Data Sciences and Operations

My research interests include marketing, and social science.

Next
Previous

Related