Data

Save Your Research

My 2 Cents on Sharing Your Research (Or How Not to Get Lost in the Data Jungle) Sharing research data is like hosting a party. You want everything to be in the right place, accessible to everyone, but not too chaotic.

Experimentation at Scale

Tools that can help you create experiments at scale: Empirica (recommend) nodegame otree: Python-based Lioness Lab BreadBoard: network Turktool: survey for Amazon Mturk Jspsych: JavaScript-based PsiTurk: Survey for Amazon Mturk

Scaling Shiny

To solve the problem of scaling: From DevOps/IT: Add memory, CPU Rstudio Connect set up for multiple machines From R/Shiny engineer: use Javascript for less CPU usage extract computations: Shiny worker, Plumber use a database

caching for faster Shiny app

To have faster performance with Shiny App, you can pass this command to yoru script. # %>% bindCache()

Connect WRDS in R

Connect from R to Wharton Research Data Services to set up connection from R to WRDS (here) library(RPostgres) library(tidyverse) # I've set up wrds connection before hand. # Please use your username and password here.

Patent Databases

Comprehensive patent data can be found here United States NBER patent data or link Search link for individual patent: link Patent API USPTO - United States patent and Trademark Office Patent ranking by orgs Bulk Data Storage System: repository for raw public bulk data For Researcher Patent Assignment Dataset details information of patent assignment since 1970 with schema and description and code Pre-Grant Publications Data Download Tables with example code note that organizaiton here is different from Compustat and CRSP, hard to match.

Linking Financial Databases (CRSP and Compustat)

Information can be found in CRSP/COMPUSTAT MERGED DATABASE GUIDE Change Identifiers: Ticker: can be reassign to another company - abbreviation used to uniquely identify publicly-traded shares of a stock CUSIP: A company can have multiple CUSIPS due to structural changes.