Historical Patent Data Files

Patent classification systems are largely designed for administrative purposes, limiting their value for most research purposes. To address this deficiency, Hall, Jaffe, and Trajtenberg (2001) developed a higher-level classification for the National Bureau of Economic Research (NBER) Patent Citation Data File by aggregating U.S. Patent Classification (USPC) classes into economically relevant technology categories. While this NBER classification scheme has proven valuable for researchers investigating US patent grants, comparable information on patent applications remained unavailable. For that reason, the Office of Chief Economist (OCE) developed a probability-matching algorithm to apply NBER classifications to patent applications as well as in-force and expired patents. From matched data, we construct the USPTO Historical Patent Data Files, four research datasets containing time series and micro-level data by NBER sub-category on applications, grants, and in-force patents spanning two centuries of innovation. Our hope is that researchers will make use of these data which, for the first time, enable detailed study of the complex dynamics between new filings, pendency, and abandonment and put into context recent trends in patenting activity, litigation, and technological change.

The USPTO Historical Patent Data Files includes four datasets:

  • The annual dataset contains counts of in-force and issued patents from 1840 to 2014 by NBER sub-category.
  • The monthly file contains a monthly count of applications, issued patents, and in-force patents by application status, disposal type (abandoned, issued, or pending), and NBER sub-category from 1981 to 2014.  
  • The monthly_disposal dataset contains counts of application by disposal type for each monthly application cohort by NBER sub-category from 1981 to 2014.
  • The historical_masterfile contains micro-level application, NBER sub-category, and prosecution data on 2.2 million patent applications filed from 1981 to 2014 and 8.9 million patents issued through 2014.  
  • Three intermediate files (orders, orders_class, and orders_subclass) used to generate the four datasets are also available for download.

A document describing these data is available and can be cited as: Marco, Alan C. and Carley, Michael and Jackson, Steven and Myers, Amanda F., The USPTO Historical Patent Data Files: Two Centuries of Innovation (June 1, 2015). SSRN working paper, available at http://ssrn.com/abstract=2616724

For questions, please email EconomicsData@uspto.gov

Data Files

File Name2014
(Size)
Output files:
annualDTA
105 KB
CSV
117 KB
historical_masterfileDTA
758 MB
CSV
228 MB
monthlyDTA
280 KB
CSV
425 KB
monthly_disposalDTA
22.8 MB
CSV
37.7 MB
Intermediate files:
ordersDTA
49.8 KB
CSV
169 KB
orders_classDTA
75 MB
CSV
15.6 MB
orders_subclassDTA
3.96 MB
CSV
5.34 MB

Direct download here.

* Note: the files marked with an asterisk have been compressed into a ZIP archive.