Patent classification systems are largely designed for administrative purposes, limiting their value for most research purposes. To address this deficiency, Hall, Jaffe, and Trajtenberg (2001) developed a higher-level classification for the National Bureau of Economic Research (NBER) Patent Citation Data File by aggregating U.S. Patent Classification (USPC) classes into economically relevant technology categories. While this NBER classification scheme has proven valuable for researchers investigating US patent grants, comparable information on patent applications remained unavailable. For that reason, the Office of Chief Economist (OCE) developed a probability-matching algorithm to apply NBER classifications to patent applications as well as in-force and expired patents. From matched data, we construct the USPTO Historical Patent Data Files, four research datasets containing time series and micro-level data by NBER sub-category on applications, grants, and in-force patents spanning two centuries of innovation. Our hope is that researchers will make use of these data which, for the first time, enable detailed study of the complex dynamics between new filings, pendency, and abandonment and put into context recent trends in patenting activity, litigation, and technological change.
The USPTO Historical Patent Data Files includes four datasets:
- The annual dataset contains counts of in-force and issued patents from 1840 to 2014 by NBER sub-category.
- The monthly file contains a monthly count of applications, issued patents, and in-force patents by application status, disposal type (abandoned, issued, or pending), and NBER sub-category from 1981 to 2014.
- The monthly_disposal dataset contains counts of application by disposal type for each monthly application cohort by NBER sub-category from 1981 to 2014.
- The historical_masterfile contains micro-level application, NBER sub-category, and prosecution data on 2.2 million patent applications filed from 1981 to 2014 and 8.9 million patents issued through 2014.
- Three intermediate files (orders, orders_class, and orders_subclass) used to generate the four datasets are also available for download.
A document describing these data is available and can be cited as: Marco, Alan C. and Carley, Michael and Jackson, Steven and Myers, Amanda F., The USPTO Historical Patent Data Files: Two Centuries of Innovation (June 1, 2015). SSRN working paper, available at http://ssrn.com/abstract=2616724
For questions, please email EconomicsData@uspto.gov
Data Files
File Name | 2014 (Size) | |
---|---|---|
Output files: | ||
annual | DTA 105 KB | CSV 117 KB |
historical_masterfile | DTA 758 MB | CSV 228 MB |
monthly | DTA 280 KB | CSV 425 KB |
monthly_disposal | DTA 22.8 MB | CSV 37.7 MB |
Intermediate files: | ||
orders | DTA 49.8 KB | CSV 169 KB |
orders_class | DTA 75 MB | CSV 15.6 MB |
orders_subclass | DTA 3.96 MB | CSV 5.34 MB |
Direct download here.
* Note: the files marked with an asterisk have been compressed into a ZIP archive.