The Patent Claims Research Dataset contain detailed information on claims from U.S. patents granted between 1976 and 2014 and U.S. patent applications published between 2001 and 2014. The dataset is derived from the Patent Application Publication Full-Text and Patent Grant Full Text files, available at https://bulkdata.uspto.gov/, to which the Office of Chief Economist (OCE) applied a Python algorithm to identify individual claims as well as the dependency relationship between claims. From the parsed claims text, OCE created six data files containing individually-parsed claims, claim-level statistics, and document-level statistics, including newly-developed measures of patent scope.
A document describing the motivation behind and trends of the patent scope measurements is available and can be cited as: Marco, Alan C. and Sarnoff, Joshua D. and deGrazia, Charles, Patent Claims and Patent Scope (October 2016). USPTO Economic Working Paper 2016-04. Available at: SSRN: https://ssrn.com/abstract=2844964
For questions, please email EconomicsData@uspto.gov
Documentation
Patent Claims Research Dataset Documentation
Data Files
Download full set of 2014 data files [.dta format (11.2 GB)] [.csv format (9.32 GB)]
Download individual data files:
File Name | 2014 | |
---|---|---|
patent_claims_fulltext | DTA 5.45 GB | CSV 4.41 GB |
patent_claims_stats | DTA 821 MB | CSV 452 MB |
patent_document_stats | DTA 119 MB | CSV 90.3 MB |
pgpub_claims_fulltext | DTA 4.21 GB | CSV 3.79 GB |
pgpub_claims_stats | DTA 570 MB | CSV 530 MB |
pgpub_document_stats | DTA 81.6 MB | CSV 75 MB |
The direct download page is here.
Note: The DTA (Stata dataset) files are saved in the Stata-13 data file format.
Note: The code used to parse the Patent Application Publication Full-Text and Patent Grant Full Text files and generate the datasets below will be made available soon.