Data on Patent Law: Sources and Uses Explained

By Adam J. Feldman February 1, 2024

Two professionals look at a tablet in a hallway; man points at screen, woman smiles, another person walks behind them.

Sometimes the most useful litigation tools are ones you assemble on your own – that way you can tailor them to your needs, and occasionally they are even free.  Here is an example of resources for the federal circuit, PTAB, and other trial level patent litigation. These resources can give you a sense of judicial behavior which can help generate expectations for case outcomes and timelines.  The three resources that I will quickly run through in this post are the the Compendium of Federal Circuit Decisions compiled by the University of Iowa Law School (what I will call the “Iowa Database”), the USPTO’s datasets and case resources , and CourtListener’s RECAP Archive. Each of these resources is free, and each can significantly assist you in developing a patent rights strategy.

Federal Circuit Database

This Iowa Database is comprehensive of Federal Circuit decisions since 2004 and has multiple pieces of information for each case.  The Database contains 19,761 cases and is consistently updated.  The types of information that one can derive from this dataset are invaluable. Anything from the likelihood of a granted en-banc (136 granted and 13,237 denied for a rate of approximately 1%) to the number of appeals adjudicated from PTAB (1,777) is readily available.  

Since the Iowa Database contains information on all decisions from the Federal Circuit, some sorting is required to isolate particular types of appeals like those relating to patents.  If you have a software application that can easily create crosstabs like Tableau (my favorite) you can organize and synthesize the information to derive useful outputs. The 19,761 records, for instance, can be sorted by dispute type. Although many of the records records relate to orders that aren’t connected to several of the case outcome variables, among cases that are labeled by type, 3,270 deal with patent infringements, 1,059 deal with inter partes review, 517 deal with contract claims, etc.

The patent cases are coded for whether they relate to code sections 102 or 103 along with other issues like claim construction and definiteness.  Once a specific area is nailed down, let’s say patent infringement, then more specific analyses can be performed.  If we isolate the cases from 2015 forward for example, we can see which judges have been the most frequent majority authors (Stoll with 99, then Prost with 93, and Lourie with 87). We can also look to see who authored dissents most frequently (Newman with 31 and Reyna with12). Or perhaps we want to know about the most frequent lower courts (District Court for the District of Delaware with 246, District Court for the Eastern District of Texas with 163, and the District Court for the Northern District of California with 158). Maybe we even want to know who is or was most likely to dissent from an opinion authored by Judge Stoll. In such instances, Judges Dyk, Hughes, Lurie, and Newman each dissented twice.

USPTO Website

The USPTO also has a treasure trove of free resources for the legal data enthusiast. Some of the information is quite helpful for legal practitioners moving forward while other data are mostly historic. Even the backwards looking data though can aid with current decisions to the extent that they are based on litigation before active judges.

The historic element is quite fascinating. While unfortunately only updated through 2016, the Patent Litigation Docket Reports have case level information from 81,350 district court cases filed between 1963 and 2016.  A few nice feature of the Docket Reports is that they track litigation timing and this can be parsed by on other variables like the judge or court of interest.  There are also multiple datasets that correlate to one another so you can look at observations based on the attorneys on the cases, patents, case names, or documents. 

We might, for instance, be interested in the magistrate judges who these cases were referred to in order to gauge how long proceedings end up taking in their courts. Here is an output of magistrate judges with over 200 proceedings in this dataset.

Judge Roy S. Payne for the Eastern District of Texas has the lion’s share of these cases with all other judges only deciding a fraction of Judge Payne’s count. Let’s say we are interested in the time it takes these judges to move from an opened to a closed case, we can use the time parameters in the dataset to run this calculation for each individual case, and then generate averages by judge.  Here are what the averages look like for these judges.

Judge Payne cleared his cases the quickest of the group at just under 250 days while, at the other end of the spectrum, Judge Trumbull of the District Court for the Northern District of California averaged over 535 days per case.

There are also other datasets available on the USPTO site as well including the Patent Examination Research Dataset (PatEx) which covers “13 million publicly-viewable provisional and non-provisional patent applications to the USPTO and over 1 million Patent Cooperation Treaty (PCT) applications.” 

CourtListener’s RECAP Archive

The RECAP Archive is a freely accessible tool that compiles PACER records.  It is an extremely useful resource and was used to derive some of the datapoints for the USPTO measures.

RECAP is generally more of a qualitative data source that can be used to put together quantitative statistics. One of the nice parts of RECAP though is that you can dive into case dockets and in some instances you can view documents filed in cases. 

One of the nice features of the RECAP archive is that you can filter by PACER codes, so, if for instance you were interested in patent cases, you could plug in nature of suit code 830 and find that since the beginning of 2015 there are 28,257 cases that fit under this code and 1,867,132 docket entries. If you were interested in the cases referred to Judge Roy S. Payne in the Eastern District of Texas you could refine your search by judge and find there are 2,611 relevant cases since the beginning of 2015.

A nice feature of RECAP that was presumably used in the creation of the USPTO dataset is the RECAP metadata that correlates with the variables in the USPTO site. These variables include the judge assigned to and referred to the case, the citation, date filed and terminated, date of last known filing, cause of action and nature of suit, jury demand, and jurisdiction type. There are also data on the parties and attorneys where available through PACER.

The upside to these data is that they allow for updating beyond the numbers currently available from the USPTO dataset which only run through 2016 and provide additional information not provided in the dataset. The downside though is that it takes either scraping and parsing skills to put it into a useable format or taking the time to input the data manually. If you have specific information you are trying to assemble rather than raw general data though, this is a good place to begin.

Concluding Thoughts

Legal data help with generating predictions, following trends, and understanding changes in the legal landscape.  The data described in this article are all readily available and relatively easy to use and navigate. These are great starting points for research and comparisons and provide context to those interested in specific cases. Another big upside is that these resources are free.

While the resources I described generally relate to patent law, this is just an example of the legal data that are freely available on the web. There are many other resources for other areas. If you already understand the value of data, then the raw data available to put together novel datasets abound. Furthermore, there are experts in legal data analysis that can help you develop the skills to make use of these resources and to ascertain answers and solutions to complex legal questions that are not answerable through doctrine alone. For claimholders, litigators, litigation funders, and insurers, such data provide the additional benefit of oftentimes lending themselves to probabilistic determinations that can help individuals forecast potential outcomes and generate likelihood intervals that relate to the probability that certain outcomes will come to fruition.

Adam Feldman  is the editor of  Empirical SCOTUS , a blog that conducts data analysis of the United States Supreme Court, and the Principal of Optimized Legal, a legal data/statistical consultancy. He is also an adjunct professor of political science and public law at California State University, Northridge. You can reach  Adam  for specific data and analyses related to your own litigation questions in this and other areas.

Certum Group Can Help

Get in touch to start discussing options.

Subscribe to Our Newsletter

Newsletter

Recent Content

By Certum Team June 25, 2026
Chambers & Partners, a leading independent legal research company, has once again recognized Certum Group and William Marra as leaders in the U.S. litigation finance industry. For the second consecutive year, Certum Group earned a Band 2 ranking in Chambers’ intellectual property litigation funding category, placing the firm among a small group of U.S. funders recognized as leaders in patent and IP finance. William Marra, a director at Certum Group, was again ranked individually, recognized in Band 3 for his work in litigation support. Reviewers interviewed by Chambers spoke to the depth and discipline of the Certum team: Certum has “some of the smartest people in the industry working there. I really respect them: they are efficient, they know the market, make smart decisions and are very discerning.” Certum’s team has “highly sophisticated legal and practical minds with an excellent grasp of litigation financing and the ebbs and flows of the litigation space.” “Certum Group are super credible, wonderful people. They are all real lawyers and they all care about our business.” One reviewer described Will as “bar none the most sophisticated, practical, partner-oriented funding professional I have worked with in my years of litigation funding involvement,” noting that he “has helped me shape cases in ways that dramatically improved their litigation and settlement posture” and is “adept and intuitively knowing of how to get to the right solutions.” Others described Will as “an excellent partner” and as someone who “bases decisions on fundamentals and has strong strategic vision.”  Click here to see the complete rankings.
By Certum Team June 17, 2026
Certum Group is pleased to announce that Suneal Bedi has joined the company as our Scholar in Residence. Suneal Bedi is an Associate Professor of Business Law & Ethics and Jerome Bess Faculty Fellow at the Kelley School of Business at Indiana University. He is also the Research Director at the Institute for Corporate Governance and Ethics. He teaches classes in corporate law and business ethics. Professor Bedi has written extensively on litigation finance and intellectual property in various outlets including Vanderbilt Law Review, USC Law Review, Harvard Journal of Law & Technology, Alabama Law Review, and has a forthcoming piece which empirically measures the value of litigation finance in the NYU Law Review. His work broadly seeks to analyze the marketplace effects of litigation finance with an emphasis on discussing the investment vehicle outside of the courtroom. Professor Bedi also brings an expertise in business ethics to the field and recently co-authored a textbook on the same titled The Vision of the Firm. He has assisted in many cases as an expert witness testifying on both IP damages and the business ethics of litigation finance. “It’s important that academic researchers spend time in the field learning how things actually work and I’m grateful for this opportunity,” Bedi said. He has a B.A. in Economics from Swarthmore College, a J.D. from Harvard Law School, an M.S. in Marketing and joint PhD in Business Ethics and Marketing from The Wharton School at the University of Pennsylvania. Before academia, he worked as a private equity associate at the Boston office of Ropes & Gray, LLP. See Suneal's announcement on Bloomberg Law , and learn more about his role at Certum Group HERE .
By W. Tyler Perry June 15, 2026
The CEO's Complaint In April 2026, Bayer CEO Bill Anderson stood before shareholders and made an argument that has become familiar in corporate boardrooms. Bayer had spent decades and billions developing products that undergo serious regulatory review. And yet, Anderson asked investors, why continue that work when it leaves the company “at the mercy of a 600-billion-dollar litigation industry”? The implication was clear. Litigation undermines the regulatory process. It second-guesses the scientists. It makes innovation irrational. Anderson was echoing an argument the defense bar has developed systematically for decades. John H. Beisner of Skadden Arps, in a series of reports for the Institute for Legal Reform , has argued that MDL proceedings pressure defendants to settle without examining the merits of individual claims. The Manhattan Institute’s James R. Copland has framed mass tort litigation as an economic drag on innovation . These positions represent the institutional consensus of the defense bar. And they rest on a factual premise that the evidentiary record contradicts. What the Record Shows Anderson was not speaking in the abstract. He was the CEO of the company that acquired Monsanto and inherited the Roundup litigation. That litigation has cost Bayer more than $11 billion in settlements and verdicts , with a further $7.25 billion proposed class settlement announced in February 2026 and granted preliminary court approval in March. Three days after Anderson’s remarks, Bayer’s attorneys stood before the United States Supreme Court in Monsanto Co. v. Durnell to argue that federal regulatory approval of glyphosate should preempt the state-law failure-to-warn claims that produced that liability. The company was not merely complaining about litigation. It was asking the Court to shut down the legal mechanism that had exposed what was in its own files. What was in the files is telling. In late 2025, the journal Regulatory Toxicology and Pharmacology retracted a twenty-five-year-old paper that had been cited for decades as evidence that glyphosate, the active ingredient in Roundup, was safe. The paper, Williams et al. (2000) , had concluded that glyphosate posed no carcinogenic risk to humans. Regulators relied on it. Defendants cited it in proceedings around the world. It shaped the scientific consensus for a generation. The retraction did not occur because new science emerged. It occurred because multidistrict litigation discovery exposed what peer review could not. The paper was ghostwritten . Internal Monsanto documents produced in the Roundup MDL (In re: Roundup Products Liability Litigation, MDL No. 2741, N.D. Cal.) revealed that company scientists had drafted sections of the paper , managed the editorial process, and selected the nominally independent authors whose names appeared on it. A 2015 internal email from Monsanto scientist William Heydens discussed “how we handled Williams, Kroes and Munro,” referring to the company’s orchestration of the very research that regulators treated as independent science . The “rigorous approval process” Anderson invoked was built on a scientific record his company had manipulated. The regulatory system did not fail because litigation interfered with it. It failed because, without litigation, no one had the tools to discover the interference that was already there. The Pattern Monsanto is not the only company whose internal record told a different story than its public one. The pattern recurs across every successful mass tort of the past three decades. Litigation discovery exposes information that no other institution had the tools or incentive to uncover. Johnson & Johnson’s internal documents, produced through discovery in the talc cancer litigation, revealed that the company had known about asbestos contamination in its Baby Powder since the 1970s . Internal testing detected asbestos fibers . Strategic decisions followed about how to manage the information rather than the contamination. The UCSF Industry Documents Library has catalogued approximately 3,500 of these internal J&J documents. They had been inside J&J’s files for half a century. They emerged only because the litigation process compelled their production. In December 2025, a Baltimore jury returned a $1.5 billion verdict against J&J for a woman who developed mesothelioma after using its talc products, a case built on that same documentary record. It was the largest verdict ever awarded to an individual talc plaintiff. Then, in March 2026, The Lancet retracted a 1977 commentary that J&J had cited for decades to defend the safety of cosmetic talc because the author, Francis J.C. Roe, was an undisclosed paid J&J consultant who had shared drafts with the company and revised the paper based on its feedback. What is striking about these cases is not simply that the defendants knew more than they disclosed, or that regulators failed to detect the problem. It is that the system lacked the capacity to respond. The information gap was structural. Regulatory agencies lacked the subpoena power, the adversarial incentive, and in many cases the resources to obtain what litigation discovery produced. As I discussed in the third article in this series, Mass Torts as a Complement to, and Backstop for, Government Regulation , the systemic case for mass torts rests in part on the proposition that private enforcement supplements public regulation. The evidentiary record that MDL discovery produces is how that supplementation operates in practice. It is the foundation on which accountability depends. From Evidence to Accountability As any trial lawyer will tell you, producing documents is not the same as establishing legal facts. The MDL system includes two processes that convert raw discovery into usable evidence: Daubert proceedings and bellwether trials. Both serve filtering and calibrating functions that determine whether the information discovery produces can be translated into accountability. Daubert serves as a form of adversarial peer review. When general causation is contested, the MDL court evaluates the methodology underlying each side’s expert testimony with a rigor that the scientific peer review process itself often lacks. Daubert is no rubber stamp. The Lipitor MDL is the proof. More than 3,000 women alleged that the statin caused their type 2 diabetes, and the science looked plausible at the headline level. Large observational studies had associated statins with new-onset blood sugar changes, and the FDA had added language to the label. But the plaintiffs’ expert methodologies could not survive scrutiny. The district court excluded them, the Fourth Circuit affirmed in 2018 , and the litigation ended without a dollar changing hands. The pattern repeats. More than 300 Zoloft birth defect claims ended the same way in the Third Circuit . The Mirena MDL ended after a 156-page opinion excluding all seven plaintiff experts, affirmed by the Second Circuit . The Onglyza heart failure MDL ended in the Sixth Circuit in 2024 . Four mass torts, four courts of appeals, zero settlements. The system worked precisely as designed. It filtered claims that could not meet the evidentiary threshold, and the appellate courts confirmed it got the answers right. Anderson’s “600-billion-dollar litigation industry” framing implies an indiscriminate machine. The record demonstrates precisely the opposite. Bellwether trials serve a different function. They calibrate. By trying a representative set of cases to verdict, bellwether trials generate the data that makes rational settlement possible. In the 3M military earplugs MDL , sixteen bellwether trials produced ten plaintiff verdicts and six defense verdicts. That distribution informed the eventual $6 billion settlement. The settlement matrix that allocated recovery across nearly 260,000 claims was built on trial data that differentiated by injury type, severity, and evidentiary strength. The verdicts ran in both directions because the system was measuring, not rubber-stamping. That data could not have been generated any other way. When I evaluate a potential mass tort investment, the first questions are specific. Is general causation supported by methodologies that will survive a Daubert challenge, or does it rely on extrapolations that a well-resourced defense will dismantle? Has the discovery produced internal documents showing the defendant knew, and if so, how specific are they? Is there a bellwether track record, and what does the verdict spread tell you about how juries process the evidence? The quality of the evidentiary record is what separates a case I will fund from a case I will not. That record depends on the discovery apparatus the MDL (and its state-court equivalents) provides. The distinction matters to funders. It should matter to everyone who cares about the quality of outcomes the system produces. What Changed Knowledge Produces Return to Anderson’s complaint. The Bayer CEO framed litigation as a threat to innovation, a system that punishes companies for bringing products to market despite regulatory approval. The framing is powerful because it is partly true. No rational company wants to face billions of dollars in liability after spending billions on development. But the framing collapses when you ask a prior question: what was the quality of the regulatory record that approved the product in the first place? In Roundup’s case, the regulatory record was contaminated by the very company now asking the Supreme Court to immunize it from the consequences. The MDL did not undermine the approval process. It exposed the fact that the approval process had already been undermined, from the inside, by the regulated entity itself. That is not a system run amok. It is a system doing what it was built to do. The Roundup retraction. The EPA PFAS limits. The FDA opioid warnings. The $1.5 billion talc verdict built on documents J&J kept from the public for fifty years . These are the observable consequences of a system that forces information into the open. Information that powerful institutions had every incentive to suppress and that no other mechanism was positioned to extract. I freely admit that the system has real costs and that there are legitimate critiques, which the next article, The Case Against Mass Torts (And What It Gets Right), will address directly. But those costs must be weighed against what the system produces. What it produces is not just verdicts and settlements. It is a changed informational landscape, one in which regulators have better data, markets have better signals, scientific literature is more honest, and the public has access to facts that were previously locked inside corporate filing cabinets. None of this is free. The depositions that produced the Monsanto emails and the J&J memos cost real money. So did the experts, the document review, the years of pretrial proceedings. That investment comes from plaintiffs, their counsel, and increasingly from litigation funders who look at an evidentiary record and make a bet that the truth, once forced into the open, will produce accountability. It is not charity, nor is it pure altruism. But the track record strongly suggests the system is socially beneficial, uncovering corporate wrongdoing that has a concrete effect on people’s lives. And the structure that makes it possible is worth defending. Particularly from those with the most to lose when the record comes to light. That is the proposition that anchors this series. Private enforcement is not an accident of American institutional design. It is how the system was built to work. The MDL’s information function, the adversarial discovery process, and the capital that funds it are the mechanism through which private actors supplement public regulation in practice. Whether that mechanism survives the current moment is the question the remaining articles will take up. Preemption challenges are before the Supreme Court. Tort reform is advancing in state legislatures. Litigation funding is under political attack. ——— W. Tyler Perry is the Director of Mass Tort Strategy at Certum Group, a litigation finance advisory firm. He writes about the institutional architecture of the American civil justice system. The views expressed here are his own.