Skip to main content
Catherine Tak Piech argues that specialists and a plethora of new technologies are now well positioned to gather and analyze the evidence needed for diagnosing and treating rare diseases.


Catherine Tak Piek

By Catherine Tak Piech

The global rare disease community is steadily increasing awareness of the plight of those <1 in 2000 people who suffer with rare conditions, which often makes them the orphans of the health care system. Their most recent efforts have been focusing on the repercussions of a delayed or missed diagnosis, and the heavy social and financial burdens rare disease patients and their families share.

On Rare Disease Day, designated as 29 February (or on February 28th in non-leap years), campaigners nationwide mobilize events to literally shine a light on the challenges faced by this community. This includes illuminating buildings, monuments and homes with Rare Disease Day colors: blue, green, pink and purple. Participants also wear zebra stripes to signal the need for clinicians to consider a rare disease diagnosis sooner.

The numbers are there. An estimated 300 million patients worldwide have diagnoses related to 7,000 unique rare diseases. More than 70% of these diseases are genetic in origin. Now that scientific advances are increasingly making new treatments feasible, governments are taking a closer look at rare disease populations and the need for reallocating resources.

The U.S. Government Accountability Office (GAO) recently issued a 100+ page report investigating rare disease prevalence and associated costs in the United States. However, the report indicates that a complete picture of costs is obscured by limited evidence.

Evidence challenges: incomplete diagnosis and cost data

The GAO highlighted multiple data gaps in the rare disease landscape. Enumerating patients is problematic for several reasons: lack of specific diagnostic tests, delayed diagnosis or missed diagnosis, lack of specific ICD-10 codes and treatment, and simply the rarity itself.

The true costs of care are difficult to track. This is because of these diagnostic difficulties AND because many non-medical and indirect costs are broadly distributed across patients, their families, and society. Meanwhile payors ─ both public and private ─ focus on the direct medical costs. According to “The National Economic Burden of Rare Disease Study Summary Report,” non-medical and indirect costs (e.g., loss of income, special accommodations and travel to appointments) could account for more than half of the nearly $1 trillion annual cost of rare diseases in the U.S.

A full accounting of rare diseases and their costs is needed to fully appreciate the challenges these populations face, and secure the resources necessary to improve the lives of those affected. The more information we have, the better we will be able to understand the value of new treatments.

Scientific progress in rare disease etiology, which seeks to determine the origin of disease ─ as well as the rapidly evolving potential for gene and cell therapies to deliver treatments ─ is driving the need for better information. Employing epidemiologists, data scientists, health economists, new data collection technologies, global outreach, and new platforms that can synthesize data from a variety of sources can help meet this need.


The good news is the specialized skills and approaches needed to develop a more complete picture of a rare disease population ─ including costs of care, treatment effectiveness and value ─ are also evolving rapidly and being adopted by both big pharma and more agile data driven startups. Some examples include:

Deploying artificial intelligence and machine learning to scour the ever-expanding volume of scientific literature with greater precision, which identifies more relevant published data for rare disease populations.

  • Using RWD from insurance claims and electronic medical records (EMRs) to develop external, or synthetic, control arms.

This will give new treatment studies a valid comparator, given the small number of patients overall and their reluctance to participate in a trial that requires randomization for what is often a severe disease. Real World Datasets can also be used to estimate the incidence and prevalence numbers needed to support an orphan drug designation application, undertake a survival simulation, develop a budget impact model, model an innovative pricing approach, and more.

  • Deploying natural language processing within EMRs to identify the “bag of words”, or symptom constellations, that are associated with a rare disease.

Having the ability to work backwards from a verified diagnosis may provide the data needed to shorten the time frame for identifying a rare disease. Developers could embed an alert within an EMR to help providers link symptoms to a rare disease.

Leveraging global epidemiology knowledge and contacts to tap into more comprehensive electronic medical records in countries with centralized health care systems, such as Scandinavia or Israel, and utilizing the resources of the National Institutes of Health-supported “The Undiagnosed Diseases Program and Network” initiative, and its globalcounterpart, “The Undiagnosed Diseases Network International”.

  • Partnering with specialized dataset owners and rare disease advocacy groups who devote significant resources to counting their constituents.

Rare disease advocates have strong incentives to better understand their community, and may benefit from data linkages that cross medical, genetic, geographic, socioeconomic, behavioral, and employment spheres. Having a more complete picture of their disease produces a powerful narrative for support.

  • Establishing data-sharing research relationships with rare disease centers of excellence or referral centers, many of which are university based.

Utilizing decentralized tools ─ including telehealth, wearables, sensors, visiting nurses, home lab tests, ePROs, data collection portals, etc. ─ to collect more comprehensive, longitudinal data and make it easier for patients to report symptom frequencies, disease limitations, costs, and functional improvements. This will provide a more robust picture of the impact of new treatments.

New data visualization technologies, such as interactive dashboards developed by Genesis Research and other industry advancements, are providing the ability to analyze and then communicate data more effectively to researchers, regulators, providers, payors, patients, and their families.


There is a dire need for better information to support development of treatments that address the costly, unmet needs of rare disease sufferers. While the life sciences community still faces challenges acquiring data for diagnosing and costing, there are several specialists (including data scientists), and a plethora of new technologies that are well positioned to gather and analyze the evidence needed for diagnosing and treating rare diseases, which will lead to a brighter future for many.

BY: Catherine Tak Piech, Strategy Consultant, Genesis Research

This article was first published on on 19 January 2022. Please see the original article for citations.


111 River Street, Suite 1120
Hoboken, New Jersey 07030, US