MediSyn Inc

MediSyn IncMediSyn IncMediSyn Inc
  • Home
  • Solution
  • Technology
  • About
  • More
    • Home
    • Solution
    • Technology
    • About

MediSyn Inc

MediSyn IncMediSyn IncMediSyn Inc
  • Home
  • Solution
  • Technology
  • About

Our Solution

Synthetic electronic health record generation

Synthetic electronic health record generation

Synthetic electronic health record generation

Our synthetic data solution is based on our cutting-edge machine learning in healthcare research.  

  • We produce high-quality, high-dimensional longitudinal electronic health records based on your patient records. 
  • We create a customized machine learning generator for your patient database.
  • We enable targeted patient generation for specific conditions of interest.  
  • We verify the synthetic dataset through comprehensive fidelity and privacy tests.

Clinical predictive modeling

Synthetic electronic health record generation

Synthetic electronic health record generation

Our predictive modeling solution is based on comprehensive research in machine learning. 

  • We create clinical predictive models for our clients on their real patient data. 
  • We augment the clinical predictive model training using synthetic patient data to boost performance. 
  • We deploy clinical predictive models into practice. 

MediSyn Capabilities

Product capability

  • Support standard data format such as FHIR API and OMOP CDM.
  • Generate medical codes such as diagnosis and procedures.
  • Generate medication data.
  • Generate patient demographics. 
  • Generate numerical data such as lab measurements.
  • Generate longitudinal EHR with multiple visits.

Research capability

We have research publications on other data modality such as clinical notes and medical images such as x-ray. 

Find out more about our solution

Email us at jimeng@medisyn.ai and our team will be happy to show you our demo.

Demo

Fidelity of our synthetic data

Similar data statistics as real patient data

Real electronic health records (EHRs) are high-dimensional, including diagnoses (ICD codes), procedures (CPT codes), and medications. Altogether, over 20K dimensions need to be modeled and synthesized. Most existing synthetic generator solutions cannot produce such high-dimensional data. Instead, they often require users to specify a handful of variables of interest from a vast number of features in the real data. Those generators will only generate those few variables (usually in the order of tens). In comparison, MediSyn can produce high-dimensional EHRs in their original resolution with high fidelity. 

Each dot corresponds to a single medical code (ICD or CPT code). High R^2 indicates high fidelity.

High correlation within a visit

Support machine learning modeling

High correlation within a visit

MediSyn can capture the co-occurrence patterns of medical codes within a visit. The correlation of prevalence between medical code pairs is very high, despite the fact that we have to model over 5 million code pairs.

High correlation across visits

Support machine learning modeling

High correlation within a visit

MediSyn generates realistic longitudinal patient records of multiple visits over time. The temporal correlation of medical codes is accurately captured. Each dot in the plot indicates a pair of medical codes that occurs in consecutive visits. The x-axis is the prevalence of this pair in real data, while the y-axis corresponds to that in synthetic data. 

Support machine learning modeling

Support machine learning modeling

Support machine learning modeling

Our synthetic data  can support machine learning modeling:  

  • ML models using our synthetic data perform almost as well as models on real data.
  • Other synthetic data struggle to support ML models.

Privacy Preservation of real patients

MediSyn protects patient privacy

Our synthetic patient data are not mapped to any specific real patient. Furthermore, we thoroughly test the all synthetic data with privacy attacks to ensure the privacy preservation of real patients. 

Membership attacks used in our validation

Membership attack is about discovering the identities of real patients in the training data.  We introduce two versions of membership attacks. 

  • Dataset attack: Attackers model the synthetic data directly. 
  • Model attack: Attackers have access to MediShn synthetic data generator directly. 

Attackers failed to recover patient identity

Our experiment results show that attackers are unable to identify the real patients in the training data. Their attack success probability is close to random guesses (close to 0.5) in all settings.

Copyright © 2023 MediSyn inc - All Rights Reserved.


Powered by GoDaddy

This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

Accept