IP-Coster | WO2025163377 | TRAINING AND GENERATING SYNTHETIC DATA USING (CONTINUOUS) NORMALIZING FLOW THAT PRESERVES PRIVACY

Publication Number WO/2025/163377

Publication Date 07.08.2025

International Application No. PCT/IB2024/062750

International Filing Date 17.12.2024

Title **

[English] TRAINING AND GENERATING SYNTHETIC DATA USING (CONTINUOUS) NORMALIZING FLOW THAT PRESERVES PRIVACY

[French] ENTRAÎNEMENT ET GÉNÉRATION DE DONNÉES SYNTHÉTIQUES À L'AIDE D'UN FLUX DE NORMALISATION (CONTINU) QUI PRÉSERVE LA CONFIDENTIALITÉ

Applicants **

NEC LABORATORIES EUROPE GMBH

Inventors

ALESIANI, Francesco

CHRISTIANSEN, Henrik

PILEGGI, Giampaolo

Priority Data

24155113.4 31.01.2024 EP

Application details

Total Number of Claims/PCT	*
Number of Independent Claims	*
Number of Priorities	*
Number of Multi-Dependent Claims	*
Number of Drawings	*
Pages for Publication	*
Number of Pages with Drawings	*
Pages of Specification	*
Sequence Listing	*
Number of Office Actions	*
International Search Report is established	*
International Searching Authority	EPO *
Recordal of a Change of the Applicant's Name/Address	Change of Applicant's Name and Address *
Type of Assignment	The Standard Agent's Assignment *
Applicant's Legal Status	Legal Entity *
Small Entity	*
Non-Commercial Organization	*
Micro Entity	*
Small Entity, USA	*
Micro Entity, USA	*
Entry into National Phase under	Chapter I *
Patent Delivery	Send the Letters Patent by Courier *
Translation

* The data is based on automatic recognition. Please verify and amend if necessary.

** IP-Coster compiles data from publicly available sources. If this data includes your personal information, you can contact us to request its removal.

Quotation for National Phase entry

Country	Stages	Total
China	Filing, Examination, Granting	2130
EPO	Filing, Examination, Granting	9265
Japan	Filing, Examination, Granting	2242
South Korea	Filing, Examination, Granting	2134
USA	Filing, Examination, Granting	4740

Total: 20,511

Contact Us

Abstract[English] A computer-implemented method for developing a differential privacy model is provided. The method includes collecting a private and personal dataset comprising private and/or personal data and training the differential privacy model via backpropagation to optimize an expected accuracy of an adversarial loss and a privacy loss. The differential privacy model is associated with a continuous normalizing flow. The method further includes outputting the trained differential privacy model. The trained differential privacy model is configured to generate new synthetic datasets that are used to train one or more downstream tasks. The method has applications including, but not limited to, use cases in medicine / healthcare such as Electronic Health Record (EHR) generation, single and bulk cell sequencing data generation, and pre-training large multimodal language models (LLMs) associated with clinical data, and can further for example, be used to optimize machine learning tasks or to support decision making.[French] L'invention concerne un procédé mis en œuvre par ordinateur pour développer un modèle de confidentialité différentielle. Le procédé consiste à collecter un ensemble de données privées et personnelles comprenant des données privées et/ou personnelles et à entraîner le modèle de confidentialité différentielle par rétropropagation pour optimiser une précision attendue d'une perte antagoniste et d'une perte de confidentialité. Le modèle de confidentialité différentielle est associé à un flux de normalisation continu. Le procédé consiste en outre à délivrer en sortie le modèle de confidentialité différentielle entraîné. Le modèle de confidentialité différentielle entraîné est configuré pour générer de nouveaux ensembles de données synthétiques qui sont utilisés pour apprendre une ou plusieurs tâches en aval. Le procédé comprend des applications comprenant, mais pas exclusivement, des cas d'utilisation en médecine/soins de santé tels que la génération de dossier de santé électronique (EHR), la génération de données de séquençage de cellules uniques et massives, et le pré-entraînement de grands modèles de langage multimodal (LLM) associés à des données cliniques, et peut en outre, par exemple, être utilisé pour optimiser des tâches d'apprentissage automatique ou pour soutenir une prise de décision.

WO2025163377 - TRAINING AND GENERATING SYNTHETIC DATA USING (CONTINUOUS) NORMALIZING FLOW THAT PRESERVES PRIVACY

Quotation for National Phase entry

Contact Us