Our research topics

PRIVASA’s research teams and industry partners are exploring the potential and limitations of using synthetic data in health care. Synthetic data could support research and innovation activities when real data is not available due to legal, ethical or practical constraints. Synthetic data could also be used alongside real data to speed up the early phases of development.

For most applications in health care (mainly excluding primary care), it is critical that the datasets protect the individuals’ privacy. As a result, synthetic datasets are often expected to be anonymous. The process of generating synthetic data includes a well-known trade-off: synthetic datasets most similar to real data offer the least privacy protection. Likewise, synthetic datasets with little or no resemblance to real data offer strongest privacy protection.

By applying different techniques to generate synthetic datasets, we are mapping the practical implications of this privacy-utility trade-off. We are also empirically testing the usability of synthetic datasets in statistical analyses and comparing the performance to alternative solutions, such as private queries to real data.

In our research, we have focused on processing numeric tabular data and medical images.


We wish to acknowledge that scientific work exceeds the limits of individual projects.

The publications are listed here based on the co-authorship of one or more PRIVASA researchers (in bold). These represent research activities
planned and executed in the project or collaborative research activities
within the thematic scope of the project.

Huhtanen J.T., Nyman M., Doncenco D., Hamedian M., Kawalya D., Salminen L., Sequeiros R.B., Koskinen S.K., Pudas T.K., Kajander S., Niemi P., Hirvonen J., Aronen H., Jafaritadi M. (2022). Deep learning accurately classifies elbow joint effusion in adult and pediatric radiographs. Scientific Reports 12, 11803. https://doi.org/10.1038/s41598-022-16154-x

Khan M.I., Jafaritadi M., Alhoniemi E., Kontio E., Khan S.A. (2022). Adaptive Weight Aggregation in Federated Learning for Brain Tumor Segmentation. In: Crimi A., Bakas S. (eds) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2021. Lecture Notes in Computer Science, vol 12963. Springer, Cham. https://doi.org/10.1007/978-3-031-09002-8_40


Differentially private synthetic tabular data generation with a generative adversarial network and privacy amplification by subsampling
Valtteri Nieminen, University of Turku (2022)

Exploring Medical Image Data Augmentation and Synthesis using conditional Generative Adversarial Networks

Dorin Doncenco, Turku University of Applied Sciences (2022)