Our research topics
PRIVASA’s research teams and industry partners are exploring the potential and limitations of using synthetic data in health care. Synthetic data could support research and innovation activities when real data is not available due to legal, ethical or practical constraints. Synthetic data could also be used alongside real data to speed up the early phases of development.
For most applications in health care (mainly excluding primary care), it is critical that the datasets protect the individuals’ privacy. As a result, synthetic datasets are often expected to be anonymous. The process of generating synthetic data includes a well-known trade-off: synthetic datasets most similar to real data offer the least privacy protection. Likewise, synthetic datasets with little or no resemblance to real data offer strongest privacy protection.
By applying different techniques to generate synthetic datasets, we are mapping the practical implications of this privacy-utility trade-off. We are also empirically testing the usability of synthetic datasets in statistical analyses and comparing the performance to alternative solutions, such as private queries to real data.
In our research, we have focused on processing numeric tabular data and medical images.
We wish to acknowledge that scientific work exceeds the limits of individual projects.
The publications are listed here based on the co-authorship of one or more PRIVASA researchers (in bold). These represent research activities
planned and executed in the project or collaborative research activities
within the thematic scope of the project.
Eisenmann M., Reinke A., Weru V., Tizabi M. D., Isensee F., Adler, T. J., … Jafaritadi M., Kontio E., Khan M., … & Finzel R. (2022). Biomedical image analysis competitions: The state of current participation practice. arXiv preprint arXiv:2212.08568
Huhtanen J.T., Nyman M., Doncenco D., Hamedian M., Kawalya D., Salminen L., Sequeiros R.B., Koskinen S.K., Pudas T.K., Kajander S., Niemi P., Hirvonen J., Aronen H., Jafaritadi M. (2022). Deep learning accurately classifies elbow joint effusion in adult and pediatric radiographs. Scientific Reports 12, 11803. https://doi.org/10.1038/s41598-022-16154-x
Kaisti M., Laitala J., Wong D., Airola A. (2023). Domain randomization using synthetic electrocardiograms for training neural networks. Artificial Intelligence in Medicine. https://doi.org/10.1016/j.artmed.2023.102583
Khan M.I., Jafaritadi M., Alhoniemi E., Kontio E., Khan S.A. (2022). Adaptive Weight Aggregation in Federated Learning for Brain Tumor Segmentation. In: Crimi A., Bakas S. (eds) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2021. Lecture Notes in Computer Science, vol 12963. Springer, Cham. https://doi.org/10.1007/978-3-031-09002-8_40
Salmi J., Hermansson L.-L. (2022) Centralized or de-centralized data and algorithms in the Finnish health care infrastructure. 14th International Conference on eHealth.
Differentially private synthetic tabular data generation with a generative adversarial network and privacy amplification by subsampling
Valtteri Nieminen, University of Turku (2022)
Exploring Medical Image Data Augmentation and Synthesis using conditional Generative Adversarial Networks
Dorin Doncenco, Turku University of Applied Sciences (2022)
Automatic classification of cardiomegaly using deep convolutional neural network
Maral Hamedian, Turku University of Applied Sciences (2022) https://urn.fi/URN:NBN:fi:amk-2022112524037
Predicting the condition of age-related macular degeneration patients with long short-term memory [in Finnish]
Kaspar Kaasikoja, Turku University of Applied Sciences (2022)