NASK Employee Awarded Preludium 24 Grant!

04.12.2025

We are pleased to announce that MSc Eng. Jan Dubiński, an employee of NASK, has been awarded a grant in the Preludium 24 competition organized by the National Science Centre.

The research project he leads addresses a highly relevant and pressing issue – auditing data sources used to train large generative artificial intelligence models.

In today’s era of rapid AI development, questions about the transparency of model training processes and the protection of intellectual property rights are becoming increasingly common. Currently, there are no tools that allow creators, publishers, or institutions to unequivocally determine whether their data has been used in training models such as ChatGPT, DALL·E, or Stable Diffusion. Jan Dubiński’s project responds to this need by proposing scalable and practical methods for detecting traces of data usage in generative models.

The central hypothesis of the project assumes that data leaves subtle, statistical traces in a model’s behavior. The developed methods aim not only to detect the presence of data but also to estimate the extent of its use, which is crucial from the perspective of copyright law. The project also includes the development of techniques to identify the use of even small data collections, such as a private artist’s portfolio, and the extension of research to multimodal models that learn simultaneously from text and images.

The project is significant both scientifically and socially – it supports data owners, enables independent audits, and provides tools for regulators and legal professionals. In the long term, it may contribute to a more ethical and transparent development of artificial intelligence. All developed methods will be released as open-source software, so they can be used in practice by creators, journalists, cultural institutions, and organizations focused on data protection.

We congratulate MSc Eng. Jan Dubiński on securing funding and wish him success in implementing the project!