In an EU research project, the Alexandra Institute will develop a Germanic language model aimed at reducing dependence on English-language models like ChatGPT, which are largely driven by commercial interests in the US and thus do not meet European standards.

Artificial intelligence is growing larger

In just a few years, English-language models like ChatGPT and Bard have grown significantly. However, these large systems carry disadvantages that the EU and other parties’ disfavour. They are subject to different regulations and cultures, meaning Europeans are forced into systems that do not meet European values for human-centred, trustworthy, and democratic artificial intelligence.

”Artificial intelligence is a train that’s moving, and it will affect at least 80% of the workforce. If we don’t accommodate our languages, others will seize all the opportunities. We must secure our languages and further build our competencies to protect our own interests,” says Torben Blach, the project manager at the Alexandra Institute for the new ambitious research project TrustLLM.

Experts from the Alexandra Institute will collaborate with competent European researchers in language technology through the project. As a GTS institute, it’s a role that the institute takes very seriously. The experiences will contribute to national ambitions of eventually creating a Danish language model.

”We are now part of the group of key actors who have come together to develop models for the Germanic languages. Through this collaboration, we further build our competencies and get firsthand impressions of the data being collected and on which the models will be trained,” says Torben Blach.

Along with the Alexandra Institute, 10 other academic institutions across the EU, including the University of Copenhagen, are participating.

An open-source mindset

The project will also look at the ethical, research-intensive, and business sides of AI. The Alexandra Institute’s Senior AI Specialist and Ph.D., Dan Saattrup Nielsen, emphasises that we currently see several limitations with the models developed. Therefore, the primary motivation is to utilize an open-source mindset.

”We are dependent on others’ data and the model’s structure, which is also closed, so we do not know the logic. Therefore, we need to improve the models and fix the issues that we experience, for example in ChatGPT. This can be biases, which we through our research need to minimize in model training. We also need to minimize the number of times the models hallucinate and invent facts,” says Dan Saattrup Nielsen.

The TrustLLM project runs until October 2026, with the goal of developing an open, trustworthy, and sustainable language model.