📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s government-funded AMÁLIA LLM is operational, outperforming some benchmarks. However, critical questions about its openness, native data use, and optimization goals are still unresolved, raising broader issues for Europe’s sovereign AI efforts.
Portugal’s €5.5 million investment in the AMÁLIA large language model has resulted in a functioning base version, which outperforms previous open models on European Portuguese benchmarks. Despite this progress, key structural questions about the model’s openness, native-language data, and primary objectives remain unresolved, raising broader concerns about Europe’s sovereign AI projects.
The AMÁLIA project involves approximately 60 researchers from Portugal’s leading academic institutions, including NOVA and IST. It was announced in December 2024, with the base version completed by September 30, 2025, and publicly launched on October 1, 2025. The model is currently accessible to 450,000 academic users via the FCT’s IAedu platform, with knowledge updated to the end of 2023. The final version is scheduled for release in June 2026.
Technically, AMÁLIA is a continuation of the pre-training phase of the EuroLLM multilingual model, rather than trained from scratch. It incorporates approximately 107 billion tokens during extended pre-training, with about 5.8 billion tokens from Portuguese web archives. The model’s architecture remains similar to EuroLLM, with minor modifications. Benchmark results show it surpasses all previous fully open models on European Portuguese tests and beats Qwen 3-8B on most benchmarks, though it still trails Qwen on the ALBA benchmark.
Despite these achievements, critics like Duarte O.Carmo have raised fundamental questions about the model’s openness, the sufficiency of native-language data, and the strategic goals guiding its development. These questions are not yet answered publicly, highlighting a broader issue across European sovereign-LLM initiatives.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

Universal Language (Spanish, Portuguese, Italian, German, etc.) Plan of Salvation Teaching Kit Gift for Missionaries, Youth or Family Home Evening Teaching Tool
Includes 15 durable wood pieces
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

Generative AI with Python: Create text, images, and audio using open-source models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.
native language data training dataset
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.

The Sovereign Creator: Escaping the Fiat Trap and Engineering Wealth in the AI Era
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Implications for Europe’s Sovereign AI Strategies
The development of AMÁLIA exemplifies Europe’s broader effort to create sovereign AI models, but the unresolved questions about openness, native data, and objectives reveal systemic challenges. These issues impact not only Portugal but also the credibility and future direction of European AI sovereignty, affecting policy, industry, and research collaborations across the continent.
European Sovereign LLM Efforts and Structural Challenges
Across Europe, countries like Italy, Germany, France, and Norway are investing in their own large language models, often with public funding. These projects share common structural questions: How open is ‘fully open’ in practice? How much native-language data is enough? What should be the primary goals—performance, openness, or strategic sovereignty? The case of AMÁLIA highlights that many of these efforts are still grappling with these fundamental issues, which influence their design choices and public accountability.
Portugal’s approach to building AMÁLIA as a continuation of a multilingual foundation contrasts with other models trained from scratch, reflecting different strategic priorities. The ongoing debates about data sufficiency and openness are central to Europe’s sovereign AI ambitions, yet clear answers remain elusive.
“The three questions about openness, native data, and goals are fundamental to understanding what these models can and should do.”
— Duarte O.Carmo
Unanswered Questions About AMÁLIA’s Openness and Strategy
It remains unclear how open the AMÁLIA model truly is, especially regarding access to training data, model weights, and licensing. The strategic objectives guiding its development—whether performance, sovereignty, or openness—are also not publicly clarified. Additionally, how the model will evolve with multimodal capabilities and whether native-language data sufficiency is achieved are still uncertain.
Upcoming Milestones and Discourse on European LLMs
The final version of AMÁLIA is scheduled for release in June 2026, which will likely include further benchmarks and transparency disclosures. Over the next 12-24 months, European projects are expected to face increased scrutiny regarding openness, data strategies, and strategic goals. Public and academic debates will likely intensify, influencing policy and funding decisions across the continent.
Key Questions
What is the current status of AMÁLIA?
The base version is operational, publicly accessible, and outperforms previous open models on Portuguese benchmarks. The final version is expected in June 2026.
What are the main concerns about AMÁLIA?
Key concerns include the true level of openness, the sufficiency of native Portuguese data, and the strategic goals guiding its development.
How does AMÁLIA compare to other European models?
It outperforms many open models on Portuguese benchmarks but still trails some commercial models like Qwen 3-8B on specific tests.
Why are these questions important for Europe?
They determine the credibility, strategic sovereignty, and future development paths of European AI efforts.
What should we expect next from AMÁLIA?
The final version release in June 2026 will clarify many remaining questions and set the stage for ongoing debates about European AI sovereignty.
Source: ThorstenMeyerAI.com