European AI Firms Shrink Giant AI Models by 95% to Run on Tiny Edge Devices

Q: How Does Model Compression Actually Work, and Why Does It Matter?

Multiverse's CompactifAI technology achieves the heavy lifting here. The company can shrink large language models (LLMs), which are the AI systems behind tools like ChatGPT, by up to 95% in size while losing only 2% to 3% of their accuracy . To put that in perspective, an LLM that normally requires gigabytes of memory and specialized hardware can now fit on a smartphone, industrial controller, or edge device with a fraction of the power consumption. This compression matters because smaller models demand less memory, require cheaper chips, and consume dramatically less energy. In environments where a constant connection to cloud infrastructure is impractical or prohibitively expensive, this efficiency shift changes the entire economics of AI deployment .

Q: What Real-World Problems Does This Partnership Solve?

The integration of Multiverse's compressed models into Axelera's hardware platforms enables workloads that normally require "datacenter-class infrastructure" to run on compact devices . The companies are targeting several specific use cases: Enrique Lizaso, co-founder and CEO of Multiverse Computing, explained the strategic vision: "Our mission is to make state-of-the-art AI radically more efficient and accessible. By combining Multiverse's advanced compressed AI models with Axelera's high-performance edge platforms, we can bring powerful reasoning capabilities to devices where latency, privacy and energy consumption are critical" .

Q: Why Is On-Device Fine-Tuning a Game-Changer?

Beyond just running inference (the process of using a trained model to make predictions), the partnership also enables on-device fine-tuning. Fine-tuning means adjusting a pre-trained model with new data specific to a particular organization or use case, without sending that sensitive data to the cloud . Historically, fine-tuning at the edge has been technically demanding because it requires significant memory and computing power. Compression changes this equation. By reducing the base model size, organizations need less headroom to adjust and customize the model locally, making fine-tuning feasible on resource-constrained devices . This capability is especially valuable for enterprises handling regulated data. A healthcare provider can fine-tune a medical AI model on patient records without ever uploading that information off-premises. A financial institution can customize fraud detection without exposing transaction data to third parties .

Q: How Does This Fit Into Europe's AI Strategy?

Both companies are backed by the European Innovation Council, and they've explicitly framed this partnership around European technological sovereignty. The idea of a "sovereign" AI stack has gained prominence as governments and enterprises reassess their dependencies on US hyperscalers and non-European chip supply chains . Ekaterina Zaharieva, Commissioner for Startups, Research and Innovation, underscored the broader significance: "Europe's competitiveness in the next decade will depend on our ability to combine world-class chips with trustworthy, efficient AI. Collaborations like the one between Multiverse Computing and Axelera AI, both supported significantly by the European Innovation Council and its Fund, show how European deep-tech companies, when connected to each other, work together to deliver sovereign strategic digital technologies that are developed and scaled in Europe while serving global markets" . The partnership directly addresses European policy goals around reducing geopolitical exposure and supply chain risk. By developing and manufacturing both the compressed AI models and the hardware acceleration technologies within Europe, the companies aim to reduce dependence on non-European infrastructure while empowering regional industry and public institutions .

Q: What Are the Practical Benefits for Organizations?

The collaboration enables three concrete advantages for customers deploying AI at scale: Fabrizio Del Maffeo, co-founder and CEO of Axelera AI, noted the expanded opportunity: "Axelera AI is committed to delivering the most powerful and efficient AI inference solutions to the world. Enabling Multiverse Computing's compressed AI models to run on our Metis and future Europa platforms will unlock new classes of applications for our customers, from industrial and retail to mobility, defense, smart cities and more" .

Q: What Happens Next?

The companies have completed the technical integration phase and are now moving into a dedicated commercialization phase. Multiverse, headquartered in Donostia-San Sebastian in Spain with offices across Europe, the US, and Canada, will work with Axelera, based in Eindhoven with staff across several European countries plus Taiwan and the US, to bring the combined solution to market . The timing is significant. As agentic AI systems (autonomous AI agents that can reason and make decisions independently) become more prevalent, edge inference is becoming increasingly critical. A separate report from InterDigital and ABI Research warns that agentic AI will generate continuous upstream data from smart glasses, wearables, smartphones, and IoT devices, potentially straining mobile networks unless intelligence is distributed across devices and edge infrastructure . The Multiverse-Axelera partnership directly addresses this architectural shift by making it practical to embed powerful AI reasoning at the edge.

FrontierNews.ai AI Research Desk

FrontierNews.ai