As private business failures now exceed 9.2% — an unprecedented level in recent memory — venture capital firm Lux Capital recently recommended that enterprises utilizing AI ensure their computational resource pledges are formally documented. With financial instability impacting the AI supply chain, Lux cautioned that a mere verbal understanding is insufficient.
Yet, a completely different alternative presents itself: entirely discontinuing reliance on external computing infrastructure. More compact AI models, capable of operating directly on a user’s own device — sans data center, without a cloud provider, and free from counterparty risk — are becoming sufficiently capable to warrant consideration. And Multiverse Computing is putting its hand up.
This Spanish startup has, until now, maintained a more subdued presence compared to some of its counterparts, but with the increasing demand for AI efficiency, this situation is evolving. Following the compression of models from prominent AI research centers including OpenAI, Meta, DeepSeek, and Mistral AI, it has unveiled both an application demonstrating the capabilities of its condensed models and an API portal — a conduit allowing developers to access and construct with these models — thereby making them more broadly available.
The CompactifAI application, which shares its designation with Multiverse’s quantum-inspired compression technology, functions as an AI conversational utility akin to ChatGPT or Mistral’s Le Chat. You pose a query, and the model provides an answer. The distinction lies in Multiverse having embedded Gilda, a model so diminutive it can operate locally and without an internet connection, according to the organization.
For end-users, this offers a glimpse into edge AI, where information remains on their gadgets and no network connection is needed. Nevertheless, a qualification exists: their mobile apparatuses must possess adequate RAM and storage. Should they not — and many older iPhones will lack this — the application reverts to cloud-based models via an API. The routing between local and cloud processing is managed autonomously by a system Multiverse has christened Ash Nazg, a name that will resonate with Tolkien aficionados as it alludes to the One Ring inscription in “The Lord of the Rings.” However, when the application routes to the cloud, it forfeits its primary privacy advantage during the process.
These constraints imply that CompactifAI is not yet fully prepared for widespread consumer adoption, although that may never have been its primary objective. Based on Sensor Tower data, the application recorded fewer than 5,000 downloads over the last month.
The actual target audience comprises businesses. Today, Multiverse is introducing a self-service API portal that grants developers and enterprises direct access to its compressed models — bypassing the need for AWS Marketplace.
Techcrunch event
San Francisco, CA
|
October 13-15, 2026
“The CompactifAI API portal 1773907256 offers developers immediate access to condensed models, along with the necessary transparency and governance to deploy them in production environments,” stated CEO Enrique Lizaso in a declaration.
Real-time utilization monitoring stands as one of the API’s principal functionalities, and this is by design. In addition to the potential benefits of edge deployment, reduced computational expenses represent a significant factor encouraging enterprises to contemplate smaller models as an alternative to large language models (LLMs).
It is also helpful that minor models are less restricted than they once were. Earlier this week, Mistral refreshed its family of smaller models with the introduction of Mistral Small 4, which it asserts is simultaneously optimized for general conversation, coding, autonomous agent tasks, and inferential reasoning. The French corporation additionally launched Forge, a framework empowering enterprises to construct bespoke models, including compact ones for which they can select the trade-offs their specific use cases can best accommodate.
Multiverse’s most recent outcomes also indicate that the disparity with LLMs is diminishing. Its latest compressed model, HyperNova 60B 2602, is constructed upon gpt-oss-120b — an OpenAI model whose foundational code is publicly accessible. The firm asserts it now provides swifter responses at a reduced cost compared to the original from which it was derived, an advantage of particular importance for agentic coding workflows, where AI independently executes intricate, multi-stage programming assignments.
Making models sufficiently small to function on mobile devices while maintaining their utility presents a considerable obstacle. Apple Intelligence circumvented that challenge by integrating an on-device model with a cloud model. Multiverse’s CompactifAI application can also transmit requests to gpt-oss-120b via an API, but its fundamental objective is to demonstrate that local models such as Gilda and its forthcoming replacements offer benefits extending beyond mere cost efficiencies.
For professionals in critical sectors, a model capable of operating locally and without connecting to the cloud provides enhanced privacy and resilience. However, the greater worth resides in the business applications this can facilitate – for instance, embedding AI in unmanned aerial vehicles, satellites, and other environments where connectivity cannot be assumed.
The company already serves over 100 international clients, including the Bank of Canada, Bosch, and Iberdrola, but expanding its customer base could assist in securing additional capital. After raising a $215 million Series B last year, it is now rumored to be pursuing a fresh €500 million investment round at a valuation surpassing €1.5 billion.
{content}
Source: {feed_title}

