Anyway

Anyway

Self-hosted AI platform for private, on-premise LLM inference

Anyway orchestrates on-premise hardware into a distributed inference cluster supporting any Hugging Face model. It deploys as a standalone app, Docker image, or Kubernetes pod and provides an OpenAI-compatible REST API endpoint, giving organisations full model control and fixed operational costs with no per-token fees.

2026 ProposalDecentralizedLarge Language ModelMachine Learning
Key facts
Maturity
Support
C4DT
Active
Lab
Unknown
  • Presentation
  • C4DT work

Today Large Language Models (LLM) and other big Machine Learning (ML) models take the upfront of the stage. These models can now be trained for specific, customized solutions. But running these models, doing inference on a new dataset, still requires access to a big datacenter.

What if a company or an organization doesn't have access to a datacenter, or if the input data is too confidential? We propose to run the inference on on-premise servers. This keeps data secure.

Our Solution

We fully automate and optimize the distributed deployment of ML models for training and inference, dynamically leveraging on-premise servers. Our high-performance, secure solution is ideal for companies seeking local ML usage with sovereignty and scalability.

The Unique Selling Points of our solution are:
  • Simplicity – Clients can focus on their business applications while our solution transparently handles distributed deployment.
  • Efficiency – Clients can utilize existing machines, maximizing available computing power—even across heterogeneous hardware.
  • Scalability – Large models can be run locally
  • Privacy – Our solution enables organizations to leverage AI's power locally without relying on untrusted providers.

Distributed Computing Lab

Distributed Computing Lab
Rachid Guerraoui

Prof. Rachid Guerraoui

The Distributed Computing Lab focuses currently on Scalable Implementations of Cryptocurrencies, Byzantine fault tolerance and privacy in distributed machine learning, distributed algorithms making use of RDMA and NVRAM.

This page was last edited on 2026-03-23.