Technical design – impressive safety and performance
- 100% hosting in Germany
ISO 27001-certified data centers, GDPR-compliant, no data transfers abroad. - Open-source technology stack
Transparent models and systems (ROCm, Open WebUI, OpenSearch) — comprehensible, expandable, future-proof. - API compatibility with OpenAI
Drop-in replacement: Continue using your existing integrations immediately by replacing the API endpoint. - High-performance hardware
AMD W7900 GPUs (48 GB VRAM) in the GPU cluster (192 GB VRAM) – ideal for LLM inference, fine-tuning, and complex workloads. - Stateless architecture
No data persistence outside the GPU – maximum data sovereignty. - Optional: Dedicated single-tenant GPU infrastructure
Your LLMs or AI&ML stack run exclusively on isolated hardware — no mixing with other customers.
Transparent contract details – AI Flat at a glance:
| Categories | Contents / Details |
|---|---|
| Price & billing |
|
| API & Models |
|
| Hosting & Data Protection |
|
| Support & SLA |
|
| Use & Liability |
|
| Optional extensions |
|
Overview of usable LLMs:
| Modell | Origin | license | Area of application (strengths) | Quantization | Context length |
|---|---|---|---|---|---|
| gpt-oss:20b | OpenAI | Apache 2.0 | Powerful all-rounder for text generation and chatbots; excellent ratio of computing power to quality. | MXFP4 | 128k Tokens |
| gpt-oss:120b | OpenAI | Apache 2.0 | Premium model for complex tasks, long documents, and precise answers; ideal for enterprise use cases. | MXFP4 | 128k Tokens |
| dots.mocr | Red Note (hilab) | MIT | Multimodal Vision-Language Model (VLM) for advanced OCR; excellent at parsing multilingual documents and directly converting structured graphics (diagrams, scientific formulas, UI) into SVG code. | FP16 / BF16 | Variable (depends on image) |
| Nomic-embed-text:v1.5 | Nomic | Apache 2.0 | Text embedding model with very high density and performance; good for semantic search and retrieval. | F16 | 2k Tokens |
| bge-m3:567m | Beijing Academy of Artificial Intelligence | MIT | Compact and high-performance embedding model; ideal for fast vector search, knowledge retrieval, and RAG. | F16 | 8k Tokens |
| embeddinggemma:300m | Gemma Terms of Use | Lightweight embedding model; very resource-efficient, good for applications with a small footprint. | BF16 | 2k Tokens |
Solution 1:
AI Flat (Flatrate)
Simply book KI Flat as a flat rate and benefit from immediate API access to powerful LLMs. Shared GPU infrastructure, stateless processing – your data is not stored or cached.
Solution 2:
Dedicated AI solution
For maximum performance: your own dedicated GPU infrastructure. 100% computing power for your models, completely isolated and optimized for demanding workloads.
Solution 3:
Consulting & Workshop
Our experts work with you to develop the optimal AI solution for your individual requirements. From strategy and workflow design to implementation—practical and solution-oriented!

Björn Langer
Let's talk about your project!
As your first technical contact, I am committed to understanding your individual requirements and developing tailor-made approaches for the successful operation of your application. Our exchange also enables us to determine an initial price indication for your individual project. I look forward to talking to you!
Contact options
Technical Foundation: Performance with NVIDIA Power
AI Flat is based on a GPU cluster featuring NVIDIA RTX 6000 Blackwell graphics cards. Each GPU is equipped with an impressive 96 GB of high-speed VRAM (GDDR7) and cutting-edge fifth-generation Tensor Cores. This massive memory capacity allows even the largest Large Language Models (LLMs) to be held entirely within the local GPU memory, ensuring ultra-low latency and maximum throughput for complex workloads.
The architecture is intentionally designed to be stateless: inputs are processed exclusively within the GPU memory and are never persisted or offloaded to external caches. Your data remains under your control at all times, protected against unwanted storage.
By leveraging CUDA, NVIDIA's world-leading software stack, Flying Circus relies on the industry standard for AI acceleration. The result is a future-proof platform that guarantees maximum compatibility and support for the latest precision formats (such as FP4), enabling the most efficient deployment of modern AI models.
AI for companies and the public sector -
sicher, flexibel und sorgenfrei
Implement powerful open source AI models in your application without taking data protection risks.The Flying Circus offers a fully managed private AI environment, hosted in ISO 27001-certified data centres in Germany - 100% GDPR-compliant and optimised for secure enterprise use.
Your advantage: An AI solution
without compromises
If you need a reliable AI solution that can be seamlessly integrated into your operations, we make it easy for you. We take over the complete management of the software and infrastructure for you so that you can concentrate on your core business. To ensure maximum security, we offer you dedicated AI server infrastructure. Our single-tenant approach ensures that your systems remain strictly separated and your data is under your control at all times - without mixing with other customers. Our customised Service Level Agreements (SLAs) guarantee you optimal application availability and adaptation to your specific requirements. You concentrate on your business - we take care of the smart AI solution that supports you.
From the idea to ready-to-use AI
Your customised solution
The use of AI must be well thought out - and this is exactly where we support you. Regardless of whether it is retrieval augmented generation (RAG), machine learning or more - we support you in the development of your own AI application and customise the method to your individual project requirements. In the planning phase, we analyse your requirements together, select the optimal model and tailor the AI hardware precisely to your needs. During the realisation phase, we implement the solution for you, enable initial tests and you use existing language models precisely for your use case - with tight feedback cycles for maximum efficiency. As soon as everything is running smoothly, we take over the hosting and secure operation of your AI during the operating phase. Access is convenient via an intuitive graphical web interface or an OpenAI-compatible API - smoothly and perfectly integrated into your systems. Simply an AI solution with maximum security and data protection from the Flying Circus.
Funded by the State of Saxony-Anhalt – for digital sovereignty
With its "Saxony-Anhalt Digital 2030" strategy, the state is pursuing the goal of establishing a strong digital economy and a modern, future-proof administration. Artificial intelligence plays a central role in this – it opens up enormous potential, but also brings challenges: the protection of personal data, the traceability of decision-making processes, and control over where and how data is processed.
Our goal is to strengthen digital self-determination in Saxony-Anhalt in a sustainable manner: We want to enable organizations to use data and technologies in a way that complies with European legislation – traceable, auditable, and compliant. At the same time, we promote the development of innovative applications based on open standards that strengthen Saxony-Anhalt as a digital location in the long term.
A data privacy-compliant solution -
secure AI in the administration
Artificial intelligence (AI) can revolutionise public administration by automating routine tasks, making decisions easier and improving citizen service. With our data protection-compliant AI approach, administrative processes can be optimised without putting sensitive data at risk.
For example, our AI solutions make it possible to check and prioritise applications, automatically classify documents and monitor deadlines. Virtual assistants (‘office agents’) can support citizens and employees with various issues, while chatbots serve as a natural language interface. Our approach is to make administrations more efficient and transparent by using data protection-compliant AI solutions to improve services for citizens. We are convinced that our dedicated AI solution can actively support the digital transformation of public administrations.