AI Model Serving
OpenAI-Compatible API Surface
Our service provides OpenAI-compatible implementations of commonly used API endpoints, making it easy to connect existing SDKs, clients, and applications to our self-hosted models with minimal changes.
Supported endpoints
| Endpoint | Method | Purpose | Documentation |
|---|---|---|---|
/models |
GET |
List available models | openai.com |
/responses |
POST |
Unified text and structured generation endpoint | openai.com |
/chat/completions |
POST |
Chat-based completions with standard or streaming responses | openai.com |
/embeddings |
POST |
Generate vector embeddings for semantic search and retrieval | openai.com |
/rerank |
POST |
Rank documents by relevance to a query | jina.ai |
These endpoints are designed to follow common OpenAI request and response schemas where practical, allowing reuse of existing integrations and tooling.
Feature availability may vary depending on the selected model or backend runtime.
Data Security & Privacy
We operate under a Zero Data Retention (ZDR) model for customer request payloads.
- Request content is processed in memory only
- Payload data is not stored after request completion
- Processing remains within our own infrastructure in Hanover, Germany
Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.
Base URL
Use the following API base URL: https://api.ai.net.de/v1
Note
All API calls require a valid API key. If you need a key, request one via our support channels (see Support).
Supported models
The service currently integrates the following models. We continuously add and update models, so the table may evolve over time.
| Class | Model ID | Description | HF Repository |
|---|---|---|---|
| Premium | gpt-oss-120b |
GPT‑style large language model. | openai/gpt-oss-120b |
| Premium | Qwen3.6-27B |
Coding and agent workflow model. | Qwen/Qwen3.6-27B |
| Standard | LightOnOCR-2-1B |
OCR model for multilingual document text extraction. | lightonai/LightOnOCR-2-1B |
| Standard | bge-reranker-v2-m3 |
Cross-encoder reranker model for relevance scoring. | BAAI/bge-reranker-v2-m3 |
| Standard | bge-m3 |
Multilingual embedding model for semantic similarity search. | BAAI/bge-m3 |
Getting Started
curl
Authentication
Authenticate using your API key via the Authorization header.
export API_KEY="YOUR_API_KEY"
List Models
curl https://api.ai.net.de/v1/models \
-H "Authorization: Bearer $API_KEY"
Responses
curl https://api.ai.net.de/v1/responses \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"input": "Write a three sentence poem about datacenters."
}'
Chat Completion
curl https://api.ai.net.de/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen3.6-27B",
"messages": [
{ "role": "user", "content": "Write Hello, world! in Go" }
]
}'
Embeddings
curl https://api.ai.net.de/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "bge-m3",
"input": "Private AI infrastructure"
}'
Rerank
curl https://api.ai.net.de/v1/rerank \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "bge-reranker-v2-m3",
"query": "AI hosting in Germany",
"documents": [
"Managed cloud hosting",
"Private AI infrastructure in Hanover",
"Consumer chatbot tools"
]
}'
OpenAI Codex
Edit ~/.codex/config.toml
[profiles.netde]
model = "gpt-oss-120b"
model_provider = "netde"
web_search = "disabled"
[model_providers.netde]
name = "api.ai.net.de"
base_url = "https://api.ai.net.de/v1"
wire_api = "responses"
experimental_bearer_token = "YOUR_API_KEY"
Start with
codex -p netde
Opencode
Edit ~/.config/opencode/opencode.json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"llama.cpp": {
"npm": "@ai-sdk/openai-compatible",
"name": "api.ai.net.de",
"options": {
"baseURL": "https://api.ai.net.de/v1",
"apiKey": "YOUR_API_KEY"
},
"models": {
"Qwen3.6-27B": {
"name": "Qwen3.6-27B"
}
}
}
}
}
FAQ
Q: How does the service ensure data security and retention?
A: We follow a Zero Data Retention (ZDR) policy. Any data sent to the AI endpoints is processed in‑memory only and is not stored after the request completes. All processing happens within our own datacenter in Hanover, Germany, ensuring that data never leaves our controlled infrastructure. Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.
Q: Is an API key required to use the AI model serving endpoints?
A: Yes. All requests to the API require a valid API key. If you do not have one, you can request it via our support channels. See the list of support options in Support.
Q: Can I request multiple API keys to differentiate between different applications?
A: Yes. Reach out through our support channels to request additional API keys.
Q: Is billing done on a per‑API‑key basis?
A: Yes. Usage is tracked per key and billed according.
Q: Can I request a specific model that is not listed?
A: Yes. Reach out through our support channels to request additional models. We evaluate demand and will add new models when feasible.
Need Help?
For onboarding, API access, pricing, or custom model requests, see Support.