AI Model Serving

OpenAI-Compatible API Surface

Our service provides OpenAI-compatible implementations of commonly used API endpoints, making it easy to connect existing SDKs, clients, and applications to our self-hosted models with minimal changes.

Supported endpoints

Endpoint	Method	Purpose	Documentation
`/models`	`GET`	List available models	openai.com
`/responses`	`POST`	Unified text and structured generation endpoint	openai.com
`/chat/completions`	`POST`	Chat-based completions with standard or streaming responses	openai.com
`/embeddings`	`POST`	Generate vector embeddings for semantic search and retrieval	openai.com
`/rerank`	`POST`	Rank documents by relevance to a query	jina.ai

These endpoints are designed to follow common OpenAI request and response schemas where practical, allowing reuse of existing integrations and tooling.

Feature availability may vary depending on the selected model or backend runtime.

Data Security & Privacy

We operate under a Zero Data Retention (ZDR) model for customer request payloads.

Request content is processed in memory only
Payload data is not stored after request completion
Processing remains within our own infrastructure in Hanover, Germany

Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.

Base URL

Use the following API base URL: https://api.ai.net.de/v1

Note

All API calls require a valid API key. If you need a key, request one via our support channels (see Support).

Supported models

The service currently integrates the following models. We continuously add and update models, so the table may evolve over time.

Class	Model ID	Description	HF Repository
Premium	`gpt-oss-120b`	GPT‑style large language model.	openai/gpt-oss-120b
Premium	`Qwen3.6-27B`	Coding and agent workflow model.	Qwen/Qwen3.6-27B
Standard	`LightOnOCR-2-1B`	OCR model for multilingual document text extraction.	lightonai/LightOnOCR-2-1B
Standard	`bge-reranker-v2-m3`	Cross-encoder reranker model for relevance scoring.	BAAI/bge-reranker-v2-m3
Standard	`bge-m3`	Multilingual embedding model for semantic similarity search.	BAAI/bge-m3

Getting Started

curl

Authentication

Authenticate using your API key via the Authorization header.

export API_KEY="YOUR_API_KEY"

List Models

curl https://api.ai.net.de/v1/models \
  -H "Authorization: Bearer $API_KEY"

Responses

curl https://api.ai.net.de/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "input": "Write a three sentence poem about datacenters."
  }'

Chat Completion

curl https://api.ai.net.de/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3.6-27B",
    "messages": [
      { "role": "user", "content": "Write Hello, world! in Go" }
    ]
  }'

Embeddings

curl https://api.ai.net.de/v1/embeddings \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-m3",
    "input": "Private AI infrastructure"
  }'

Rerank

curl https://api.ai.net.de/v1/rerank \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-reranker-v2-m3",
    "query": "AI hosting in Germany",
    "documents": [
      "Managed cloud hosting",
      "Private AI infrastructure in Hanover",
      "Consumer chatbot tools"
    ]
  }'

OpenAI Codex

Edit ~/.codex/config.toml

[profiles.netde]
model = "gpt-oss-120b"
model_provider = "netde"
web_search = "disabled"

[model_providers.netde]
name = "api.ai.net.de"
base_url = "https://api.ai.net.de/v1"
wire_api = "responses"
experimental_bearer_token = "YOUR_API_KEY"

Start with

codex -p netde

Opencode

Edit ~/.config/opencode/opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "llama.cpp": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "api.ai.net.de",
      "options": {
        "baseURL": "https://api.ai.net.de/v1",
        "apiKey": "YOUR_API_KEY"
      },
      "models": {
        "Qwen3.6-27B": {
          "name": "Qwen3.6-27B"
        }
      }
    }
  }
}

FAQ

Q: How does the service ensure data security and retention?

A: We follow a Zero Data Retention (ZDR) policy. Any data sent to the AI endpoints is processed in‑memory only and is not stored after the request completes. All processing happens within our own datacenter in Hanover, Germany, ensuring that data never leaves our controlled infrastructure. Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.

Q: Is an API key required to use the AI model serving endpoints?

A: Yes. All requests to the API require a valid API key. If you do not have one, you can request it via our support channels. See the list of support options in Support.

Q: Can I request multiple API keys to differentiate between different applications?

A: Yes. Reach out through our support channels to request additional API keys.

Q: Is billing done on a per‑API‑key basis?

A: Yes. Usage is tracked per key and billed according.

Q: Can I request a specific model that is not listed?

A: Yes. Reach out through our support channels to request additional models. We evaluate demand and will add new models when feasible.

Need Help?

For onboarding, API access, pricing, or custom model requests, see Support.