AI Model Serving

OpenAI-Compatible API Surface

Our service provides OpenAI-compatible implementations of commonly used API endpoints, making it easy to connect existing SDKs, clients, and applications to our self-hosted models with minimal changes.

Supported endpoints

Endpoint Method Purpose Documentation
/models GET List available models openai.com
/responses POST Unified text and structured generation endpoint openai.com
/chat/completions POST Chat-based completions with standard or streaming responses openai.com
/embeddings POST Generate vector embeddings for semantic search and retrieval openai.com
/rerank POST Rank documents by relevance to a query jina.ai

These endpoints are designed to follow common OpenAI request and response schemas where practical, allowing reuse of existing integrations and tooling.

Feature availability may vary depending on the selected model or backend runtime.

Data Security & Privacy

We operate under a Zero Data Retention (ZDR) model for customer request payloads.

  • Request content is processed in memory only
  • Payload data is not stored after request completion
  • Processing remains within our own infrastructure in Hanover, Germany

Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.

Base URL

Use the following API base URL: https://api.ai.net.de/v1

Note

All API calls require a valid API key. If you need a key, request one via our support channels (see Support).

Supported models

The service currently integrates the following models. We continuously add and update models, so the table may evolve over time.

Class Model ID Description HF Repository
Premium gpt-oss-120b GPT‑style large language model. openai/gpt-oss-120b
Premium Qwen3.6-27B Coding and agent workflow model. Qwen/Qwen3.6-27B
Standard LightOnOCR-2-1B OCR model for multilingual document text extraction. lightonai/LightOnOCR-2-1B
Standard bge-reranker-v2-m3 Cross-encoder reranker model for relevance scoring. BAAI/bge-reranker-v2-m3
Standard bge-m3 Multilingual embedding model for semantic similarity search. BAAI/bge-m3

Getting Started

curl

Authentication

Authenticate using your API key via the Authorization header.

export API_KEY="YOUR_API_KEY"

List Models

curl https://api.ai.net.de/v1/models \
  -H "Authorization: Bearer $API_KEY"

Responses

curl https://api.ai.net.de/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "input": "Write a three sentence poem about datacenters."
  }'

Chat Completion

curl https://api.ai.net.de/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3.6-27B",
    "messages": [
      { "role": "user", "content": "Write Hello, world! in Go" }
    ]
  }'

Embeddings

curl https://api.ai.net.de/v1/embeddings \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-m3",
    "input": "Private AI infrastructure"
  }'

Rerank

curl https://api.ai.net.de/v1/rerank \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bge-reranker-v2-m3",
    "query": "AI hosting in Germany",
    "documents": [
      "Managed cloud hosting",
      "Private AI infrastructure in Hanover",
      "Consumer chatbot tools"
    ]
  }'

OpenAI Codex

Edit ~/.codex/config.toml

[profiles.netde]
model = "gpt-oss-120b"
model_provider = "netde"
web_search = "disabled"

[model_providers.netde]
name = "api.ai.net.de"
base_url = "https://api.ai.net.de/v1"
wire_api = "responses"
experimental_bearer_token = "YOUR_API_KEY"

Start with

codex -p netde

Opencode

Edit ~/.config/opencode/opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "llama.cpp": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "api.ai.net.de",
      "options": {
        "baseURL": "https://api.ai.net.de/v1",
        "apiKey": "YOUR_API_KEY"
      },
      "models": {
        "Qwen3.6-27B": {
          "name": "Qwen3.6-27B"
        }
      }
    }
  }
}

FAQ

Q: How does the service ensure data security and retention?

A: We follow a Zero Data Retention (ZDR) policy. Any data sent to the AI endpoints is processed in‑memory only and is not stored after the request completes. All processing happens within our own datacenter in Hanover, Germany, ensuring that data never leaves our controlled infrastructure. Operational metadata such as usage metrics, billing records, and service telemetry may be retained as required for platform operations.

Q: Is an API key required to use the AI model serving endpoints?

A: Yes. All requests to the API require a valid API key. If you do not have one, you can request it via our support channels. See the list of support options in Support.

Q: Can I request multiple API keys to differentiate between different applications?

A: Yes. Reach out through our support channels to request additional API keys.

Q: Is billing done on a per‑API‑key basis?

A: Yes. Usage is tracked per key and billed according.

Q: Can I request a specific model that is not listed?

A: Yes. Reach out through our support channels to request additional models. We evaluate demand and will add new models when feasible.


Need Help?

For onboarding, API access, pricing, or custom model requests, see Support.