Custom API Integration

Override the default Kuzco endpoints to use your own model server or integrate with custom API backends.

Overview

Kuzco uses KuzcoAPIConfiguration to manage API endpoints for model downloads. You can override these to point to your own server hosting model files.

Configuring Custom Endpoints

Configure custom endpoints at app startup, before initializing any sessions:

AppDelegate.swift
import Kuzco
@main
struct MyApp: App {
init() {
// Configure custom API endpoint
KuzcoAPIConfiguration.shared.configure(
baseURL: "https://your-server.com/api/v1",
modelsEndpoint: "/models"
)
// Then initialize with your API key
KuzcoClient.initialize(apiKey: "kzc_your_api_key_here")
}
var body: some Scene {
WindowGroup {
ContentView()
}
}
}

Full Configuration Options

KuzcoAPIConfiguration.shared.configure(
baseURL: "https://your-server.com/api/v1",
modelsEndpoint: "/models", // Endpoint for model list
downloadEndpoint: "/downloads", // Endpoint for model downloads
headers: [ // Custom headers
"X-Custom-Header": "value",
"Authorization": "Bearer token"
],
timeout: 60 // Request timeout in seconds
)

Required Response Format

Your models endpoint must return data in one of these formats:

Sectioned Format (Recommended)

Models organized by category:

{
"sections": [
{
"title": "Text Models",
"models": [
{
"id": "qwen3_4b",
"name": "Qwen3 4B",
"description": "Balanced text generation model",
"size": 2500000000,
"contextLength": 32768,
"type": "text",
"downloadURL": "https://your-server.com/models/qwen3_4b.mlmodelc.zip"
},
{
"id": "qwen3_8b",
"name": "Qwen3 8B",
"description": "High-quality text generation",
"size": 5000000000,
"contextLength": 32768,
"type": "text",
"downloadURL": "https://your-server.com/models/qwen3_8b.mlmodelc.zip"
}
]
},
{
"title": "Vision Models",
"models": [
{
"id": "qwen3VL",
"name": "Qwen3 VL",
"description": "Vision-language model",
"size": 4000000000,
"contextLength": 8192,
"type": "vision",
"downloadURL": "https://your-server.com/models/qwen3vl.mlmodelc.zip"
}
]
}
]
}

Flat Array Format

Simple flat list of models:

{
"models": [
{
"id": "qwen3_4b",
"name": "Qwen3 4B",
"description": "Balanced text generation model",
"size": 2500000000,
"contextLength": 32768,
"type": "text",
"downloadURL": "https://your-server.com/models/qwen3_4b.mlmodelc.zip"
},
{
"id": "qwen3VL",
"name": "Qwen3 VL",
"description": "Vision-language model",
"size": 4000000000,
"contextLength": 8192,
"type": "vision",
"downloadURL": "https://your-server.com/models/qwen3vl.mlmodelc.zip"
}
]
}

LLMModel JSON Structure

Each model object must include these fields:

FieldTypeRequiredDescription
idStringYesUnique identifier matching KuzcoModel enum case
nameStringYesDisplay name
descriptionStringYesBrief description of capabilities
sizeIntYesFile size in bytes
contextLengthIntYesMaximum context window size in tokens
typeStringYes"text", "vision", or "image"
downloadURLStringYesDirect URL to download the model file

Authentication Headers

Add custom authentication headers to all API requests:

// API Key authentication
KuzcoAPIConfiguration.shared.configure(
baseURL: "https://your-server.com/api",
headers: [
"X-API-Key": "your-server-api-key"
]
)
// Bearer token authentication
KuzcoAPIConfiguration.shared.configure(
baseURL: "https://your-server.com/api",
headers: [
"Authorization": "Bearer your-jwt-token"
]
)
// Update headers dynamically (e.g., after token refresh)
KuzcoAPIConfiguration.shared.updateHeaders([
"Authorization": "Bearer new-token"
])

Self-Hosted Model Server

Example Node.js server for hosting models:

// server.js
const express = require('express');
const app = express();
const models = [
{
id: "qwen3_4b",
name: "Qwen3 4B",
description: "Balanced text model",
size: 2500000000,
contextLength: 32768,
type: "text",
downloadURL: "https://your-cdn.com/models/qwen3_4b.mlmodelc.zip"
}
];
app.get('/api/v1/models', (req, res) => {
// Verify API key
const apiKey = req.headers['x-api-key'];
if (!apiKey || !isValidKey(apiKey)) {
return res.status(401).json({ error: 'Invalid API key' });
}
res.json({ models });
});
app.listen(3000);

Verifying Configuration

Test your custom configuration:

// Check current configuration
let config = KuzcoAPIConfiguration.shared
print("Base URL: \(config.baseURL)")
print("Models Endpoint: \(config.modelsEndpoint)")
// Test connectivity
Task {
do {
let models = try await KuzcoModelManager.shared.fetchAvailableModels()
print("Found \(models.count) models from custom API")
} catch {
print("API Error: \(error)")
}
}

Reset to Default

Reset configuration to use default Kuzco servers:

// Reset to default configuration
KuzcoAPIConfiguration.shared.reset()
// Or configure with nil to use defaults
KuzcoAPIConfiguration.shared.configure(baseURL: nil)

Important Notes

  • Configure the API before initializing KuzcoClient
  • Model IDs must match the KuzcoModel enum cases exactly
  • Model files must be in Core ML format (.mlmodelc or .mlpackage)
  • Ensure your server supports range requests for resumable downloads