Configuration
Fine-tune model behavior with KuzcoConfiguration. Adjust parameters like temperature, max tokens, and sampling strategies.
Basic Configuration
Create a configuration and pass it when initializing a session:
import Kuzcolet config = KuzcoConfiguration( temperature: 0.7, maxTokens: 1024)let session = try await KuzcoSession(model: .qwen3_4b, configuration: config)Configuration Properties
| Property | Type | Default | Description |
|---|---|---|---|
temperature | Float | 0.7 | Controls randomness. Lower = more focused, higher = more creative (0.0-2.0) |
maxTokens | Int | 2048 | Maximum tokens to generate in response |
topK | Int | 40 | Number of top tokens to consider for sampling |
topP | Float | 0.9 | Nucleus sampling threshold (0.0-1.0) |
repeatPenalty | Float | 1.1 | Penalty for repeating tokens. Higher = less repetition |
contextLength | Int? | nil | Override context window size (model default if nil) |
stopSequences | [String] | [] | Sequences that stop generation when encountered |
Full Configuration Example
let config = KuzcoConfiguration( temperature: 0.8, maxTokens: 4096, topK: 50, topP: 0.95, repeatPenalty: 1.2, contextLength: 8192, stopSequences: ["\n\n", "User:", "END"])let session = try await KuzcoSession(model: .qwen3_8b, configuration: config)Configuration Presets
Use built-in presets for common use cases:
.default
temp: 0.7, maxTokens: 2048Balanced settings for general-purpose chat and completion tasks.
.creative
temp: 1.0, topP: 0.95Higher randomness for creative writing, brainstorming, and storytelling.
.precise
temp: 0.3, topK: 20Lower randomness for factual responses, Q&A, and technical queries.
.coding
temp: 0.2, repeatPenalty: 1.0Optimized for code generation with high consistency and low repetition penalty.
.lowMemory
contextLength: 2048, maxTokens: 512Reduced memory footprint for constrained environments.
.performance
maxTokens: 256, topK: 10Optimized for fast responses with limited output length.
// Using presetslet creativeSession = try await KuzcoSession( model: .qwen3_4b, configuration: .creative)let codingSession = try await KuzcoSession( model: .phi4_mini, configuration: .coding)let lowMemorySession = try await KuzcoSession( model: .deepseekR1_1_5b, configuration: .lowMemory)Custom Presets
Extend presets with custom modifications:
// Start from a preset and modifyvar config = KuzcoConfiguration.creativeconfig.maxTokens = 4096config.stopSequences = ["THE END"]let session = try await KuzcoSession(model: .qwen3_4b, configuration: config)Understanding Temperature
Low Temperature (0.0 - 0.3)
More deterministic and focused. Best for factual queries, code, and when you need consistent outputs.
Medium Temperature (0.4 - 0.7)
Balanced creativity and coherence. Good default for general-purpose chat.
High Temperature (0.8 - 1.5)
More random and creative. Best for brainstorming, creative writing, and exploring diverse ideas.
// Factual responselet factual = KuzcoConfiguration(temperature: 0.1)// Creative storylet story = KuzcoConfiguration(temperature: 1.2)// Balanced chatlet chat = KuzcoConfiguration(temperature: 0.7)Understanding Top-K and Top-P
These parameters control token sampling diversity:
Top-K Sampling
Limits selection to the K most likely tokens. Lower K = more focused, higher K = more diverse.
topK: 10 (focused) → topK: 100 (diverse)Top-P (Nucleus) Sampling
Selects from the smallest set of tokens whose cumulative probability exceeds P. Adapts dynamically to context.
topP: 0.5 (focused) → topP: 0.95 (diverse)Stop Sequences
Configure sequences that stop generation when encountered:
let config = KuzcoConfiguration( stopSequences: [ "\n\nHuman:", // Stop at conversation turn "---", // Stop at separator "THE END", // Stop at story ending "```" // Stop at code block end ])let session = try await KuzcoSession(model: .qwen3_4b, configuration: config)// Generation will stop when any stop sequence is encounteredlet response = try await session.oneShot("Write a short poem")Dynamic Configuration
Update configuration during a session:
let session = try await KuzcoSession(model: .qwen3_4b)// Start with default settingslet response1 = try await session.oneShot("What is 2+2?")// Switch to creative mode for the next promptsession.updateConfiguration(.creative)let response2 = try await session.oneShot("Write a haiku about coding")// Or use custom configurationsession.updateConfiguration(KuzcoConfiguration(temperature: 0.1))let response3 = try await session.oneShot("List the planets in order")