Supported Tokenizers
ABV uses tokenizers to infer usage when token counts aren’t provided. The following tokenizers are currently supported:| Model Pattern | Tokenizer | Package | Notes |
|---|---|---|---|
gpt-4o* | o200k_base | Built-in | Used for GPT-4o models |
gpt* | cl100k_base | Built-in | Used for GPT-4, GPT-3.5, and other GPT models |
claude* | claude | @anthropic-ai/tokenizer | Not fully accurate for Claude models per Anthropic |
Predefined Models
ABV maintains a list of predefined popular models with tokenizers and pricing for:- OpenAI - GPT-4, GPT-3.5, GPT-4o, and other models
- Anthropic - Claude 4 family (Opus, Sonnet, Haiku)
- Google - Gemini models
Custom Model Definitions
You can add your own model definitions for:- Custom or proprietary models
- Fine-tuned models
- Models not yet supported by ABV
- Overriding default pricing
Creating Models via the UI
- Navigate to Settings → Model Definitions in the ABV UI
- Click Add Model
- Configure the model settings (see configuration details below)
- Save the model definition
Creating Models via the API
Use the Models API to programmatically manage model definitions:Model Configuration
Model Matching
Models are matched to generations using regular expressions:| Generation Attribute | Model Attribute | Description |
|---|---|---|
model | match_pattern | Regex pattern to match model names |
User-defined models take priority over ABV-maintained models. This allows you to override default pricing or add support for new models.
Pricing Configuration
Define pricing per usage type. Usage types must match exactly with the keys in theusage_details object:
- Usage types are case-sensitive
- Must match exactly:
"input"≠"Input" - Supports arbitrary usage types for custom metrics
Tokenization Configuration
For models using theopenai tokenizer, specify the tokenization config:
Complete Model Definition Example
Here’s a complete example of a custom model definition:- JSON
- cURL
Reasoning Models
Reasoning models (like OpenAI’s o1 family) require special handling.Why Inference Doesn’t Work
Reasoning models take multiple internal steps to arrive at a response:- Model generates reasoning tokens (internal thought process)
- Model generates completion tokens (final response)
- API bills you for: reasoning tokens + completion tokens
Solution: Always Ingest Usage
For reasoning models, you must ingest usage details from the API response:Historical Data and Model Changes
Retroactive Application
ABV does not retroactively apply model definition changes to historical generations. This includes:- Price updates
- New tokenizers
- Changed model patterns
Batch Reprocessing
If you need to apply new model definitions to existing generations:- Contact ABV support
- Request a batch job to reprocess generations
- Specify the time range and model changes
Batch reprocessing is typically used when:
- Correcting a pricing error
- Adding missing model definitions
- Migrating to new tokenizers
OpenAI Usage Schema Compatibility
ABV supports the OpenAI usage schema for compatibility with OpenAI SDKs and tools.Automatic Mapping
When you use OpenAI field names, ABV automatically maps them:| OpenAI Field | ABV Field |
|---|---|
prompt_tokens | input |
completion_tokens | output |
total_tokens | total |
prompt_tokens_details.* | input_* |
completion_tokens_details.* | output_* |
Requesting Official Model Support
Don’t want to maintain custom model definitions? Request official support:- Visit the ABV Model Support Request Form
- Provide model details and documentation
- ABV will add official support in a future release
- Automatic tokenization
- Pricing updates
- Maintenance and updates
Troubleshooting
Usage and Cost Missing for Historical Data
Problem: After adding a new model definition, historical generations still show no usage or cost. Solution: Model definitions are not applied retroactively. Options:- Request a batch reprocessing job from support
- Accept that historical data uses old definitions
- For future data: Ingest usage directly instead of relying on inference
Incorrect Costs
Problem: Costs don’t match your LLM provider’s billing. Checklist:- Check pricing: Verify the model definition pricing matches your provider’s current rates
- Check usage types: Ensure usage type keys match exactly (case-sensitive)
- Check model matching: Verify the regex pattern correctly matches your model name
- Consider timing: Prices are applied at ingestion time, not retroactively
Model Not Matching
Problem: Generations aren’t matching your custom model definition. Debug steps:- Check the
modelfield in your generation matches thematch_patternregex - Test the regex pattern in a regex tester
- Verify user-defined model isn’t being overridden by ABV model
- Check for typos in the model name
Tokenization Errors
Problem: Tokenization failing or producing unexpected results. Solutions:- For Claude models: Ingest token counts from API instead of inferring
- For OpenAI models: Verify
tokensPerMessageandtokensPerNameare set correctly - For custom tokenizers: Ensure the tokenizer is supported (see supported tokenizers table)
API Reference
Models API Endpoints
List all models:Next Steps
Implementation Guide
Return to implementation examples
Daily Metrics API
Export usage and cost data