Skip to main content

How Log Levels Work

ABV supports four log levels to categorize observations by severity:
  • DEBUG: Verbose internal details (tool calls, intermediate steps, debugging info)
  • DEFAULT: Standard operations (successful LLM calls, normal workflow steps)
  • WARNING: Degraded performance or unexpected behavior (slow responses, fallbacks, retries)
  • ERROR: Failures (API errors, timeouts, invalid outputs, exceptions)

Understand Log Level Hierarchy

Log levels follow a severity hierarchy from least to most critical:
DEBUG < DEFAULT < WARNING < ERROR
When to use each level:DEBUG
  • Internal tool executions
  • Intermediate processing steps
  • Variable values for debugging
  • Cache hits/misses
  • Retry attempts before failure
DEFAULT (normal operations)
  • Successful LLM API calls
  • Standard workflow completions
  • User interactions without issues
  • Expected behavior
WARNING (concerning but non-fatal)
  • Slow LLM responses (>5 seconds)
  • Fallback to alternative model/prompt
  • Deprecated feature usage
  • Rate limit warnings (approaching threshold)
  • Validation warnings (non-blocking)
ERROR (failures)
  • LLM API errors (401, 500, timeouts)
  • Exceptions and crashes
  • Invalid outputs (failed parsing, guardrail violations)
  • Data processing failures
  • Critical validation failures

Set Log Levels on Observations

Assign log levels when creating spans or generations, or update them dynamically based on runtime conditions.Python (set on creation):
from abvdev import ABV

abv = ABV(api_key="sk-abv-...")

# Create span with WARNING level
with abv.start_as_current_observation(
    as_type='span',
    name="risky-operation",
    level="WARNING",
    status_message="Operation may fail with invalid input"
) as span:
    result = process_risky_data(data)
Python (update dynamically):
with abv.start_as_current_observation(
    as_type='generation',
    name="llm-call",
    level="DEFAULT"
) as generation:
    try:
        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": query}]
        )
        generation.update(output=response.choices[0].message.content)

    except Exception as e:
        # Update to ERROR on failure
        generation.update(
            level="ERROR",
            status_message=f"LLM call failed: {str(e)}"
        )
        raise
JavaScript/TypeScript:
import { startObservation } from '@abvdev/tracing';

const span = startObservation('manual-observation', {
  input: { query: 'What is the capital of France?' },
  level: 'WARNING',
  statusMessage: 'This operation is experimental'
});

// Update level dynamically
span.update({
  level: 'ERROR',
  statusMessage: 'Operation failed'
});

span.end();

Add Status Messages for Context

Include a statusMessage alongside the log level to provide human-readable context about why this observation has a particular severity.Good status messages:
  • "LLM timeout after 30 seconds" (ERROR)
  • "Fallback to gpt-3.5-turbo due to rate limit" (WARNING)
  • "Cache miss, fetching from API" (DEBUG)
  • "Retry 2/3 after transient error" (WARNING)
Bad status messages:
  • "Error" (too generic)
  • "Something went wrong" (not actionable)
  • "" (empty, provides no context)
Example:
with abv.start_as_current_observation(
    as_type='span',
    name="guardrail-check",
    level="DEFAULT"
) as span:
    result = check_for_biased_language(output)

    if result.violations:
        span.update(
            level="WARNING",
            status_message=f"Detected {len(result.violations)} potential bias issues"
        )
    else:
        span.update(
            level="DEFAULT",
            status_message="No guardrail violations detected"
        )

Filter Traces by Log Level in Dashboard

In the ABV Dashboard, filter traces or observations by log level to focus on specific severity levels.Use cases:
  • Production debugging: Filter to ERROR only to see all failures
  • Performance optimization: Filter to WARNING to find slow or degraded operations
  • Development: Show DEBUG to see full execution details
Dashboard filters:
  • View single trace → Filter observations by level
  • Trace list view → Filter entire traces containing ERROR observations
  • Search queries: level = "ERROR" or level IN ["WARNING", "ERROR"]
This helps you quickly identify issues without scrolling through hundreds of DEBUG observations.

Set Minimum Log Level for Sampling

Configure your SDK to only send observations at or above a certain log level. This reduces ingestion costs while preserving critical error data.Example: Only log warnings and errors in production
import os

# Set minimum log level based on environment
min_level = "WARNING" if os.getenv("ENV") == "production" else "DEBUG"

abv = ABV(
    api_key="sk-abv-...",
    min_log_level=min_level  # Only send WARNING and ERROR in production
)
Result:
  • Development: All DEBUG, DEFAULT, WARNING, ERROR observations logged
  • Production: Only WARNING and ERROR observations logged (70% cost reduction)
Note: Check your SDK documentation for exact parameter names (min_log_level, log_level, etc.).

Why Use Log Levels?

Production traces can contain hundreds of observations. Log levels let you filter to what matters.Filter examples:
  • level = "ERROR": See only failed observations
  • level >= "WARNING": See concerning behavior leading to failure
  • level = "ERROR" AND environment = "production": Production failures only
Benefits:
  • Find failures in seconds instead of scrolling through hundreds of successful operations
  • Eliminate noise from DEBUG logs in production
  • Focus on actionable errors
Ingesting every DEBUG observation from production is expensive and unnecessary. Filter to WARNING/ERROR levels to reduce ingestion costs by 70-90% while preserving critical error data.Example: Production app with 15 observations per trace (3 DEBUG, 10 DEFAULT, 2 ERROR/WARNING) can reduce log volume by 87% by filtering out DEBUG and DEFAULT levels.Implementation:
import os

# Environment-based log level
if os.getenv("ENV") == "production":
    min_level = "WARNING"  # Only warnings and errors
elif os.getenv("ENV") == "staging":
    min_level = "DEFAULT"  # Standard operations and above
else:
    min_level = "DEBUG"  # Everything in development

abv = ABV(api_key="sk-abv-...", min_log_level=min_level)
Best practice: Use DEBUG in development, WARNING in production. Errors are always critical, so always log ERROR.
Generic alerts (“LLM API returned 500”) fire constantly and get ignored. Log level-based alerts are precise and actionable.Alert strategies:1. Error rate threshold
  • Alert when ERROR observations exceed 1% of total traces
  • Catches systematic failures (API outage, bad prompt deployment)
2. Absolute error count
  • Alert when >100 ERROR observations in 5 minutes
  • Catches spikes in failures
3. Warning accumulation
  • Alert when WARNING observations exceed 10% of traces
  • Catches degraded performance (slow responses, frequent retries)
4. Zero errors expected
  • Alert on any ERROR in critical workflows (payment processing, compliance tasks)
  • Catches every failure immediately
Example: PagerDuty integration
# Monitor ERROR observations via ABV webhook
@app.post("/abv-webhook")
def handle_abv_event(event: ABVEvent):
    if event.observation.level == "ERROR":
        # Trigger PagerDuty alert
        pagerduty.trigger(
            summary=f"LLM error in production: {event.observation.name}",
            severity="error",
            source="abv",
            custom_details={
                "trace_url": event.observation.trace_url,
                "error_message": event.observation.status_message
            }
        )
Benefits:
  • Proactive issue detection (catch errors before users complain)
  • Reduced alert fatigue (only actionable alerts)
  • Faster incident response (trace URL in alert for instant debugging)
Multi-step LLM workflows (RAG, agents, chains) generate dozens of observations. Log levels let you control verbosity dynamically.Use case: Agent workflow with tool callsFull trace (DEBUG enabled):
[DEFAULT] User query received: "What's the weather in Paris?"
[DEBUG] Parsing query intent
[DEBUG] Intent identified: weather_lookup
[DEBUG] Searching for weather tool
[DEBUG] Tool found: get_weather(location)
[DEFAULT] Calling tool: get_weather(location="Paris")
[DEBUG] API request to weather service
[DEBUG] Weather API response: 200 OK
[DEFAULT] Tool result: 72°F, sunny
[DEBUG] Formatting response
[DEFAULT] LLM generating final response
[DEFAULT] Response generated: "It's 72°F and sunny in Paris."
Production trace (WARNING+ only):
[WARNING] Weather API slow response (3.2s)
Error case (ERROR level):
[ERROR] Weather API timeout after 5 seconds
[WARNING] Fallback to cached weather data (2 hours old)
[DEFAULT] Response generated with cached data
Benefits:
  • Development: See every step for debugging
  • Production: Only see issues (warnings, errors)
  • Selective detail: Enable DEBUG for specific users or traces when investigating issues
Not all issues are outright failures. Slow responses, fallbacks, and retries indicate degraded performance that should be monitored.Warning-worthy scenarios:1. Latency degradation
import time

start = time.time()
response = llm.generate(query)
latency = time.time() - start

if latency > 5.0:
    span.update(
        level="WARNING",
        status_message=f"Slow LLM response: {latency:.2f}s"
    )
2. Fallback behavior
try:
    result = primary_model.generate(query)
except RateLimitError:
    span.update(
        level="WARNING",
        status_message="Rate limit hit, falling back to secondary model"
    )
    result = fallback_model.generate(query)
3. Retry patterns
for attempt in range(3):
    try:
        result = llm.generate(query)
        break
    except TransientError as e:
        if attempt < 2:
            span.update(
                level="WARNING",
                status_message=f"Retry {attempt + 1}/3 after error: {str(e)}"
            )
        else:
            span.update(
                level="ERROR",
                status_message="All retries exhausted"
            )
            raise
4. Guardrail violations
if guardrail_check.violations:
    span.update(
        level="WARNING",
        status_message=f"Guardrail triggered: {guardrail_check.violations}"
    )
    # Still return response but log the warning
Dashboard analysis:
  • Query for WARNING observations over time
  • Identify trends: Are latency warnings increasing?
  • Correlate warnings with deployments or traffic spikes

Implementation Guide

Set log levels when using the @observe() decorator to automatically trace functions.Setup:
pip install abvdev
Update log level dynamically:
from abvdev import ABV, observe

abv = ABV(api_key="sk-abv-...", host="https://app.abv.dev")

@observe()
def process_document(document):
    # Start at DEFAULT level
    try:
        result = complex_processing(document)

        # Update to WARNING if processing takes too long
        if result.processing_time > 10:
            abv.update_current_span(
                level="WARNING",
                status_message=f"Slow processing: {result.processing_time}s"
            )

        return result

    except Exception as e:
        # Update to ERROR on failure
        abv.update_current_span(
            level="ERROR",
            status_message=f"Processing failed: {str(e)}"
        )
        raise

process_document(my_document)
Set level on creation:
@observe(level="DEBUG", status_message="Debugging this function")
def debug_function():
    # This function's span starts at DEBUG level
    pass
Set log levels when creating spans or generations manually with context managers.Set on creation:
from abvdev import ABV

abv = ABV(api_key="sk-abv-...")

# Create span with WARNING level
with abv.start_as_current_observation(
    as_type='span',
    name="experimental-feature",
    level="WARNING",
    status_message="This feature is experimental and may fail"
) as span:
    result = try_experimental_logic()
Update during execution:
with abv.start_as_current_observation(
    as_type='generation',
    name="llm-call",
    model="gpt-4",
    level="DEFAULT"
) as generation:
    try:
        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": query}]
        )

        generation.update(output=response.choices[0].message.content)

    except TimeoutError:
        generation.update(
            level="ERROR",
            status_message="LLM timeout after 30 seconds"
        )
        raise
    except RateLimitError:
        generation.update(
            level="WARNING",
            status_message="Rate limit hit, retrying with exponential backoff"
        )
        # Retry logic here
Update without direct span reference:
with abv.start_as_current_observation(as_type='span', name="workflow"):
    # Some processing
    validation_result = validate_input(data)

    if not validation_result.is_valid:
        # Update the current span
        abv.update_current_span(
            level="WARNING",
            status_message=f"Validation warnings: {validation_result.warnings}"
        )
Set and update log levels in JavaScript/TypeScript using the @abvdev/tracing package.Setup:
npm install @abvdev/tracing @abvdev/otel @opentelemetry/sdk-node
Update during execution:
import './instrumentation';
import { startActiveObservation, updateActiveObservation } from '@abvdev/tracing';

async function main() {
  await startActiveObservation('process-request', async (span) => {
    span.update({
      input: { query: 'What is the capital of France?' }
    });

    try {
      const result = await processQuery(query);

      if (result.latency > 5000) {
        // Update to WARNING if slow
        updateActiveObservation('span', {
          level: 'WARNING',
          statusMessage: `Slow response: ${result.latency}ms`
        });
      }

    } catch (error) {
      // Update to ERROR on failure
      updateActiveObservation('span', {
        level: 'ERROR',
        statusMessage: `Processing failed: ${error.message}`
      });
      throw error;
    }
  });
}

main();
Wrap existing functions with automatic tracing and log level updates.Example:
import './instrumentation';
import { observe, updateActiveObservation } from '@abvdev/tracing';

// Original function
async function fetchData(source: string) {
  try {
    const data = await fetch(source);

    if (data.status !== 200) {
      updateActiveObservation('span', {
        level: 'WARNING',
        statusMessage: `Non-200 status: ${data.status}`
      });
    }

    return await data.json();

  } catch (error) {
    updateActiveObservation('span', {
      level: 'ERROR',
      statusMessage: `Fetch failed: ${error.message}`
    });
    throw error;
  }
}

// Wrap with observe
const tracedFetchData = observe(fetchData, {
  name: 'fetch-data-operation'
});

// Use traced version
async function main() {
  const result = await tracedFetchData('https://api.example.com/data');
}

main();
Create spans manually and set log levels explicitly.Example:
import './instrumentation';
import { startObservation } from '@abvdev/tracing';

const span = startObservation('manual-operation', {
  input: { query: 'Process this data' },
  level: 'WARNING',
  statusMessage: 'This operation is in beta'
});

try {
  const result = processData(input);

  // Update to DEFAULT on success
  span.update({
    level: 'DEFAULT',
    statusMessage: 'Operation completed successfully',
    output: result
  });

} catch (error) {
  // Update to ERROR on failure
  span.update({
    level: 'ERROR',
    statusMessage: `Operation failed: ${error.message}`
  });
} finally {
  span.end();
}

Next Steps

Sampling

Control trace volume and cost with rule-based or rate-based sampling strategies

Metadata

Attach structured context to traces for precise filtering and analysis

Tags

Add flexible labels to categorize and filter traces quickly

Environments

Separate development, staging, and production traces for clean comparisons