CoAI LogoCoAI.Dev

API Integration & Relay

Comprehensive API relay system with multi-tenant distribution, unified formats, and enterprise-grade key management

API Integration & Relay

CoAI.Dev's API Relay system is a core enterprise feature that provides powerful and flexible AI integration tools for developers. The system implements multi-tenant relay distribution, advanced key management, and converts 10+ different AI service formats into the standardized OpenAI API format, dramatically simplifying the integration process.

Overview

The API relay system offers:

  • 🏢 Multi-Tenant Distribution: Independent resource and configuration management per tenant
  • 🔑 Advanced Key Management: Flexible API key strategies with granular permissions
  • 🔄 Unified API Format: Convert 10+ AI service formats to OpenAI standard
  • 🛡️ Security Controls: Model restrictions, quota limits, IP whitelisting, and expiration
  • 📊 Usage Analytics: Comprehensive tracking and billing integration
  • ⚡ High Performance: Built-in load balancing and retry mechanisms
  • 🎯 Cost Optimization: Intelligent routing and resource allocation

Enterprise Integration

The API relay system enables developers to focus on innovation and application development without worrying about the complexity of integrating multiple AI services, significantly accelerating development and deployment.

Key Features

Multi-Tenant Architecture

Complete Tenant Separation

Each tenant operates in a completely isolated environment:

Isolation Features:

  • Data Isolation: Complete separation of user data and configurations
  • Resource Isolation: Independent quotas and usage tracking
  • Security Isolation: Separate authentication and authorization
  • Configuration Isolation: Independent settings and customizations
  • Billing Isolation: Separate cost tracking and invoicing

Multi-Tenant Benefits:

{
  "tenant_isolation": {
    "data_separation": "complete",
    "resource_quotas": "independent",
    "authentication": "tenant_specific",
    "customization": "per_tenant",
    "billing": "separate_tracking"
  }
}

Use Cases:

  • SaaS providers serving multiple customers
  • Enterprise departments with separate budgets
  • Reseller and white-label deployments
  • Development, staging, and production environments
  • Regional or business unit separation

Advanced Key Management

Comprehensive API Key Control System:

Create and Configure API Keys

Generate API keys with granular permissions:

# Create new API key
POST /api/admin/keys
{
  "name": "Production API Key",
  "user_id": "user_12345",
  "permissions": {
    "models": ["gpt-4", "gpt-3.5-turbo", "claude-3"],
    "quota_limit": 10000,
    "rate_limits": {
      "requests_per_minute": 100,
      "tokens_per_hour": 50000
    },
    "ip_whitelist": ["192.168.1.0/24", "10.0.0.100"],
    "expiry_date": "2024-12-31T23:59:59Z"
  }
}

Model Restrictions

Control which AI models can be accessed:

  • Selective Access: Choose specific models for each key
  • Model Categories: Group models by type or provider
  • Dynamic Updates: Modify model access without key regeneration
  • Version Control: Control access to specific model versions
  • Cost Management: Restrict access to high-cost models

Model Access Configuration:

{
  "model_restrictions": {
    "allowed_models": [
      "gpt-4",
      "gpt-3.5-turbo",
      "claude-3-sonnet"
    ],
    "blocked_models": [
      "gpt-4-32k",
      "dall-e-3"
    ],
    "model_aliases": {
      "gpt-4-production": "gpt-4",
      "fast-model": "gpt-3.5-turbo"
    }
  }
}

Quota and Rate Limiting

Implement comprehensive usage controls:

Quota Types:

  • Token Limits: Maximum tokens per period
  • Request Limits: Maximum API calls per period
  • Cost Limits: Maximum spending per period
  • Time-based Limits: Daily, weekly, monthly quotas
  • Burst Limits: Short-term spike allowances

Rate Limiting Configuration:

{
  "rate_limits": {
    "requests_per_minute": 100,
    "requests_per_hour": 5000,
    "tokens_per_hour": 100000,
    "concurrent_requests": 10,
    "burst_allowance": {
      "multiplier": 2,
      "duration": "5_minutes"
    }
  }
}

IP Whitelisting and Security

Enhance security with network-level controls:

Security Features:

  • IP Whitelisting: Restrict access to specific IP addresses or ranges
  • CIDR Support: Flexible network range definitions
  • Geographic Restrictions: Country or region-based access control
  • Time-based Access: Scheduled access windows
  • Multi-factor Authentication: Additional security layers

IP Whitelist Configuration:

{
  "ip_restrictions": {
    "allowed_ips": [
      "192.168.1.100",
      "10.0.0.0/16",
      "203.0.113.0/24"
    ],
    "blocked_ips": [
      "suspicious.ip.range"
    ],
    "geographic_restrictions": {
      "allowed_countries": ["US", "CA", "GB"],
      "blocked_countries": ["CN", "RU"]
    }
  }
}

Unified API Format

OpenAI Compatibility Layer

Standardized Integration Experience:

// Standard OpenAI format for all providers
interface UnifiedAPIRequest {
  model: string;
  messages: ChatMessage[];
  temperature?: number;
  max_tokens?: number;
  stream?: boolean;
  // ... other OpenAI-compatible parameters
}
 
// Supported AI Providers (automatically converted)
const supportedProviders = [
  'OpenAI',
  'Anthropic Claude',
  'Google Gemini',
  'Azure OpenAI',
  'AWS Bedrock',
  'Cohere',
  'Hugging Face',
  'LocalAI',
  'Ollama',
  'Custom Models'
];

Provider Format Conversion

Intelligent Request Translation

Automatic conversion from OpenAI format to provider-specific formats:

Anthropic Claude Conversion:

{
  "openai_request": {
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ],
    "temperature": 0.7
  },
  "claude_request": {
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1000,
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ],
    "temperature": 0.7
  }
}

Google Gemini Conversion:

{
  "openai_request": {
    "model": "gpt-4",
    "messages": [
      {"role": "system", "content": "You are helpful."},
      {"role": "user", "content": "What is AI?"}
    ]
  },
  "gemini_request": {
    "model": "gemini-pro",
    "contents": [
      {
        "parts": [{"text": "You are helpful.\n\nWhat is AI?"}]
      }
    ]
  }
}

Parameter Mapping:

  • Temperature and randomness controls
  • Token limits and response lengths
  • Streaming and callback configurations
  • Context and memory management
  • Custom parameters and extensions

Configuration and Management

API Relay Settings

Enable/Disable Relay System

Control the relay system availability:

Admin Configuration:

  1. Navigate to Admin PanelSystem SettingsOperations
  2. Toggle Relay API checkbox
  3. Click Save to apply changes

Configuration Options:

{
  "relay_settings": {
    "enabled": true,
    "default_provider": "openai",
    "fallback_providers": ["anthropic", "gemini"],
    "rate_limiting": true,
    "usage_tracking": true,
    "error_logging": true
  }
}

Subscription Integration

Configure how subscriptions work with the relay API:

Subscription Options:

  • Include Relay: Subscription quotas cover relay API usage
  • Exclude Relay: Relay API requires separate elastic billing (credits)
  • Hybrid Model: Partial coverage with overage billing

Configuration Steps:

  1. Go to System SettingsOperations
  2. Configure Subscription Quota Covers Relay API option
  3. Set billing preferences for relay usage
  4. Apply settings and test configuration

Provider Management

Configure and manage AI service providers:

Provider Configuration:

{
  "providers": [
    {
      "name": "openai",
      "enabled": true,
      "priority": 1,
      "models": ["gpt-4", "gpt-3.5-turbo"],
      "rate_limits": {
        "requests_per_minute": 3500,
        "tokens_per_minute": 90000
      },
      "retry_config": {
        "max_retries": 3,
        "backoff_multiplier": 2
      }
    },
    {
      "name": "anthropic",
      "enabled": true,
      "priority": 2,
      "models": ["claude-3-sonnet", "claude-3-haiku"],
      "fallback_for": ["openai"]
    }
  ]
}

Monitoring and Analytics

Set up comprehensive monitoring:

Analytics Features:

  • Real-time usage statistics
  • Provider performance metrics
  • Error rate monitoring
  • Cost analysis and optimization
  • User behavior analytics

Monitoring Dashboard:

{
  "monitoring": {
    "real_time_metrics": true,
    "historical_data": "90_days",
    "alerts": {
      "high_error_rate": "5%",
      "quota_threshold": "90%",
      "response_time": "5_seconds"
    },
    "reporting": {
      "daily_summary": true,
      "weekly_analysis": true,
      "monthly_billing": true
    }
  }
}

Usage Examples

Basic API Integration

Standard OpenAI-Compatible Request:

curl -X POST 'https://your-domain.com/api/v1/chat/completions' \
  -H 'Authorization: Bearer your-api-key' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-4",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response:

{
  "id": "chatcmpl-8abc123",
  "object": "chat.completion",
  "created": 1703123456,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a revolutionary approach to computation that..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 87,
    "total_tokens": 99
  }
}

Advanced Features

Multi-Model Request with Fallback:

{
  "model": "gpt-4",
  "messages": [...],
  "provider_config": {
    "primary": "openai",
    "fallback": ["anthropic", "gemini"],
    "retry_strategy": "exponential_backoff"
  }
}

Custom Model Mapping:

{
  "model": "best-model",
  "model_mapping": {
    "best-model": {
      "provider": "openai",
      "actual_model": "gpt-4",
      "parameters": {
        "temperature": 0.3
      }
    }
  }
}

Security and Compliance

Security Best Practices

API Key Security:

  • Regular key rotation policies
  • Principle of least privilege
  • Secure key storage and transmission
  • Audit logging for all key operations
  • Automated security scanning

Network Security:

  • TLS 1.3 encryption for all communications
  • IP whitelisting and geographic restrictions
  • DDoS protection and rate limiting
  • Web Application Firewall (WAF) integration
  • Regular security assessments

Compliance Features

Data Protection:

{
  "compliance": {
    "data_encryption": "AES-256",
    "data_residency": "configurable",
    "retention_policies": "customizable",
    "audit_logging": "comprehensive",
    "access_controls": "role_based",
    "privacy_controls": "gdpr_compliant"
  }
}

Regulatory Support:

  • GDPR compliance with data protection controls
  • HIPAA support for healthcare applications
  • SOC 2 Type II certification readiness
  • PCI DSS compliance for payment data
  • Custom compliance frameworks

Performance and Optimization

Load Balancing

Intelligent Distribution:

  • Round-robin and weighted distribution
  • Health check-based routing
  • Geographic load balancing
  • Provider performance optimization
  • Automatic failover mechanisms

Caching Strategies

Response Caching:

{
  "caching": {
    "enabled": true,
    "ttl": "1_hour",
    "cache_key_strategy": "request_hash",
    "providers": ["redis", "memory"],
    "invalidation": "smart"
  }
}

Monitoring and Alerts

Performance Metrics:

  • Response time percentiles (P50, P95, P99)
  • Request volume and throughput
  • Error rates by provider and model
  • Cost per request and optimization opportunities
  • User satisfaction and experience metrics

The API Integration & Relay system provides enterprise-grade capabilities for seamless AI service integration, enabling developers to build sophisticated applications while maintaining security, performance, and cost control. Continue with Call Records & Logging to track API usage, or explore Custom Models for private model integration.