Documentation

Everything you need to run SIP4AI and integrate with the API. Configure the CLI, set up AI providers, and control voice calls programmatically.

Installation

Install SIP4AI with a single command. The installer automatically detects your platform and downloads the appropriate binary.

Terminal
# Install SIP4AI
curl -sSL https://sip4ai.com/install.sh | bash

Supported Platforms

macOS
Intel and Apple Silicon (arm64)
Linux
x86_64 and arm64

Verify Installation

After installation, run sip4ai --version to confirm the binary is installed correctly.

CLI Usage

The sip4ai binary is a standalone application that bridges SIP telephone networks with AI voice providers.

Terminal
# Show help
./sip4ai --help

# Run the client
./sip4ai

Operating Modes

inbound default
Listen for incoming SIP calls and connect them to AI providers
outbound
Make outbound calls to target numbers with task-based objectives

Environment Variables

Configure the SIP client using environment variables. SIP credentials are required for telephony features; at least one AI provider key is required for voice AI.

SIP Configuration

Variable Description
SIP_USERNAME SIP account username required
SIP_PASSWORD SIP account password required
SIP_DOMAIN SIP domain (e.g., pbx.example.com) required
SIP_SERVER SIP server address (e.g., pbx.example.com:5060) required
SIP_AUTH_ID Authorization ID if different from username
SIP_OUTBOUND_PROXY Outbound proxy address (e.g., proxy.example.com:5096)
SIP_TRANSPORT Transport protocol: udp (default), tcp, or tls

AI Provider API Keys

Variable Provider
OPENAI_API_KEY OpenAI Realtime API
ELEVEN_LABS_API_KEY ElevenLabs Conversational AI
GEMINI_API_KEY Google Gemini (also accepts GOOGLE_API_KEY)
DEEPGRAM_API_KEY Deepgram Voice Agent API
CARTESIA_API_KEY Cartesia Sonic (also accepts CARTESIA_KEY)

Other Variables

Variable Description
CONFIG_FILE Path to config file (default: config.json)
DEBUG_FORCE_PORT Force a specific RTP port for debugging
AUDIO_FILE Play an audio file instead of AI (fallback mode)

Configuration File

The config.json file controls the behavior of the SIP client, including the AI provider settings, operating mode, and function definitions.

config.json — Full example
{
  "mode": "inbound",              // "inbound" or "outbound"
  "api_port": 8080,               // HTTP API port (0 to disable)
  "provider": "openai",           // AI provider to use

  "openai": {
    "instructions": "You are a helpful assistant...",
    "voice": "alloy",             // alloy, echo, fable, onyx, nova, shimmer
    "greeting": "Hello!"          // For outbound calls
  },

  "elevenlabs": {
    "agent_id": "your-agent-id",
    "first_message": "Hello!",
    "system_prompt": "You are..."
  },

  "gemini": {
    "model": "models/gemini-2.0-flash-exp",
    "voice": "Puck",              // Aoede, Charon, Fenrir, Kore, Puck
    "greeting": "Hello!"
  },

  "deepgram": {
    "model": "aura-2-thalia-en",
    "listen_model": "nova-3",
    "think_provider": "open_ai",
    "think_model": "gpt-4o-mini"
  },

  "cartesia": {
    "voice_id": "your-voice-id",
    "stt_provider": "deepgram",
    "llm_provider": "openai",
    "llm_model": "gpt-4o-mini"
  },

  "outbound": {
    "target_number": "+1234567890",
    "caller_id": "+1987654321",
    "task_description": "Call to confirm appointment...",
    "result_webhook": "https://example.com/webhook",
    "hangup_on_task_complete": true
  },

  "functions": [
    {
      "name": "get_customer_info",
      "description": "Look up customer by phone",
      "webhook": "https://api.example.com/customer",
      "method": "POST",
      "parameters": {
        "type": "object",
        "properties": {
          "phone": { "type": "string" }
        }
      }
    }
  ]
}

Top-level Options

mode
Operating mode: inbound (default) or outbound
api_port
HTTP API port. Set to 0 to disable the API server.
provider
AI provider: openai, elevenlabs, gemini, deepgram, cartesia
functions
Array of function definitions for AI to call during conversations

Examples

Common usage patterns for running the SIP4AI client.

Run in API-only mode (no SIP)

./sip4ai

Run with SIP and OpenAI

SIP_USERNAME=user \
SIP_PASSWORD=pass \
SIP_DOMAIN=pbx.example.com \
SIP_SERVER=pbx.example.com \
OPENAI_API_KEY=sk-... \
./sip4ai

Run with custom config file

CONFIG_FILE=my_config.json ./sip4ai

Outbound call mode

CONFIG_FILE=outbound_config.json \
SIP_USERNAME=user \
SIP_PASSWORD=pass \
SIP_DOMAIN=pbx.example.com \
SIP_SERVER=pbx.example.com \
OPENAI_API_KEY=sk-... \
./sip4ai

API Reference

Introduction

The SIP4AI HTTP API provides programmatic access to manage calls, inject prompts, and monitor system status. The API runs on the same port as the SIP client (default: 8080).

Base URL

http://localhost:8080

Configure the port using the API_PORT environment variable.

Available Endpoints

Method Endpoint Description
GET /health Health check endpoint
GET /status Detailed system status
POST /call Initiate an outbound call
POST /calls/:id/inject Inject prompt into active call

Authentication

The SIP4AI HTTP API does not require authentication by default. It is designed to run on localhost and be accessed by local applications. For production deployments, place the API behind a reverse proxy with authentication.

Security Notice

Do not expose the API directly to the internet. Use a reverse proxy (nginx, Caddy) with authentication for remote access.

Errors

The API uses standard HTTP status codes to indicate success or failure. Error responses include a JSON body with details about the error.

HTTP Status Codes

200 OK
Request succeeded
400 Bad Request
Invalid request parameters or body
404 Not Found
Resource (e.g., call ID) not found
500 Internal Error
Server error — contact support if persistent
404 Response
{
  "error": "call not found",
  "code": "NOT_FOUND"
}
GET /health

Health check

Returns a simple health status. Use this endpoint for load balancer health checks or uptime monitoring.

Response

status string
Always ok when the server is running
Request
curl http://localhost:8080/health
import requests

response = requests.get("http://localhost:8080/health")
print(response.json())
const response = await fetch('http://localhost:8080/health');
const data = await response.json();
console.log(data);
200 Response
{
  "status": "ok"
}
GET /status

Get system status

Returns detailed system status including SIP registration state and active call information.

Response

sip_registered boolean
Whether SIP client is registered with the server
active_calls integer
Number of currently active calls
calls array
List of active call objects with id, status, and duration
uptime string
Human-readable uptime string
Request
curl http://localhost:8080/status
import requests

response = requests.get("http://localhost:8080/status")
print(response.json())
const response = await fetch('http://localhost:8080/status');
const data = await response.json();
console.log(data);
200 Response
{
  "sip_registered": true,
  "active_calls": 1,
  "calls": [
    {
      "id": "abc-123-def",
      "status": "in_progress",
      "duration": 45
    }
  ],
  "uptime": "2h15m30s"
}
OBJECT

The Call object

A Call represents a voice conversation between the AI and a phone number. Calls can be inbound (received) or outbound (initiated via API).

Attributes

id string
Unique identifier for the call (UUID format)
status string
Current status: initiating, ringing, in_progress, completed, failed
direction string
Either inbound or outbound
target string
Phone number in E.164 format (e.g., +61400000000)
provider string
AI provider used: openai, elevenlabs, gemini, deepgram, cartesia
duration integer
Call duration in seconds (null if not completed)
transcript array
Full conversation transcript with speaker labels
created_at timestamp
ISO 8601 timestamp of when the call was created
The Call object
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "in_progress",
  "direction": "outbound",
  "target": "+61400000000",
  "provider": "openai",
  "duration": null,
  "transcript": [],
  "created_at": "2024-01-15T10:30:00Z"
}
POST /api/calls

Create a call

Initiate an outbound call to a phone number. The AI will attempt to reach the target and complete the specified task.

Request Body

target required
Phone number to call in E.164 format
task required
The objective for the AI to accomplish during the call
instructions optional
Additional behavioral instructions for the AI
provider optional
AI provider to use. Defaults to config setting.
caller_id optional
Outbound caller ID to display (must be verified)
max_duration_seconds optional
Maximum call duration. Default: 300 (5 minutes)
event_webhook optional
URL to receive call events via POST
functions optional
Array of function definitions the AI can call
Request
curl -X POST https://your-instance.com/api/calls \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "target": "+61400000000",
    "task": "Confirm the appointment for Tuesday at 2pm",
    "instructions": "Be friendly and professional",
    "max_duration_seconds": 300,
    "event_webhook": "https://your-app.com/webhook"
  }'
import requests

response = requests.post(
    "https://your-instance.com/api/calls",
    headers={
        "Authorization": "Bearer sk_live_...",
        "Content-Type": "application/json"
    },
    json={
        "target": "+61400000000",
        "task": "Confirm the appointment for Tuesday at 2pm",
        "instructions": "Be friendly and professional",
        "max_duration_seconds": 300,
        "event_webhook": "https://your-app.com/webhook"
    }
)

print(response.json())
const response = await fetch('https://your-instance.com/api/calls', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    target: '+61400000000',
    task: 'Confirm the appointment for Tuesday at 2pm',
    instructions: 'Be friendly and professional',
    max_duration_seconds: 300,
    event_webhook: 'https://your-app.com/webhook'
  })
});

const data = await response.json();
console.log(data);
201 Response
{
  "success": true,
  "call_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "initiating",
  "created_at": "2024-01-15T10:30:00Z"
}
GET /api/calls/{call_id}

Retrieve a call

Retrieve details of a specific call including status, duration, and transcript.

Path Parameters

call_id required
The unique identifier of the call to retrieve
Request
curl https://your-instance.com/api/calls/550e8400... \
  -H "Authorization: Bearer sk_live_..."
import requests

call_id = "550e8400..."
response = requests.get(
    f"https://your-instance.com/api/calls/{call_id}",
    headers={"Authorization": "Bearer sk_live_..."}
)

print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}`, {
  headers: {
    'Authorization': 'Bearer sk_live_...'
  }
});

const data = await response.json();
console.log(data);
200 Response
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "direction": "outbound",
  "target": "+61400000000",
  "provider": "openai",
  "duration": 127,
  "transcript": [
    {
      "speaker": "ai",
      "text": "Hello, this is calling about your Tuesday appointment...",
      "timestamp": 0.5
    },
    {
      "speaker": "user",
      "text": "Yes, that's correct. 2pm works for me.",
      "timestamp": 5.2
    }
  ],
  "created_at": "2024-01-15T10:30:00Z",
  "ended_at": "2024-01-15T10:32:07Z"
}
GET /api/calls

List all calls

Returns a paginated list of calls, sorted by creation date (newest first).

Query Parameters

limit optional
Number of results per page (1-100). Default: 20
offset optional
Number of results to skip. Default: 0
status optional
Filter by call status
Request
curl "https://your-instance.com/api/calls?limit=20&status=completed" \
  -H "Authorization: Bearer sk_live_..."
import requests

response = requests.get(
    "https://your-instance.com/api/calls",
    headers={"Authorization": "Bearer sk_live_..."},
    params={"limit": 20, "status": "completed"}
)

print(response.json())
const params = new URLSearchParams({
  limit: '20',
  status: 'completed'
});

const response = await fetch(`https://your-instance.com/api/calls?${params}`, {
  headers: {
    'Authorization': 'Bearer sk_live_...'
  }
});

const data = await response.json();
console.log(data);
POST /api/calls/{call_id}/inject

Inject context

Inject additional context or instructions into an active call. The AI will incorporate this information into the ongoing conversation.

Request Body

message required
Context or instruction to inject into the conversation
role optional
Message role: system (default) or user
Request
curl -X POST https://your-instance.com/api/calls/550e8400.../inject \
  -H "Authorization: Bearer sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "message": "The customer has a VIP status. Offer them a 20% discount.",
    "role": "system"
  }'
import requests

call_id = "550e8400..."
response = requests.post(
    f"https://your-instance.com/api/calls/{call_id}/inject",
    headers={
        "Authorization": "Bearer sk_live_...",
        "Content-Type": "application/json"
    },
    json={
        "message": "The customer has a VIP status. Offer them a 20% discount.",
        "role": "system"
    }
)

print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}/inject`, {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk_live_...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    message: 'The customer has a VIP status. Offer them a 20% discount.',
    role: 'system'
  })
});

const data = await response.json();
console.log(data);
DELETE /api/calls/{call_id}

End a call

Immediately terminate an active call. The AI will say a brief goodbye before hanging up.

Request
curl -X DELETE https://your-instance.com/api/calls/550e8400... \
  -H "Authorization: Bearer sk_live_..."
import requests

call_id = "550e8400..."
response = requests.delete(
    f"https://your-instance.com/api/calls/{call_id}",
    headers={"Authorization": "Bearer sk_live_..."}
)

print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}`, {
  method: 'DELETE',
  headers: {
    'Authorization': 'Bearer sk_live_...'
  }
});

const data = await response.json();
console.log(data);
200 Response
{
  "success": true,
  "call_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "ending"
}

Functions

Functions allow the AI to call your webhooks during a conversation. This enables real-time actions like booking appointments, checking order status, or transferring calls.

Real-time execution

Functions execute immediately when the AI decides to call them. Hold audio plays while waiting for your response.

Webhook security

All function calls include a signature header for verification. Validate requests to ensure they're from SIP4AI.

Define functions

Define functions when creating a call. Each function needs a name, description, parameters schema, and webhook URL.

Function definition example
{
  "functions": [
    {
      "name": "book_appointment",
      "description": "Book an appointment for the caller",
      "webhook": "https://your-app.com/api/book",
      "hold_audio_url": "https://your-app.com/audio/hold.wav",
      "parameters": {
        "type": "object",
        "properties": {
          "date": {
            "type": "string",
            "description": "The appointment date (YYYY-MM-DD)"
          },
          "time": {
            "type": "string",
            "description": "The appointment time (HH:MM)"
          }
        },
        "required": ["date", "time"]
      }
    },
    {
      "name": "transfer_call",
      "description": "Transfer the call to a human agent",
      "type": "transfer",
      "target": "+61400000001"
    }
  ]
}

Webhook events

SIP4AI sends webhook events to your specified URL for important call lifecycle events.

Event Description
call.initiated Call has been initiated, dialing target
call.ringing Target phone is ringing
call.answered Call was answered, AI conversation starting
call.completed Call ended normally
call.failed Call failed (busy, no answer, error)
function.called AI invoked a function during the call
transcript.update New transcript segment available

Payload format

All webhook payloads follow a consistent format with event type, call ID, timestamp, and event-specific data.

call.completed webhook payload
{
  "event": "call.completed",
  "call_id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2024-01-15T10:32:07Z",
  "data": {
    "duration": 127,
    "end_reason": "completed",
    "transcript": [
      {
        "speaker": "ai",
        "text": "Hello, this is calling about your Tuesday appointment...",
        "timestamp": 0.5
      }
    ],
    "functions_called": [
      {
        "name": "book_appointment",
        "arguments": { "date": "2024-01-16", "time": "14:00" },
        "result": { "success": true }
      }
    ]
  }
}
AI

OpenAI Realtime

GPT-4o powered realtime voice conversations with advanced reasoning capabilities.

Configuration Options

voice
Voice to use: shimmer, alloy, echo, fable, onyx, nova
turn_detection
VAD configuration with threshold and silence duration
instructions
System prompt for the AI assistant
OpenAI configuration
{
  "provider": "openai",
  "openai": {
    "instructions": "You are a helpful assistant...",
    "voice": "shimmer",
    "turn_detection": {
      "type": "server_vad",
      "threshold": 0.7,
      "silence_duration_ms": 1000
    }
  }
}
11

ElevenLabs

Native 8kHz support for the lowest possible latency. Configure your agent in the ElevenLabs dashboard.

ElevenLabs configuration
{
  "provider": "elevenlabs",
  "elevenlabs": {
    "agent_id": "your-agent-id",
    "first_message": "Hello! How can I help you today?",
    "system_prompt": "You are a helpful assistant..."
  }
}
G

Gemini Live

Google's Gemini 2.0 Flash with five distinct voices and built-in function calling.

Gemini configuration
{
  "provider": "gemini",
  "gemini": {
    "model": "models/gemini-2.0-flash-exp",
    "voice": "Kore",
    "first_message": "Hello!",
    "system_prompt": "You are a helpful assistant..."
  }
}