Documentation
Everything you need to run SIP4AI and integrate with the API. Configure the CLI, set up AI providers, and control voice calls programmatically.
Installation
Install SIP4AI with a single command. The installer automatically detects your platform and downloads the appropriate binary.
# Install SIP4AI
curl -sSL https://sip4ai.com/install.sh | bash
Supported Platforms
macOS
Linux
Verify Installation
After installation, run
sip4ai --version
to confirm the binary is installed correctly.
CLI Usage
The
sip4ai
binary is a standalone application that bridges
SIP telephone networks with AI voice providers.
# Show help
./sip4ai --help
# Run the client
./sip4ai
Operating Modes
inbound
default
outbound
Environment Variables
Configure the SIP client using environment variables. SIP credentials are required for telephony features; at least one AI provider key is required for voice AI.
SIP Configuration
| Variable | Description |
|---|---|
SIP_USERNAME
|
SIP account username required |
SIP_PASSWORD
|
SIP account password required |
SIP_DOMAIN
|
SIP domain (e.g., pbx.example.com) required |
SIP_SERVER
|
SIP server address (e.g., pbx.example.com:5060) required |
SIP_AUTH_ID
|
Authorization ID if different from username |
SIP_OUTBOUND_PROXY
|
Outbound proxy address (e.g., proxy.example.com:5096) |
SIP_TRANSPORT
|
Transport protocol:
udp
(default),
tcp, or
tls
|
AI Provider API Keys
| Variable | Provider |
|---|---|
OPENAI_API_KEY
|
OpenAI Realtime API |
ELEVEN_LABS_API_KEY
|
ElevenLabs Conversational AI |
GEMINI_API_KEY
|
Google Gemini (also accepts
GOOGLE_API_KEY)
|
DEEPGRAM_API_KEY
|
Deepgram Voice Agent API |
CARTESIA_API_KEY
|
Cartesia Sonic (also accepts
CARTESIA_KEY)
|
Other Variables
| Variable | Description |
|---|---|
CONFIG_FILE
|
Path to config file (default:
config.json)
|
DEBUG_FORCE_PORT
|
Force a specific RTP port for debugging |
AUDIO_FILE
|
Play an audio file instead of AI (fallback mode) |
Configuration File
The
config.json
file controls the behavior of the SIP client,
including the AI provider settings, operating
mode, and function definitions.
{
"mode": "inbound", // "inbound" or "outbound"
"api_port": 8080, // HTTP API port (0 to disable)
"provider": "openai", // AI provider to use
"openai": {
"instructions": "You are a helpful assistant...",
"voice": "alloy", // alloy, echo, fable, onyx, nova, shimmer
"greeting": "Hello!" // For outbound calls
},
"elevenlabs": {
"agent_id": "your-agent-id",
"first_message": "Hello!",
"system_prompt": "You are..."
},
"gemini": {
"model": "models/gemini-2.0-flash-exp",
"voice": "Puck", // Aoede, Charon, Fenrir, Kore, Puck
"greeting": "Hello!"
},
"deepgram": {
"model": "aura-2-thalia-en",
"listen_model": "nova-3",
"think_provider": "open_ai",
"think_model": "gpt-4o-mini"
},
"cartesia": {
"voice_id": "your-voice-id",
"stt_provider": "deepgram",
"llm_provider": "openai",
"llm_model": "gpt-4o-mini"
},
"outbound": {
"target_number": "+1234567890",
"caller_id": "+1987654321",
"task_description": "Call to confirm appointment...",
"result_webhook": "https://example.com/webhook",
"hangup_on_task_complete": true
},
"functions": [
{
"name": "get_customer_info",
"description": "Look up customer by phone",
"webhook": "https://api.example.com/customer",
"method": "POST",
"parameters": {
"type": "object",
"properties": {
"phone": { "type": "string" }
}
}
}
]
}
Top-level Options
mode
inbound
(default) or
outbound
api_port
0
to disable the API server.
provider
openai,
elevenlabs,
gemini,
deepgram,
cartesia
functions
Examples
Common usage patterns for running the SIP4AI client.
Run in API-only mode (no SIP)
./sip4ai
Run with SIP and OpenAI
SIP_USERNAME=user \
SIP_PASSWORD=pass \
SIP_DOMAIN=pbx.example.com \
SIP_SERVER=pbx.example.com \
OPENAI_API_KEY=sk-... \
./sip4ai
Run with custom config file
CONFIG_FILE=my_config.json ./sip4ai
Outbound call mode
CONFIG_FILE=outbound_config.json \
SIP_USERNAME=user \
SIP_PASSWORD=pass \
SIP_DOMAIN=pbx.example.com \
SIP_SERVER=pbx.example.com \
OPENAI_API_KEY=sk-... \
./sip4ai
API Reference
Introduction
The SIP4AI HTTP API provides programmatic access
to manage calls, inject prompts, and monitor
system status. The API runs on the same port as
the SIP client (default:
8080).
Base URL
http://localhost:8080
Configure the port using the
API_PORT
environment variable.
Available Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET |
/health
|
Health check endpoint |
| GET |
/status
|
Detailed system status |
| POST |
/call
|
Initiate an outbound call |
| POST |
/calls/:id/inject
|
Inject prompt into active call |
Authentication
The SIP4AI HTTP API does not require authentication by default. It is designed to run on localhost and be accessed by local applications. For production deployments, place the API behind a reverse proxy with authentication.
Security Notice
Do not expose the API directly to the internet. Use a reverse proxy (nginx, Caddy) with authentication for remote access.
Errors
The API uses standard HTTP status codes to indicate success or failure. Error responses include a JSON body with details about the error.
HTTP Status Codes
200
OK
400
Bad Request
404
Not Found
500
Internal Error
{
"error": "call not found",
"code": "NOT_FOUND"
}
/health
Health check
Returns a simple health status. Use this endpoint for load balancer health checks or uptime monitoring.
Response
status
string
ok
when the server is running
curl http://localhost:8080/health
import requests
response = requests.get("http://localhost:8080/health")
print(response.json())
const response = await fetch('http://localhost:8080/health');
const data = await response.json();
console.log(data);
{
"status": "ok"
}
/status
Get system status
Returns detailed system status including SIP registration state and active call information.
Response
sip_registered
boolean
active_calls
integer
calls
array
uptime
string
curl http://localhost:8080/status
import requests
response = requests.get("http://localhost:8080/status")
print(response.json())
const response = await fetch('http://localhost:8080/status');
const data = await response.json();
console.log(data);
{
"sip_registered": true,
"active_calls": 1,
"calls": [
{
"id": "abc-123-def",
"status": "in_progress",
"duration": 45
}
],
"uptime": "2h15m30s"
}
The Call object
A Call represents a voice conversation between the AI and a phone number. Calls can be inbound (received) or outbound (initiated via API).
Attributes
id
string
status
string
initiating,
ringing,
in_progress,
completed,
failed
direction
string
inbound
or
outbound
target
string
provider
string
openai,
elevenlabs,
gemini,
deepgram,
cartesia
duration
integer
transcript
array
created_at
timestamp
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "in_progress",
"direction": "outbound",
"target": "+61400000000",
"provider": "openai",
"duration": null,
"transcript": [],
"created_at": "2024-01-15T10:30:00Z"
}
/api/calls
Create a call
Initiate an outbound call to a phone number. The AI will attempt to reach the target and complete the specified task.
Request Body
target
required
task
required
instructions
optional
provider
optional
caller_id
optional
max_duration_seconds
optional
event_webhook
optional
functions
optional
curl -X POST https://your-instance.com/api/calls \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"target": "+61400000000",
"task": "Confirm the appointment for Tuesday at 2pm",
"instructions": "Be friendly and professional",
"max_duration_seconds": 300,
"event_webhook": "https://your-app.com/webhook"
}'
import requests
response = requests.post(
"https://your-instance.com/api/calls",
headers={
"Authorization": "Bearer sk_live_...",
"Content-Type": "application/json"
},
json={
"target": "+61400000000",
"task": "Confirm the appointment for Tuesday at 2pm",
"instructions": "Be friendly and professional",
"max_duration_seconds": 300,
"event_webhook": "https://your-app.com/webhook"
}
)
print(response.json())
const response = await fetch('https://your-instance.com/api/calls', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk_live_...',
'Content-Type': 'application/json'
},
body: JSON.stringify({
target: '+61400000000',
task: 'Confirm the appointment for Tuesday at 2pm',
instructions: 'Be friendly and professional',
max_duration_seconds: 300,
event_webhook: 'https://your-app.com/webhook'
})
});
const data = await response.json();
console.log(data);
{
"success": true,
"call_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "initiating",
"created_at": "2024-01-15T10:30:00Z"
}
/api/calls/{call_id}
Retrieve a call
Retrieve details of a specific call including status, duration, and transcript.
Path Parameters
call_id
required
curl https://your-instance.com/api/calls/550e8400... \
-H "Authorization: Bearer sk_live_..."
import requests
call_id = "550e8400..."
response = requests.get(
f"https://your-instance.com/api/calls/{call_id}",
headers={"Authorization": "Bearer sk_live_..."}
)
print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}`, {
headers: {
'Authorization': 'Bearer sk_live_...'
}
});
const data = await response.json();
console.log(data);
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"direction": "outbound",
"target": "+61400000000",
"provider": "openai",
"duration": 127,
"transcript": [
{
"speaker": "ai",
"text": "Hello, this is calling about your Tuesday appointment...",
"timestamp": 0.5
},
{
"speaker": "user",
"text": "Yes, that's correct. 2pm works for me.",
"timestamp": 5.2
}
],
"created_at": "2024-01-15T10:30:00Z",
"ended_at": "2024-01-15T10:32:07Z"
}
/api/calls
List all calls
Returns a paginated list of calls, sorted by creation date (newest first).
Query Parameters
limit
optional
offset
optional
status
optional
curl "https://your-instance.com/api/calls?limit=20&status=completed" \
-H "Authorization: Bearer sk_live_..."
import requests
response = requests.get(
"https://your-instance.com/api/calls",
headers={"Authorization": "Bearer sk_live_..."},
params={"limit": 20, "status": "completed"}
)
print(response.json())
const params = new URLSearchParams({
limit: '20',
status: 'completed'
});
const response = await fetch(`https://your-instance.com/api/calls?${params}`, {
headers: {
'Authorization': 'Bearer sk_live_...'
}
});
const data = await response.json();
console.log(data);
/api/calls/{call_id}/inject
Inject context
Inject additional context or instructions into an active call. The AI will incorporate this information into the ongoing conversation.
Request Body
message
required
role
optional
system
(default) or
user
curl -X POST https://your-instance.com/api/calls/550e8400.../inject \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"message": "The customer has a VIP status. Offer them a 20% discount.",
"role": "system"
}'
import requests
call_id = "550e8400..."
response = requests.post(
f"https://your-instance.com/api/calls/{call_id}/inject",
headers={
"Authorization": "Bearer sk_live_...",
"Content-Type": "application/json"
},
json={
"message": "The customer has a VIP status. Offer them a 20% discount.",
"role": "system"
}
)
print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}/inject`, {
method: 'POST',
headers: {
'Authorization': 'Bearer sk_live_...',
'Content-Type': 'application/json'
},
body: JSON.stringify({
message: 'The customer has a VIP status. Offer them a 20% discount.',
role: 'system'
})
});
const data = await response.json();
console.log(data);
/api/calls/{call_id}
End a call
Immediately terminate an active call. The AI will say a brief goodbye before hanging up.
curl -X DELETE https://your-instance.com/api/calls/550e8400... \
-H "Authorization: Bearer sk_live_..."
import requests
call_id = "550e8400..."
response = requests.delete(
f"https://your-instance.com/api/calls/{call_id}",
headers={"Authorization": "Bearer sk_live_..."}
)
print(response.json())
const callId = '550e8400...';
const response = await fetch(`https://your-instance.com/api/calls/${callId}`, {
method: 'DELETE',
headers: {
'Authorization': 'Bearer sk_live_...'
}
});
const data = await response.json();
console.log(data);
{
"success": true,
"call_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "ending"
}
Functions
Functions allow the AI to call your webhooks during a conversation. This enables real-time actions like booking appointments, checking order status, or transferring calls.
Real-time execution
Functions execute immediately when the AI decides to call them. Hold audio plays while waiting for your response.
Webhook security
All function calls include a signature header for verification. Validate requests to ensure they're from SIP4AI.
Define functions
Define functions when creating a call. Each function needs a name, description, parameters schema, and webhook URL.
{
"functions": [
{
"name": "book_appointment",
"description": "Book an appointment for the caller",
"webhook": "https://your-app.com/api/book",
"hold_audio_url": "https://your-app.com/audio/hold.wav",
"parameters": {
"type": "object",
"properties": {
"date": {
"type": "string",
"description": "The appointment date (YYYY-MM-DD)"
},
"time": {
"type": "string",
"description": "The appointment time (HH:MM)"
}
},
"required": ["date", "time"]
}
},
{
"name": "transfer_call",
"description": "Transfer the call to a human agent",
"type": "transfer",
"target": "+61400000001"
}
]
}
Webhook events
SIP4AI sends webhook events to your specified URL for important call lifecycle events.
| Event | Description |
|---|---|
call.initiated
|
Call has been initiated, dialing target |
call.ringing
|
Target phone is ringing |
call.answered
|
Call was answered, AI conversation starting |
call.completed
|
Call ended normally |
call.failed
|
Call failed (busy, no answer, error) |
function.called
|
AI invoked a function during the call |
transcript.update
|
New transcript segment available |
Payload format
All webhook payloads follow a consistent format with event type, call ID, timestamp, and event-specific data.
{
"event": "call.completed",
"call_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15T10:32:07Z",
"data": {
"duration": 127,
"end_reason": "completed",
"transcript": [
{
"speaker": "ai",
"text": "Hello, this is calling about your Tuesday appointment...",
"timestamp": 0.5
}
],
"functions_called": [
{
"name": "book_appointment",
"arguments": { "date": "2024-01-16", "time": "14:00" },
"result": { "success": true }
}
]
}
}
OpenAI Realtime
GPT-4o powered realtime voice conversations with advanced reasoning capabilities.
Configuration Options
voice
turn_detection
instructions
{
"provider": "openai",
"openai": {
"instructions": "You are a helpful assistant...",
"voice": "shimmer",
"turn_detection": {
"type": "server_vad",
"threshold": 0.7,
"silence_duration_ms": 1000
}
}
}
ElevenLabs
Native 8kHz support for the lowest possible latency. Configure your agent in the ElevenLabs dashboard.
{
"provider": "elevenlabs",
"elevenlabs": {
"agent_id": "your-agent-id",
"first_message": "Hello! How can I help you today?",
"system_prompt": "You are a helpful assistant..."
}
}
Gemini Live
Google's Gemini 2.0 Flash with five distinct voices and built-in function calling.
{
"provider": "gemini",
"gemini": {
"model": "models/gemini-2.0-flash-exp",
"voice": "Kore",
"first_message": "Hello!",
"system_prompt": "You are a helpful assistant..."
}
}