Architecture Overview

Pinecall's architecture is designed to provide a seamless voice AI experience through multiple input channels and specialized components working together. Understanding this architecture will help you better integrate and leverage the platform's capabilities.

Core Components:

Input Channels - Support for phone networks (PSTN/VoIP) and web-based audio calls
Telephony Gateway - Handles call routing, signaling, and voice transmission
Speech Processing - Converts speech to text and text to speech
Orchestration Layer - Coordinates the flow of data between components
LLM Interface & Agent Logic - Powers the intelligence and reasoning of AI agents
APIs & WebSockets - Provides programmatic access to all platform features

System Architecture Diagram

Component Breakdown

Input Channels

Pinecall supports multiple input channels to provide flexibility in how users interact with the platform:

Phone Networks (PSTN, VoIP) - Traditional telephone calls through public switched telephone networks or Voice over IP protocols.
Web Calls - Browser-based audio communication using WebRTC technology, enabling voice interactions directly from web applications without requiring a phone.

Telephony Gateway

The telephony gateway connects to traditional telephone networks and VoIP services, managing inbound and outbound calls. It handles call setup, teardown, DTMF processing, and voice data transmission. This layer abstracts the complexities of telecom protocols and ensures high-quality voice connections across both traditional telephony and web-based audio channels.

Speech Processing

This component converts between audio and text in real-time. It uses advanced speech-to-text (STT) engines to transcribe caller speech and text-to-speech (TTS) engines to generate natural-sounding voice responses. Pinecall employs state-of-the-art neural TTS models for human-like pronunciation, intonation, and emotion. The speech processing layer handles audio from both phone calls and browser-based web calls with consistent quality.

Orchestration Layer

The orchestration layer coordinates the flow of information between all system components. It handles session management, routes messages, manages state, and ensures timely processing of all events. This central component maintains system coherence and enables complex multi-turn conversations by directing data between speech processing, LLM interface, and integration points.

LLM Interface & Agent Logic

This layer interfaces with large language models (LLMs) like GPT-4 and manages the conversational logic of AI agents. It processes transcribed speech, maintains conversation context, generates appropriate responses, and handles specialized functionality like information retrieval and API calls. The agent logic is channel-agnostic, providing consistent intelligence whether the user is calling from a phone or connecting through a web browser.

APIs & WebSockets

Pinecall exposes RESTful APIs for management operations and WebSocket interfaces for real-time updates. These interfaces allow developers to create, configure, and monitor voice AI systems, as well as integrate with external applications and data sources. The API layer provides unified access to both phone and web-based calling functionality, allowing seamless integration with your existing systems.

Data Flow

Inbound Call Flow:

Call comes in through telephony gateway (phone) or web browser (WebRTC)
Audio stream is established
Speech-to-text converts caller's voice to text
Text is sent to the orchestration layer
Orchestration layer passes the text to the LLM interface & agent logic
Agent logic processes the text using LLMs and generates a response
Response is sent back to the orchestration layer
Text-to-speech converts response to audio
Audio is played back to the caller through the appropriate channel
Process repeats for the duration of the conversation

Integration Points

Pinecall's architecture provides several integration points for developers:

REST APIs

Create and manage agents, phone numbers, web call endpoints, and calls. Configure webhooks, retrieve analytics, and more.

SDKs

Language-specific libraries for easier integration with your applications, including web client SDKs for browser-based calling.

Webhooks

Receive notifications about call events and agent actions across both phone and web channels.

WebSockets

Real-time updates and monitoring of ongoing calls and agent activities for both phone and web-based interactions.

Next Steps

Now that you understand Pinecall's architecture, you might want to explore:

Key Concepts - Learn about the fundamental concepts of the platform
Installation - Get started with the Pinecall SDK
Web Integration - Implement browser-based voice AI
API Reference - Explore the available API endpoints