Architecture Overview

Pinecall's architecture is designed to provide a seamless voice AI experience through multiple input channels and specialized components working together. Understanding this architecture will help you better integrate and leverage the platform's capabilities.

Core Components:

  • Input Channels - Support for phone networks (PSTN/VoIP) and web-based audio calls
  • Telephony Gateway - Handles call routing, signaling, and voice transmission
  • Speech Processing - Converts speech to text and text to speech
  • Orchestration Layer - Coordinates the flow of data between components
  • LLM Interface & Agent Logic - Powers the intelligence and reasoning of AI agents
  • APIs & WebSockets - Provides programmatic access to all platform features

System Architecture Diagram

System Architecture Diagram

Component Breakdown

Input Channels

Pinecall supports multiple input channels to provide flexibility in how users interact with the platform:

  • Phone Networks (PSTN, VoIP) - Traditional telephone calls through public switched telephone networks or Voice over IP protocols.
  • Web Calls - Browser-based audio communication using WebRTC technology, enabling voice interactions directly from web applications without requiring a phone.

Telephony Gateway

The telephony gateway connects to traditional telephone networks and VoIP services, managing inbound and outbound calls. It handles call setup, teardown, DTMF processing, and voice data transmission. This layer abstracts the complexities of telecom protocols and ensures high-quality voice connections across both traditional telephony and web-based audio channels.

Speech Processing

This component converts between audio and text in real-time. It uses advanced speech-to-text (STT) engines to transcribe caller speech and text-to-speech (TTS) engines to generate natural-sounding voice responses. Pinecall employs state-of-the-art neural TTS models for human-like pronunciation, intonation, and emotion. The speech processing layer handles audio from both phone calls and browser-based web calls with consistent quality.

Orchestration Layer

The orchestration layer coordinates the flow of information between all system components. It handles session management, routes messages, manages state, and ensures timely processing of all events. This central component maintains system coherence and enables complex multi-turn conversations by directing data between speech processing, LLM interface, and integration points.

LLM Interface & Agent Logic

This layer interfaces with large language models (LLMs) like GPT-4 and manages the conversational logic of AI agents. It processes transcribed speech, maintains conversation context, generates appropriate responses, and handles specialized functionality like information retrieval and API calls. The agent logic is channel-agnostic, providing consistent intelligence whether the user is calling from a phone or connecting through a web browser.

APIs & WebSockets

Pinecall exposes RESTful APIs for management operations and WebSocket interfaces for real-time updates. These interfaces allow developers to create, configure, and monitor voice AI systems, as well as integrate with external applications and data sources. The API layer provides unified access to both phone and web-based calling functionality, allowing seamless integration with your existing systems.

Data Flow

Inbound Call Flow:

  1. Call comes in through telephony gateway (phone) or web browser (WebRTC)
  2. Audio stream is established
  3. Speech-to-text converts caller's voice to text
  4. Text is sent to the orchestration layer
  5. Orchestration layer passes the text to the LLM interface & agent logic
  6. Agent logic processes the text using LLMs and generates a response
  7. Response is sent back to the orchestration layer
  8. Text-to-speech converts response to audio
  9. Audio is played back to the caller through the appropriate channel
  10. Process repeats for the duration of the conversation

Integration Points

Pinecall's architecture provides several integration points for developers:

REST APIs

Create and manage agents, phone numbers, web call endpoints, and calls. Configure webhooks, retrieve analytics, and more.

SDKs

Language-specific libraries for easier integration with your applications, including web client SDKs for browser-based calling.

Webhooks

Receive notifications about call events and agent actions across both phone and web channels.

WebSockets

Real-time updates and monitoring of ongoing calls and agent activities for both phone and web-based interactions.

Next Steps

Now that you understand Pinecall's architecture, you might want to explore: