Skip to content

ADR-001: Firebase Functions for API + Cloud Run for Voice/Chat

Status

Accepted

Date

2025-01-15

Context

The Humanlike platform needs two distinct backend services:

  1. Dashboard/admin APIs -- CRUD operations for business management, staff, calendar OAuth, Stripe billing, and static site generation. These are short-lived HTTP request/response cycles initiated by the web dashboard.

  2. Voice and chat APIs -- real-time audio streaming (Twilio WebSocket media streams), OpenAI Realtime API WebSocket connections, and streaming chat completions. These are long-lived connections that can last minutes per call.

Firebase Cloud Functions (2nd gen) have a maximum timeout of 540 seconds and do not natively support WebSocket upgrades. Voice calls require persistent bidirectional WebSocket connections between Twilio and the AI provider, with audio format conversion happening in real time.

Additionally, the voice/chat service needs to be written in plain JavaScript (no build step) for rapid iteration on prompt engineering and tool dispatch, while the dashboard APIs benefit from TypeScript and Express router structure shared with the rest of the monorepo.

Decision

Split the backend into two deployment targets:

  • Firebase Cloud Functions (functions/) -- TypeScript Express routers for dashboard APIs: adminApi, calendarApi, siteApi, staffApi, stripeApi. Deployed automatically on push to main when functions/** or shared/** change.

  • Cloud Run (services/api/) -- plain JavaScript HTTP + WebSocket server for voice and chat: Twilio TwiML webhook, Twilio media stream WebSocket handler, OpenAI ephemeral token creation, tool execution, chat completions proxy, portal resolution. Deployed as a Docker container on push when services/api/** changes.

Both services share Firebase Auth for authentication and Firestore for data, using the same @humanlike/shared types and path builders (Cloud Functions via TypeScript import, Cloud Run via runtime reference).

Alternatives Considered

AlternativeWhy Rejected
Everything in Firebase Cloud FunctionsCloud Functions do not support WebSocket upgrades needed for Twilio media streams and OpenAI Realtime. The 540s timeout is too short for long calls.
Everything in Cloud RunWould lose Firebase Functions conveniences (automatic scaling, tight Firestore triggers integration, Express router pattern). Dashboard APIs do not need long-lived connections.
Single Cloud Run service with TypeScriptThe voice/chat service benefits from plain JS for fast iteration without a build step. Mixing TS and JS in one service adds complexity.
AWS Lambda + API Gateway WebSocketWould require migrating away from Firebase ecosystem (Auth, Firestore, Hosting) and introducing cross-cloud complexity.
Self-hosted server (GCE/EC2)Adds operational burden for scaling, patching, and availability. Cloud Run provides managed scaling with pay-per-use.

Consequences

Positive:

  • Voice/chat service supports WebSocket connections of arbitrary duration
  • Each service can scale independently based on its traffic pattern
  • Plain JS in services/api/ enables rapid prompt and tool iteration without build steps
  • Dashboard APIs get TypeScript safety and Express router structure
  • Path-based CI deploys only what changed

Negative:

  • Two deployment targets to monitor and debug
  • Shared logic (Firebase Auth verification, Firestore access) is duplicated across both services rather than shared as a library
  • Cloud Run requires a Dockerfile and container registry management

Neutral:

  • Both services authenticate via Firebase Auth tokens, so the auth model is consistent
  • Firestore is the shared data layer, keeping both services in sync without inter-service calls