What you’ll learn:
This page is the authoritative architecture reference for ZappWay, generated from a full codebase scan (commit 14982f37). It covers every major subsystem: the monorepo structure, database schema, queue/worker topology, ingestion loaders, Qdrant vector layer, embedding pipeline, adaptive multi-shot RAG, chat engine, auth, sync orchestrator, all 13 channel integrations, file storage, LLM model router, and observability.
🔢 Table of Contents
- Overall System Architecture
- Database Layer
- Queue & Worker Architecture
- Ingestion Engine & Loaders
- Vector Database Layer — Qdrant
- Embedding Pipeline
- RAG & Retrieval Pipeline
- Chat & Conversation Engine
- Multi-Tenant Isolation & Auth
- Sync Orchestrator
- External Integrations & Channels
- File Storage & Media Pipeline
- LLM Orchestration & Model Router
- Observability & Monitoring
- Gaps & Recommendations
1. Overall System Architecture
ZappWay is a pnpm monorepo structured into 3 applications and 2 package trees. Theapps/zappway app hosts the Dashboard, Landing, Blog, and Docs. All core business logic lives in packages/zappway/lib (~160 files). A dedicated apps/workers-services app runs all background workers and the WhatsApp Bridge API in isolation, consuming jobs from Redis-backed BullMQ-Pro queues.
Application Map
| App | Path | Framework | Purpose |
|---|---|---|---|
| Dashboard | apps/zappway/dashboard | Next.js | Main SaaS UI |
| Blog | apps/zappway/blog | Next.js | Marketing blog |
| Docs | apps/zappway/docs | MDX-based | Product documentation |
| Landing Page | apps/zappway/lp | Next.js | Landing page |
| Workers Services | apps/workers-services | Next.js | Background workers + WA Bridge |
| ZappFlux | apps/zappflux | Next.js | Secondary product (shared DB) |
Package Map
| Package | Path | Purpose |
|---|---|---|
@zappway/prisma | packages/zappway/prisma | Prisma schema, client, browser exports |
@zappway/lib | packages/zappway/lib | Core business logic (~160 files) |
@zappway/integrations | packages/zappway/integrations | 13 channel adapters |
@zappway/ui | packages/zappway/ui | Shared React components |
2. Database Layer
Source:packages/zappway/prisma/schema.prisma (1548 lines) · 33 models · 23 enums
PostgreSQL via Prisma with fullTextSearch and fullTextIndex preview features enabled. The Organization model is the root tenant boundary — every other entity is scoped to it directly or transitively.
Full Model Inventory
| # | Model | Tenant Key | Purpose |
|---|---|---|---|
| 1 | User | — | Platform user account |
| 2 | Organization | id (root) | Tenant container |
| 3 | Membership | organizationId | User–Org relationship (RBAC) |
| 4 | ApiKey | organizationId | Programmatic access |
| 5 | Usage | organizationId | Billing usage tracking |
| 6 | Agent | organizationId | AI Employee definition |
| 7 | Datastore | organizationId | Knowledge base container |
| 8 | AppDatasource | organizationId | Individual data source |
| 9 | Conversation | organizationId | Chat session |
| 10 | Message | via Conversation | Individual message |
| 11 | Tool | via Agent | Agent tool definition |
| 12 | Form | via Agent | Dynamic form |
| 13 | FormSubmission | via Form | Form response |
| 14 | Contact | organizationId | CRM contact |
| 15 | Subscription | organizationId | Billing plan |
| 16 | Product | — | Stripe product |
| 17 | Price | — | Stripe price |
| 18 | Account | userId | OAuth provider link |
| 19 | Session | organizationId | Auth session |
| 20 | VerificationToken | — | Email verification |
| 21 | ServiceProvider | organizationId | Integration credentials |
| 22 | ActionApproval | via Conversation | Tool approval request |
| 23 | Attachment | via Message | Message attachment |
| 24 | ExternalIntegration | via Agent | Webhook/external config |
| 25 | LLMTaskOutput | organizationId | Async LLM task result |
| 26 | MailInbox | organizationId | Email inbox config |
| 27 | ConversationCrmTag | organizationId | CRM tagging |
| 28 | Lead | organizationId | Lead capture |
| 29 | XPBNPEval | — | A/B test evaluation |
| 30 | AgentsOnDatastores | — | M:N agent–datastore join |
| 31 | Coupon | — | Billing coupon |
| 32 | UserApiKey | userId | User-level API key |
| 33 | CrmLog | organizationId | CRM activity log |
Key Enums
| Enum | Sample Values |
|---|---|
DatasourceType | web_page, web_site, text, file, google_drive_file, notion, youtube_video, youtube_bulk, qa, api |
DatasourceStatus | unsynched, pending, running, synched, error, usage_limit_reached, stalled |
ConversationChannel | dashboard, website, whatsapp, telegram, slack, crisp, form, mail, api, zapier, zendesk, messenger, instagram |
AgentModelName | 40+ model enum values — see Section 13 |
SubscriptionPlan | level_0, level_1, level_2, level_3, level_4 |
MembershipRole | OWNER, ADMIN, USER |
GlobalRole | SUPERADMIN, CUSTOMER |
ToolType | datastore, http, form, mark_as_resolved, request_human, lead_capture, payment |
3. Queue & Worker Architecture
Source:packages/zappway/lib/types/index.ts, apps/workers-services/workers/
Three dedicated BullMQ-Pro queues separate concerns. The load-datasource queue is the primary active queue; the other two are partially implemented.
Worker Configuration — Datasource Loader
| Config | Value | Env Override |
|---|---|---|
| Queue Name | load-datasource | — |
| Concurrency | 3 | DS_WORKER_CONCURRENCY |
| Lock Duration | 8 min | DS_BULL_LOCK_DURATION_MS |
| Stalled Interval | 90s | DS_BULL_STALLED_INTERVAL_MS |
| Max Stalled Count | 2 | DS_BULL_MAX_STALLED_COUNT |
| Dedupe Lock TTL | 10 min | DS_JOB_LOCK_TTL_MS |
| Remove on Complete | Last 1000 | — |
| Remove on Fail | Last 5000 | — |
SET NX with key ld:lock:{datasourceId} prevents the same datasource from running in parallel.
Graceful Shutdown: Handles SIGTERM and SIGINT — closes the worker, quits Redis cleanly.
Worker Inventory
| Worker | Queue | Status |
|---|---|---|
datasource-loader.ts | load-datasource | ✅ Active |
check-and-sync-cron.ts | cron-based | ✅ Active |
generate-rag.ts | generate-rag-document | ✅ Active |
index-media-memory.ts | index-media-memory | ✅ Active |
check-stalled.ts | internal | ✅ Active |
trial-expiry-check.ts | cron-based | ✅ Active |
usage-reconciler.ts | cron-based | ✅ Active |
scheduler.ts | orchestrator | ✅ Active |
4. Ingestion Engine & Loaders
Source:packages/zappway/lib/datastores/datasources/, packages/zappway/lib/loaders/
All loaders extend DatasourceLoaderBase. The entry point taskLoadDatasource() selects the correct loader at runtime based on DatasourceType. Output is a normalized AppDocument[] array fed into the chunking engine and then into Qdrant.
Loader Mapping
| DatasourceType | Loader | Source |
|---|---|---|
web_page | WebPageLoader | URL fetch |
web_site | WebSiteLoader | Sitemap / crawl |
file | FileLoader | S3 upload |
text | TextLoader | Raw text input |
qa | QALoader | Q&A pairs |
youtube_video | YouTubeLoader | YouTube API |
youtube_bulk | YouTubeLoader | Multi-video |
google_drive_file | GoogleDriveLoader | Drive API |
google_drive_folder | GoogleDriveLoader | Drive API |
notion | NotionLoader | Notion API |
notion_page | NotionLoader | Notion API |
api | APILoader | HTTP endpoint |
WebSite Loader — Pipeline Detail
Source:packages/zappway/lib/loaders/web-site.ts (515 lines)
The WebSiteLoader is the most complex loader. Its pipeline runs in 6 stages: (1) Discovery — parses sitemap XML or crawls via findDomainPages(); (2) Normalization — URL dedup, strip UTM params, lowercase hostname; (3) Blacklist filtering — applies black_listed_urls config; (4) HTTP probing — HEAD → GET with semver path repair on 404s; (5) Child management — upserts web_page child datasources, deletes orphans; (6) Enqueueing — emits child jobs with priority scores (home = 5, sitemap = 8, others = 10). Concurrency is capped at 6 via mapWithConcurrency. Plan limit applied via accountConfig[plan].limits.maxWebsiteURL (default: 25).
5. Vector Database Layer — Qdrant
Source:packages/zappway/lib/datastores/qdrant.ts (731 lines)
Each Datastore maps to exactly one Qdrant collection named zw_{datastoreId}, providing strict per-tenant vector isolation. QdrantManager includes an auto-migration routine that detects dimension or distance metric mismatches and recreates the collection transparently — enabling zero-downtime embedding model upgrades.
Collection Configuration
| Parameter | Value |
|---|---|
| Vector Size | 3072 |
| Distance Metric | Cosine |
| Collection Name | zw_{datastoreId} |
HNSW m | 16 |
HNSW ef_construct | 100 |
| Quantization | Scalar Int8 — 99th percentile |
| On-disk Payload | true |
| Write Consistency | majority |
Payload Schema (per point)
| Field | Type | Purpose |
|---|---|---|
datastore_id | string | Tenant-scoped datastore |
datasource_id | string | Source document |
datasource_hash | string | Content hash for dedup |
chunk_hash | string | Individual chunk hash |
chunk_offset | integer | Position in document |
text | string | Original text content |
tags | string[] | User-defined tags |
custom_id | string | External reference |
page_number | integer | PDF page number |
total_pages | integer | Total PDF pages |
6. Embedding Pipeline
Source:packages/zappway/lib/datastores/gemini-embeddings.ts, packages/zappway/lib/multimodal-memory/
All embeddings use Gemini Embedding 2 Preview, producing 3072-dimensional vectors that match Qdrant’s VECTOR_SIZE. The pipeline uses asymmetric taskType values per call site — RETRIEVAL_DOCUMENT during ingestion and RETRIEVAL_QUERY at search time — which is critical for retrieval quality with Gemini’s asymmetric embedding model.
Configuration
| Parameter | Value |
|---|---|
| Model | gemini-embedding-2-preview |
| Output Dimensions | 3072 |
| Batch Size | 100 (API limit per batchEmbedContents) |
| Auth | x-goog-api-key — env GOOGLE_AI_API_KEY |
Multimodal Support
embedMultimodal() accepts GeminiPart[][] where each part can be { text } or { inlineData: { mimeType, data } } (base64-encoded images, video frames, audio, PDF pages). This embeds non-text content into the same 3072-dimensional space as text, enabling true multimodal semantic search.
The multimodal-memory/ module provides dedicated indexing (indexer.ts — 24.5KB), media-specific chunkers (media-chunkers.ts), collection management, search, and async queue triggers.
7. RAG & Retrieval Pipeline
Source:packages/zappway/lib/chat-v4/rag.ts (493 lines)
ZappWay implements Multi-Shot Adaptive RAG with up to 6 progressive retrieval attempts across 3 quality tiers. Each attempt relaxes similarity thresholds to maximize recall while evaluateRagQuality() enables early exit when results are strong enough. A circuit breaker prevents cascading failures, and ragMemo (in-memory cache) deduplicates identical attempts within a session.
Adaptive Thresholds
| Attempt | Label | Deep Mode | Normal | TopK |
|---|---|---|---|---|
| 1 | strict | 0.55 | 0.48 | 30 |
| 2 | balanced-high | 0.48 | 0.42 | 45 |
| 3 | balanced | 0.42 | 0.38 | 60 |
| 4 | balanced-low | 0.37 | 0.32 | 75 |
| 5 | relaxed | 0.33 | 0.28 | 90 |
| 6 | permissive | 0.30 | 0.25 | 100 |
shouldFavorDeepRag(query) and applies higher initial thresholds. minUcount is 1 for single-datastore and 2 for multi-datastore queries. Timeouts scale from 8s (Tier 1) to 12s (Tier 3), capped by remaining chat budget. If remainingMs() < 120s, max attempts reduce to 4.
8. Chat & Conversation Engine
Source:packages/zappway/lib/chat-v4/chat.ts (1147 lines), packages/zappway/lib/agent/tools/
The chat function orchestrates system prompt assembly, message history truncation, multi-shot RAG, runtime tool building, LLM execution with streaming, and the full model fallback chain — all within a shared time budget enforced at every checkpoint.
SSE Event Types
| Event | Purpose |
|---|---|
answer | Streamed text delta |
tool_call | Tool invocation notification |
endpoint_response | HTTP tool response data |
step | Processing step indicator |
metadata | Model usage, sources, context window |
error | Error notification |
done | Stream completion signal |
Runtime Tools
| Tool File | Tool Type | Purpose |
|---|---|---|
datastore.ts | datastore | Dynamic RAG within tool loop |
form.ts | form | Present and collect dynamic forms |
http.ts | http | Call external APIs |
request-human.ts | request_human | Transfer to human agent |
lead-capture.ts | lead_capture | Capture visitor info |
payment.ts | payment | Initiate payment flow |
mark-as-resolved.ts | mark_as_resolved | Close support ticket |
ui.ts | UI rendering | Custom UI components |
9. Multi-Tenant Isolation & Auth
Source:packages/zappway/lib/auth.ts (1661 lines)
NextAuth v5 with a custom Prisma adapter. Four sign-in methods are supported. On first sign-in, the platform auto-provisions the full tenant stack atomically.
Tenant Isolation by Layer
| Layer | Mechanism |
|---|---|
| Database | organizationId on every tenant model |
| API | Session middleware injects organization from session |
| Vector DB | One Qdrant collection per datastore (zw_{datastoreId}) |
| File Storage | S3 prefix organizations/{orgId}/ |
| Queues | organizationId carried in every job payload |
RBAC Model
| Role | Scope | Access |
|---|---|---|
SUPERADMIN | Global | Full system |
CUSTOMER | Global | Standard user |
OWNER | Organization | Full org management |
ADMIN | Organization | Manage agents and datastores |
USER | Organization | Read-only |
Session Configuration
| Setting | Production | Development |
|---|---|---|
| Strategy | database | database |
| Cookie Name | __Secure-next-auth.session-token | next-auth.session-token |
| Cookie Domain | .{rootDomain} | undefined |
| Secure | true | false |
NEXT_LOCALE cookie → i18next cookie → Accept-Language header → default en.
10. Sync Orchestrator
Source:apps/workers-services/workers/check-and-sync-cron.ts
A cron-driven worker scans all datasources with status = synched, filters by lastSyncAt + syncInterval, and fans out individual load-datasource jobs. A check-stalled worker handles recovery of datasources stuck in running status beyond a configurable threshold.
11. External Integrations & Channels
Source:packages/zappway/integrations/ — 13 channel adapters
All channels normalize inbound messages to the same internal Conversation + Message model and route through the unified chat-v4 engine.
Channel Capabilities
| Channel | Direction | Streaming | Form Support |
|---|---|---|---|
dashboard | Bidirectional | SSE | Yes |
website | Bidirectional | SSE | Yes |
whatsapp | Bidirectional | No (webhook) | Limited |
telegram | Bidirectional | No (webhook) | Limited |
slack | Bidirectional | No (webhook) | No |
crisp | Bidirectional | No (webhook) | No |
mail | Async | No | No |
api | Request/Response | Optional | No |
zapier | Webhook | No | No |
form | Form submit | No | Yes |
zendesk | Bidirectional | No (webhook) | No |
messenger | Bidirectional | No (webhook) | No |
instagram | Bidirectional | No (webhook) | No |
ServiceProvider Auth
| Type | Auth Mechanism |
|---|---|
website | API Key |
whatsapp | Cloud API token |
telegram | Bot token |
slack | OAuth |
crisp | API key + website ID |
zendesk | API token |
messenger / instagram | Page access token |
notion | Integration token |
google_drive | OAuth refresh token |
zapier | Webhook URL |
12. File Storage & Media Pipeline
Source:packages/zappway/lib/aws.ts
Supports S3-compatible storage (AWS S3, Cloudflare R2, MinIO) via AWS SDK v3. Files are uploaded via presigned URLs, stored under organizations/{orgId}/, and pulled by FileLoader for extraction and ingestion.
S3 Environment Variables
| Variable | Purpose |
|---|---|
APP_AWS_REGION | AWS region |
APP_AWS_ACCESS_KEY | Access key ID |
APP_AWS_SECRET_KEY | Secret access key |
APP_AWS_S3_ENDPOINT | Custom endpoint (R2 / MinIO) |
APP_AWS_S3_FORCE_PATH_STYLE | Path-style for R2 / MinIO |
APP_AWS_S3_BUCKET | Target bucket |
13. LLM Orchestration & Model Router
Source:packages/zappway/lib/config.ts (943 lines), packages/zappway/lib/chat-model/model.ts (745 lines)
40+ models across 9 providers via a unified OpenAI-SDK-compatible interface. Provider routing is automatic based on each model’s baseUrl in ModelConfig. OpenAI, Google Gemini, and OpenRouter are all accessed through the same OpenAI SDK client with different base URLs and API keys.
Model Inventory
🧠 OpenAI — Direct
| Model | Max Tokens | Cost |
|---|---|---|
| GPT-4o / GPT-4o Mini | 128K | 15 / 4 |
| GPT-5 / Mini / Nano | 400K | 15 / 5 / 4 |
| GPT-5.1 / 5.1 Chat | 400K | 15 / 12 |
| GPT-5.2 / 5.2 Chat / 5.2 Chat Latest | 400K | 20 / 15 / 15 |
| GPT-5.3 Chat Latest / 5.3 Codex | 400K / 128K | 15 |
| GPT-5.4 / 5.4 Mini / 5.4 Nano | 400K | 15 / 6 / 5 |
🟡 Google — Direct via Gemini API
| Model | Max Tokens | Cost |
|---|---|---|
| Gemini Flash 2.0 / 2.5 | 1M | 5 / 10 |
| Gemini Pro 2.5 | 1M | 12 |
| Gemini 3 Pro / 3 Flash | 1M | 15 / 10 |
| Gemini 3 Pro Image | 64K | 15 |
| Gemini 3.1 Flash Lite / Flash Image / Pro | 1M / 64K / 1M | 5 / 8 / 15 |
🔶 Anthropic — via OpenRouter
| Model | Max Tokens | Cost |
|---|---|---|
| Claude 3.5 Haiku / 3.5 Sonnet | 200K | 12 / 15 |
| Claude 4 Sonnet | 200K | 15 |
| Claude 4.5 Haiku / 4.5 Sonnet | 200K | 12 / 15 |
| Claude Opus 4.5 / Opus 4.6 / Sonnet 4.6 | 200K | 20 / 20 / 15 |
Other Providers — via OpenRouter
| Model | Provider | Cost |
|---|---|---|
| DeepSeek V3.2 / R1 | DeepSeek | 5 / 8 |
| Llama-4 Maverick / Scout | Meta | 10 / 5 |
| Mistral Large 2512 / 8x22B / 8x7B / Small Creative | Mistral | 5 / 5 / 2 / 2 |
| Sonar Pro | Perplexity | 15 |
| Grok 4 / 4.1 Fast | xAI | 15 / 5 |
| Nova Premier / Nova 2 Lite | Amazon | 12 / 6 |
| LFM2 8B / LFM 2 24B | Liquid | 1 / 1 |
| MiniMax M2.7 | MiniMax | 5 |
| MiMo-V2-Omni / MiMo-V2-Pro | Xiaomi | 6 / 10 |
gpt_4o_mini, gpt_5_mini, gpt_5_nano, gpt_5_4_mini, gpt_5_4_nano, gemini_flash_2_0, lfm2_8b_a1b
Important: GPT-5 family models do not support manual temperature adjustment — they use automatic temperature recognition.
14. Observability & Monitoring
Source:packages/zappway/lib/logger.ts, apps/workers-services/sentry.*.config.ts
| Component | Technology | Status |
|---|---|---|
| Core Logger | Pino | ✅ Active |
| Pretty Print | pino-pretty | ✅ Active |
| Axiom Cloud | @axiomhq/pino | ❌ Commented out |
| Chat Logger | Pino child logChat | ✅ Active |
| Error Tracking | Sentry (client + server + edge) | ✅ Active |
WorkerHealth payload to Redis (health:worker:{name}, 30s TTL) every 10s. Fields: status, startedAt, lastActivityAt, jobsProcessed, jobsFailed, queueLength, system.memoryMB, system.uptimeMs. Exposed via /api/workers/health.
Workers also stream real-time logs to Redis Pub/Sub channel logs:datasource for live dashboard monitoring.
15. Gaps & Recommendations
Identified Gaps
| # | Area | Gap | Severity |
|---|---|---|---|
| 1 | Observability | Axiom integration commented out — no cloud logging in production | 🟡 Medium |
| 2 | Rate Limiting | No visible rate limiting on chat API endpoints | 🔴 High |
| 3 | Queue DLQ | No Dead Letter Queue for permanently failed jobs | 🟡 Medium |
| 4 | Embedding Fallback | Single embedding provider (Gemini) — no fallback if API is down | 🟡 Medium |
| 5 | Vector DB Backup | No documented Qdrant backup/snapshot strategy | 🟡 Medium |
| 6 | Test Coverage | No test files found in scanned directories | 🔴 High |
Recommendations
- Enable cloud logging — Uncomment Axiom or use Datadog / Grafana Loki for production log retention.
- Add rate limiting — Per-organization token-bucket on chat endpoints.
- Implement DLQ — BullMQ supports
deadLetterQueueoption; enable for all 3 queues. - Embedding fallback —
text-embedding-3-largefrom OpenAI (3072-dim compatible) as secondary. - Qdrant snapshots — Cron-based snapshot to S3 for disaster recovery.
- Test coverage — Integration tests for the RAG pipeline and ingestion workers as a starting priority.
Architectural Strengths
| Strength | Detail |
|---|---|
| Multi-shot RAG | 6-attempt adaptive retrieval with quality evaluation and circuit breaker |
| Auto-migration | Qdrant collection recreated transparently on embedding model change |
| Graceful Shutdown | All workers handle SIGTERM / SIGINT cleanly |
| Dedup Locks | Redis SET NX prevents parallel processing of the same datasource |
| Budget Management | Time-budget enforced at every checkpoint in the chat pipeline |
| 40+ Models | Unified OpenAI-SDK interface across 9 providers |
| True Multimodal | End-to-end multimodal embed, chunk, and search for images, audio, and video |
| Full Tenant Isolation | Separate Qdrant collections and S3 prefixes per organization |
Vocabulary
| Term | Description |
|---|---|
| Datastore | Container of Datasources — maps to a Qdrant collection (zw_{datastoreId}) |
| Datasource | Individual data source — web_page, file, youtube_video, etc. |
| AppDocument | Normalized document format output by all Loaders |
| Multi-Shot RAG | Adaptive retrieval with up to 6 progressive attempts across 3 quality tiers |
| Circuit Breaker | Stops RAG after 3 consecutive failures for 60s to prevent cascading errors |
| Deep Mode | High-threshold RAG mode triggered by shouldFavorDeepRag(query) |
| BullMQ-Pro | Redis-backed job queue with advanced scheduling and flow features |
| Qdrant | Vector database used for multimodal semantic search |
| Gemini Embedding 2 | Multimodal embedding model producing 3072-dimensional vectors |
| taskType | Asymmetric embedding mode: RETRIEVAL_DOCUMENT for ingestion, RETRIEVAL_QUERY for search |
| AgentsOnDatastores | M:N junction table linking Agents to Datastores |
| HNSW | Hierarchical Navigable Small World — approximate nearest-neighbor index in Qdrant |
· Last updated: March 2026

