Platform Architecture - ZappWay - Connect custom data to large language models

What you’ll learn: This page is the authoritative architecture reference for ZappWay, generated from a full codebase scan (commit 14982f37). It covers every major subsystem: the monorepo structure, database schema, queue/worker topology, ingestion loaders, Qdrant vector layer, embedding pipeline, adaptive multi-shot RAG, chat engine, auth, sync orchestrator, all 13 channel integrations, file storage, LLM model router, and observability.

1. Overall System Architecture

ZappWay is a pnpm monorepo structured into 3 applications and 2 package trees. The apps/zappway app hosts the Dashboard, Landing, Blog, and Docs. All core business logic lives in packages/zappway/lib (~160 files). A dedicated apps/workers-services app runs all background workers and the WhatsApp Bridge API in isolation, consuming jobs from Redis-backed BullMQ-Pro queues.

Application Map

App	Path	Framework	Purpose
Dashboard	`apps/zappway/dashboard`	Next.js	Main SaaS UI
Blog	`apps/zappway/blog`	Next.js	Marketing blog
Docs	`apps/zappway/docs`	MDX-based	Product documentation
Landing Page	`apps/zappway/lp`	Next.js	Landing page
Workers Services	`apps/workers-services`	Next.js	Background workers + WA Bridge
ZappFlux	`apps/zappflux`	Next.js	Secondary product (shared DB)

Package Map

Package	Path	Purpose
`@zappway/prisma`	`packages/zappway/prisma`	Prisma schema, client, browser exports
`@zappway/lib`	`packages/zappway/lib`	Core business logic (~160 files)
`@zappway/integrations`	`packages/zappway/integrations`	13 channel adapters
`@zappway/ui`	`packages/zappway/ui`	Shared React components

2. Database Layer

Source: packages/zappway/prisma/schema.prisma (1548 lines) · 33 models · 23 enums PostgreSQL via Prisma with fullTextSearch and fullTextIndex preview features enabled. The Organization model is the root tenant boundary — every other entity is scoped to it directly or transitively.

Full Model Inventory

#	Model	Tenant Key	Purpose
1	`User`	—	Platform user account
2	`Organization`	`id` (root)	Tenant container
3	`Membership`	`organizationId`	User–Org relationship (RBAC)
4	`ApiKey`	`organizationId`	Programmatic access
5	`Usage`	`organizationId`	Billing usage tracking
6	`Agent`	`organizationId`	AI Employee definition
7	`Datastore`	`organizationId`	Knowledge base container
8	`AppDatasource`	`organizationId`	Individual data source
9	`Conversation`	`organizationId`	Chat session
10	`Message`	via Conversation	Individual message
11	`Tool`	via Agent	Agent tool definition
12	`Form`	via Agent	Dynamic form
13	`FormSubmission`	via Form	Form response
14	`Contact`	`organizationId`	CRM contact
15	`Subscription`	`organizationId`	Billing plan
16	`Product`	—	Stripe product
17	`Price`	—	Stripe price
18	`Account`	`userId`	OAuth provider link
19	`Session`	`organizationId`	Auth session
20	`VerificationToken`	—	Email verification
21	`ServiceProvider`	`organizationId`	Integration credentials
22	`ActionApproval`	via Conversation	Tool approval request
23	`Attachment`	via Message	Message attachment
24	`ExternalIntegration`	via Agent	Webhook/external config
25	`LLMTaskOutput`	`organizationId`	Async LLM task result
26	`MailInbox`	`organizationId`	Email inbox config
27	`ConversationCrmTag`	`organizationId`	CRM tagging
28	`Lead`	`organizationId`	Lead capture
29	`XPBNPEval`	—	A/B test evaluation
30	`AgentsOnDatastores`	—	M:N agent–datastore join
31	`Coupon`	—	Billing coupon
32	`UserApiKey`	`userId`	User-level API key
33	`CrmLog`	`organizationId`	CRM activity log

Key Enums

Enum	Sample Values
`DatasourceType`	`web_page`, `web_site`, `text`, `file`, `google_drive_file`, `notion`, `youtube_video`, `youtube_bulk`, `qa`, `api`
`DatasourceStatus`	`unsynched`, `pending`, `running`, `synched`, `error`, `usage_limit_reached`, `stalled`
`ConversationChannel`	`dashboard`, `website`, `whatsapp`, `telegram`, `slack`, `crisp`, `form`, `mail`, `api`, `zapier`, `zendesk`, `messenger`, `instagram`
`AgentModelName`	50+ model enum values — see Section 13
`SubscriptionPlan`	`level_0`, `level_1`, `level_2`, `level_3`, `level_4`
`MembershipRole`	`OWNER`, `ADMIN`, `USER`
`GlobalRole`	`SUPERADMIN`, `CUSTOMER`
`ToolType`	`datastore`, `http`, `form`, `mark_as_resolved`, `request_human`, `lead_capture`, `payment`

3. Queue & Worker Architecture

Source: packages/zappway/lib/types/index.ts, apps/workers-services/workers/ Three dedicated BullMQ-Pro queues separate concerns. The load-datasource queue is the primary active queue; the other two are partially implemented.

Worker Configuration — Datasource Loader

Config	Value	Env Override
Queue Name	`load-datasource`	—
Concurrency	3	`DS_WORKER_CONCURRENCY`
Lock Duration	8 min	`DS_BULL_LOCK_DURATION_MS`
Stalled Interval	90s	`DS_BULL_STALLED_INTERVAL_MS`
Max Stalled Count	2	`DS_BULL_MAX_STALLED_COUNT`
Dedupe Lock TTL	10 min	`DS_JOB_LOCK_TTL_MS`
Remove on Complete	Last 1000	—
Remove on Fail	Last 5000	—

Dedupe Mechanism: Redis SET NX with key ld:lock:{datasourceId} prevents the same datasource from running in parallel. Graceful Shutdown: Handles SIGTERM and SIGINT — closes the worker, quits Redis cleanly.

Worker Inventory

Worker	Queue	Status
`datasource-loader.ts`	`load-datasource`	✅ Active
`check-and-sync-cron.ts`	cron-based	✅ Active
`generate-rag.ts`	`generate-rag-document`	✅ Active
`index-media-memory.ts`	`index-media-memory`	✅ Active
`check-stalled.ts`	internal	✅ Active
`trial-expiry-check.ts`	cron-based	✅ Active
`usage-reconciler.ts`	cron-based	✅ Active
`scheduler.ts`	orchestrator	✅ Active

4. Ingestion Engine & Loaders

Source: packages/zappway/lib/datastores/datasources/, packages/zappway/lib/loaders/ All loaders extend DatasourceLoaderBase. The entry point taskLoadDatasource() selects the correct loader at runtime based on DatasourceType. Output is a normalized AppDocument[] array fed into the chunking engine and then into Qdrant.

Loader Mapping

DatasourceType	Loader	Source
`web_page`	`WebPageLoader`	URL fetch
`web_site`	`WebSiteLoader`	Sitemap / crawl
`file`	`FileLoader`	S3 upload
`text`	`TextLoader`	Raw text input
`qa`	`QALoader`	Q&A pairs
`youtube_video`	`YouTubeLoader`	YouTube API
`youtube_bulk`	`YouTubeLoader`	Multi-video
`google_drive_file`	`GoogleDriveLoader`	Drive API
`google_drive_folder`	`GoogleDriveLoader`	Drive API
`notion`	`NotionLoader`	Notion API
`notion_page`	`NotionLoader`	Notion API
`api`	`APILoader`	HTTP endpoint

WebSite Loader — Pipeline Detail

Source: packages/zappway/lib/loaders/web-site.ts (515 lines) The WebSiteLoader is the most complex loader. Its pipeline runs in 6 stages: (1) Discovery — parses sitemap XML or crawls via findDomainPages(); (2) Normalization — URL dedup, strip UTM params, lowercase hostname; (3) Blacklist filtering — applies black_listed_urls config; (4) HTTP probing — HEAD → GET with semver path repair on 404s; (5) Child management — upserts web_page child datasources, deletes orphans; (6) Enqueueing — emits child jobs with priority scores (home = 5, sitemap = 8, others = 10). Concurrency is capped at 6 via mapWithConcurrency. Plan limit applied via accountConfig[plan].limits.maxWebsiteURL (default: 25).

5. Vector Database Layer — Qdrant

Source: packages/zappway/lib/datastores/qdrant.ts (731 lines) Each Datastore maps to exactly one Qdrant collection named zw_{datastoreId}, providing strict per-tenant vector isolation. QdrantManager includes an auto-migration routine that detects dimension or distance metric mismatches and recreates the collection transparently — enabling zero-downtime embedding model upgrades.

Collection Configuration

Parameter	Value
Vector Size	3072
Distance Metric	Cosine
Collection Name	`zw_{datastoreId}`
HNSW `m`	16
HNSW `ef_construct`	100
Quantization	Scalar Int8 — 99th percentile
On-disk Payload	`true`
Write Consistency	`majority`

Payload Schema (per point)

Field	Type	Purpose
`datastore_id`	string	Tenant-scoped datastore
`datasource_id`	string	Source document
`datasource_hash`	string	Content hash for dedup
`chunk_hash`	string	Individual chunk hash
`chunk_offset`	integer	Position in document
`text`	string	Original text content
`tags`	string[]	User-defined tags
`custom_id`	string	External reference
`page_number`	integer	PDF page number
`total_pages`	integer	Total PDF pages

6. Embedding Pipeline

Source: packages/zappway/lib/datastores/gemini-embeddings.ts, packages/zappway/lib/multimodal-memory/ All embeddings use Gemini Embedding 2 Preview, producing 3072-dimensional vectors that match Qdrant’s VECTOR_SIZE. The pipeline uses asymmetric taskType values per call site — RETRIEVAL_DOCUMENT during ingestion and RETRIEVAL_QUERY at search time — which is critical for retrieval quality with Gemini’s asymmetric embedding model.

Configuration

Parameter	Value
Model	`gemini-embedding-2-preview`
Output Dimensions	3072
Batch Size	100 (API limit per `batchEmbedContents`)
Auth	`x-goog-api-key` — env `GOOGLE_AI_API_KEY`

Multimodal Support

embedMultimodal() accepts GeminiPart[][] where each part can be { text } or { inlineData: { mimeType, data } } (base64-encoded images, video frames, audio, PDF pages). This embeds non-text content into the same 3072-dimensional space as text, enabling true multimodal semantic search. The multimodal-memory/ module provides dedicated indexing (indexer.ts — 24.5KB), media-specific chunkers (media-chunkers.ts), collection management, search, and async queue triggers.

7. RAG & Retrieval Pipeline

Source: packages/zappway/lib/chat-v4/rag.ts (493 lines) ZappWay implements Multi-Shot Adaptive RAG with up to 6 progressive retrieval attempts across 3 quality tiers. Each attempt relaxes similarity thresholds to maximize recall while evaluateRagQuality() enables early exit when results are strong enough. A circuit breaker prevents cascading failures, and ragMemo (in-memory cache) deduplicates identical attempts within a session.

Adaptive Thresholds

Attempt	Label	Deep Mode	Normal	TopK
1	`strict`	0.55	0.48	30
2	`balanced-high`	0.48	0.42	45
3	`balanced`	0.42	0.38	60
4	`balanced-low`	0.37	0.32	75
5	`relaxed`	0.33	0.28	90
6	`permissive`	0.30	0.25	100

Deep Mode is triggered by shouldFavorDeepRag(query) and applies higher initial thresholds. minUcount is 1 for single-datastore and 2 for multi-datastore queries. Timeouts scale from 8s (Tier 1) to 12s (Tier 3), capped by remaining chat budget. If remainingMs() < 120s, max attempts reduce to 4.

8. Chat & Conversation Engine

Source: packages/zappway/lib/chat-v4/chat.ts (1147 lines), packages/zappway/lib/agent/tools/ The chat function orchestrates system prompt assembly, message history truncation, multi-shot RAG, runtime tool building, LLM execution with streaming, and the full model fallback chain — all within a shared time budget enforced at every checkpoint.

SSE Event Types

Event	Purpose
`answer`	Streamed text delta
`tool_call`	Tool invocation notification
`endpoint_response`	HTTP tool response data
`step`	Processing step indicator
`metadata`	Model usage, sources, context window
`error`	Error notification
`done`	Stream completion signal

Runtime Tools

Tool File	Tool Type	Purpose
`datastore.ts`	`datastore`	Dynamic RAG within tool loop
`form.ts`	`form`	Present and collect dynamic forms
`http.ts`	`http`	Call external APIs
`request-human.ts`	`request_human`	Transfer to human agent
`lead-capture.ts`	`lead_capture`	Capture visitor info
`payment.ts`	`payment`	Initiate payment flow
`mark-as-resolved.ts`	`mark_as_resolved`	Close support ticket
`ui.ts`	UI rendering	Custom UI components

9. Multi-Tenant Isolation & Auth

Source: packages/zappway/lib/auth.ts (1661 lines) NextAuth v5 with a custom Prisma adapter. Four sign-in methods are supported. On first sign-in, the platform auto-provisions the full tenant stack atomically.

Tenant Isolation by Layer

Layer	Mechanism
Database	`organizationId` on every tenant model
API	Session middleware injects `organization` from session
Vector DB	One Qdrant collection per datastore (`zw_{datastoreId}`)
File Storage	S3 prefix `organizations/{orgId}/`
Queues	`organizationId` carried in every job payload

RBAC Model

Role	Scope	Access
`SUPERADMIN`	Global	Full system
`CUSTOMER`	Global	Standard user
`OWNER`	Organization	Full org management
`ADMIN`	Organization	Manage agents and datastores
`USER`	Organization	Read-only

Session Configuration

Setting	Production	Development
Strategy	`database`	`database`
Cookie Name	`__Secure-next-auth.session-token`	`next-auth.session-token`
Cookie Domain	`.{rootDomain}`	`undefined`
Secure	`true`	`false`

Locale Support: 37 locales, including RTL (Arabic, Hebrew, Persian, Urdu). Resolution order: URL path → NEXT_LOCALE cookie → i18next cookie → Accept-Language header → default en.

10. Sync Orchestrator

Source: apps/workers-services/workers/check-and-sync-cron.ts A cron-driven worker scans all datasources with status = synched, filters by lastSyncAt + syncInterval, and fans out individual load-datasource jobs. A check-stalled worker handles recovery of datasources stuck in running status beyond a configurable threshold.

11. External Integrations & Channels

Source: packages/zappway/integrations/ — 13 channel adapters All channels normalize inbound messages to the same internal Conversation + Message model and route through the unified chat-v4 engine.

Channel Capabilities

Channel	Direction	Streaming	Form Support
`dashboard`	Bidirectional	SSE	Yes
`website`	Bidirectional	SSE	Yes
`whatsapp`	Bidirectional	No (webhook)	Limited
`telegram`	Bidirectional	No (webhook)	Limited
`slack`	Bidirectional	No (webhook)	No
`crisp`	Bidirectional	No (webhook)	No
`mail`	Async	No	No
`api`	Request/Response	Optional	No
`zapier`	Webhook	No	No
`form`	Form submit	No	Yes
`zendesk`	Bidirectional	No (webhook)	No
`messenger`	Bidirectional	No (webhook)	No
`instagram`	Bidirectional	No (webhook)	No

ServiceProvider Auth

Type	Auth Mechanism
`website`	API Key
`whatsapp`	Cloud API token
`telegram`	Bot token
`slack`	OAuth
`crisp`	API key + website ID
`zendesk`	API token
`messenger` / `instagram`	Page access token
`notion`	Integration token
`google_drive`	OAuth refresh token
`zapier`	Webhook URL

12. File Storage & Media Pipeline

Source: packages/zappway/lib/aws.ts Supports S3-compatible storage (AWS S3, Cloudflare R2, MinIO) via AWS SDK v3. Files are uploaded via presigned URLs, stored under organizations/{orgId}/, and pulled by FileLoader for extraction and ingestion.

S3 Environment Variables

Variable	Purpose
`APP_AWS_REGION`	AWS region
`APP_AWS_ACCESS_KEY`	Access key ID
`APP_AWS_SECRET_KEY`	Secret access key
`APP_AWS_S3_ENDPOINT`	Custom endpoint (R2 / MinIO)
`APP_AWS_S3_FORCE_PATH_STYLE`	Path-style for R2 / MinIO
`APP_AWS_S3_BUCKET`	Target bucket

13. LLM Orchestration & Model Router

Source: packages/zappway/lib/config.ts (943 lines), packages/zappway/lib/chat-model/model.ts (745 lines) 50+ models across 12 providers via a unified OpenAI-SDK-compatible interface. Provider routing is automatic based on each model’s baseUrl in ModelConfig. OpenAI, Google Gemini, and OpenRouter are all accessed through the same OpenAI SDK client with different base URLs and API keys.

Model Inventory

🧠 OpenAI — Direct

Model	Max Tokens	Cost
GPT-4o / GPT-4o Mini	128K	15 / 4
GPT-5 / Mini / Nano	400K	15 / 5 / 4
GPT-5.1 / 5.1 Chat	400K	15 / 12
GPT-5.2 / 5.2 Chat / 5.2 Chat Latest	400K	20 / 15 / 15
GPT-5.3 Chat Latest / 5.3 Codex	400K / 128K	15
GPT-5.4 / 5.4 Mini / 5.4 Nano	400K	15 / 6 / 5

🟡 Google — Direct via Gemini API

Model	Max Tokens	Cost
Gemini Flash 2.0 / 2.5	1M	5 / 10
Gemini Pro 2.5	1M	12
Gemini 3 Pro / 3 Flash	1M	15 / 10
Gemini 3 Pro Image	64K	15
Gemini 3.1 Flash Lite / Flash Image / Pro	1M / 64K / 1M	5 / 8 / 15

🔶 Anthropic — via OpenRouter

Model	Max Tokens	Cost
Claude 3.5 Haiku / 3.5 Sonnet	200K	12 / 15
Claude 4 Sonnet	200K	15
Claude 4.5 Haiku / 4.5 Sonnet	200K	12 / 15
Claude Opus 4.5 / Opus 4.6 / Sonnet 4.6	200K	20 / 20 / 15

Other Providers — via OpenRouter

Model	Provider	Cost
DeepSeek V3.2 / R1	DeepSeek	5 / 8
Llama-4 Maverick / Scout	Meta	10 / 5
Mistral Large 2512 / 8x22B / 8x7B / Small Creative	Mistral	5 / 5 / 2 / 2
Sonar Pro	Perplexity	15
Grok 4 / 4.1 Fast	xAI	15 / 5
Nova Premier / Nova 2 Lite	Amazon	12 / 6
LFM2 8B / LFM 2 24B	Liquid	1 / 1
MiniMax M2.7	MiniMax	5
MiMo-V2-Omni / MiMo-V2-Pro	Xiaomi	6 / 10

Free Tier Models: gpt_4o_mini, gpt_5_mini, gpt_5_nano, gpt_5_4_mini, gpt_5_4_nano, gemini_flash_2_0

Important: GPT-5 family models do not support manual temperature adjustment — they use automatic temperature recognition.

14. Observability & Monitoring

Source: packages/zappway/lib/logger.ts, apps/workers-services/sentry.*.config.ts

Component	Technology	Status
Core Logger	Pino	✅ Active
Pretty Print	pino-pretty	✅ Active
Axiom Cloud	@axiomhq/pino	❌ Commented out
Chat Logger	Pino child `logChat`	✅ Active
Error Tracking	Sentry (client + server + edge)	✅ Active

Workers publish a WorkerHealth payload to Redis (health:worker:{name}, 30s TTL) every 10s. Fields: status, startedAt, lastActivityAt, jobsProcessed, jobsFailed, queueLength, system.memoryMB, system.uptimeMs. Exposed via /api/workers/health. Workers also stream real-time logs to Redis Pub/Sub channel logs:datasource for live dashboard monitoring.

15. Gaps & Recommendations

Identified Gaps

#	Area	Gap	Severity
1	Observability	Axiom integration commented out — no cloud logging in production	🟡 Medium
2	Rate Limiting	No visible rate limiting on chat API endpoints	🔴 High
3	Queue DLQ	No Dead Letter Queue for permanently failed jobs	🟡 Medium
4	Embedding Fallback	Single embedding provider (Gemini) — no fallback if API is down	🟡 Medium
5	Vector DB Backup	No documented Qdrant backup/snapshot strategy	🟡 Medium
6	Test Coverage	No test files found in scanned directories	🔴 High

Recommendations

Enable cloud logging — Uncomment Axiom or use Datadog / Grafana Loki for production log retention.
Add rate limiting — Per-organization token-bucket on chat endpoints.
Implement DLQ — BullMQ supports deadLetterQueue option; enable for all 3 queues.
Embedding fallback — text-embedding-3-large from OpenAI (3072-dim compatible) as secondary.
Qdrant snapshots — Cron-based snapshot to S3 for disaster recovery.
Test coverage — Integration tests for the RAG pipeline and ingestion workers as a starting priority.

Architectural Strengths

Strength	Detail
Multi-shot RAG	6-attempt adaptive retrieval with quality evaluation and circuit breaker
Auto-migration	Qdrant collection recreated transparently on embedding model change
Graceful Shutdown	All workers handle `SIGTERM` / `SIGINT` cleanly
Dedup Locks	Redis `SET NX` prevents parallel processing of the same datasource
Budget Management	Time-budget enforced at every checkpoint in the chat pipeline
50+ Models	Unified OpenAI-SDK interface across 12 providers
True Multimodal	End-to-end multimodal embed, chunk, and search for images, audio, and video
Full Tenant Isolation	Separate Qdrant collections and S3 prefixes per organization

Vocabulary

Term	Description
Datastore	Container of Datasources — maps to a Qdrant collection (`zw_{datastoreId}`)
Datasource	Individual data source — `web_page`, `file`, `youtube_video`, etc.
AppDocument	Normalized document format output by all Loaders
Multi-Shot RAG	Adaptive retrieval with up to 6 progressive attempts across 3 quality tiers
Circuit Breaker	Stops RAG after 3 consecutive failures for 60s to prevent cascading errors
Deep Mode	High-threshold RAG mode triggered by `shouldFavorDeepRag(query)`
BullMQ-Pro	Redis-backed job queue with advanced scheduling and flow features
Qdrant	Vector database used for multimodal semantic search
Gemini Embedding 2	Multimodal embedding model producing 3072-dimensional vectors
taskType	Asymmetric embedding mode: `RETRIEVAL_DOCUMENT` for ingestion, `RETRIEVAL_QUERY` for search
AgentsOnDatastores	M:N junction table linking Agents to Datastores
HNSW	Hierarchical Navigable Small World — approximate nearest-neighbor index in Qdrant

· Last updated: March 2026

​🔢 Table of Contents

​1. Overall System Architecture

​Application Map

​Package Map

​2. Database Layer

​Full Model Inventory

​Key Enums

​3. Queue & Worker Architecture

​Worker Configuration — Datasource Loader

​Worker Inventory

​4. Ingestion Engine & Loaders

​Loader Mapping

​WebSite Loader — Pipeline Detail

​5. Vector Database Layer — Qdrant

​Collection Configuration

​Payload Schema (per point)

​6. Embedding Pipeline

​Configuration

​Multimodal Support

​7. RAG & Retrieval Pipeline

​Adaptive Thresholds

​8. Chat & Conversation Engine

​SSE Event Types

​Runtime Tools

​9. Multi-Tenant Isolation & Auth

​Tenant Isolation by Layer

​RBAC Model

​Session Configuration

​10. Sync Orchestrator

​11. External Integrations & Channels

​Channel Capabilities

​ServiceProvider Auth

​12. File Storage & Media Pipeline

​S3 Environment Variables

​13. LLM Orchestration & Model Router

​Model Inventory

​🧠 OpenAI — Direct

​🟡 Google — Direct via Gemini API

​🔶 Anthropic — via OpenRouter

​Other Providers — via OpenRouter

​14. Observability & Monitoring

​15. Gaps & Recommendations

​Identified Gaps

​Recommendations

​Architectural Strengths

​Vocabulary