Saltar al contenido principal
What you’ll learn: This page is the authoritative architecture reference for ZappWay, generated from a full codebase scan (commit 14982f37). It covers every major subsystem: the monorepo structure, database schema, queue/worker topology, ingestion loaders, Qdrant vector layer, embedding pipeline, adaptive multi-shot RAG, chat engine, auth, sync orchestrator, all 13 channel integrations, file storage, LLM model router, and observability.

🔢 Table of Contents

  1. Overall System Architecture
  2. Database Layer
  3. Queue & Worker Architecture
  4. Ingestion Engine & Loaders
  5. Vector Database Layer — Qdrant
  6. Embedding Pipeline
  7. RAG & Retrieval Pipeline
  8. Chat & Conversation Engine
  9. Multi-Tenant Isolation & Auth
  10. Sync Orchestrator
  11. External Integrations & Channels
  12. File Storage & Media Pipeline
  13. LLM Orchestration & Model Router
  14. Observability & Monitoring
  15. Gaps & Recommendations

1. Overall System Architecture

ZappWay is a pnpm monorepo structured into 3 applications and 2 package trees. The apps/zappway app hosts the Dashboard, Landing, Blog, and Docs. All core business logic lives in packages/zappway/lib (~160 files). A dedicated apps/workers-services app runs all background workers and the WhatsApp Bridge API in isolation, consuming jobs from Redis-backed BullMQ-Pro queues.

Application Map

AppPathFrameworkPurpose
Dashboardapps/zappway/dashboardNext.jsMain SaaS UI
Blogapps/zappway/blogNext.jsMarketing blog
Docsapps/zappway/docsMDX-basedProduct documentation
Landing Pageapps/zappway/lpNext.jsLanding page
Workers Servicesapps/workers-servicesNext.jsBackground workers + WA Bridge
ZappFluxapps/zappfluxNext.jsSecondary product (shared DB)

Package Map

PackagePathPurpose
@zappway/prismapackages/zappway/prismaPrisma schema, client, browser exports
@zappway/libpackages/zappway/libCore business logic (~160 files)
@zappway/integrationspackages/zappway/integrations13 channel adapters
@zappway/uipackages/zappway/uiShared React components

2. Database Layer

Source: packages/zappway/prisma/schema.prisma (1548 lines) · 33 models · 23 enums PostgreSQL via Prisma with fullTextSearch and fullTextIndex preview features enabled. The Organization model is the root tenant boundary — every other entity is scoped to it directly or transitively.

Full Model Inventory

#ModelTenant KeyPurpose
1UserPlatform user account
2Organizationid (root)Tenant container
3MembershiporganizationIdUser–Org relationship (RBAC)
4ApiKeyorganizationIdProgrammatic access
5UsageorganizationIdBilling usage tracking
6AgentorganizationIdAI Employee definition
7DatastoreorganizationIdKnowledge base container
8AppDatasourceorganizationIdIndividual data source
9ConversationorganizationIdChat session
10Messagevia ConversationIndividual message
11Toolvia AgentAgent tool definition
12Formvia AgentDynamic form
13FormSubmissionvia FormForm response
14ContactorganizationIdCRM contact
15SubscriptionorganizationIdBilling plan
16ProductStripe product
17PriceStripe price
18AccountuserIdOAuth provider link
19SessionorganizationIdAuth session
20VerificationTokenEmail verification
21ServiceProviderorganizationIdIntegration credentials
22ActionApprovalvia ConversationTool approval request
23Attachmentvia MessageMessage attachment
24ExternalIntegrationvia AgentWebhook/external config
25LLMTaskOutputorganizationIdAsync LLM task result
26MailInboxorganizationIdEmail inbox config
27ConversationCrmTagorganizationIdCRM tagging
28LeadorganizationIdLead capture
29XPBNPEvalA/B test evaluation
30AgentsOnDatastoresM:N agent–datastore join
31CouponBilling coupon
32UserApiKeyuserIdUser-level API key
33CrmLogorganizationIdCRM activity log

Key Enums

EnumSample Values
DatasourceTypeweb_page, web_site, text, file, google_drive_file, notion, youtube_video, youtube_bulk, qa, api
DatasourceStatusunsynched, pending, running, synched, error, usage_limit_reached, stalled
ConversationChanneldashboard, website, whatsapp, telegram, slack, crisp, form, mail, api, zapier, zendesk, messenger, instagram
AgentModelName40+ model enum values — see Section 13
SubscriptionPlanlevel_0, level_1, level_2, level_3, level_4
MembershipRoleOWNER, ADMIN, USER
GlobalRoleSUPERADMIN, CUSTOMER
ToolTypedatastore, http, form, mark_as_resolved, request_human, lead_capture, payment

3. Queue & Worker Architecture

Source: packages/zappway/lib/types/index.ts, apps/workers-services/workers/ Three dedicated BullMQ-Pro queues separate concerns. The load-datasource queue is the primary active queue; the other two are partially implemented.

Worker Configuration — Datasource Loader

ConfigValueEnv Override
Queue Nameload-datasource
Concurrency3DS_WORKER_CONCURRENCY
Lock Duration8 minDS_BULL_LOCK_DURATION_MS
Stalled Interval90sDS_BULL_STALLED_INTERVAL_MS
Max Stalled Count2DS_BULL_MAX_STALLED_COUNT
Dedupe Lock TTL10 minDS_JOB_LOCK_TTL_MS
Remove on CompleteLast 1000
Remove on FailLast 5000
Dedupe Mechanism: Redis SET NX with key ld:lock:{datasourceId} prevents the same datasource from running in parallel. Graceful Shutdown: Handles SIGTERM and SIGINT — closes the worker, quits Redis cleanly.

Worker Inventory

WorkerQueueStatus
datasource-loader.tsload-datasource✅ Active
check-and-sync-cron.tscron-based✅ Active
generate-rag.tsgenerate-rag-document✅ Active
index-media-memory.tsindex-media-memory✅ Active
check-stalled.tsinternal✅ Active
trial-expiry-check.tscron-based✅ Active
usage-reconciler.tscron-based✅ Active
scheduler.tsorchestrator✅ Active

4. Ingestion Engine & Loaders

Source: packages/zappway/lib/datastores/datasources/, packages/zappway/lib/loaders/ All loaders extend DatasourceLoaderBase. The entry point taskLoadDatasource() selects the correct loader at runtime based on DatasourceType. Output is a normalized AppDocument[] array fed into the chunking engine and then into Qdrant.

Loader Mapping

DatasourceTypeLoaderSource
web_pageWebPageLoaderURL fetch
web_siteWebSiteLoaderSitemap / crawl
fileFileLoaderS3 upload
textTextLoaderRaw text input
qaQALoaderQ&A pairs
youtube_videoYouTubeLoaderYouTube API
youtube_bulkYouTubeLoaderMulti-video
google_drive_fileGoogleDriveLoaderDrive API
google_drive_folderGoogleDriveLoaderDrive API
notionNotionLoaderNotion API
notion_pageNotionLoaderNotion API
apiAPILoaderHTTP endpoint

WebSite Loader — Pipeline Detail

Source: packages/zappway/lib/loaders/web-site.ts (515 lines) The WebSiteLoader is the most complex loader. Its pipeline runs in 6 stages: (1) Discovery — parses sitemap XML or crawls via findDomainPages(); (2) Normalization — URL dedup, strip UTM params, lowercase hostname; (3) Blacklist filtering — applies black_listed_urls config; (4) HTTP probing — HEAD → GET with semver path repair on 404s; (5) Child management — upserts web_page child datasources, deletes orphans; (6) Enqueueing — emits child jobs with priority scores (home = 5, sitemap = 8, others = 10). Concurrency is capped at 6 via mapWithConcurrency. Plan limit applied via accountConfig[plan].limits.maxWebsiteURL (default: 25).

5. Vector Database Layer — Qdrant

Source: packages/zappway/lib/datastores/qdrant.ts (731 lines) Each Datastore maps to exactly one Qdrant collection named zw_{datastoreId}, providing strict per-tenant vector isolation. QdrantManager includes an auto-migration routine that detects dimension or distance metric mismatches and recreates the collection transparently — enabling zero-downtime embedding model upgrades.

Collection Configuration

ParameterValue
Vector Size3072
Distance MetricCosine
Collection Namezw_{datastoreId}
HNSW m16
HNSW ef_construct100
QuantizationScalar Int8 — 99th percentile
On-disk Payloadtrue
Write Consistencymajority

Payload Schema (per point)

FieldTypePurpose
datastore_idstringTenant-scoped datastore
datasource_idstringSource document
datasource_hashstringContent hash for dedup
chunk_hashstringIndividual chunk hash
chunk_offsetintegerPosition in document
textstringOriginal text content
tagsstring[]User-defined tags
custom_idstringExternal reference
page_numberintegerPDF page number
total_pagesintegerTotal PDF pages

6. Embedding Pipeline

Source: packages/zappway/lib/datastores/gemini-embeddings.ts, packages/zappway/lib/multimodal-memory/ All embeddings use Gemini Embedding 2 Preview, producing 3072-dimensional vectors that match Qdrant’s VECTOR_SIZE. The pipeline uses asymmetric taskType values per call site — RETRIEVAL_DOCUMENT during ingestion and RETRIEVAL_QUERY at search time — which is critical for retrieval quality with Gemini’s asymmetric embedding model.

Configuration

ParameterValue
Modelgemini-embedding-2-preview
Output Dimensions3072
Batch Size100 (API limit per batchEmbedContents)
Authx-goog-api-key — env GOOGLE_AI_API_KEY

Multimodal Support

embedMultimodal() accepts GeminiPart[][] where each part can be { text } or { inlineData: { mimeType, data } } (base64-encoded images, video frames, audio, PDF pages). This embeds non-text content into the same 3072-dimensional space as text, enabling true multimodal semantic search. The multimodal-memory/ module provides dedicated indexing (indexer.ts — 24.5KB), media-specific chunkers (media-chunkers.ts), collection management, search, and async queue triggers.

7. RAG & Retrieval Pipeline

Source: packages/zappway/lib/chat-v4/rag.ts (493 lines) ZappWay implements Multi-Shot Adaptive RAG with up to 6 progressive retrieval attempts across 3 quality tiers. Each attempt relaxes similarity thresholds to maximize recall while evaluateRagQuality() enables early exit when results are strong enough. A circuit breaker prevents cascading failures, and ragMemo (in-memory cache) deduplicates identical attempts within a session.

Adaptive Thresholds

AttemptLabelDeep ModeNormalTopK
1strict0.550.4830
2balanced-high0.480.4245
3balanced0.420.3860
4balanced-low0.370.3275
5relaxed0.330.2890
6permissive0.300.25100
Deep Mode is triggered by shouldFavorDeepRag(query) and applies higher initial thresholds. minUcount is 1 for single-datastore and 2 for multi-datastore queries. Timeouts scale from 8s (Tier 1) to 12s (Tier 3), capped by remaining chat budget. If remainingMs() < 120s, max attempts reduce to 4.

8. Chat & Conversation Engine

Source: packages/zappway/lib/chat-v4/chat.ts (1147 lines), packages/zappway/lib/agent/tools/ The chat function orchestrates system prompt assembly, message history truncation, multi-shot RAG, runtime tool building, LLM execution with streaming, and the full model fallback chain — all within a shared time budget enforced at every checkpoint.

SSE Event Types

EventPurpose
answerStreamed text delta
tool_callTool invocation notification
endpoint_responseHTTP tool response data
stepProcessing step indicator
metadataModel usage, sources, context window
errorError notification
doneStream completion signal

Runtime Tools

Tool FileTool TypePurpose
datastore.tsdatastoreDynamic RAG within tool loop
form.tsformPresent and collect dynamic forms
http.tshttpCall external APIs
request-human.tsrequest_humanTransfer to human agent
lead-capture.tslead_captureCapture visitor info
payment.tspaymentInitiate payment flow
mark-as-resolved.tsmark_as_resolvedClose support ticket
ui.tsUI renderingCustom UI components

9. Multi-Tenant Isolation & Auth

Source: packages/zappway/lib/auth.ts (1661 lines) NextAuth v5 with a custom Prisma adapter. Four sign-in methods are supported. On first sign-in, the platform auto-provisions the full tenant stack atomically.

Tenant Isolation by Layer

LayerMechanism
DatabaseorganizationId on every tenant model
APISession middleware injects organization from session
Vector DBOne Qdrant collection per datastore (zw_{datastoreId})
File StorageS3 prefix organizations/{orgId}/
QueuesorganizationId carried in every job payload

RBAC Model

RoleScopeAccess
SUPERADMINGlobalFull system
CUSTOMERGlobalStandard user
OWNEROrganizationFull org management
ADMINOrganizationManage agents and datastores
USEROrganizationRead-only

Session Configuration

SettingProductionDevelopment
Strategydatabasedatabase
Cookie Name__Secure-next-auth.session-tokennext-auth.session-token
Cookie Domain.{rootDomain}undefined
Securetruefalse
Locale Support: 37 locales, including RTL (Arabic, Hebrew, Persian, Urdu). Resolution order: URL path → NEXT_LOCALE cookie → i18next cookie → Accept-Language header → default en.

10. Sync Orchestrator

Source: apps/workers-services/workers/check-and-sync-cron.ts A cron-driven worker scans all datasources with status = synched, filters by lastSyncAt + syncInterval, and fans out individual load-datasource jobs. A check-stalled worker handles recovery of datasources stuck in running status beyond a configurable threshold.

11. External Integrations & Channels

Source: packages/zappway/integrations/ — 13 channel adapters All channels normalize inbound messages to the same internal Conversation + Message model and route through the unified chat-v4 engine.

Channel Capabilities

ChannelDirectionStreamingForm Support
dashboardBidirectionalSSEYes
websiteBidirectionalSSEYes
whatsappBidirectionalNo (webhook)Limited
telegramBidirectionalNo (webhook)Limited
slackBidirectionalNo (webhook)No
crispBidirectionalNo (webhook)No
mailAsyncNoNo
apiRequest/ResponseOptionalNo
zapierWebhookNoNo
formForm submitNoYes
zendeskBidirectionalNo (webhook)No
messengerBidirectionalNo (webhook)No
instagramBidirectionalNo (webhook)No

ServiceProvider Auth

TypeAuth Mechanism
websiteAPI Key
whatsappCloud API token
telegramBot token
slackOAuth
crispAPI key + website ID
zendeskAPI token
messenger / instagramPage access token
notionIntegration token
google_driveOAuth refresh token
zapierWebhook URL

12. File Storage & Media Pipeline

Source: packages/zappway/lib/aws.ts Supports S3-compatible storage (AWS S3, Cloudflare R2, MinIO) via AWS SDK v3. Files are uploaded via presigned URLs, stored under organizations/{orgId}/, and pulled by FileLoader for extraction and ingestion.

S3 Environment Variables

VariablePurpose
APP_AWS_REGIONAWS region
APP_AWS_ACCESS_KEYAccess key ID
APP_AWS_SECRET_KEYSecret access key
APP_AWS_S3_ENDPOINTCustom endpoint (R2 / MinIO)
APP_AWS_S3_FORCE_PATH_STYLEPath-style for R2 / MinIO
APP_AWS_S3_BUCKETTarget bucket

13. LLM Orchestration & Model Router

Source: packages/zappway/lib/config.ts (943 lines), packages/zappway/lib/chat-model/model.ts (745 lines) 40+ models across 9 providers via a unified OpenAI-SDK-compatible interface. Provider routing is automatic based on each model’s baseUrl in ModelConfig. OpenAI, Google Gemini, and OpenRouter are all accessed through the same OpenAI SDK client with different base URLs and API keys.

Model Inventory

🧠 OpenAI — Direct

ModelMax TokensCost
GPT-4o / GPT-4o Mini128K15 / 4
GPT-5 / Mini / Nano400K15 / 5 / 4
GPT-5.1 / 5.1 Chat400K15 / 12
GPT-5.2 / 5.2 Chat / 5.2 Chat Latest400K20 / 15 / 15
GPT-5.3 Chat Latest / 5.3 Codex400K / 128K15
GPT-5.4 / 5.4 Mini / 5.4 Nano400K15 / 6 / 5

🟡 Google — Direct via Gemini API

ModelMax TokensCost
Gemini Flash 2.0 / 2.51M5 / 10
Gemini Pro 2.51M12
Gemini 3 Pro / 3 Flash1M15 / 10
Gemini 3 Pro Image64K15
Gemini 3.1 Flash Lite / Flash Image / Pro1M / 64K / 1M5 / 8 / 15

🔶 Anthropic — via OpenRouter

ModelMax TokensCost
Claude 3.5 Haiku / 3.5 Sonnet200K12 / 15
Claude 4 Sonnet200K15
Claude 4.5 Haiku / 4.5 Sonnet200K12 / 15
Claude Opus 4.5 / Opus 4.6 / Sonnet 4.6200K20 / 20 / 15

Other Providers — via OpenRouter

ModelProviderCost
DeepSeek V3.2 / R1DeepSeek5 / 8
Llama-4 Maverick / ScoutMeta10 / 5
Mistral Large 2512 / 8x22B / 8x7B / Small CreativeMistral5 / 5 / 2 / 2
Sonar ProPerplexity15
Grok 4 / 4.1 FastxAI15 / 5
Nova Premier / Nova 2 LiteAmazon12 / 6
LFM2 8B / LFM 2 24BLiquid1 / 1
MiniMax M2.7MiniMax5
MiMo-V2-Omni / MiMo-V2-ProXiaomi6 / 10
Free Tier Models: gpt_4o_mini, gpt_5_mini, gpt_5_nano, gpt_5_4_mini, gpt_5_4_nano, gemini_flash_2_0, lfm2_8b_a1b
Important: GPT-5 family models do not support manual temperature adjustment — they use automatic temperature recognition.

14. Observability & Monitoring

Source: packages/zappway/lib/logger.ts, apps/workers-services/sentry.*.config.ts
ComponentTechnologyStatus
Core LoggerPino✅ Active
Pretty Printpino-pretty✅ Active
Axiom Cloud@axiomhq/pino❌ Commented out
Chat LoggerPino child logChat✅ Active
Error TrackingSentry (client + server + edge)✅ Active
Workers publish a WorkerHealth payload to Redis (health:worker:{name}, 30s TTL) every 10s. Fields: status, startedAt, lastActivityAt, jobsProcessed, jobsFailed, queueLength, system.memoryMB, system.uptimeMs. Exposed via /api/workers/health. Workers also stream real-time logs to Redis Pub/Sub channel logs:datasource for live dashboard monitoring.

15. Gaps & Recommendations

Identified Gaps

#AreaGapSeverity
1ObservabilityAxiom integration commented out — no cloud logging in production🟡 Medium
2Rate LimitingNo visible rate limiting on chat API endpoints🔴 High
3Queue DLQNo Dead Letter Queue for permanently failed jobs🟡 Medium
4Embedding FallbackSingle embedding provider (Gemini) — no fallback if API is down🟡 Medium
5Vector DB BackupNo documented Qdrant backup/snapshot strategy🟡 Medium
6Test CoverageNo test files found in scanned directories🔴 High

Recommendations

  1. Enable cloud logging — Uncomment Axiom or use Datadog / Grafana Loki for production log retention.
  2. Add rate limiting — Per-organization token-bucket on chat endpoints.
  3. Implement DLQ — BullMQ supports deadLetterQueue option; enable for all 3 queues.
  4. Embedding fallbacktext-embedding-3-large from OpenAI (3072-dim compatible) as secondary.
  5. Qdrant snapshots — Cron-based snapshot to S3 for disaster recovery.
  6. Test coverage — Integration tests for the RAG pipeline and ingestion workers as a starting priority.

Architectural Strengths

StrengthDetail
Multi-shot RAG6-attempt adaptive retrieval with quality evaluation and circuit breaker
Auto-migrationQdrant collection recreated transparently on embedding model change
Graceful ShutdownAll workers handle SIGTERM / SIGINT cleanly
Dedup LocksRedis SET NX prevents parallel processing of the same datasource
Budget ManagementTime-budget enforced at every checkpoint in the chat pipeline
40+ ModelsUnified OpenAI-SDK interface across 9 providers
True MultimodalEnd-to-end multimodal embed, chunk, and search for images, audio, and video
Full Tenant IsolationSeparate Qdrant collections and S3 prefixes per organization

Vocabulary

TermDescription
DatastoreContainer of Datasources — maps to a Qdrant collection (zw_{datastoreId})
DatasourceIndividual data source — web_page, file, youtube_video, etc.
AppDocumentNormalized document format output by all Loaders
Multi-Shot RAGAdaptive retrieval with up to 6 progressive attempts across 3 quality tiers
Circuit BreakerStops RAG after 3 consecutive failures for 60s to prevent cascading errors
Deep ModeHigh-threshold RAG mode triggered by shouldFavorDeepRag(query)
BullMQ-ProRedis-backed job queue with advanced scheduling and flow features
QdrantVector database used for multimodal semantic search
Gemini Embedding 2Multimodal embedding model producing 3072-dimensional vectors
taskTypeAsymmetric embedding mode: RETRIEVAL_DOCUMENT for ingestion, RETRIEVAL_QUERY for search
AgentsOnDatastoresM:N junction table linking Agents to Datastores
HNSWHierarchical Navigable Small World — approximate nearest-neighbor index in Qdrant

· Last updated: March 2026