> ## Documentation Index
> Fetch the complete documentation index at: https://docs.zappway.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Platform Architecture

> Complete architecture reference for the ZappWay platform — monorepo structure, ingestion pipeline, multimodal RAG, vector storage, LLM routing, chat engine, sync orchestration, and all integrations.

> **What you'll learn:**
> This page is the authoritative architecture reference for ZappWay, generated from a full codebase scan (commit `14982f37`). It covers every major subsystem: the monorepo structure, database schema, queue/worker topology, ingestion loaders, Qdrant vector layer, embedding pipeline, adaptive multi-shot RAG, chat engine, auth, sync orchestrator, all 13 channel integrations, file storage, LLM model router, and observability.

***

## 🔢 Table of Contents

1. [Overall System Architecture](#1-overall-system-architecture)
2. [Database Layer](#2-database-layer)
3. [Queue & Worker Architecture](#3-queue--worker-architecture)
4. [Ingestion Engine & Loaders](#4-ingestion-engine--loaders)
5. [Vector Database Layer — Qdrant](#5-vector-database-layer--qdrant)
6. [Embedding Pipeline](#6-embedding-pipeline)
7. [RAG & Retrieval Pipeline](#7-rag--retrieval-pipeline)
8. [Chat & Conversation Engine](#8-chat--conversation-engine)
9. [Multi-Tenant Isolation & Auth](#9-multi-tenant-isolation--auth)
10. [Sync Orchestrator](#10-sync-orchestrator)
11. [External Integrations & Channels](#11-external-integrations--channels)
12. [File Storage & Media Pipeline](#12-file-storage--media-pipeline)
13. [LLM Orchestration & Model Router](#13-llm-orchestration--model-router)
14. [Observability & Monitoring](#14-observability--monitoring)
15. [Gaps & Recommendations](#15-gaps--recommendations)

***

## 1. Overall System Architecture

ZappWay is a **pnpm monorepo** structured into 3 applications and 2 package trees. The `apps/zappway` app hosts the Dashboard, Landing, Blog, and Docs. All core business logic lives in `packages/zappway/lib` (\~160 files). A dedicated `apps/workers-services` app runs all background workers and the WhatsApp Bridge API in isolation, consuming jobs from Redis-backed BullMQ-Pro queues.

```mermaid theme={null}
graph TB
    subgraph "Apps Layer"
        ZW["apps/zappway — Dashboard + Blog + Docs + LP"]
        WS["apps/workers-services — BullMQ Workers + WA Bridge"]
        ZF["apps/zappflux — Secondary SaaS Product"]
    end

    subgraph "Packages Layer"
        PKG_ZW["packages/zappway — Core Business Logic"]
        PKG_ZF["packages/zappflux — ZappFlux Logic"]
    end

    subgraph "Infrastructure"
        PG["PostgreSQL — Prisma ORM"]
        QD["Qdrant — Vector DB"]
        RD["Redis — BullMQ-Pro Queues"]
        S3["S3 / R2 — File Storage"]
    end

    subgraph "External Services"
        LLM["LLM Providers — OpenAI + Google + OpenRouter"]
        SENTRY["Sentry — Error Tracking"]
        SMTP["SMTP — Email Verification"]
    end

    ZW --> PKG_ZW
    WS --> PKG_ZW
    ZF --> PKG_ZF
    PKG_ZW --> PG
    PKG_ZW --> QD
    PKG_ZW --> RD
    PKG_ZW --> S3
    PKG_ZW --> LLM
    WS --> SENTRY
    ZW --> SENTRY
    PKG_ZW --> SMTP
```

### Application Map

| App              | Path                     | Framework | Purpose                        |
| ---------------- | ------------------------ | --------- | ------------------------------ |
| Dashboard        | `apps/zappway/dashboard` | Next.js   | Main SaaS UI                   |
| Blog             | `apps/zappway/blog`      | Next.js   | Marketing blog                 |
| Docs             | `apps/zappway/docs`      | MDX-based | Product documentation          |
| Landing Page     | `apps/zappway/lp`        | Next.js   | Landing page                   |
| Workers Services | `apps/workers-services`  | Next.js   | Background workers + WA Bridge |
| ZappFlux         | `apps/zappflux`          | Next.js   | Secondary product (shared DB)  |

### Package Map

| Package                 | Path                            | Purpose                                |
| ----------------------- | ------------------------------- | -------------------------------------- |
| `@zappway/prisma`       | `packages/zappway/prisma`       | Prisma schema, client, browser exports |
| `@zappway/lib`          | `packages/zappway/lib`          | Core business logic (\~160 files)      |
| `@zappway/integrations` | `packages/zappway/integrations` | 13 channel adapters                    |
| `@zappway/ui`           | `packages/zappway/ui`           | Shared React components                |

***

## 2. Database Layer

**Source**: `packages/zappway/prisma/schema.prisma` (1548 lines) · 33 models · 23 enums

PostgreSQL via Prisma with `fullTextSearch` and `fullTextIndex` preview features enabled. The `Organization` model is the root tenant boundary — every other entity is scoped to it directly or transitively.

```mermaid theme={null}
erDiagram
    Organization ||--o{ Membership : "has members"
    Organization ||--o{ Agent : "owns agents"
    Organization ||--o{ Datastore : "owns datastores"
    Organization ||--o{ Subscription : "billing"
    Organization ||--o{ ApiKey : "API access"
    Organization ||--o{ Usage : "tracks usage"
    User ||--o{ Membership : "belongs to orgs"
    User ||--o{ Subscription : "has subscriptions"
    User ||--o{ Account : "OAuth accounts"
    Agent ||--o{ Conversation : "has conversations"
    Agent ||--o{ Tool : "has tools"
    Agent ||--o{ Form : "has forms"
    Agent ||--o{ Contact : "manages contacts"
    Datastore ||--o{ AppDatasource : "contains sources"
    Datastore }o--o{ Agent : "via AgentsOnDatastores"
    Conversation ||--o{ Message : "has messages"
    Conversation ||--o{ ActionApproval : "pending approvals"
    Form ||--o{ FormSubmission : "collects submissions"
```

### Full Model Inventory

| #  | Model                 | Tenant Key       | Purpose                      |
| -- | --------------------- | ---------------- | ---------------------------- |
| 1  | `User`                | —                | Platform user account        |
| 2  | `Organization`        | `id` (root)      | Tenant container             |
| 3  | `Membership`          | `organizationId` | User–Org relationship (RBAC) |
| 4  | `ApiKey`              | `organizationId` | Programmatic access          |
| 5  | `Usage`               | `organizationId` | Billing usage tracking       |
| 6  | `Agent`               | `organizationId` | AI Employee definition       |
| 7  | `Datastore`           | `organizationId` | Knowledge base container     |
| 8  | `AppDatasource`       | `organizationId` | Individual data source       |
| 9  | `Conversation`        | `organizationId` | Chat session                 |
| 10 | `Message`             | via Conversation | Individual message           |
| 11 | `Tool`                | via Agent        | Agent tool definition        |
| 12 | `Form`                | via Agent        | Dynamic form                 |
| 13 | `FormSubmission`      | via Form         | Form response                |
| 14 | `Contact`             | `organizationId` | CRM contact                  |
| 15 | `Subscription`        | `organizationId` | Billing plan                 |
| 16 | `Product`             | —                | Stripe product               |
| 17 | `Price`               | —                | Stripe price                 |
| 18 | `Account`             | `userId`         | OAuth provider link          |
| 19 | `Session`             | `organizationId` | Auth session                 |
| 20 | `VerificationToken`   | —                | Email verification           |
| 21 | `ServiceProvider`     | `organizationId` | Integration credentials      |
| 22 | `ActionApproval`      | via Conversation | Tool approval request        |
| 23 | `Attachment`          | via Message      | Message attachment           |
| 24 | `ExternalIntegration` | via Agent        | Webhook/external config      |
| 25 | `LLMTaskOutput`       | `organizationId` | Async LLM task result        |
| 26 | `MailInbox`           | `organizationId` | Email inbox config           |
| 27 | `ConversationCrmTag`  | `organizationId` | CRM tagging                  |
| 28 | `Lead`                | `organizationId` | Lead capture                 |
| 29 | `XPBNPEval`           | —                | A/B test evaluation          |
| 30 | `AgentsOnDatastores`  | —                | M:N agent–datastore join     |
| 31 | `Coupon`              | —                | Billing coupon               |
| 32 | `UserApiKey`          | `userId`         | User-level API key           |
| 33 | `CrmLog`              | `organizationId` | CRM activity log             |

### Key Enums

| Enum                  | Sample Values                                                                                                                          |
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `DatasourceType`      | `web_page`, `web_site`, `text`, `file`, `google_drive_file`, `notion`, `youtube_video`, `youtube_bulk`, `qa`, `api`                    |
| `DatasourceStatus`    | `unsynched`, `pending`, `running`, `synched`, `error`, `usage_limit_reached`, `stalled`                                                |
| `ConversationChannel` | `dashboard`, `website`, `whatsapp`, `telegram`, `slack`, `crisp`, `form`, `mail`, `api`, `zapier`, `zendesk`, `messenger`, `instagram` |
| `AgentModelName`      | 50+ model enum values — see [Section 13](#13-llm-orchestration--model-router)                                                          |
| `SubscriptionPlan`    | `level_0`, `level_1`, `level_2`, `level_3`, `level_4`                                                                                  |
| `MembershipRole`      | `OWNER`, `ADMIN`, `USER`                                                                                                               |
| `GlobalRole`          | `SUPERADMIN`, `CUSTOMER`                                                                                                               |
| `ToolType`            | `datastore`, `http`, `form`, `mark_as_resolved`, `request_human`, `lead_capture`, `payment`                                            |

***

## 3. Queue & Worker Architecture

**Source**: `packages/zappway/lib/types/index.ts`, `apps/workers-services/workers/`

Three dedicated BullMQ-Pro queues separate concerns. The `load-datasource` queue is the primary active queue; the other two are partially implemented.

```mermaid theme={null}
graph LR
    subgraph "Producers"
        API["API Routes — Dashboard Actions"]
        SYNC["Sync Orchestrator — Cron Jobs"]
    end

    subgraph "Redis — BullMQ-Pro"
        Q1["load-datasource — Concurrency: 3"]
        Q2["generate-rag-document"]
        Q3["index-media-memory"]
    end

    subgraph "Workers — apps/workers-services"
        W1["datasource-loader"]
        W2["rag-document"]
        W3["media-memory"]
    end

    API --> Q1
    SYNC --> Q1
    Q1 --> W1
    Q2 --> W2
    Q3 --> W3
    W1 -->|"success"| DB1["PostgreSQL — status: synched"]
    W1 -->|"failure"| DB2["PostgreSQL — status: error"]
```

### Worker Configuration — Datasource Loader

| Config             | Value             | Env Override                  |
| ------------------ | ----------------- | ----------------------------- |
| Queue Name         | `load-datasource` | —                             |
| Concurrency        | 3                 | `DS_WORKER_CONCURRENCY`       |
| Lock Duration      | 8 min             | `DS_BULL_LOCK_DURATION_MS`    |
| Stalled Interval   | 90s               | `DS_BULL_STALLED_INTERVAL_MS` |
| Max Stalled Count  | 2                 | `DS_BULL_MAX_STALLED_COUNT`   |
| Dedupe Lock TTL    | 10 min            | `DS_JOB_LOCK_TTL_MS`          |
| Remove on Complete | Last 1000         | —                             |
| Remove on Fail     | Last 5000         | —                             |

**Dedupe Mechanism**: Redis `SET NX` with key `ld:lock:{datasourceId}` prevents the same datasource from running in parallel.

**Graceful Shutdown**: Handles `SIGTERM` and `SIGINT` — closes the worker, quits Redis cleanly.

### Worker Inventory

| Worker                   | Queue                   | Status   |
| ------------------------ | ----------------------- | -------- |
| `datasource-loader.ts`   | `load-datasource`       | ✅ Active |
| `check-and-sync-cron.ts` | cron-based              | ✅ Active |
| `generate-rag.ts`        | `generate-rag-document` | ✅ Active |
| `index-media-memory.ts`  | `index-media-memory`    | ✅ Active |
| `check-stalled.ts`       | internal                | ✅ Active |
| `trial-expiry-check.ts`  | cron-based              | ✅ Active |
| `usage-reconciler.ts`    | cron-based              | ✅ Active |
| `scheduler.ts`           | orchestrator            | ✅ Active |

***

## 4. Ingestion Engine & Loaders

**Source**: `packages/zappway/lib/datastores/datasources/`, `packages/zappway/lib/loaders/`

All loaders extend `DatasourceLoaderBase`. The entry point `taskLoadDatasource()` selects the correct loader at runtime based on `DatasourceType`. Output is a normalized `AppDocument[]` array fed into the chunking engine and then into Qdrant.

```mermaid theme={null}
graph TB
    subgraph "Entry Point"
        TASK["taskLoadDatasource()"]
    end

    subgraph "Loader Registry"
        BASE["DatasourceLoaderBase — Abstract"]
    end

    subgraph "Loaders"
        WP["WebPageLoader — Single URL"]
        WS2["WebSiteLoader — Sitemap + crawl"]
        FILE["FileLoader — PDF, DOCX, TXT"]
        QA["QALoader — Q and A pairs"]
        TEXT["TextLoader — Raw text"]
        YT["YouTubeLoader — Transcription"]
        GD["GoogleDriveLoader — Drive files"]
        NOT["NotionLoader — Pages and DBs"]
        APIL["APILoader — HTTP endpoint"]
    end

    TASK --> BASE
    BASE --> WP
    BASE --> WS2
    BASE --> FILE
    BASE --> QA
    BASE --> TEXT
    BASE --> YT
    BASE --> GD
    BASE --> NOT
    BASE --> APIL

    WP -->|"Chunks"| CHUNK["Chunking Engine — defaultChunkSize: 1024"]
    CHUNK -->|"Vectors"| QDR["Qdrant — zw_{datastoreId}"]
```

### Loader Mapping

| DatasourceType        | Loader              | Source          |
| --------------------- | ------------------- | --------------- |
| `web_page`            | `WebPageLoader`     | URL fetch       |
| `web_site`            | `WebSiteLoader`     | Sitemap / crawl |
| `file`                | `FileLoader`        | S3 upload       |
| `text`                | `TextLoader`        | Raw text input  |
| `qa`                  | `QALoader`          | Q\&A pairs      |
| `youtube_video`       | `YouTubeLoader`     | YouTube API     |
| `youtube_bulk`        | `YouTubeLoader`     | Multi-video     |
| `google_drive_file`   | `GoogleDriveLoader` | Drive API       |
| `google_drive_folder` | `GoogleDriveLoader` | Drive API       |
| `notion`              | `NotionLoader`      | Notion API      |
| `notion_page`         | `NotionLoader`      | Notion API      |
| `api`                 | `APILoader`         | HTTP endpoint   |

### WebSite Loader — Pipeline Detail

**Source**: `packages/zappway/lib/loaders/web-site.ts` (515 lines)

The `WebSiteLoader` is the most complex loader. Its pipeline runs in 6 stages: (1) **Discovery** — parses sitemap XML or crawls via `findDomainPages()`; (2) **Normalization** — URL dedup, strip UTM params, lowercase hostname; (3) **Blacklist filtering** — applies `black_listed_urls` config; (4) **HTTP probing** — HEAD → GET with semver path repair on 404s; (5) **Child management** — upserts `web_page` child datasources, deletes orphans; (6) **Enqueueing** — emits child jobs with priority scores (home = 5, sitemap = 8, others = 10). Concurrency is capped at 6 via `mapWithConcurrency`. Plan limit applied via `accountConfig[plan].limits.maxWebsiteURL` (default: 25).

***

## 5. Vector Database Layer — Qdrant

**Source**: `packages/zappway/lib/datastores/qdrant.ts` (731 lines)

Each Datastore maps to exactly one Qdrant collection named `zw_{datastoreId}`, providing strict per-tenant vector isolation. `QdrantManager` includes an **auto-migration** routine that detects dimension or distance metric mismatches and recreates the collection transparently — enabling zero-downtime embedding model upgrades.

```mermaid theme={null}
graph TB
    subgraph "Qdrant Cluster"
        subgraph "Collection: zw_datastore_001"
            P1["Point — vector float 3072 — payload: datastore_id, datasource_id, text, tags"]
        end
        subgraph "Collection: zw_datastore_002"
            P2["Point — vector float 3072 — ..."]
        end
    end

    subgraph "QdrantManager"
        SEARCH["search() — similarity + topK"]
        UPSERT["upsertDocuments() — batch upsert"]
        DELETE["deleteDocuments() — by datasource_id"]
        MIGRATE["autoMigrate() — dimension and distance check"]
    end

    SEARCH --> P1
    UPSERT --> P1
    DELETE --> P1
    MIGRATE -->|"recreates on mismatch"| P1
```

### Collection Configuration

| Parameter           | Value                         |
| ------------------- | ----------------------------- |
| Vector Size         | 3072                          |
| Distance Metric     | Cosine                        |
| Collection Name     | `zw_{datastoreId}`            |
| HNSW `m`            | 16                            |
| HNSW `ef_construct` | 100                           |
| Quantization        | Scalar Int8 — 99th percentile |
| On-disk Payload     | `true`                        |
| Write Consistency   | `majority`                    |

### Payload Schema (per point)

| Field             | Type      | Purpose                 |
| ----------------- | --------- | ----------------------- |
| `datastore_id`    | string    | Tenant-scoped datastore |
| `datasource_id`   | string    | Source document         |
| `datasource_hash` | string    | Content hash for dedup  |
| `chunk_hash`      | string    | Individual chunk hash   |
| `chunk_offset`    | integer   | Position in document    |
| `text`            | string    | Original text content   |
| `tags`            | string\[] | User-defined tags       |
| `custom_id`       | string    | External reference      |
| `page_number`     | integer   | PDF page number         |
| `total_pages`     | integer   | Total PDF pages         |

***

## 6. Embedding Pipeline

**Source**: `packages/zappway/lib/datastores/gemini-embeddings.ts`, `packages/zappway/lib/multimodal-memory/`

All embeddings use **Gemini Embedding 2 Preview**, producing 3072-dimensional vectors that match Qdrant's `VECTOR_SIZE`. The pipeline uses asymmetric `taskType` values per call site — `RETRIEVAL_DOCUMENT` during ingestion and `RETRIEVAL_QUERY` at search time — which is critical for retrieval quality with Gemini's asymmetric embedding model.

```mermaid theme={null}
graph LR
    subgraph "Ingestion"
        ING["Loader / Chunker"] -->|"RETRIEVAL_DOCUMENT"| EMB_DOC["createGeminiEmbeddings"]
    end

    subgraph "Search"
        QUERY["Chat / RAG"] -->|"RETRIEVAL_QUERY"| EMB_Q["createGeminiEmbeddings"]
    end

    EMB_DOC --> GEMINI["Gemini API — batchEmbedContents — batch: 100"]
    EMB_Q --> GEMINI
    GEMINI --> VEC["float 3072 vectors"]
```

### Configuration

| Parameter         | Value                                      |
| ----------------- | ------------------------------------------ |
| Model             | `gemini-embedding-2-preview`               |
| Output Dimensions | 3072                                       |
| Batch Size        | 100 (API limit per `batchEmbedContents`)   |
| Auth              | `x-goog-api-key` — env `GOOGLE_AI_API_KEY` |

### Multimodal Support

`embedMultimodal()` accepts `GeminiPart[][]` where each part can be `{ text }` or `{ inlineData: { mimeType, data } }` (base64-encoded images, video frames, audio, PDF pages). This embeds non-text content into the same 3072-dimensional space as text, enabling true multimodal semantic search.

The `multimodal-memory/` module provides dedicated indexing (`indexer.ts` — 24.5KB), media-specific chunkers (`media-chunkers.ts`), collection management, search, and async queue triggers.

***

## 7. RAG & Retrieval Pipeline

**Source**: `packages/zappway/lib/chat-v4/rag.ts` (493 lines)

ZappWay implements **Multi-Shot Adaptive RAG** with up to 6 progressive retrieval attempts across 3 quality tiers. Each attempt relaxes similarity thresholds to maximize recall while `evaluateRagQuality()` enables early exit when results are strong enough. A circuit breaker prevents cascading failures, and `ragMemo` (in-memory cache) deduplicates identical attempts within a session.

```mermaid theme={null}
graph TB
    subgraph "Entry"
        CHAT["chat()"] --> RAG["executeRagMultiShot()"]
    end

    subgraph "Budget Check"
        RAG --> BUDGET["validateRagBudget()"]
        BUDGET -->|"insufficient"| LIGHT["Lightweight Fallback — topK: 3 / threshold: 0.65"]
        BUDGET -->|"ok"| CB["Circuit Breaker — 3 failures / 60s cooldown"]
    end

    subgraph "Multi-Shot Loop — up to 6 attempts"
        CB -->|"closed"| T1["Tier 1 — Strict — 0.55 / 0.48 — topK 30/45"]
        T1 -->|"not strong"| T2["Tier 2 — Balanced — 0.42 / 0.37 — topK 60/75"]
        T2 -->|"not strong"| T3["Tier 3 — Recall — 0.33 / 0.30 — topK 90/100"]
    end

    subgraph "Quality Evaluation"
        T1 --> EVAL["evaluateRagQuality()"]
        T2 --> EVAL
        T3 --> EVAL
        EVAL -->|"strong + minUcount"| EXIT["Early Exit"]
        EVAL -->|"best partial"| BEST["Return Best Result"]
    end
```

### Adaptive Thresholds

| Attempt | Label           | Deep Mode | Normal | TopK |
| ------- | --------------- | --------- | ------ | ---- |
| 1       | `strict`        | 0.55      | 0.48   | 30   |
| 2       | `balanced-high` | 0.48      | 0.42   | 45   |
| 3       | `balanced`      | 0.42      | 0.38   | 60   |
| 4       | `balanced-low`  | 0.37      | 0.32   | 75   |
| 5       | `relaxed`       | 0.33      | 0.28   | 90   |
| 6       | `permissive`    | 0.30      | 0.25   | 100  |

**Deep Mode** is triggered by `shouldFavorDeepRag(query)` and applies higher initial thresholds. `minUcount` is 1 for single-datastore and 2 for multi-datastore queries. Timeouts scale from 8s (Tier 1) to 12s (Tier 3), capped by remaining chat budget. If `remainingMs() < 120s`, max attempts reduce to 4.

***

## 8. Chat & Conversation Engine

**Source**: `packages/zappway/lib/chat-v4/chat.ts` (1147 lines), `packages/zappway/lib/agent/tools/`

The chat function orchestrates system prompt assembly, message history truncation, multi-shot RAG, runtime tool building, LLM execution with streaming, and the full model fallback chain — all within a shared time budget enforced at every checkpoint.

```mermaid theme={null}
graph TB
    subgraph "Input"
        REQ["Chat Request — query, conversationId, channel, modelName"]
    end

    subgraph "Preparation"
        REQ --> BUDGET["Budget — CHAT_TIMEOUT_CONFIG"]
        BUDGET --> SYSTEM["System Prompt — persona + instructions"]
        SYSTEM --> HISTORY["Message History — truncation + formatting"]
    end

    subgraph "RAG Phase"
        HISTORY --> RAG["executeRagMultiShot()"]
        RAG --> INJECT["Inject KB context into system prompt"]
    end

    subgraph "Tool Assembly"
        INJECT --> TOOLS["Runtime Tools"]
        TOOLS --> T_DS["datastore — KB search"]
        TOOLS --> T_FORM["form — Dynamic forms"]
        TOOLS --> T_HTTP["http — API calls"]
        TOOLS --> T_HUMAN["request-human — Handoff"]
        TOOLS --> T_LEAD["lead-capture — Lead form"]
        TOOLS --> T_PAY["payment — Payment flow"]
        TOOLS --> T_MARK["mark-resolved — Close ticket"]
    end

    subgraph "Execution"
        TOOLS --> RUN["runWithTools() — OpenAI-compat tool loop"]
        RUN -->|"streaming"| SSE["SSE Stream — answer, tool_call, metadata"]
        RUN -->|"on failure"| FALLBACK["fallbackCandidates() — Model degradation"]
    end

    FALLBACK --> RESP["ChatResponse — answer, sources, usage"]
    SSE --> RESP
```

### SSE Event Types

| Event               | Purpose                              |
| ------------------- | ------------------------------------ |
| `answer`            | Streamed text delta                  |
| `tool_call`         | Tool invocation notification         |
| `endpoint_response` | HTTP tool response data              |
| `step`              | Processing step indicator            |
| `metadata`          | Model usage, sources, context window |
| `error`             | Error notification                   |
| `done`              | Stream completion signal             |

### Runtime Tools

| Tool File             | Tool Type          | Purpose                           |
| --------------------- | ------------------ | --------------------------------- |
| `datastore.ts`        | `datastore`        | Dynamic RAG within tool loop      |
| `form.ts`             | `form`             | Present and collect dynamic forms |
| `http.ts`             | `http`             | Call external APIs                |
| `request-human.ts`    | `request_human`    | Transfer to human agent           |
| `lead-capture.ts`     | `lead_capture`     | Capture visitor info              |
| `payment.ts`          | `payment`          | Initiate payment flow             |
| `mark-as-resolved.ts` | `mark_as_resolved` | Close support ticket              |
| `ui.ts`               | UI rendering       | Custom UI components              |

***

## 9. Multi-Tenant Isolation & Auth

**Source**: `packages/zappway/lib/auth.ts` (1661 lines)

NextAuth v5 with a custom Prisma adapter. Four sign-in methods are supported. On first sign-in, the platform auto-provisions the full tenant stack atomically.

```mermaid theme={null}
graph TB
    subgraph "Auth Providers"
        EMAIL["EmailProvider — Magic Link"]
        GITHUB["GitHubProvider — OAuth"]
        GOOGLE["GoogleProvider — OAuth"]
        APPLE["AppleProvider — OAuth ES256"]
    end

    subgraph "NextAuth v5"
        CONFIG["authConfig() — Dynamic per-request"]
        ADAPTER["CustomPrismaAdapter"]
    end

    subgraph "Auto-Provisioning on First Sign-in"
        CREATE["User record"]
        ORG["Organization — auto-generated name"]
        MEMBER["Membership — Role: OWNER"]
        APIKEY["ApiKey — UUID v4"]
        USAGE["Usage tracker"]
        TRIAL["Trial Subscription — level 4 — default: 7 days"]
    end

    EMAIL --> CONFIG
    GITHUB --> CONFIG
    GOOGLE --> CONFIG
    APPLE --> CONFIG
    CONFIG --> ADAPTER
    ADAPTER --> CREATE
    CREATE --> ORG
    ORG --> MEMBER
    ORG --> APIKEY
    ORG --> USAGE
    ORG --> TRIAL
```

### Tenant Isolation by Layer

| Layer        | Mechanism                                                |
| ------------ | -------------------------------------------------------- |
| Database     | `organizationId` on every tenant model                   |
| API          | Session middleware injects `organization` from session   |
| Vector DB    | One Qdrant collection per datastore (`zw_{datastoreId}`) |
| File Storage | S3 prefix `organizations/{orgId}/`                       |
| Queues       | `organizationId` carried in every job payload            |

### RBAC Model

| Role         | Scope        | Access                       |
| ------------ | ------------ | ---------------------------- |
| `SUPERADMIN` | Global       | Full system                  |
| `CUSTOMER`   | Global       | Standard user                |
| `OWNER`      | Organization | Full org management          |
| `ADMIN`      | Organization | Manage agents and datastores |
| `USER`       | Organization | Read-only                    |

### Session Configuration

| Setting       | Production                         | Development               |
| ------------- | ---------------------------------- | ------------------------- |
| Strategy      | `database`                         | `database`                |
| Cookie Name   | `__Secure-next-auth.session-token` | `next-auth.session-token` |
| Cookie Domain | `.{rootDomain}`                    | `undefined`               |
| Secure        | `true`                             | `false`                   |

**Locale Support**: 37 locales, including RTL (Arabic, Hebrew, Persian, Urdu). Resolution order: URL path → `NEXT_LOCALE` cookie → `i18next` cookie → `Accept-Language` header → default `en`.

***

## 10. Sync Orchestrator

**Source**: `apps/workers-services/workers/check-and-sync-cron.ts`

A cron-driven worker scans all datasources with `status = synched`, filters by `lastSyncAt + syncInterval`, and fans out individual `load-datasource` jobs. A `check-stalled` worker handles recovery of datasources stuck in `running` status beyond a configurable threshold.

```mermaid theme={null}
graph TB
    subgraph "Trigger"
        CRON["Cron Job — Scheduled interval"]
    end

    subgraph "Orchestration"
        CRON --> SCAN["Scan — WHERE status = synched"]
        SCAN --> FILTER["Filter by lastSyncAt + syncInterval"]
        FILTER --> ENQUEUE["Fan-out to load-datasource queue"]
    end

    subgraph "Per-Datasource"
        ENQUEUE --> W["Datasource Loader Worker"]
        W -->|"success"| OK["status: synched — lastSyncAt updated"]
        W -->|"failure"| ERR["status: error"]
    end

    subgraph "Stalled Recovery"
        STALL["check-stalled worker"] -->|"scans running beyond threshold"| RETRY["Retry or mark stalled"]
    end
```

***

## 11. External Integrations & Channels

**Source**: `packages/zappway/integrations/` — 13 channel adapters

All channels normalize inbound messages to the same internal `Conversation` + `Message` model and route through the unified `chat-v4` engine.

```mermaid theme={null}
graph LR
    subgraph "Channel Adapters"
        WEB["Website Widget"]
        WA["WhatsApp — Cloud API"]
        TG["Telegram — Bot API"]
        SL["Slack Bot"]
        CR["Crisp Plugin"]
        ZD["Zendesk"]
        MSG["Messenger — Facebook"]
        IG["Instagram — Meta"]
        MAIL["Email — Mail Inboxes"]
        ZAP["Zapier — Webhooks"]
        API["REST API"]
        FORM["Form — Embeddable"]
    end

    CHAT["chat-v4 Engine"]

    WEB --> CHAT
    WA --> CHAT
    TG --> CHAT
    SL --> CHAT
    CR --> CHAT
    ZD --> CHAT
    MSG --> CHAT
    IG --> CHAT
    MAIL --> CHAT
    ZAP --> CHAT
    API --> CHAT
    FORM --> CHAT
```

### Channel Capabilities

| Channel     | Direction        | Streaming    | Form Support |
| ----------- | ---------------- | ------------ | ------------ |
| `dashboard` | Bidirectional    | SSE          | Yes          |
| `website`   | Bidirectional    | SSE          | Yes          |
| `whatsapp`  | Bidirectional    | No (webhook) | Limited      |
| `telegram`  | Bidirectional    | No (webhook) | Limited      |
| `slack`     | Bidirectional    | No (webhook) | No           |
| `crisp`     | Bidirectional    | No (webhook) | No           |
| `mail`      | Async            | No           | No           |
| `api`       | Request/Response | Optional     | No           |
| `zapier`    | Webhook          | No           | No           |
| `form`      | Form submit      | No           | Yes          |
| `zendesk`   | Bidirectional    | No (webhook) | No           |
| `messenger` | Bidirectional    | No (webhook) | No           |
| `instagram` | Bidirectional    | No (webhook) | No           |

### ServiceProvider Auth

| Type                      | Auth Mechanism       |
| ------------------------- | -------------------- |
| `website`                 | API Key              |
| `whatsapp`                | Cloud API token      |
| `telegram`                | Bot token            |
| `slack`                   | OAuth                |
| `crisp`                   | API key + website ID |
| `zendesk`                 | API token            |
| `messenger` / `instagram` | Page access token    |
| `notion`                  | Integration token    |
| `google_drive`            | OAuth refresh token  |
| `zapier`                  | Webhook URL          |

***

## 12. File Storage & Media Pipeline

**Source**: `packages/zappway/lib/aws.ts`

Supports S3-compatible storage (AWS S3, Cloudflare R2, MinIO) via AWS SDK v3. Files are uploaded via presigned URLs, stored under `organizations/{orgId}/`, and pulled by `FileLoader` for extraction and ingestion.

```mermaid theme={null}
graph LR
    subgraph "Upload"
        USER["User Upload"] -->|"presigned URL"| S3["S3 / R2 Bucket"]
    end

    subgraph "Processing"
        S3 -->|"file URL"| LOADER["FileLoader"]
        LOADER -->|"text extraction"| CHUNK["Chunking — defaultChunkSize: 1024"]
        CHUNK -->|"embed"| QDRANT["Qdrant — zw_{datastoreId}"]
    end

    subgraph "Cleanup"
        DELETE["deleteFolderFromS3Bucket()"] -->|"ListObjects + DeleteObjects"| S3
    end
```

### S3 Environment Variables

| Variable                      | Purpose                      |
| ----------------------------- | ---------------------------- |
| `APP_AWS_REGION`              | AWS region                   |
| `APP_AWS_ACCESS_KEY`          | Access key ID                |
| `APP_AWS_SECRET_KEY`          | Secret access key            |
| `APP_AWS_S3_ENDPOINT`         | Custom endpoint (R2 / MinIO) |
| `APP_AWS_S3_FORCE_PATH_STYLE` | Path-style for R2 / MinIO    |
| `APP_AWS_S3_BUCKET`           | Target bucket                |

***

## 13. LLM Orchestration & Model Router

**Source**: `packages/zappway/lib/config.ts` (943 lines), `packages/zappway/lib/chat-model/model.ts` (745 lines)

50+ models across 12 providers via a unified OpenAI-SDK-compatible interface. Provider routing is automatic based on each model's `baseUrl` in `ModelConfig`. OpenAI, Google Gemini, and OpenRouter are all accessed through the same OpenAI SDK client with different base URLs and API keys.

```mermaid theme={null}
graph TB
    subgraph "Model Router"
        SELECT["Agent.modelName — AgentModelName enum"]
        SELECT --> DETECT["Provider Detection — ModelConfig.baseUrl"]
    end

    subgraph "Providers"
        DETECT -->|"api.openai.com"| OAI["OpenAI Direct — GPT-4o, GPT-5.x"]
        DETECT -->|"generativelanguage.googleapis.com"| GOOGLE["Google Direct — Gemini 2.0 to 3.1"]
        DETECT -->|"openrouter.ai"| OR["OpenRouter — Claude, Llama, DeepSeek, Grok, etc."]
    end

    subgraph "OpenAI SDK — Unified Interface"
        OAI --> SDK["OpenAI SDK"]
        GOOGLE --> SDK
        OR --> SDK
    end

    SDK --> STREAM["Streaming SSE"]
    SDK --> NONSTREAM["Non-Streaming"]
```

### Model Inventory

#### 🧠 OpenAI — Direct

| Model                                | Max Tokens  | Cost         |
| ------------------------------------ | ----------- | ------------ |
| GPT-4o / GPT-4o Mini                 | 128K        | 15 / 4       |
| GPT-5 / Mini / Nano                  | 400K        | 15 / 5 / 4   |
| GPT-5.1 / 5.1 Chat                   | 400K        | 15 / 12      |
| GPT-5.2 / 5.2 Chat / 5.2 Chat Latest | 400K        | 20 / 15 / 15 |
| GPT-5.3 Chat Latest / 5.3 Codex      | 400K / 128K | 15           |
| GPT-5.4 / 5.4 Mini / 5.4 Nano        | 400K        | 15 / 6 / 5   |

#### 🟡 Google — Direct via Gemini API

| Model                                     | Max Tokens    | Cost       |
| ----------------------------------------- | ------------- | ---------- |
| Gemini Flash 2.0 / 2.5                    | 1M            | 5 / 10     |
| Gemini Pro 2.5                            | 1M            | 12         |
| Gemini 3 Pro / 3 Flash                    | 1M            | 15 / 10    |
| Gemini 3 Pro Image                        | 64K           | 15         |
| Gemini 3.1 Flash Lite / Flash Image / Pro | 1M / 64K / 1M | 5 / 8 / 15 |

#### 🔶 Anthropic — via OpenRouter

| Model                                   | Max Tokens | Cost         |
| --------------------------------------- | ---------- | ------------ |
| Claude 3.5 Haiku / 3.5 Sonnet           | 200K       | 12 / 15      |
| Claude 4 Sonnet                         | 200K       | 15           |
| Claude 4.5 Haiku / 4.5 Sonnet           | 200K       | 12 / 15      |
| Claude Opus 4.5 / Opus 4.6 / Sonnet 4.6 | 200K       | 20 / 20 / 15 |

#### Other Providers — via OpenRouter

| Model                                              | Provider   | Cost          |
| -------------------------------------------------- | ---------- | ------------- |
| DeepSeek V3.2 / R1                                 | DeepSeek   | 5 / 8         |
| Llama-4 Maverick / Scout                           | Meta       | 10 / 5        |
| Mistral Large 2512 / 8x22B / 8x7B / Small Creative | Mistral    | 5 / 5 / 2 / 2 |
| Sonar Pro                                          | Perplexity | 15            |
| Grok 4 / 4.1 Fast                                  | xAI        | 15 / 5        |
| Nova Premier / Nova 2 Lite                         | Amazon     | 12 / 6        |
| LFM2 8B / LFM 2 24B                                | Liquid     | 1 / 1         |
| MiniMax M2.7                                       | MiniMax    | 5             |
| MiMo-V2-Omni / MiMo-V2-Pro                         | Xiaomi     | 6 / 10        |

**Free Tier Models**: `gpt_4o_mini`, `gpt_5_mini`, `gpt_5_nano`, `gpt_5_4_mini`, `gpt_5_4_nano`, `gemini_flash_2_0`

> **Important:**
> GPT-5 family models do not support manual temperature adjustment — they use automatic temperature recognition.

***

## 14. Observability & Monitoring

**Source**: `packages/zappway/lib/logger.ts`, `apps/workers-services/sentry.*.config.ts`

| Component      | Technology                      | Status          |
| -------------- | ------------------------------- | --------------- |
| Core Logger    | Pino                            | ✅ Active        |
| Pretty Print   | pino-pretty                     | ✅ Active        |
| Axiom Cloud    | @axiomhq/pino                   | ❌ Commented out |
| Chat Logger    | Pino child `logChat`            | ✅ Active        |
| Error Tracking | Sentry (client + server + edge) | ✅ Active        |

Workers publish a `WorkerHealth` payload to Redis (`health:worker:{name}`, 30s TTL) every 10s. Fields: `status`, `startedAt`, `lastActivityAt`, `jobsProcessed`, `jobsFailed`, `queueLength`, `system.memoryMB`, `system.uptimeMs`. Exposed via `/api/workers/health`.

Workers also stream real-time logs to Redis Pub/Sub channel `logs:datasource` for live dashboard monitoring.

***

## 15. Gaps & Recommendations

### Identified Gaps

| # | Area               | Gap                                                              | Severity  |
| - | ------------------ | ---------------------------------------------------------------- | --------- |
| 1 | Observability      | Axiom integration commented out — no cloud logging in production | 🟡 Medium |
| 2 | Rate Limiting      | No visible rate limiting on chat API endpoints                   | 🔴 High   |
| 3 | Queue DLQ          | No Dead Letter Queue for permanently failed jobs                 | 🟡 Medium |
| 4 | Embedding Fallback | Single embedding provider (Gemini) — no fallback if API is down  | 🟡 Medium |
| 5 | Vector DB Backup   | No documented Qdrant backup/snapshot strategy                    | 🟡 Medium |
| 6 | Test Coverage      | No test files found in scanned directories                       | 🔴 High   |

### Recommendations

1. **Enable cloud logging** — Uncomment Axiom or use Datadog / Grafana Loki for production log retention.
2. **Add rate limiting** — Per-organization token-bucket on chat endpoints.
3. **Implement DLQ** — BullMQ supports `deadLetterQueue` option; enable for all 3 queues.
4. **Embedding fallback** — `text-embedding-3-large` from OpenAI (3072-dim compatible) as secondary.
5. **Qdrant snapshots** — Cron-based snapshot to S3 for disaster recovery.
6. **Test coverage** — Integration tests for the RAG pipeline and ingestion workers as a starting priority.

### Architectural Strengths

| Strength              | Detail                                                                      |
| --------------------- | --------------------------------------------------------------------------- |
| Multi-shot RAG        | 6-attempt adaptive retrieval with quality evaluation and circuit breaker    |
| Auto-migration        | Qdrant collection recreated transparently on embedding model change         |
| Graceful Shutdown     | All workers handle `SIGTERM` / `SIGINT` cleanly                             |
| Dedup Locks           | Redis `SET NX` prevents parallel processing of the same datasource          |
| Budget Management     | Time-budget enforced at every checkpoint in the chat pipeline               |
| 50+ Models            | Unified OpenAI-SDK interface across 12 providers                            |
| True Multimodal       | End-to-end multimodal embed, chunk, and search for images, audio, and video |
| Full Tenant Isolation | Separate Qdrant collections and S3 prefixes per organization                |

***

## Vocabulary

| Term                   | Description                                                                                 |
| ---------------------- | ------------------------------------------------------------------------------------------- |
| **Datastore**          | Container of Datasources — maps to a Qdrant collection (`zw_{datastoreId}`)                 |
| **Datasource**         | Individual data source — `web_page`, `file`, `youtube_video`, etc.                          |
| **AppDocument**        | Normalized document format output by all Loaders                                            |
| **Multi-Shot RAG**     | Adaptive retrieval with up to 6 progressive attempts across 3 quality tiers                 |
| **Circuit Breaker**    | Stops RAG after 3 consecutive failures for 60s to prevent cascading errors                  |
| **Deep Mode**          | High-threshold RAG mode triggered by `shouldFavorDeepRag(query)`                            |
| **BullMQ-Pro**         | Redis-backed job queue with advanced scheduling and flow features                           |
| **Qdrant**             | Vector database used for multimodal semantic search                                         |
| **Gemini Embedding 2** | Multimodal embedding model producing 3072-dimensional vectors                               |
| **taskType**           | Asymmetric embedding mode: `RETRIEVAL_DOCUMENT` for ingestion, `RETRIEVAL_QUERY` for search |
| **AgentsOnDatastores** | M:N junction table linking Agents to Datastores                                             |
| **HNSW**               | Hierarchical Navigable Small World — approximate nearest-neighbor index in Qdrant           |

***

· **Last updated**: March 2026
