> ## Documentation Index
> Fetch the complete documentation index at: https://docs.zappway.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Datastores

> Create and manage knowledge bases that power your AI Employees with custom data. Import documents, websites, and structured data to enhance AI responses with accurate, context-aware information.

> **Important:**
> Datastores are the foundation of intelligent AI responses. By connecting relevant data sources to your AI Employees, you enable them to provide accurate, context-aware answers based on your organization's specific information.

***

## 🔢 Table of Contents

1. [What is a Datastore?](#1-what-is-a-datastore)
2. [Creating Your First Datastore](#2-creating-your-first-datastore)
3. [Adding Datasources](#3-adding-datasources)
4. [Supported Data Types](#4-supported-data-types)
5. [Processing and Indexing](#5-processing-and-indexing)
6. [Connecting to AI Employees](#6-connecting-to-ai-employees)
7. [Managing Datastores](#7-managing-datastores)
8. [Best Practices](#8-best-practices)
9. [Troubleshooting](#9-troubleshooting)

***

## 1. What is a Datastore?

### Overview

A **Datastore** is a collection of data sources that provides knowledge to your AI Employees. Think of it as a knowledge base that your AI can reference when answering questions or performing tasks.

**Key Concepts:**

* **Datastore**: A container that groups related datasources together
* **Datasource**: Individual pieces of content (documents, websites, text files, etc.)
* **Chunks**: Small segments of processed data optimized for AI retrieval
* **Embeddings**: Vector representations of your data used for semantic search

### Why Use Datastores?

**Benefits:**

* **Accuracy**: AI responses based on your actual data, not general knowledge
* **Context**: Provide domain-specific information unique to your business
* **Control**: Manage exactly what information the AI can access
* **Updates**: Keep knowledge current by updating datasources
* **Organization**: Group related information logically by department, topic, or purpose

**Common Use Cases:**

* Product documentation and manuals
* Company policies and procedures
* FAQ databases
* Technical specifications
* Customer support knowledge bases
* Training materials
* Legal documents
* Marketing content

***

## 2. Creating Your First Datastore

### Quick Start

**Steps:**

1. Navigate to **Datastores** in the sidebar (or visit `/datastores`)
2. Click **"Create Datastore"** button
3. Fill in the creation form:
   * **Name** (required): Descriptive name for the datastore
   * **Description** (optional): Purpose and contents overview
4. Click **"Create"**
5. Your new datastore appears in the list

### Naming Best Practices

**Good Examples:**

* "Customer Support Documentation"
* "Product Catalog 2025"
* "HR Policies and Procedures"
* "Technical API Reference"
* "Sales Training Materials"

**Avoid:**

* Generic names like "Datastore 1" or "Test"
* Unclear abbreviations
* Names without context

### Datastore Settings

**Configuration Options:**

After creation, you can configure:

* **Name**: Update the datastore name
* **Description**: Add or modify description
* **Visibility**: Control who can access (if applicable)
* **Connected AI Employees**: View which agents use this datastore

***

## 3. Adding Datasources

### What is a Datasource?

A **Datasource** is an individual piece of content within a datastore. Each datastore can contain multiple datasources of different types.

### How to Add Datasources

**Location:** Inside any datastore detail page

**Methods:**

#### Method 1: File Upload

1. Click **"Add Datasource"** or **"Upload Files"**
2. Select **"File Upload"** option
3. Choose files from your computer:
   * Single file upload
   * Multiple files (batch upload)
   * Drag and drop support
4. Files are uploaded and processed automatically

**Supported File Types:**

* PDF documents
* Word documents (.docx, .doc)
* Text files (.txt)
* Markdown files (.md)
* CSV files
* Excel spreadsheets (.xlsx, .xls)
* PowerPoint presentations (.pptx, .ppt)

#### Method 2: Website/URL

1. Click **"Add Datasource"**
2. Select **"Website"** option
3. Enter the URL of the webpage
4. Configure crawling options:
   * **Single page**: Import only the specified URL
   * **Crawl site**: Follow links and import multiple pages
   * **Max depth**: How many levels deep to crawl
   * **URL patterns**: Include/exclude specific paths
5. Click **"Add"** to start import

**URL Examples:**

* Documentation sites: `https://docs.example.com`
* Blog posts: `https://blog.example.com/article`
* Product pages: `https://shop.example.com/products`

#### Method 3: Text Input

1. Click **"Add Datasource"**
2. Select **"Text"** option
3. Paste or type content directly
4. Give it a descriptive name
5. Click **"Save"**

**Use Cases:**

* Quick FAQ entries
* Policy snippets
* Short reference materials
* Temporary information

#### Method 4: Integrations (Coming Soon)

Future support for:

* Google Drive folders
* Notion databases
* Confluence spaces
* GitHub repositories
* SharePoint sites

### Batch Operations

**Upload Multiple Files:**

1. Select multiple files in the upload dialog
2. All files are queued for processing
3. Track progress for each file individually
4. Failed uploads can be retried

**Import Multiple URLs:**

1. Enable **"Bulk URL Import"**
2. Paste multiple URLs (one per line)
3. Configure shared settings for all URLs
4. Start batch import

***

## 4. Supported Data Types

### Documents

**File Formats:**

| Type              | Extensions        | Notes                                 |
| ----------------- | ----------------- | ------------------------------------- |
| **PDF**           | .pdf              | Text extraction, OCR for scanned docs |
| **Word**          | .docx, .doc       | Full formatting preserved             |
| **Text**          | .txt, .md         | Plain text and Markdown               |
| **Spreadsheets**  | .xlsx, .xls, .csv | Table data extracted                  |
| **Presentations** | .pptx, .ppt       | Slide content and notes               |

**Processing Features:**

* **Text Extraction**: Automatic extraction from all formats
* **OCR**: Optical character recognition for image-based PDFs
* **Table Parsing**: Structured data from spreadsheets
* **Metadata Extraction**: Author, creation date, titles

### Web Content

**Supported:**

* Public websites (HTML pages)
* Blog posts and articles
* Documentation sites
* Product pages
* Knowledge base articles

**Features:**

* **Smart Crawling**: Follow internal links automatically
* **Content Cleaning**: Remove navigation, ads, footers
* **JavaScript Rendering**: Support for dynamic content
* **Sitemap Support**: Import via sitemap.xml

**Limitations:**

* Cannot access password-protected sites
* Rate limiting may apply for large crawls
* Some dynamic content may not render perfectly

### Structured Data

**CSV/Excel Processing:**

* Each row can become a separate datasource
* Column headers used for context
* Numeric data preserved
* Support for large datasets (up to 100,000 rows)

**JSON Data:**

* Import structured JSON files
* Nested objects supported
* Array elements handled intelligently

### Raw Text

**Use Cases:**

* FAQ entries
* Policy statements
* Product descriptions
* Quick reference materials

**Formatting:**

* Markdown supported for rich formatting
* Plain text for simple content
* HTML can be pasted directly

***

## 5. Processing and Indexing

### How Processing Works

When you add a datasource, ZappWay automatically:

1. **Extracts**: Pulls text content from files/URLs
2. **Cleans**: Removes irrelevant elements (ads, navigation)
3. **Chunks**: Splits content into optimal-sized segments
4. **Embeds**: Creates vector representations for semantic search
5. **Indexes**: Stores in a searchable database

### Processing Status

**Status Indicators:**

| Status         | Meaning                         | What to Do                  |
| -------------- | ------------------------------- | --------------------------- |
| **Uploading**  | File transfer in progress       | Wait                        |
| **Processing** | Extracting and chunking content | Wait (may take 1-5 minutes) |
| **Indexing**   | Creating embeddings             | Wait                        |
| **Ready**      | Available for AI use            | Nothing, it works!          |
| **Failed**     | Processing error occurred       | Retry or check file         |

**Tracking Progress:**

* Real-time status updates on datasource cards
* Progress percentage for large files
* Estimated time remaining
* Error messages for failed processing

### Chunking Strategy

**What is Chunking?**

Large documents are split into smaller pieces (chunks) to optimize AI retrieval. Each chunk is typically 500-1000 tokens.

**Why Chunking Matters:**

* **Relevance**: AI can find the exact section needed
* **Performance**: Faster search and retrieval
* **Context**: Each chunk maintains coherent context

**Automatic Optimization:**

ZappWay automatically determines the best chunk size based on:

* Document type (PDF, text, web page)
* Content structure (headings, paragraphs)
* Information density

### Embeddings and Search

**What are Embeddings?**

Embeddings are numerical representations of text that capture semantic meaning. They enable the AI to find relevant information even when exact keywords do not match.

**Example:**

Query: "How do I reset my password?"\
Match: Chunk containing "Password recovery process"\
(Even though "reset" is not in the chunk text)

**Search Types:**

* **Semantic Search**: Meaning-based matching (default)
* **Keyword Search**: Exact term matching
* **Hybrid Search**: Combination of both (best results)

***

## 6. Connecting to AI Employees

### Why Connect Datastores?

Connecting a datastore to an AI Employee allows that agent to reference the datastore's knowledge when responding to queries. Without a connection, the AI cannot access the data.

### Connection Methods

#### Method 1: During AI Employee Creation

When creating a new AI Employee:

1. In the creation form, find **"Knowledge"** section
2. Click the knowledge selector
3. Search and select datastores
4. Multiple datastores can be connected
5. Save the AI Employee

#### Method 2: From AI Employee Settings

For existing AI Employees:

1. Navigate to **AI Employees** → Select employee
2. Go to **Settings** tab
3. Find **"Knowledge"** section
4. Click **"Edit"** or knowledge selector
5. Add/remove datastores
6. Click **"Save"**

#### Method 3: From Integration Setup

When setting up integrations (WhatsApp, Messenger, etc.):

1. During integration configuration
2. Select AI Employee
3. Knowledge selector appears automatically
4. Choose relevant datastores
5. Complete integration setup

### Multiple Datastore Strategy

**Best Practices:**

**Scenario 1: Specialized AI Employees**

* **Support AI**: Connect only support-related datastores
* **Sales AI**: Connect product catalogs and pricing
* **HR AI**: Connect policies and employee handbooks

**Scenario 2: Comprehensive Knowledge**

* Connect multiple related datastores to one AI
* Example: Support AI with product docs + FAQs + troubleshooting guides

**Performance Considerations:**

* More datastores = more data to search
* Keep connections relevant to avoid confusion
* Typical limit: 5-10 datastores per AI Employee

### Viewing Connections

**From Datastore Page:**

Each datastore card shows:

* **Connected AI Employees**: Count of agents using this datastore
* **Click to view**: List of specific AI Employees

**From AI Employee Page:**

Settings tab displays:

* **Connected Datastores**: Full list with names
* **Quick remove**: Unlink datastores easily

***

## 7. Managing Datastores

### Datastore List View

**Main Page Display:**

Each datastore card shows:

* **Name**: Datastore title
* **Description**: Purpose overview
* **Datasource Count**: Number of sources inside
* **Connected AI Employees**: How many agents use it
* **Last Updated**: Most recent modification
* **Actions**: Edit, Delete, View Details

### Viewing Datastore Contents

**Detail Page:**

Click on any datastore to see:

* **Overview**: Name, description, stats
* **Datasources List**: All contained datasources
  * File name/URL
  * Type (PDF, website, text)
  * Status (Ready, Processing, Failed)
  * Size/Length
  * Upload date
  * Actions (View, Delete)
* **Activity Log**: Recent changes and updates
* **Connected AI Employees**: Full list

### Editing Datastores

**What Can Be Edited:**

* **Name**: Update datastore name
* **Description**: Modify description
* **Add Datasources**: Upload new files or URLs
* **Remove Datasources**: Delete outdated sources
* **Reprocess**: Trigger re-indexing if needed

**Steps:**

1. Click datastore name to open detail page
2. Click **"Edit"** button (top-right)
3. Modify fields as needed
4. Click **"Save Changes"**

### Updating Content

**When to Update:**

* Product information changes
* New documentation released
* Policy updates
* Correcting outdated information

**How to Update:**

**Option 1: Replace Datasource**

1. Delete old datasource
2. Upload new version
3. Wait for processing

**Option 2: Add New Version**

1. Upload new file with version number
2. Keep old version for reference (optional)
3. Remove old version later

**Option 3: Reprocess**

Some changes to websites can be captured by:

1. Clicking **"Reprocess"** on a web datasource
2. Content is re-crawled and updated

### Deleting Datastores

**Warning:** Deleting a datastore removes all contained datasources and disconnects it from all AI Employees. This action cannot be undone.

**Steps:**

1. Navigate to datastore detail page
2. Click **"Delete"** button (usually in menu)
3. Confirm deletion in modal
4. Datastore and all datasources are permanently removed

**Before Deleting:**

* Check which AI Employees are connected
* Consider archiving instead if data might be needed later
* Export important datasources if needed

### Archiving (If Available)

Some plans support archiving:

* Temporarily disable a datastore without deleting
* Disconnects from AI Employees automatically
* Can be restored later
* Preserves all datasources and configuration

***

## 8. Best Practices

### Organizing Datastores

**Strategy 1: By Department**

Create separate datastores for each team:

* "Customer Support Knowledge Base"
* "Sales Resources"
* "HR Policies"
* "Engineering Documentation"

**Benefits:**

* Clear ownership
* Easy access control
* Focused knowledge per team

**Strategy 2: By Topic**

Organize by subject matter:

* "Product Information"
* "Technical Specifications"
* "Company Policies"
* "Training Materials"

**Benefits:**

* Cross-functional access
* Logical grouping
* Easy to find information

**Strategy 3: By AI Employee**

One datastore per AI Employee:

* "Support Bot Knowledge"
* "Sales Bot Resources"
* "Onboarding Assistant Data"

**Benefits:**

* Direct 1:1 mapping
* Simplified management
* Clear scope per agent

### Content Quality

**Document Preparation:**

Before uploading:

1. **Clean Up**: Remove irrelevant sections
2. **Structure**: Use clear headings and hierarchy
3. **Format**: Ensure text is selectable (not images)
4. **Accuracy**: Verify information is current
5. **Completeness**: Include all necessary context

**Writing for AI:**

* **Clear Language**: Avoid ambiguous terms
* **Complete Sentences**: Full thoughts, not fragments
* **Context**: Include background information
* **Examples**: Provide concrete examples where possible
* **Consistency**: Use consistent terminology

### Maintenance Schedule

**Daily:**

* Monitor processing status for new uploads
* Check for failed datasources

**Weekly:**

* Review AI Employee response quality
* Identify knowledge gaps
* Add missing information

**Monthly:**

* Audit all datastores for outdated content
* Update changed information
* Remove deprecated datasources
* Check for duplicate content

**Quarterly:**

* Major content review and refresh
* Reorganize if needed
* Archive unused datastores
* Performance optimization

### Performance Optimization

**Keep Datastores Focused:**

✅ **Good:**

* "Customer Support - Product A"
* "Customer Support - Product B"

❌ **Avoid:**

* "Everything About Our Company"

**Optimal Size:**

* **Small Datastore**: 10-50 datasources
* **Medium Datastore**: 50-200 datasources
* **Large Datastore**: 200-1000 datasources

**Signs of Too Much Data:**

* Slow AI response times
* Irrelevant information in responses
* Difficulty finding specific information

**Solutions:**

* Split into multiple focused datastores
* Remove redundant content
* Archive old versions

### Security and Privacy

**Sensitive Information:**

**Best Practices:**

1. **Audit Before Upload**: Review for confidential data
2. **Redact**: Remove personal information, credentials
3. **Access Control**: Use appropriate visibility settings
4. **Regular Reviews**: Check for exposed sensitive data

**What to Avoid Uploading:**

* Personal identifiable information (PII)
* Passwords or API keys
* Financial records (unless encrypted)
* Confidential business strategies
* Private customer data

**Compliance Considerations:**

* GDPR: Ensure lawful processing of personal data
* HIPAA: Do not upload protected health information
* PCI DSS: Never upload payment card data
* Industry-specific: Follow your sector's regulations

***

## 9. Troubleshooting

### Common Issues

#### Datasource Failed to Process

**Problem:** Status shows "Failed" after upload

**Possible Causes:**

* Corrupted file
* Unsupported format
* File too large
* Password-protected document
* Network timeout

**Solutions:**

1. **Check File:**
   * Open file on your computer
   * Verify it is not corrupted
   * Ensure file is not password-protected

2. **File Size:**
   * Check file size (limit: typically 50MB per file)
   * For large files, split into smaller parts
   * Compress PDFs if possible

3. **Format:**
   * Verify file extension matches actual format
   * Try converting to a different format
   * Use PDF for best compatibility

4. **Retry:**
   * Click **"Retry"** button
   * Or delete and re-upload

#### Website Crawl Failed

**Problem:** URL datasource shows error after crawling

**Possible Causes:**

* Website blocks crawlers
* Authentication required
* JavaScript-heavy site
* Server timeout
* Invalid URL

**Solutions:**

1. **Check URL:**
   * Verify URL is accessible in browser
   * Ensure no typos
   * Use direct page URL, not redirect

2. **Authentication:**
   * Public pages only (no login required)
   * Contact site owner for crawler access

3. **Alternative Methods:**
   * Copy content manually as text datasource
   * Use PDF print instead
   * Try different page from same site

#### AI Not Using Datastore Knowledge

**Problem:** AI responses do not reference datastore content

**Checks:**

1. **Connection:**
   * Verify datastore is connected to AI Employee
   * Check connection from both sides

2. **Processing Status:**
   * Ensure all datasources show "Ready"
   * Wait if still "Processing"

3. **Content Relevance:**
   * Confirm question relates to datastore content
   * Try more specific queries
   * Check if information actually exists in datasources

4. **AI Configuration:**
   * Verify datastore tool is enabled
   * Check AI Employee instructions do not prevent tool use

**Testing:**

Ask direct questions about specific information you know is in the datastore:

Good test: "What is the return policy?" (if return policy is in datastore)\
Bad test: "Tell me everything you know" (too vague)

#### Duplicate or Conflicting Information

**Problem:** AI gives inconsistent answers

**Cause:** Multiple datasources contain different versions of the same information

**Solutions:**

1. **Identify Duplicates:**
   * Review datasources for overlapping content
   * Check upload dates to find old versions

2. **Clean Up:**
   * Delete outdated datasources
   * Keep only the most recent version
   * Update descriptions to note version/date

3. **Version Control:**
   * Include version numbers in datasource names
   * Example: "Product Manual v2.0 (Jan 2025)"
   * Remove old versions promptly

#### Slow Processing

**Problem:** Datasources stuck in "Processing" for extended time

**Expected Times:**

* Small text file: 10-30 seconds
* PDF (10-50 pages): 1-3 minutes
* Large PDF (100+ pages): 5-15 minutes
* Website (single page): 30-60 seconds
* Website (crawl 50 pages): 5-10 minutes

**If Unusually Slow:**

1. **Wait**: Large files legitimately take time
2. **Refresh**: Reload page to check updated status
3. **Check Status**: Look for error messages
4. **Support**: Contact support if stuck > 30 minutes

***

## 📊 Usage Limits

### Limits by Plan

**Typical Limits:**

| Plan           | Datastores | Datasources per Datastore | Storage |
| -------------- | ---------- | ------------------------- | ------- |
| **Free**       | 1          | 10                        | 10 MB   |
| **Growth**     | 5          | 50                        | 100 MB  |
| **Pro**        | 20         | 200                       | 1 GB    |
| **Enterprise** | 100        | 1000                      | 10 GB   |
| **Ultimate**   | Unlimited  | Unlimited                 | 1 TB    |

**Note:** Limits may vary. Check your plan details in Settings → Billing.

### Checking Usage

**Location:** Datastores page header

**Information Displayed:**

* Current datastore count vs. limit
* Total storage used vs. limit
* Warning if approaching limits

**Example Alert:**

```
⚠️ Storage Limit Warning (9.2 GB / 10 GB)
You're approaching your storage limit. Consider upgrading or removing unused datasources.
[Upgrade Plan]
```

### What Happens at Limit?

**Datastores Limit Reached:**

* Cannot create new datastores
* Can still add datasources to existing datastores
* Upgrade prompt displayed

**Storage Limit Reached:**

* Cannot upload new files
* Can still add text datasources
* Must delete datasources or upgrade

**Datasource Limit Reached:**

* Cannot add more datasources to that datastore
* Can create new datastores (if under datastore limit)
* Can delete old datasources to free slots

***

## 🔗 Integration with AI Employees

### Datastore Tool

When you connect a datastore to an AI Employee, the **Datastore Tool** is automatically enabled. This tool allows the AI to search and retrieve information from connected datastores.

### How the AI Uses Datastores

**Process:**

1. **User asks question** → AI Employee receives message
2. **AI determines** if question requires datastore knowledge
3. **Tool is called** → Datastore tool searches connected datastores
4. **Relevant chunks returned** → Top matching content pieces
5. **AI synthesizes** → Combines search results with its knowledge
6. **Response generated** → Answer based on your data

**Example Flow:**

```
User: "What is the return policy?"
  ↓
AI: [Searches datastore for "return policy"]
  ↓
Datastore: [Returns relevant policy text]
  ↓
AI: "According to our policy, you can return items within 30 days..."
```

### Customizing Datastore Behavior

**In AI Employee Settings:**

You can configure:

* **Search Threshold**: How closely content must match query
* **Max Results**: How many chunks to return
* **Priority**: Which datastores to search first
* **Fallback**: What to do if no relevant content found

**Advanced Prompting:**

Include instructions in the AI Employee's system prompt:

Example:

```
When answering questions, always check the datastore first. 
Only use your general knowledge if the datastore does not 
contain relevant information. Always cite the source document 
when referencing datastore content.
```

***

## 📞 Support & Resources

### Getting Help

**In-App Support:**

* Help button in dashboard
* Live chat (available on Pro+ plans)
* Documentation center

**Common Resources:**

* [ZappWay Documentation](https://docs.zappway.ai)
* [AI Employee Guide](https://docs.zappway.ai/en/ai-employees)
* [Integration Tutorials](https://docs.zappway.ai/en/integrations)

### Feedback

**Report Issues:**

* Use feedback button in dashboard
* Email: [support@zappway.ai](mailto:support@zappway.ai)
* Include:
  * Datastore ID
  * Datasource name
  * Error message (if any)
  * Steps to reproduce

**Feature Requests:**

* Submit via in-app feedback
* Community forum
* Feature voting board

***

## ✅ Quick Reference

### Essential Actions

| Task              | Location             | Action                       |
| ----------------- | -------------------- | ---------------------------- |
| Create Datastore  | Datastores page      | "Create Datastore"           |
| Add File          | Datastore detail     | "Add Datasource" → "Upload"  |
| Add URL           | Datastore detail     | "Add Datasource" → "Website" |
| Connect to AI     | AI Employee settings | Knowledge selector           |
| View Contents     | Datastore card       | Click name                   |
| Delete Datasource | Datasource card      | Delete button                |
| Reprocess         | Datasource actions   | "Reprocess"                  |

### Processing Status Guide

```
🔵 Uploading    → File transfer in progress
🟡 Processing   → Extracting content
🟠 Indexing     → Creating embeddings
🟢 Ready        → Available for use
🔴 Failed       → Error occurred
```

### Best File Formats

**Priority Order:**

1. **PDF** - Best for documents
2. **.docx** - Good for text documents
3. **.txt/.md** - Simple text content
4. **.csv** - Structured data
5. **URL** - Web content

***

**Last Updated:** March 2026\
**Platform:** ZappWay Datastores