Skip to main content
Important: Datastores are the foundation of intelligent AI responses. By connecting relevant data sources to your AI Employees, you enable them to provide accurate, context-aware answers based on your organization’s specific information.

🔢 Table of Contents

  1. What is a Datastore?
  2. Creating Your First Datastore
  3. Adding Datasources
  4. Supported Data Types
  5. Processing and Indexing
  6. Connecting to AI Employees
  7. Managing Datastores
  8. Best Practices
  9. Troubleshooting

1. What is a Datastore?

Overview

A Datastore is a collection of data sources that provides knowledge to your AI Employees. Think of it as a knowledge base that your AI can reference when answering questions or performing tasks. Key Concepts:
  • Datastore: A container that groups related datasources together
  • Datasource: Individual pieces of content (documents, websites, text files, etc.)
  • Chunks: Small segments of processed data optimized for AI retrieval
  • Embeddings: Vector representations of your data used for semantic search

Why Use Datastores?

Benefits:
  • Accuracy: AI responses based on your actual data, not general knowledge
  • Context: Provide domain-specific information unique to your business
  • Control: Manage exactly what information the AI can access
  • Updates: Keep knowledge current by updating datasources
  • Organization: Group related information logically by department, topic, or purpose
Common Use Cases:
  • Product documentation and manuals
  • Company policies and procedures
  • FAQ databases
  • Technical specifications
  • Customer support knowledge bases
  • Training materials
  • Legal documents
  • Marketing content

2. Creating Your First Datastore

Quick Start

Steps:
  1. Navigate to Datastores in the sidebar (or visit /datastores)
  2. Click “Create Datastore” button
  3. Fill in the creation form:
    • Name (required): Descriptive name for the datastore
    • Description (optional): Purpose and contents overview
  4. Click “Create”
  5. Your new datastore appears in the list

Naming Best Practices

Good Examples:
  • “Customer Support Documentation”
  • “Product Catalog 2025”
  • “HR Policies and Procedures”
  • “Technical API Reference”
  • “Sales Training Materials”
Avoid:
  • Generic names like “Datastore 1” or “Test”
  • Unclear abbreviations
  • Names without context

Datastore Settings

Configuration Options: After creation, you can configure:
  • Name: Update the datastore name
  • Description: Add or modify description
  • Visibility: Control who can access (if applicable)
  • Connected AI Employees: View which agents use this datastore

3. Adding Datasources

What is a Datasource?

A Datasource is an individual piece of content within a datastore. Each datastore can contain multiple datasources of different types.

How to Add Datasources

Location: Inside any datastore detail page Methods:

Method 1: File Upload

  1. Click “Add Datasource” or “Upload Files”
  2. Select “File Upload” option
  3. Choose files from your computer:
    • Single file upload
    • Multiple files (batch upload)
    • Drag and drop support
  4. Files are uploaded and processed automatically
Supported File Types:
  • PDF documents
  • Word documents (.docx, .doc)
  • Text files (.txt)
  • Markdown files (.md)
  • CSV files
  • Excel spreadsheets (.xlsx, .xls)
  • PowerPoint presentations (.pptx, .ppt)

Method 2: Website/URL

  1. Click “Add Datasource”
  2. Select “Website” option
  3. Enter the URL of the webpage
  4. Configure crawling options:
    • Single page: Import only the specified URL
    • Crawl site: Follow links and import multiple pages
    • Max depth: How many levels deep to crawl
    • URL patterns: Include/exclude specific paths
  5. Click “Add” to start import
URL Examples:
  • Documentation sites: https://docs.example.com
  • Blog posts: https://blog.example.com/article
  • Product pages: https://shop.example.com/products

Method 3: Text Input

  1. Click “Add Datasource”
  2. Select “Text” option
  3. Paste or type content directly
  4. Give it a descriptive name
  5. Click “Save”
Use Cases:
  • Quick FAQ entries
  • Policy snippets
  • Short reference materials
  • Temporary information

Method 4: Integrations (Coming Soon)

Future support for:
  • Google Drive folders
  • Notion databases
  • Confluence spaces
  • GitHub repositories
  • SharePoint sites

Batch Operations

Upload Multiple Files:
  1. Select multiple files in the upload dialog
  2. All files are queued for processing
  3. Track progress for each file individually
  4. Failed uploads can be retried
Import Multiple URLs:
  1. Enable “Bulk URL Import”
  2. Paste multiple URLs (one per line)
  3. Configure shared settings for all URLs
  4. Start batch import

4. Supported Data Types

Documents

File Formats:
TypeExtensionsNotes
PDF.pdfText extraction, OCR for scanned docs
Word.docx, .docFull formatting preserved
Text.txt, .mdPlain text and Markdown
Spreadsheets.xlsx, .xls, .csvTable data extracted
Presentations.pptx, .pptSlide content and notes
Processing Features:
  • Text Extraction: Automatic extraction from all formats
  • OCR: Optical character recognition for image-based PDFs
  • Table Parsing: Structured data from spreadsheets
  • Metadata Extraction: Author, creation date, titles

Web Content

Supported:
  • Public websites (HTML pages)
  • Blog posts and articles
  • Documentation sites
  • Product pages
  • Knowledge base articles
Features:
  • Smart Crawling: Follow internal links automatically
  • Content Cleaning: Remove navigation, ads, footers
  • JavaScript Rendering: Support for dynamic content
  • Sitemap Support: Import via sitemap.xml
Limitations:
  • Cannot access password-protected sites
  • Rate limiting may apply for large crawls
  • Some dynamic content may not render perfectly

Structured Data

CSV/Excel Processing:
  • Each row can become a separate datasource
  • Column headers used for context
  • Numeric data preserved
  • Support for large datasets (up to 100,000 rows)
JSON Data:
  • Import structured JSON files
  • Nested objects supported
  • Array elements handled intelligently

Raw Text

Use Cases:
  • FAQ entries
  • Policy statements
  • Product descriptions
  • Quick reference materials
Formatting:
  • Markdown supported for rich formatting
  • Plain text for simple content
  • HTML can be pasted directly

5. Processing and Indexing

How Processing Works

When you add a datasource, ZappWay automatically:
  1. Extracts: Pulls text content from files/URLs
  2. Cleans: Removes irrelevant elements (ads, navigation)
  3. Chunks: Splits content into optimal-sized segments
  4. Embeds: Creates vector representations for semantic search
  5. Indexes: Stores in a searchable database

Processing Status

Status Indicators:
StatusMeaningWhat to Do
UploadingFile transfer in progressWait
ProcessingExtracting and chunking contentWait (may take 1-5 minutes)
IndexingCreating embeddingsWait
ReadyAvailable for AI useNothing, it works!
FailedProcessing error occurredRetry or check file
Tracking Progress:
  • Real-time status updates on datasource cards
  • Progress percentage for large files
  • Estimated time remaining
  • Error messages for failed processing

Chunking Strategy

What is Chunking? Large documents are split into smaller pieces (chunks) to optimize AI retrieval. Each chunk is typically 500-1000 tokens. Why Chunking Matters:
  • Relevance: AI can find the exact section needed
  • Performance: Faster search and retrieval
  • Context: Each chunk maintains coherent context
Automatic Optimization: ZappWay automatically determines the best chunk size based on:
  • Document type (PDF, text, web page)
  • Content structure (headings, paragraphs)
  • Information density
What are Embeddings? Embeddings are numerical representations of text that capture semantic meaning. They enable the AI to find relevant information even when exact keywords do not match. Example: Query: “How do I reset my password?”
Match: Chunk containing “Password recovery process”
(Even though “reset” is not in the chunk text)
Search Types:
  • Semantic Search: Meaning-based matching (default)
  • Keyword Search: Exact term matching
  • Hybrid Search: Combination of both (best results)

6. Connecting to AI Employees

Why Connect Datastores?

Connecting a datastore to an AI Employee allows that agent to reference the datastore’s knowledge when responding to queries. Without a connection, the AI cannot access the data.

Connection Methods

Method 1: During AI Employee Creation

When creating a new AI Employee:
  1. In the creation form, find “Knowledge” section
  2. Click the knowledge selector
  3. Search and select datastores
  4. Multiple datastores can be connected
  5. Save the AI Employee

Method 2: From AI Employee Settings

For existing AI Employees:
  1. Navigate to AI Employees → Select employee
  2. Go to Settings tab
  3. Find “Knowledge” section
  4. Click “Edit” or knowledge selector
  5. Add/remove datastores
  6. Click “Save”

Method 3: From Integration Setup

When setting up integrations (WhatsApp, Messenger, etc.):
  1. During integration configuration
  2. Select AI Employee
  3. Knowledge selector appears automatically
  4. Choose relevant datastores
  5. Complete integration setup

Multiple Datastore Strategy

Best Practices: Scenario 1: Specialized AI Employees
  • Support AI: Connect only support-related datastores
  • Sales AI: Connect product catalogs and pricing
  • HR AI: Connect policies and employee handbooks
Scenario 2: Comprehensive Knowledge
  • Connect multiple related datastores to one AI
  • Example: Support AI with product docs + FAQs + troubleshooting guides
Performance Considerations:
  • More datastores = more data to search
  • Keep connections relevant to avoid confusion
  • Typical limit: 5-10 datastores per AI Employee

Viewing Connections

From Datastore Page: Each datastore card shows:
  • Connected AI Employees: Count of agents using this datastore
  • Click to view: List of specific AI Employees
From AI Employee Page: Settings tab displays:
  • Connected Datastores: Full list with names
  • Quick remove: Unlink datastores easily

7. Managing Datastores

Datastore List View

Main Page Display: Each datastore card shows:
  • Name: Datastore title
  • Description: Purpose overview
  • Datasource Count: Number of sources inside
  • Connected AI Employees: How many agents use it
  • Last Updated: Most recent modification
  • Actions: Edit, Delete, View Details

Viewing Datastore Contents

Detail Page: Click on any datastore to see:
  • Overview: Name, description, stats
  • Datasources List: All contained datasources
    • File name/URL
    • Type (PDF, website, text)
    • Status (Ready, Processing, Failed)
    • Size/Length
    • Upload date
    • Actions (View, Delete)
  • Activity Log: Recent changes and updates
  • Connected AI Employees: Full list

Editing Datastores

What Can Be Edited:
  • Name: Update datastore name
  • Description: Modify description
  • Add Datasources: Upload new files or URLs
  • Remove Datasources: Delete outdated sources
  • Reprocess: Trigger re-indexing if needed
Steps:
  1. Click datastore name to open detail page
  2. Click “Edit” button (top-right)
  3. Modify fields as needed
  4. Click “Save Changes”

Updating Content

When to Update:
  • Product information changes
  • New documentation released
  • Policy updates
  • Correcting outdated information
How to Update: Option 1: Replace Datasource
  1. Delete old datasource
  2. Upload new version
  3. Wait for processing
Option 2: Add New Version
  1. Upload new file with version number
  2. Keep old version for reference (optional)
  3. Remove old version later
Option 3: Reprocess Some changes to websites can be captured by:
  1. Clicking “Reprocess” on a web datasource
  2. Content is re-crawled and updated

Deleting Datastores

Warning: Deleting a datastore removes all contained datasources and disconnects it from all AI Employees. This action cannot be undone. Steps:
  1. Navigate to datastore detail page
  2. Click “Delete” button (usually in menu)
  3. Confirm deletion in modal
  4. Datastore and all datasources are permanently removed
Before Deleting:
  • Check which AI Employees are connected
  • Consider archiving instead if data might be needed later
  • Export important datasources if needed

Archiving (If Available)

Some plans support archiving:
  • Temporarily disable a datastore without deleting
  • Disconnects from AI Employees automatically
  • Can be restored later
  • Preserves all datasources and configuration

8. Best Practices

Organizing Datastores

Strategy 1: By Department Create separate datastores for each team:
  • “Customer Support Knowledge Base”
  • “Sales Resources”
  • “HR Policies”
  • “Engineering Documentation”
Benefits:
  • Clear ownership
  • Easy access control
  • Focused knowledge per team
Strategy 2: By Topic Organize by subject matter:
  • “Product Information”
  • “Technical Specifications”
  • “Company Policies”
  • “Training Materials”
Benefits:
  • Cross-functional access
  • Logical grouping
  • Easy to find information
Strategy 3: By AI Employee One datastore per AI Employee:
  • “Support Bot Knowledge”
  • “Sales Bot Resources”
  • “Onboarding Assistant Data”
Benefits:
  • Direct 1:1 mapping
  • Simplified management
  • Clear scope per agent

Content Quality

Document Preparation: Before uploading:
  1. Clean Up: Remove irrelevant sections
  2. Structure: Use clear headings and hierarchy
  3. Format: Ensure text is selectable (not images)
  4. Accuracy: Verify information is current
  5. Completeness: Include all necessary context
Writing for AI:
  • Clear Language: Avoid ambiguous terms
  • Complete Sentences: Full thoughts, not fragments
  • Context: Include background information
  • Examples: Provide concrete examples where possible
  • Consistency: Use consistent terminology

Maintenance Schedule

Daily:
  • Monitor processing status for new uploads
  • Check for failed datasources
Weekly:
  • Review AI Employee response quality
  • Identify knowledge gaps
  • Add missing information
Monthly:
  • Audit all datastores for outdated content
  • Update changed information
  • Remove deprecated datasources
  • Check for duplicate content
Quarterly:
  • Major content review and refresh
  • Reorganize if needed
  • Archive unused datastores
  • Performance optimization

Performance Optimization

Keep Datastores Focused: Good:
  • “Customer Support - Product A”
  • “Customer Support - Product B”
Avoid:
  • “Everything About Our Company”
Optimal Size:
  • Small Datastore: 10-50 datasources
  • Medium Datastore: 50-200 datasources
  • Large Datastore: 200-1000 datasources
Signs of Too Much Data:
  • Slow AI response times
  • Irrelevant information in responses
  • Difficulty finding specific information
Solutions:
  • Split into multiple focused datastores
  • Remove redundant content
  • Archive old versions

Security and Privacy

Sensitive Information: Best Practices:
  1. Audit Before Upload: Review for confidential data
  2. Redact: Remove personal information, credentials
  3. Access Control: Use appropriate visibility settings
  4. Regular Reviews: Check for exposed sensitive data
What to Avoid Uploading:
  • Personal identifiable information (PII)
  • Passwords or API keys
  • Financial records (unless encrypted)
  • Confidential business strategies
  • Private customer data
Compliance Considerations:
  • GDPR: Ensure lawful processing of personal data
  • HIPAA: Do not upload protected health information
  • PCI DSS: Never upload payment card data
  • Industry-specific: Follow your sector’s regulations

9. Troubleshooting

Common Issues

Datasource Failed to Process

Problem: Status shows “Failed” after upload Possible Causes:
  • Corrupted file
  • Unsupported format
  • File too large
  • Password-protected document
  • Network timeout
Solutions:
  1. Check File:
    • Open file on your computer
    • Verify it is not corrupted
    • Ensure file is not password-protected
  2. File Size:
    • Check file size (limit: typically 50MB per file)
    • For large files, split into smaller parts
    • Compress PDFs if possible
  3. Format:
    • Verify file extension matches actual format
    • Try converting to a different format
    • Use PDF for best compatibility
  4. Retry:
    • Click “Retry” button
    • Or delete and re-upload

Website Crawl Failed

Problem: URL datasource shows error after crawling Possible Causes:
  • Website blocks crawlers
  • Authentication required
  • JavaScript-heavy site
  • Server timeout
  • Invalid URL
Solutions:
  1. Check URL:
    • Verify URL is accessible in browser
    • Ensure no typos
    • Use direct page URL, not redirect
  2. Authentication:
    • Public pages only (no login required)
    • Contact site owner for crawler access
  3. Alternative Methods:
    • Copy content manually as text datasource
    • Use PDF print instead
    • Try different page from same site

AI Not Using Datastore Knowledge

Problem: AI responses do not reference datastore content Checks:
  1. Connection:
    • Verify datastore is connected to AI Employee
    • Check connection from both sides
  2. Processing Status:
    • Ensure all datasources show “Ready”
    • Wait if still “Processing”
  3. Content Relevance:
    • Confirm question relates to datastore content
    • Try more specific queries
    • Check if information actually exists in datasources
  4. AI Configuration:
    • Verify datastore tool is enabled
    • Check AI Employee instructions do not prevent tool use
Testing: Ask direct questions about specific information you know is in the datastore: Good test: “What is the return policy?” (if return policy is in datastore)
Bad test: “Tell me everything you know” (too vague)

Duplicate or Conflicting Information

Problem: AI gives inconsistent answers Cause: Multiple datasources contain different versions of the same information Solutions:
  1. Identify Duplicates:
    • Review datasources for overlapping content
    • Check upload dates to find old versions
  2. Clean Up:
    • Delete outdated datasources
    • Keep only the most recent version
    • Update descriptions to note version/date
  3. Version Control:
    • Include version numbers in datasource names
    • Example: “Product Manual v2.0 (Jan 2025)”
    • Remove old versions promptly

Slow Processing

Problem: Datasources stuck in “Processing” for extended time Expected Times:
  • Small text file: 10-30 seconds
  • PDF (10-50 pages): 1-3 minutes
  • Large PDF (100+ pages): 5-15 minutes
  • Website (single page): 30-60 seconds
  • Website (crawl 50 pages): 5-10 minutes
If Unusually Slow:
  1. Wait: Large files legitimately take time
  2. Refresh: Reload page to check updated status
  3. Check Status: Look for error messages
  4. Support: Contact support if stuck > 30 minutes

📊 Usage Limits

Limits by Plan

Typical Limits:
PlanDatastoresDatasources per DatastoreStorage
Free11010 MB
Growth550100 MB
Pro202001 GB
Enterprise100100010 GB
UltimateUnlimitedUnlimited1 TB
Note: Limits may vary. Check your plan details in Settings → Billing.

Checking Usage

Location: Datastores page header Information Displayed:
  • Current datastore count vs. limit
  • Total storage used vs. limit
  • Warning if approaching limits
Example Alert:
⚠️ Storage Limit Warning (9.2 GB / 10 GB)
You're approaching your storage limit. Consider upgrading or removing unused datasources.
[Upgrade Plan]

What Happens at Limit?

Datastores Limit Reached:
  • Cannot create new datastores
  • Can still add datasources to existing datastores
  • Upgrade prompt displayed
Storage Limit Reached:
  • Cannot upload new files
  • Can still add text datasources
  • Must delete datasources or upgrade
Datasource Limit Reached:
  • Cannot add more datasources to that datastore
  • Can create new datastores (if under datastore limit)
  • Can delete old datasources to free slots

🔗 Integration with AI Employees

Datastore Tool

When you connect a datastore to an AI Employee, the Datastore Tool is automatically enabled. This tool allows the AI to search and retrieve information from connected datastores.

How the AI Uses Datastores

Process:
  1. User asks question → AI Employee receives message
  2. AI determines if question requires datastore knowledge
  3. Tool is called → Datastore tool searches connected datastores
  4. Relevant chunks returned → Top matching content pieces
  5. AI synthesizes → Combines search results with its knowledge
  6. Response generated → Answer based on your data
Example Flow:
User: "What is the return policy?"

AI: [Searches datastore for "return policy"]

Datastore: [Returns relevant policy text]

AI: "According to our policy, you can return items within 30 days..."

Customizing Datastore Behavior

In AI Employee Settings: You can configure:
  • Search Threshold: How closely content must match query
  • Max Results: How many chunks to return
  • Priority: Which datastores to search first
  • Fallback: What to do if no relevant content found
Advanced Prompting: Include instructions in the AI Employee’s system prompt: Example:
When answering questions, always check the datastore first. 
Only use your general knowledge if the datastore does not 
contain relevant information. Always cite the source document 
when referencing datastore content.

📞 Support & Resources

Getting Help

In-App Support:
  • Help button in dashboard
  • Live chat (available on Pro+ plans)
  • Documentation center
Common Resources:

Feedback

Report Issues:
  • Use feedback button in dashboard
  • Email: [email protected]
  • Include:
    • Datastore ID
    • Datasource name
    • Error message (if any)
    • Steps to reproduce
Feature Requests:
  • Submit via in-app feedback
  • Community forum
  • Feature voting board

✅ Quick Reference

Essential Actions

TaskLocationAction
Create DatastoreDatastores page”Create Datastore”
Add FileDatastore detail”Add Datasource” → “Upload”
Add URLDatastore detail”Add Datasource” → “Website”
Connect to AIAI Employee settingsKnowledge selector
View ContentsDatastore cardClick name
Delete DatasourceDatasource cardDelete button
ReprocessDatasource actions”Reprocess”

Processing Status Guide

🔵 Uploading    → File transfer in progress
🟡 Processing   → Extracting content
🟠 Indexing     → Creating embeddings
🟢 Ready        → Available for use
🔴 Failed       → Error occurred

Best File Formats

Priority Order:
  1. PDF - Best for documents
  2. .docx - Good for text documents
  3. .txt/.md - Simple text content
  4. .csv - Structured data
  5. URL - Web content

Last Updated: January 2025
Version: 1.0
Platform: ZappWay Datastores