Important: Datastores are the foundation of intelligent AI responses. By connecting relevant data sources to your AI Employees, you enable them to provide accurate, context-aware answers based on your organization’s specific information.
🔢 Table of Contents
- What is a Datastore?
- Creating Your First Datastore
- Adding Datasources
- Supported Data Types
- Processing and Indexing
- Connecting to AI Employees
- Managing Datastores
- Best Practices
- Troubleshooting
1. What is a Datastore?
Overview
A Datastore is a collection of data sources that provides knowledge to your AI Employees. Think of it as a knowledge base that your AI can reference when answering questions or performing tasks. Key Concepts:- Datastore: A container that groups related datasources together
- Datasource: Individual pieces of content (documents, websites, text files, etc.)
- Chunks: Small segments of processed data optimized for AI retrieval
- Embeddings: Vector representations of your data used for semantic search
Why Use Datastores?
Benefits:- Accuracy: AI responses based on your actual data, not general knowledge
- Context: Provide domain-specific information unique to your business
- Control: Manage exactly what information the AI can access
- Updates: Keep knowledge current by updating datasources
- Organization: Group related information logically by department, topic, or purpose
- Product documentation and manuals
- Company policies and procedures
- FAQ databases
- Technical specifications
- Customer support knowledge bases
- Training materials
- Legal documents
- Marketing content
2. Creating Your First Datastore
Quick Start
Steps:- Navigate to Datastores in the sidebar (or visit
/datastores) - Click “Create Datastore” button
- Fill in the creation form:
- Name (required): Descriptive name for the datastore
- Description (optional): Purpose and contents overview
- Click “Create”
- Your new datastore appears in the list
Naming Best Practices
Good Examples:- “Customer Support Documentation”
- “Product Catalog 2025”
- “HR Policies and Procedures”
- “Technical API Reference”
- “Sales Training Materials”
- Generic names like “Datastore 1” or “Test”
- Unclear abbreviations
- Names without context
Datastore Settings
Configuration Options: After creation, you can configure:- Name: Update the datastore name
- Description: Add or modify description
- Visibility: Control who can access (if applicable)
- Connected AI Employees: View which agents use this datastore
3. Adding Datasources
What is a Datasource?
A Datasource is an individual piece of content within a datastore. Each datastore can contain multiple datasources of different types.How to Add Datasources
Location: Inside any datastore detail page Methods:Method 1: File Upload
- Click “Add Datasource” or “Upload Files”
- Select “File Upload” option
- Choose files from your computer:
- Single file upload
- Multiple files (batch upload)
- Drag and drop support
- Files are uploaded and processed automatically
- PDF documents
- Word documents (.docx, .doc)
- Text files (.txt)
- Markdown files (.md)
- CSV files
- Excel spreadsheets (.xlsx, .xls)
- PowerPoint presentations (.pptx, .ppt)
Method 2: Website/URL
- Click “Add Datasource”
- Select “Website” option
- Enter the URL of the webpage
- Configure crawling options:
- Single page: Import only the specified URL
- Crawl site: Follow links and import multiple pages
- Max depth: How many levels deep to crawl
- URL patterns: Include/exclude specific paths
- Click “Add” to start import
- Documentation sites:
https://docs.example.com - Blog posts:
https://blog.example.com/article - Product pages:
https://shop.example.com/products
Method 3: Text Input
- Click “Add Datasource”
- Select “Text” option
- Paste or type content directly
- Give it a descriptive name
- Click “Save”
- Quick FAQ entries
- Policy snippets
- Short reference materials
- Temporary information
Method 4: Integrations (Coming Soon)
Future support for:- Google Drive folders
- Notion databases
- Confluence spaces
- GitHub repositories
- SharePoint sites
Batch Operations
Upload Multiple Files:- Select multiple files in the upload dialog
- All files are queued for processing
- Track progress for each file individually
- Failed uploads can be retried
- Enable “Bulk URL Import”
- Paste multiple URLs (one per line)
- Configure shared settings for all URLs
- Start batch import
4. Supported Data Types
Documents
File Formats:| Type | Extensions | Notes |
|---|---|---|
| Text extraction, OCR for scanned docs | ||
| Word | .docx, .doc | Full formatting preserved |
| Text | .txt, .md | Plain text and Markdown |
| Spreadsheets | .xlsx, .xls, .csv | Table data extracted |
| Presentations | .pptx, .ppt | Slide content and notes |
- Text Extraction: Automatic extraction from all formats
- OCR: Optical character recognition for image-based PDFs
- Table Parsing: Structured data from spreadsheets
- Metadata Extraction: Author, creation date, titles
Web Content
Supported:- Public websites (HTML pages)
- Blog posts and articles
- Documentation sites
- Product pages
- Knowledge base articles
- Smart Crawling: Follow internal links automatically
- Content Cleaning: Remove navigation, ads, footers
- JavaScript Rendering: Support for dynamic content
- Sitemap Support: Import via sitemap.xml
- Cannot access password-protected sites
- Rate limiting may apply for large crawls
- Some dynamic content may not render perfectly
Structured Data
CSV/Excel Processing:- Each row can become a separate datasource
- Column headers used for context
- Numeric data preserved
- Support for large datasets (up to 100,000 rows)
- Import structured JSON files
- Nested objects supported
- Array elements handled intelligently
Raw Text
Use Cases:- FAQ entries
- Policy statements
- Product descriptions
- Quick reference materials
- Markdown supported for rich formatting
- Plain text for simple content
- HTML can be pasted directly
5. Processing and Indexing
How Processing Works
When you add a datasource, ZappWay automatically:- Extracts: Pulls text content from files/URLs
- Cleans: Removes irrelevant elements (ads, navigation)
- Chunks: Splits content into optimal-sized segments
- Embeds: Creates vector representations for semantic search
- Indexes: Stores in a searchable database
Processing Status
Status Indicators:| Status | Meaning | What to Do |
|---|---|---|
| Uploading | File transfer in progress | Wait |
| Processing | Extracting and chunking content | Wait (may take 1-5 minutes) |
| Indexing | Creating embeddings | Wait |
| Ready | Available for AI use | Nothing, it works! |
| Failed | Processing error occurred | Retry or check file |
- Real-time status updates on datasource cards
- Progress percentage for large files
- Estimated time remaining
- Error messages for failed processing
Chunking Strategy
What is Chunking? Large documents are split into smaller pieces (chunks) to optimize AI retrieval. Each chunk is typically 500-1000 tokens. Why Chunking Matters:- Relevance: AI can find the exact section needed
- Performance: Faster search and retrieval
- Context: Each chunk maintains coherent context
- Document type (PDF, text, web page)
- Content structure (headings, paragraphs)
- Information density
Embeddings and Search
What are Embeddings? Embeddings are numerical representations of text that capture semantic meaning. They enable the AI to find relevant information even when exact keywords do not match. Example: Query: “How do I reset my password?”Match: Chunk containing “Password recovery process”
(Even though “reset” is not in the chunk text) Search Types:
- Semantic Search: Meaning-based matching (default)
- Keyword Search: Exact term matching
- Hybrid Search: Combination of both (best results)
6. Connecting to AI Employees
Why Connect Datastores?
Connecting a datastore to an AI Employee allows that agent to reference the datastore’s knowledge when responding to queries. Without a connection, the AI cannot access the data.Connection Methods
Method 1: During AI Employee Creation
When creating a new AI Employee:- In the creation form, find “Knowledge” section
- Click the knowledge selector
- Search and select datastores
- Multiple datastores can be connected
- Save the AI Employee
Method 2: From AI Employee Settings
For existing AI Employees:- Navigate to AI Employees → Select employee
- Go to Settings tab
- Find “Knowledge” section
- Click “Edit” or knowledge selector
- Add/remove datastores
- Click “Save”
Method 3: From Integration Setup
When setting up integrations (WhatsApp, Messenger, etc.):- During integration configuration
- Select AI Employee
- Knowledge selector appears automatically
- Choose relevant datastores
- Complete integration setup
Multiple Datastore Strategy
Best Practices: Scenario 1: Specialized AI Employees- Support AI: Connect only support-related datastores
- Sales AI: Connect product catalogs and pricing
- HR AI: Connect policies and employee handbooks
- Connect multiple related datastores to one AI
- Example: Support AI with product docs + FAQs + troubleshooting guides
- More datastores = more data to search
- Keep connections relevant to avoid confusion
- Typical limit: 5-10 datastores per AI Employee
Viewing Connections
From Datastore Page: Each datastore card shows:- Connected AI Employees: Count of agents using this datastore
- Click to view: List of specific AI Employees
- Connected Datastores: Full list with names
- Quick remove: Unlink datastores easily
7. Managing Datastores
Datastore List View
Main Page Display: Each datastore card shows:- Name: Datastore title
- Description: Purpose overview
- Datasource Count: Number of sources inside
- Connected AI Employees: How many agents use it
- Last Updated: Most recent modification
- Actions: Edit, Delete, View Details
Viewing Datastore Contents
Detail Page: Click on any datastore to see:- Overview: Name, description, stats
- Datasources List: All contained datasources
- File name/URL
- Type (PDF, website, text)
- Status (Ready, Processing, Failed)
- Size/Length
- Upload date
- Actions (View, Delete)
- Activity Log: Recent changes and updates
- Connected AI Employees: Full list
Editing Datastores
What Can Be Edited:- Name: Update datastore name
- Description: Modify description
- Add Datasources: Upload new files or URLs
- Remove Datasources: Delete outdated sources
- Reprocess: Trigger re-indexing if needed
- Click datastore name to open detail page
- Click “Edit” button (top-right)
- Modify fields as needed
- Click “Save Changes”
Updating Content
When to Update:- Product information changes
- New documentation released
- Policy updates
- Correcting outdated information
- Delete old datasource
- Upload new version
- Wait for processing
- Upload new file with version number
- Keep old version for reference (optional)
- Remove old version later
- Clicking “Reprocess” on a web datasource
- Content is re-crawled and updated
Deleting Datastores
Warning: Deleting a datastore removes all contained datasources and disconnects it from all AI Employees. This action cannot be undone. Steps:- Navigate to datastore detail page
- Click “Delete” button (usually in menu)
- Confirm deletion in modal
- Datastore and all datasources are permanently removed
- Check which AI Employees are connected
- Consider archiving instead if data might be needed later
- Export important datasources if needed
Archiving (If Available)
Some plans support archiving:- Temporarily disable a datastore without deleting
- Disconnects from AI Employees automatically
- Can be restored later
- Preserves all datasources and configuration
8. Best Practices
Organizing Datastores
Strategy 1: By Department Create separate datastores for each team:- “Customer Support Knowledge Base”
- “Sales Resources”
- “HR Policies”
- “Engineering Documentation”
- Clear ownership
- Easy access control
- Focused knowledge per team
- “Product Information”
- “Technical Specifications”
- “Company Policies”
- “Training Materials”
- Cross-functional access
- Logical grouping
- Easy to find information
- “Support Bot Knowledge”
- “Sales Bot Resources”
- “Onboarding Assistant Data”
- Direct 1:1 mapping
- Simplified management
- Clear scope per agent
Content Quality
Document Preparation: Before uploading:- Clean Up: Remove irrelevant sections
- Structure: Use clear headings and hierarchy
- Format: Ensure text is selectable (not images)
- Accuracy: Verify information is current
- Completeness: Include all necessary context
- Clear Language: Avoid ambiguous terms
- Complete Sentences: Full thoughts, not fragments
- Context: Include background information
- Examples: Provide concrete examples where possible
- Consistency: Use consistent terminology
Maintenance Schedule
Daily:- Monitor processing status for new uploads
- Check for failed datasources
- Review AI Employee response quality
- Identify knowledge gaps
- Add missing information
- Audit all datastores for outdated content
- Update changed information
- Remove deprecated datasources
- Check for duplicate content
- Major content review and refresh
- Reorganize if needed
- Archive unused datastores
- Performance optimization
Performance Optimization
Keep Datastores Focused: ✅ Good:- “Customer Support - Product A”
- “Customer Support - Product B”
- “Everything About Our Company”
- Small Datastore: 10-50 datasources
- Medium Datastore: 50-200 datasources
- Large Datastore: 200-1000 datasources
- Slow AI response times
- Irrelevant information in responses
- Difficulty finding specific information
- Split into multiple focused datastores
- Remove redundant content
- Archive old versions
Security and Privacy
Sensitive Information: Best Practices:- Audit Before Upload: Review for confidential data
- Redact: Remove personal information, credentials
- Access Control: Use appropriate visibility settings
- Regular Reviews: Check for exposed sensitive data
- Personal identifiable information (PII)
- Passwords or API keys
- Financial records (unless encrypted)
- Confidential business strategies
- Private customer data
- GDPR: Ensure lawful processing of personal data
- HIPAA: Do not upload protected health information
- PCI DSS: Never upload payment card data
- Industry-specific: Follow your sector’s regulations
9. Troubleshooting
Common Issues
Datasource Failed to Process
Problem: Status shows “Failed” after upload Possible Causes:- Corrupted file
- Unsupported format
- File too large
- Password-protected document
- Network timeout
-
Check File:
- Open file on your computer
- Verify it is not corrupted
- Ensure file is not password-protected
-
File Size:
- Check file size (limit: typically 50MB per file)
- For large files, split into smaller parts
- Compress PDFs if possible
-
Format:
- Verify file extension matches actual format
- Try converting to a different format
- Use PDF for best compatibility
-
Retry:
- Click “Retry” button
- Or delete and re-upload
Website Crawl Failed
Problem: URL datasource shows error after crawling Possible Causes:- Website blocks crawlers
- Authentication required
- JavaScript-heavy site
- Server timeout
- Invalid URL
-
Check URL:
- Verify URL is accessible in browser
- Ensure no typos
- Use direct page URL, not redirect
-
Authentication:
- Public pages only (no login required)
- Contact site owner for crawler access
-
Alternative Methods:
- Copy content manually as text datasource
- Use PDF print instead
- Try different page from same site
AI Not Using Datastore Knowledge
Problem: AI responses do not reference datastore content Checks:-
Connection:
- Verify datastore is connected to AI Employee
- Check connection from both sides
-
Processing Status:
- Ensure all datasources show “Ready”
- Wait if still “Processing”
-
Content Relevance:
- Confirm question relates to datastore content
- Try more specific queries
- Check if information actually exists in datasources
-
AI Configuration:
- Verify datastore tool is enabled
- Check AI Employee instructions do not prevent tool use
Bad test: “Tell me everything you know” (too vague)
Duplicate or Conflicting Information
Problem: AI gives inconsistent answers Cause: Multiple datasources contain different versions of the same information Solutions:-
Identify Duplicates:
- Review datasources for overlapping content
- Check upload dates to find old versions
-
Clean Up:
- Delete outdated datasources
- Keep only the most recent version
- Update descriptions to note version/date
-
Version Control:
- Include version numbers in datasource names
- Example: “Product Manual v2.0 (Jan 2025)”
- Remove old versions promptly
Slow Processing
Problem: Datasources stuck in “Processing” for extended time Expected Times:- Small text file: 10-30 seconds
- PDF (10-50 pages): 1-3 minutes
- Large PDF (100+ pages): 5-15 minutes
- Website (single page): 30-60 seconds
- Website (crawl 50 pages): 5-10 minutes
- Wait: Large files legitimately take time
- Refresh: Reload page to check updated status
- Check Status: Look for error messages
- Support: Contact support if stuck > 30 minutes
📊 Usage Limits
Limits by Plan
Typical Limits:| Plan | Datastores | Datasources per Datastore | Storage |
|---|---|---|---|
| Free | 1 | 10 | 10 MB |
| Growth | 5 | 50 | 100 MB |
| Pro | 20 | 200 | 1 GB |
| Enterprise | 100 | 1000 | 10 GB |
| Ultimate | Unlimited | Unlimited | 1 TB |
Checking Usage
Location: Datastores page header Information Displayed:- Current datastore count vs. limit
- Total storage used vs. limit
- Warning if approaching limits
What Happens at Limit?
Datastores Limit Reached:- Cannot create new datastores
- Can still add datasources to existing datastores
- Upgrade prompt displayed
- Cannot upload new files
- Can still add text datasources
- Must delete datasources or upgrade
- Cannot add more datasources to that datastore
- Can create new datastores (if under datastore limit)
- Can delete old datasources to free slots
🔗 Integration with AI Employees
Datastore Tool
When you connect a datastore to an AI Employee, the Datastore Tool is automatically enabled. This tool allows the AI to search and retrieve information from connected datastores.How the AI Uses Datastores
Process:- User asks question → AI Employee receives message
- AI determines if question requires datastore knowledge
- Tool is called → Datastore tool searches connected datastores
- Relevant chunks returned → Top matching content pieces
- AI synthesizes → Combines search results with its knowledge
- Response generated → Answer based on your data
Customizing Datastore Behavior
In AI Employee Settings: You can configure:- Search Threshold: How closely content must match query
- Max Results: How many chunks to return
- Priority: Which datastores to search first
- Fallback: What to do if no relevant content found
📞 Support & Resources
Getting Help
In-App Support:- Help button in dashboard
- Live chat (available on Pro+ plans)
- Documentation center
Feedback
Report Issues:- Use feedback button in dashboard
- Email: [email protected]
- Include:
- Datastore ID
- Datasource name
- Error message (if any)
- Steps to reproduce
- Submit via in-app feedback
- Community forum
- Feature voting board
✅ Quick Reference
Essential Actions
| Task | Location | Action |
|---|---|---|
| Create Datastore | Datastores page | ”Create Datastore” |
| Add File | Datastore detail | ”Add Datasource” → “Upload” |
| Add URL | Datastore detail | ”Add Datasource” → “Website” |
| Connect to AI | AI Employee settings | Knowledge selector |
| View Contents | Datastore card | Click name |
| Delete Datasource | Datasource card | Delete button |
| Reprocess | Datasource actions | ”Reprocess” |
Processing Status Guide
Best File Formats
Priority Order:- PDF - Best for documents
- .docx - Good for text documents
- .txt/.md - Simple text content
- .csv - Structured data
- URL - Web content
Last Updated: January 2025
Version: 1.0
Platform: ZappWay Datastores

