Skip to main content
Note: A Datasource is a single piece of content (file, URL, or text) that exists within a Datastore. Think of datastores as folders and datasources as the individual files inside them.

🔢 Table of Contents

  1. Understanding Datasources
  2. Types of Datasources
  3. Adding Datasources
  4. Managing Datasources
  5. Viewing and Editing
  6. Best Practices

1. Understanding Datasources

What is a Datasource?

A Datasource is an individual unit of content that contributes to your knowledge base. Each datasource contains information that your AI Employees can reference when responding to queries. Relationship: Organization └─ Datastore (e.g., “Customer Support Docs”) ├─ Datasource 1 (e.g., “Product Manual.pdf”) ├─ Datasource 2 (e.g., “FAQ Page”) └─ Datasource 3 (e.g., “Return Policy”)

Datasource vs. Datastore

DatasourceDatastore
Individual piece of contentCollection of datasources
Single file, URL, or text entryContainer for multiple datasources
Example: “Setup Guide.pdf”Example: “Product Documentation”

2. Types of Datasources

File-Based Datasources

Upload documents from your computer: Supported Formats:
  • PDF (.pdf)
  • Word (.docx, .doc)
  • Text (.txt)
  • Markdown (.md)
  • Spreadsheets (.xlsx, .xls, .csv)
  • Presentations (.pptx, .ppt)
Characteristics:
  • Stored permanently in ZappWay
  • Processed once upon upload
  • Can be downloaded later
  • Versioning through re-upload

URL-Based Datasources

Import content from websites: Types:
  • Single web page
  • Blog article
  • Documentation page
  • Product page
  • Knowledge base article
Characteristics:
  • Content snapshot taken at import time
  • Can be refreshed/reprocessed
  • Does not auto-update (manual refresh needed)
  • Public pages only (no authentication)

Text-Based Datasources

Manually entered content: Use Cases:
  • Quick FAQ answers
  • Short policies
  • Custom instructions
  • Temporary information
  • Snippets and notes
Characteristics:
  • Fastest to create
  • Easy to edit inline
  • No file upload needed
  • Ideal for short content

3. Adding Datasources

From Datastore Detail Page

Steps:
  1. Navigate to desired datastore
  2. Click “Add Datasource” button
  3. Choose datasource type:
    • File Upload
    • Website URL
    • Text Entry
  4. Follow type-specific steps below

Adding Files

Single File Upload:
  1. Select “File Upload”
  2. Click “Choose File” or drag-and-drop
  3. Select file from your computer
  4. File uploads automatically
  5. Processing begins immediately
Multiple File Upload:
  1. Select “File Upload”
  2. Click “Choose Files” or drag multiple files
  3. All files are queued
  4. Each processes independently
  5. Track individual progress
Supported Operations:
  • Drag and Drop: Drag files directly onto upload area
  • Batch Upload: Select multiple files at once
  • Progress Tracking: Real-time upload and processing status

Adding Websites

Single URL:
  1. Select “Website”
  2. Enter full URL (including https://)
  3. Choose import mode:
    • Single Page: Import only this URL
    • Crawl Site: Follow links automatically
  4. Configure options (if crawling):
    • Max pages to crawl
    • URL patterns to include/exclude
  5. Click “Add”
Crawl Options Explained:
  • Max Depth: How many clicks deep to follow links
    • Depth 0: Only the page you specified
    • Depth 1: Specified page + linked pages
    • Depth 2: Above + pages linked from those pages
  • URL Patterns: Control which pages to import
    • Include: /docs/* (only documentation pages)
    • Exclude: /blog/* (skip blog posts)

Adding Text

Manual Entry:
  1. Select “Text Entry”
  2. Enter a descriptive name
  3. Type or paste content in text area
  4. Optionally use Markdown formatting
  5. Click “Save”
Formatting Options:
  • Plain Text: No formatting
  • Markdown: Headings, lists, bold, italic
  • HTML: Rich formatting (advanced)
Best Practices:
  • Keep text entries focused (single topic)
  • Use clear headings for structure
  • Include relevant keywords
  • Provide complete context

4. Managing Datasources

Datasource List View

Information Displayed: Each datasource card shows:
  • Type Icon: File, URL, or text indicator
  • Name: Datasource title or filename
  • Status: Processing state
  • Size/Length: File size or character count
  • Date Added: Upload/creation timestamp
  • Actions Menu: View, edit, delete options

Datasource Status

Status Types:
StatusIconMeaning
Ready🟢Processed and available
Processing🟡Currently being indexed
Failed🔴Error during processing
Uploading🔵File transfer in progress

Viewing Datasource Details

Click on any datasource to see:
  • Full Content Preview: Read the processed text
  • Metadata:
    • Original filename (if applicable)
    • Source URL (if applicable)
    • Upload date and time
    • File size
    • Character/word count
  • Processing Info:
    • Number of chunks created
    • Processing time
    • Status history
  • Usage Stats:
    • Number of times referenced by AI
    • Connected datastores
    • Last accessed timestamp

Editing Datasources

What Can Be Edited:
  • Name: Update the datasource name
  • Text Content: Modify text-based datasources
  • URL: Update website address (triggers reprocess)
File content (must re-upload) Processing settings Creation date Steps to Edit: Click datasource name or “Edit” button Modify allowed fields Click “Save Changes” Reprocessing occurs if content changed Reprocessing Datasources When to Reprocess: Website content has been updated Need to refresh URL-based datasource Processing failed initially Want to apply new chunking strategy How to Reprocess: Open datasource details Click “Reprocess” button Confirm action Wait for processing to complete Note: Original content is fetched again for URL datasources. Deleting Datasources Warning: Deletion is permanent and cannot be undone. The datasource is removed from all connected AI Employees immediately. Steps: Locate datasource in list Click “Delete” button (trash icon) Confirm deletion in modal Datasource is removed permanently Bulk Delete: Select multiple datasources (checkboxes) Click “Delete Selected” Confirm batch deletion All selected datasources are removed
  1. Viewing and Editing Content Preview Available for All Types:
Processed Text: View extracted content Original Format: Download source file (if applicable) Chunks View: See how content was segmented Preview Features: Search: Find keywords within datasource Highlight: Keyword highlighting in content Copy: Copy sections of text Download: Download original file Chunk Viewer What Are Chunks? Chunks are the segments your datasource is divided into for AI retrieval. Each chunk is optimized for semantic search. Viewing Chunks: Open datasource details Click “View Chunks” tab Browse all generated chunks See chunk boundaries and overlap Chunk Information: Chunk ID (for debugging) Content preview Token count Position in original document Use Cases: Verify content was chunked properly Troubleshoot missing information Understand how AI will see content Metadata Viewer Available Metadata: File Properties: Original filename File type/extension File size Upload timestamp Source Information: URL (if applicable) Domain Crawl date Page title Processing Data: Processing duration Number of chunks Embedding model used Index date
  1. Best Practices File Preparation Before Uploading:
Quality Check: Ensure text is selectable (not images) Check for OCR errors in scanned docs Verify content is complete Formatting: Use clear headings Include table of contents (for long docs) Structure with logical sections Naming: Use descriptive filenames Include version numbers Add dates if time-sensitive Example Good Names: Product_Manual_v2.1_Jan2025.pdf Return_Policy_Updated.docx API_Reference_Latest.pdf URL Best Practices Choosing URLs: ✅ Good URLs: Direct documentation pages Specific article/blog post URLs Static content pages Well-structured sites ❌ Avoid: URLs requiring login JavaScript-heavy applications Frequently changing pages (news sites) Dynamic search result pages Crawling Tips: Start with small crawls (test with 10-20 pages) Use URL patterns to filter content Avoid crawling entire large sites Re-crawl periodically for updates Content Organization Strategy 1: By Document Type Group similar datasources: All PDFs together All web pages together All text entries together Strategy 2: By Topic Organize by subject: Product A documentation Product B documentation General policies Strategy 3: By Freshness Prioritize by update frequency: Frequently updated (refresh monthly) Stable content (refresh yearly) Archive (reference only) Maintenance Regular Tasks: Weekly: Check for failed datasources Reprocess important URLs Remove outdated content Monthly: Audit all datasources Update version-controlled docs Verify critical information is current Quarterly: Major content refresh Delete unused datasources Reorganize if needed Version Control: Include version numbers in names Keep only current version (delete old) Document changes in descriptions Use consistent naming conventions ✅ Quick Reference Adding Datasources TypeBest ForSpeedFile UploadDocuments, manualsFast (1-5 min)Website URLWeb content, blogsMedium (2-10 min)Text EntryQuick facts, FAQsInstant File Size Limits Maximum per file: 50 MB (typical) Recommended: Under 20 MB for faster processing For large files: Split into multiple smaller files Common Formats Best Compatibility: PDF (text-based) .docx .txt / .md .csv / .xlsx Last Updated: January 2025 Version: 1.0 Platform: ZappWay Datasources