What you'll need

An Appalix account on the Pro plan or above
A .zip archive containing your text files (max 50 MB compressed)
A configured bot with at least one source slot available

1. Prepare your ZIP file

Package the files you want indexed into a single .zip archive. Subdirectory structure is fine — Appalix walks the entire archive recursively.

Keep total uncompressed size under 50 MB. The compressed ZIP itself can be much smaller — only the expanded text content counts toward the limit.

Tip — what to put inside

Export docs from Confluence or Notion as HTML / Markdown and ZIP them
ZIP a folder of CSVs (product catalogue, FAQ pairs, pricing tables)
Bundle static help-centre pages exported from your CMS
Combine multiple JSON knowledge files into one archive

2. Add a new source

In the Appalix dashboard, go to Sources and click Add source. Select the PDF / Word / ZIP tile — this is the same tile used for PDFs, Word docs, and PowerPoints.

3. Upload the ZIP and submit

Click Choose file, select your .zip, and wait for the Done indicator. Enter a name for the source, then click Add & index source.

The file uploads directly to secure cloud storage — it never passes through Vercel's 4.5 MB serverless limit — so even large archives upload reliably.

4. Verify the source is ready

Return to the Sources list. Once ingestion finishes, the source will show a green Ready badge and a chunk count. Each readable file inside the ZIP becomes one or more chunks your bot can retrieve.

What Appalix reads from your ZIP

Appalix only extracts files with these extensions. Everything else is silently skipped — no errors, no partial reads.

Extension	Format	How it's indexed
`.txt`	Plain text	Read as-is. Great for FAQs, policies, and notes.
`.md`	Markdown	Read as plain text. Headers, lists, and code blocks are preserved as text.
`.csv`	CSV spreadsheet	Each row becomes searchable text. Column headers are included.
`.json`	JSON data	The full JSON string is indexed. Ideal for structured knowledge dumps.
`.xml`	XML	Raw XML text is indexed. Tag names and values are both searchable.
`.html / .htm`	HTML	Full HTML source is indexed, including tag content and attributes.

What Appalix skips

Files with any other extension are ignored entirely. They are never executed, stored, or sent to the AI model. This applies to:

Executables & scripts

.exe, .dll, .bat, .sh, .py, .js, .php

Never run. Skipped silently for security.

Images & media

.jpg, .png, .gif, .mp4, .mp3, .pdf

Binary files with no plain-text content to index.

Office documents

.docx, .xlsx, .pptx

Upload these directly as their own source type for full parsing.

Archives inside archives

Nested .zip, .tar, .gz

Only top-level content is processed. Nested ZIPs are skipped.

Security & safety

Executables are never run — files are decoded as plain text strings only, never executed in any environment.
Zip bomb protection — if the total uncompressed text content exceeds 50 MB, ingestion stops immediately with an error.
No binary processing — only whitelisted text extensions are read. Unknown types are skipped without error.
Isolated processing — ingestion runs in a sandboxed API service, separate from your bot's runtime environment.

Frequently asked questions

Can I include PDFs or Word docs inside the ZIP?

Not yet — those formats require a separate parsing pipeline. Upload PDF, Word (.docx), or Excel (.xlsx) files directly using their own source tiles. Inside a ZIP, only plain-text formats are indexed.

Does folder structure inside the ZIP matter?

No. Appalix flattens the archive and processes every matching file regardless of which subfolder it lives in. The folder path is shown as a section header in the indexed content so you can trace where each chunk came from.

What if some files inside are empty?

Empty files are skipped automatically — only files with non-whitespace content are indexed.

Can I re-upload a ZIP to update the knowledge base?

Yes. Delete the old source and add a new one with the updated ZIP, or use the Resync button on the source row if you replace the file at the same storage path.

Is there a limit on the number of files inside the ZIP?

There is no file count limit, only the 50 MB total uncompressed text content limit. A ZIP with 500 tiny .txt files will work fine as long as the combined text stays under 50 MB.

How to Upload a ZIP File as a Knowledge Base Source