All Work
Developer Tools

DocProc
PDF Toolkit

18-pipeline document processing suite with a RAG-powered AI assistant, 248 automated tests, and zero cloud dependency. Privacy-first by design.

PythonFlaskChromaDBOpenCVONNXSQLite

18

Processing pipelines

248

Automated tests

0

Cloud uploads required

1

Standalone .exe package

01 / Problem

Document processing is fragmented — and it leaks your data

Compressing a PDF means uploading it to one tool. Merging PDFs means a different tool. Removing a background from an image means a third tool. Each requires uploading files — often sensitive ones — to a third-party server you don't control and can't audit.

Businesses handling contracts, medical records, financial documents, or proprietary engineering drawings can't afford that exposure. They need processing power without the privacy trade-off.

Beyond privacy: the fragmentation itself is a productivity problem. Switching between tools, re-uploading files, reconfiguring settings — it adds up. A single coherent suite with a natural language interface would save hours per week.

02 / Approach

18 pipelines, one interface, zero cloud

We built a unified document processing suite where every operation runs locally. No API keys for processing. No uploads. No account required. Everything executes on the user's machine, and the results stay there.

The AI chat assistant doesn't require describing operations in technical terms. A user can say "compress this PDF and make it under 2MB without losing too much quality" and the system generates and executes the correct processing plan automatically. The RAG engine answers questions about uploaded documents using hybrid BM25 + TF-IDF retrieval.

PDF Operations

  • Compress
  • Merge
  • Split
  • Rotate
  • Watermark
  • Password protect
  • Unlock

Image Operations

  • Resize
  • Convert format
  • AI background removal
  • Bulk export

AI Assistant

  • Natural language task description
  • Auto-generated execution plans
  • RAG document Q&A

Workflow

  • Recipe system
  • Multi-step chaining
  • Batch processing
  • Export as .exe
03 / Engineering Quality

248 tests. Clean architecture. Ships as a single .exe.

The test suite covers every pipeline, the AI chat execution flow, the RAG retrieval engine, the recipe chaining system, and the API layer. 248 automated tests that run before every release.

The recipe system allows users to chain operations: "compress this PDF, then merge it with these three others, then add a watermark." Each step is deterministic and individually tested. The final .exe package bundles the entire suite — Python runtime, all dependencies, ML models — into a single executable that non-technical users can run without installing anything.

248 automated tests

Full test coverage across all 18 pipelines, the AI chat layer, and the RAG retrieval engine.

Recipe chaining

Multi-step workflows defined as reusable recipes. Chain compress → merge → watermark in a single run.

Local ML inference

AI background removal and document analysis run via ONNX models locally. No API calls for core operations.

04 / What Was Shipped

A complete, production-quality document processing suite

  • 18 document processing pipelines covering PDF and image operations
  • 248 automated tests with full coverage across all pipelines and AI layers
  • Natural language AI chat: describe what you need, system executes it
  • RAG document Q&A engine with hybrid BM25 + TF-IDF retrieval
  • Privacy-first: all processing runs locally, nothing leaves the machine
  • Recipe system for composable multi-step processing workflows
  • Standalone .exe packaging for non-technical end users