Plugin
document-analysis
Document-to-Markdown ingestion front-end for the document-analysis pipeline. Wraps Microsoft MarkItDown (PDF, Word, PowerPoint, Excel, HTML, CSV/JSON/XML, EPub) with batch folder conversion that mirrors the input tree, YAML provenance frontmatter, a SHA-256 + settings manifest for idempotent re-runs (re-OCRs when --use-llm is added), a per-file timeout, per-file error isolation, and optional LLM image descriptions + embedded-image OCR via the team OpenRouter key. Runs in the Cowork sandbox; outputs default to the document-analysis repo's sources-md/ tree. Includes a documented (not-yet-wired) adoption reference for layout-faithful Azure Document Intelligence.