Pdf to Text .NET CLI Converter (github.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

A new open-source .NET 9 CLI tool (TacTicA.Llama.PdfToText.Net) converts PDFs to plaintext locally by rendering each page to PNG and sending the images to Ollama’s multimodal models (e.g., qwen2.5vl). Because it runs against a locally hosted Ollama instance there are no token costs and document data can stay on-prem, making it attractive for privacy-sensitive workflows, dataset creation, or research that needs rich image-to-text descriptions (diagrams, figures, tables) rather than plain OCR alone. The tool is cross-platform (Windows/Linux/macOS), distributed as a dotnet global tool (MIT license) and requires .NET 9+ and Ollama installed locally. Key components: PDFtoImage for page rendering, SixLabors.ImageSharp/SkiaSharp for image handling and optional resizing, an OllamaClient for API calls, and System.CommandLine for CLI parsing. CLI flags let you pick model, page ranges, output dir or stdout, image width, and whether to keep intermediate images; it consolidates per-page transcriptions into one output file and includes robust error handling (missing files, connectivity, invalid ranges). It’s useful for building local document-understanding pipelines, though throughput depends on image processing and model latency—parallelization is noted as a future optimization.

Loading comments...

loading comments...