OCR on Patent Figures with DeepSeek-OCR

12 approaches to extracting text and reference numbers from patent figure sheets, tested against 8 sheets from US11423567B2 (a facial recognition depth mapping system). Flowcharts, dense instrument screenshots, architectural diagrams with tiny scattered reference numbers. The figures Patent figures have text at multiple orientations (some sheets are rotated 90 degrees), tiny reference numbers like "41" or "7025" scattered among drawings, dense data screens with white text on dark backgrounds, structural elements (boxes, arrows, lines) that look like text to a machine, and "Figure X" labels often printed sideways. Sheet 01 from US11423567B2. The whole thing is rotated 90 degrees, with labels like "BP", "DR", "1", and "D" scattered around the drawing. DeepSeek-OCR is a 3.3B parameter vision model that runs locally. It has a grounding mode that returns bounding boxes alongside text—the prompt <|grounding|>OCR this image. produces output like <|ref|>camera 110</ref><|det|>[[412, 8, 455, 63

OCR on Patent Figures with DeepSeek-OCR

Related Articles

Read Receipts: An iMessage Simulator

Why 60,000 Repos Adopted AGENTS.md

Intel and LG Display may have beaten Apple and Qualcomm with the best laptop battery life ever

FiberBills: A Complete Billing & Collection System for ISPs and Subscription Businesses

Prompting as Probabilistic Programming

Related Articles

News
Read Receipts: An iMessage Simulator
Lobsters • 1h ago

News
Why 60,000 Repos Adopted AGENTS.md
Medium Programming • 1h ago

News
Intel and LG Display may have beaten Apple and Qualcomm with the best laptop battery life ever
The Verge • 2h ago

News
FiberBills: A Complete Billing & Collection System for ISPs and Subscription Businesses
Medium Programming • 3h ago

News
Prompting as Probabilistic Programming
Medium Programming • 4h ago