Mistral Drops OCR 4 With 170-Language Document Extraction

Mistral AI launches OCR 4, bringing structured document extraction with bounding boxes and confidence scores across 170 languages.

Mistral Drops OCR 4 With 170-Language Document Extraction

Mistral AI just shipped OCR 4, and it's a serious upgrade for document intelligence. The new model handles structured document extraction across 170 languages — a massive multilingual footprint.

The headline features: bounding boxes that pinpoint exactly where text lives on a page, block classification that categorizes different document sections, and inline confidence scores that tell you how sure the model is about each extraction. All of this ships alongside the extracted text itself.

It's a clear play for enterprise document processing pipelines, where knowing *where* something was found and *how reliable* the extraction is matters just as much as the text itself.

With 170 languages supported, Mistral is positioning OCR 4 as a global-scale solution. The combination of spatial awareness, classification, and confidence scoring puts pressure on competing document AI offerings to match this feature set.