Baidu Releases Unlimited OCR, a 3B Model That Keeps the KV Cache Flat for Long-Document Parsing

Article summary
Quick briefing — cleaned from the original RSS feed
Baidu open-sourced Unlimited OCR, a 3B-parameter MoE model that parses dozens of document pages in a single forward pass. Its Reference Sliding Window Attention (R-SWA) holds the KV cache constant, so memory and latency stay flat as output grows. It scores 93.23 on OmniDocBench v1.5, beating the DeepSeek OCR baseline by 6.22 points, under an MIT license.
1Key Takeaways
- Baidu open-sourced Unlimited OCR, a 3B-parameter MoE model that parses dozens of document pages in a single forward pass.
- Its Reference Sliding Window Attention (R-SWA) holds the KV cache constant, so memory and latency stay flat as output grows.
- It scores 93.23 on OmniDocBench v1.5, beating the DeepSeek OCR baseline by 6.22 points, under an MIT license.
2AIWedia Score
7.6/10
Solid update — useful context for the AI space
Based on source trust, recency, category impact, and story depth.
3Why it matters
Video AI is reshaping ads, social content, and entertainment with faster generation pipelines. MarkTechPost Video reports that baidu open-sourced Unlimited OCR, a 3B-parameter MoE model that parses dozens of document pages in a single forward pass.
Explore related
Browse toolsVideo AI news
Explore curated video ai tools on AIWedia — compare, rank, and launch from our directory.
Full story on MarkTechPost Video
Read full articleHeadlines aggregated via RSS for discovery on AIWedia. Original content © MarkTechPost Video. We link to the source and do not republish full articles.