
【#Tech24H】Recently, DeepSeek began beta testing its image recognition mode and has made it widely available for user experience. DeepSeek's image recognition mode is centered on “visual primitive thinking”. This core framework emphasizes precise spatial reasoning and complex scene parsing, rather than mere OCR (optical character recognition) or basic identification. In practical tests, after enabling this mode, users can directly upload images for DeepSeek to “see” the world, with capabilities far beyond simple text extraction. For example, a user uploaded a photo of an unknown artifact taken in a museum and enabled "deep thinking"; the model not only described the texture and material of the artifact in detail but also accurately inferred its stylistic period. Facing popular memes or reaction images, it can also understand them correctly. [ By Zhang Liyan | Tang Ruohan ]

