Ai stinks in reading watches- Brit Commerce

Ai stinks in reading watches– Brit Commerce

These days, artificial intelligence can generate photorealistic images, write novels, do your homework and even predict protein structures. However, new research reveals that it often fails in a very basic task: counting time.

Researchers at the University of Edinburgh have proven the capacity of seven known multimodal language models, the type of AI that can be interpreted and generate various types of media, to answer questions related to time based on different images of watches or calendars. His study, soon in April and Currently housed In the Preimprint Arxiv server, it shows that the LLM has difficulties with these basic tasks.

“The ability to interpret and reason about the time of visual entries is critical for many real world applications, from the programming of events to autonomous systems,” the researchers wrote in the study. “Despite the advances in multimodal large language models (MLLMS), most of the work has focused on the detection of objects, the subtitles of the images or the understanding of the scene, leaving the temporary inference little axplored.”

The team tested GPT-4O and GPT-O1 of OpenAI; Google Deepmind’s Gemini 2.0; Sonnet Claude 3.5 from Anthrope; Meta’s calls 3.2-11b-vision-instrument; Alibaba’s Qwen2-VL7B-Instruct; and Minicpm-V-2.6 of Modelbest. They fed the models different images of analog watches (Timekeepers with Roman numbers, different colors of dial and even some are missing the hand of the latter, as well as 10 years of calendar images.

For the clock images, the researchers asked the LLMS, WIs the hat time shown in the clock in the given image? For calendar images, researchers asked simple questions such as WIs the day of the week of the week new year day? and harder consultations that include WThe hat is day 153 of the year?

“The analog reading of the clock and the understanding of the calendar involve intricate cognitive steps: they demand a visual recognition of fine grain (for example, position of the hand of the clock, day -day design) and non -trivial numerical reasoning (for example, the calculating day compensation),” the researchers explained.

In general, AI systems did not work well. They read the time in analog watches correctly less than 25% of the time. They fought with clocks with Roman numbers and stylized hands as much as they did with watches that lack a few seconds in total, indicating that the problem can be derived from detecting the hands and angles of interpretation on the face of the clock, according to the researchers.

Google’s gemini-2.0 obtained the highest maximum in the team’s clock task, while GPT-O1 was necessary in the task of the calendar 80% of the time, a much better result than its competitors. But even then, the most successful MLLM in the calendar’s task still made errors approximately 20% of the time.

“Most people can say time and use calendars from an early age. Our findings highlight a significant gap in AI’s ability to carry out what are the quite basic skills for people, ”said Rohit Saxena, co -author of the study and doctoral student at the School of Computer Science at the University of Edinburgh, at a university statement. “These deficits should be addressed if AI systems should be successful in applications sensitive to real world, such as programming, automation and assistance technologies.”

So, while AI could complete its task, do not have it adhere to the deadlines.

Leave a Comment

Your email address will not be published. Required fields are marked *

Shopping Cart
Scroll to Top