The Allen Institute for AI (Ai2) unveiled Olmo 3, a new generation of open language models that it says outperforms rivals ...
MLCommons today released AILuminate, a new benchmark test for evaluating the safety of large language models. Launched in 2020, MLCommons is an industry consortium backed by several dozen tech firms.
The new framework from Tongyi Lab enables agents to create their own training data by exploring and interacting with new ...
Artificial intelligence systems are increasingly woven into everyday decisions about health, money and work, yet most tests ...
Elon Musk's xAI has launched Grok 4.1, an upgraded AI model that significantly enhances speed, stability, and answer accuracy ...
You know all of those reports about artificial intelligence models successfully passing the bar or achieving Ph.D.-level intelligence? Looks like we should start taking those degrees back. A new study ...
This year, Stanford University organized Agents4Science , the first open conference to accept papers written entirely by ...
Anthropic today released Opus 4.5, its flagship frontier model, and it brings improvements in coding performance, as well as ...
eSpeaks host Corey Noles sits down with Qualcomm's Craig Tellalian to explore a workplace computing transformation: the rise of AI-ready PCs. Matt Hillary, VP of Security and CISO at Drata, details ...
Companies investing in generative AI find that testing and quality assurance are two of the most critical areas for improvement. Here are four strategies for testing LLMs embedded in generative AI ...