Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro
A Cursor study shows coding agents retrieve known fixes instead of deriving them, inflating SWE-bench Pro scores through runtime contamination.
Loading...
Technology, AI models, startups, and research — aggregated from OpenAI, Anthropic, TechCrunch, arXiv, AWS, and 40+ more sources.
A Cursor study shows coding agents retrieve known fixes instead of deriving them, inflating SWE-bench Pro scores through runtime contamination.
Bruce Clay, a founding figure who shaped the SEO marketing industry, has died. His concepts continue to influence professionals to this day.
Structuring Prompts for 2M Token Contexts: Maintaining Retrieval Accuracy at Scale The expansion of Large Language Model (LLM) context windows to 2 million tokens changes how we think about…
Corgi became embroiled in controversy when Papermark accused it of stealing its software. Corgi says it did not, raising new questions about vibe coding.
One thing I keep thinking about as AI becomes more personal: What happens to the relationship, project history, preferences, and context when the user switches platforms? A lot of AI tools are…

This week, most of the largest U.S. startup funding rounds centered around the sector one would suspect: artificial intelligence. Beyond that, the next-biggest area for startup funding was biotech.

The FBI and CISA have updated their March warning about Russian intelligence phishing Signal accounts, and the operators have added a step: they now coax targets into handing over their Signal Backup…
Perplexity's Computer for Counsel extends Perplexity Computer to legal teams. It routes 20+ models across Midpage, MCP connectors, and Microsoft 365, with cited outputs lawyers can verify.
OpenAI's GPT-5.6 family adds tiered models with max and ultra reasoning. Here is what early-level engineers should know.
General Atlantic has tapped tennis legend Novak Djokovic to serve as a global strategic advisor.

A simple technical failure cost an agency months of leads, revealing why communication and QA matter as much as PPC performance.

No one has supported the search industry, for as long, and with so much of his resources, like Bruce Clay.
General Intuition is using video game clips with embedded action labels to speed up AI training for robotics.
“We don’t believe this kind of government access process should become the long-term default,” says OpenAI. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global…
The hire marks OpenAI's latest push into India, expanding offices, partnerships and hiring.

A newly discovered cyber attack campaign has been observed delivering a previously undocumented malware family called SharkLoader that acts as a loader for deploying Cobalt Strike Beacon on…

This spam update took two full days to roll out.
As enterprises scale autonomous AI agents into production, enabling safe innovation requires robust architectural guardrails. AI agents connect across tools and datasets, so it’s essential to…
Nvidia has dominated the AI chip market for years, but the era of total dependence might be ending. OpenAI just shared its plans to spice things up with Jalapeño, its custom inference chip built with…

Land your next job in SEO or PPC. These brands and agencies are hiring to fill open search marketing positions right now.
Less than 24 hours after news broke that OpenAI would stagger its next model release at the request of the Trump administration, that model, GPT-5.6, is here. On Friday, the company unveiled the…
GitHub joined the United Nations Development Programme in Ghana to explore how open source governance can support one of West Africa's most ambitious digital reform efforts.
AI models have progressed to the point where their capabilities have real political consequences. Dealing with those consequences will require collective action.

A Chinese-speaking advanced persistent threat (APT) actor has been linked to a new custom backdoor called TinyRCT as part of cyber attacks aimed at government entities and critical infrastructure in…
Tools, prompts, and guides to go with the news.