AI Daily Report: Foundation Models · AI Infrastructure (May 26, 2026)

Tuesday, May 26, 2026 · 10 curated articles

AI Daily Report Cover 2026-05-26

Editor's Picks

The headlines from today, May 26, 2026, signal a definitive shift in the AI trajectory: we have officially entered the era of 'World-Acting' models. The news that Andrej Karpathy has joined Anthropic, as detailed in 'Karpathy Joins Anthropic Amid Pre-training Focus and Musk’s Massive GPU Expansion,' is not merely a high-profile hire; it is a tactical pivot. As the industry grapples with the diminishing returns of post-training refinement, the focus is returning to the 'violent aesthetic' of massive-scale pre-training and synthetic data. With Elon Musk amassing a 220,000 H100 GPU fleet, we are seeing the emergence of a new Compute Sovereignty. For developers, the message is clear: the advantage isn't in building yet another wrapper; it's in understanding the infrastructure that powers these behemoths.

At the same time, the recent Gemini 3.5 rollout and Google DeepMind’s already-established Genie world-model line, mentioned in 'LWiAI #246: Google Recaps Gemini 3.5,' alongside Ant Group’s 'LingBot-VA: Ant Group’s Causal World Model for Robotics,' suggest that the industry's obsession with text-based reasoning is being superseded by spatial and causal intelligence. We are moving from models that 'know' things to models that 'simulate' reality to act within it. Geohot’s critique in today’s Hacker News recap—that LLMs are merely distribution fitters rather than world models—is being addressed head-on by these new autoregressive video-action architectures. LingBot-VA’s 92% success rate on robotic tasks proves that the 'perception-action loop' is finally being closed.

For the engineering community, the bottleneck is shifting from logic to physics and memory. While the giants fight over 900-billion-dollar valuations, the real innovation for the 'rest of us' lies in efficiency. Model Best’s release of 'BitCPM-CANN'—achieving 6x memory efficiency via 1.58-bit ternary quantization—is perhaps the most disruptive sleeper hit of the day. It demonstrates that the future isn't just in Musk’s massive GPU clusters, but in our ability to run 'smart enough' models on edge hardware. As a developer, your goal for the rest of 2026 should be mastering the intersection of distributed vector data—like CockroachDB’s C-SPANN architecture—and these high-efficiency local models. The 'AGI' we were promised won't just live in a data center; it will be a distributed, embodied, and physics-aware intelligence that lives in the devices we carry and the robots that share our homes.

Foundation Models

Foundation models continue to evolve rapidly with significant updates from industry leaders and specialized optimization breakthroughs. Google has introduced Gemini 3.5 to enhance its multimodal capabilities, while Anthropic’s soaring valuation reflects the massive capital flowing into frontier AI development. Meanwhile, hardware-specific advancements like Model Best’s BitCPM-CANN demonstrate how ternary quantization can achieve sixfold memory efficiency on Ascend chips, balancing cutting-edge performance with high-efficiency deployment for large-scale language models.

LWiAI #246 Recaps Gemini 3.5, Musk's OpenAI Lawsuit Loss, and Anthropic's $900B Valuation

Google unveils AI model Gemini 3.5 and AI agent Gemini Spark, Omni turns images, audio, and text into video

Elon Musk losing his OpenAI lawsuit on statute-of-limitations grounds

Google has introduced its latest AI model, Gemini 3.5, alongside an autonomous agent named Gemini Spark and the multimodal video generation tool Gemini Omni. These announcements at Google I/O 2026 also featured Antigravity 2.0 and the Genie world model, which simulates realistic street environments using Street View data. In the legal and business sector, a court dismissed Elon Musk’s lawsuit against OpenAI due to statute-of-limitations issues, while Anthropic secured a $30 billion funding round at a staggering $900 billion valuation. Additionally, OpenAI achieved a significant mathematical milestone by solving an 80-year-old Erdős geometry problem. The coding landscape continues to evolve as Cursor Composer 2.5 and xAI’s Grok Build compete for dominance in the developer market. Finally, the AI chipmaker Cerebras saw its stock surge by 90% following the year's largest initial public offering to date.

Source: Last Week in AI

LWiAI #246 Recaps Gemini 3.5, Musk's OpenAI Lawsuit Loss, and Anthropic's $900B Valuation

Model Best Unveils BitCPM-CANN: 8B Ternary Models with 6x Memory Efficiency on Ascend

BitCPM-CANN is a series of ternary large models released by Model Best in collaboration with Tsinghua University and the OpenBMB open-source community.

Compared to BF16 full-precision models, BitCPM-CANN saves approximately 6x the VRAM; an 8B parameter full-precision model requires about 16GB, while the BitCPM-CANN ternary version needs less than 3GB.

Model Best (Mianbi Intelligent) has successfully trained and released BitCPM-CANN, a series of ternary (1.58-bit) large language models ranging from 0.5B to 8B parameters, achieving up to 97.2% performance retention compared to full-precision counterparts. These models utilize 1.58-bit quantization to reduce VRAM requirements by approximately six times, allowing an 8B parameter model to run on less than 3GB of memory. This development marks the first time a ternary model has been trained end-to-end on Huawei Ascend hardware, establishing a completely domestic stack encompassing frameworks, chips, and methodology. By significantly lowering the hardware barrier, BitCPM-CANN enables large-scale models to run efficiently on mobile devices and edge hardware. The project has been open-sourced, providing the developer community with a foundation for low-bit training on domestic computing platforms. This breakthrough addresses the critical bottleneck of memory costs in the global AI race, particularly for on-device intelligence.

Source: 爱范儿

Model Best Unveils BitCPM-CANN: 8B Ternary Models with 6x Memory Efficiency on Ascend

AI Infrastructure

AI Infrastructure encompasses the foundational systems and hardware required to develop and scale modern machine learning workloads. This category highlights critical advancements in high-performance computing, distributed databases, and specialized hardware accelerators. Recent developments, such as CockroachDB’s C-SPANN architecture, showcase the shift toward scalable, distributed vector indexing solutions designed to support large-scale AI applications. These innovations are essential for managing the growing complexity and data demands of the evolving artificial intelligence landscape.

CockroachDB Introduces C-SPANN: A Scalable Distributed Vector Indexing Architecture

The team’s response was to build something new, called C-SPANN, that satisfied every constraint by treating the index as ordinary table data

Vector indexes solve this by giving up on exact answers. They find approximate nearest neighbors

CockroachDB developed a new vector indexing architecture called C-SPANN to support semantic search within its distributed database without relying on a central coordinator. The engineering team rejected existing algorithms after identifying specific architectural requirements, such as the need for real-time updates, sharding support, and an intolerance for large in-memory caches or hot spots. By treating vector indexes as ordinary table data rather than a separate system, the implementation maintains consistency and scalability across distributed nodes. Traditional B-tree indexes are ineffective for vectors because high-dimensional embeddings lack a natural sequence or inherent ordering. Consequently, the system utilizes approximate nearest neighbor algorithms to balance search accuracy with high-speed performance. This approach allows users to query billions of vectors efficiently, enabling applications like semantic search and image retrieval to run at scale in production environments.

Source: ByteByteGo Newsletter

CockroachDB Introduces C-SPANN: A Scalable Distributed Vector Indexing Architecture

Research

This section highlights groundbreaking academic achievements and theoretical advancements in the global technology landscape. We cover significant milestones such as Ant Group’s LingBot-VA being accepted at RSS 2026, showcasing the evolution of causal world models for robotics. These papers bridge the gap between abstract simulation and real-world execution, offering deep insights into how reasoning and action synergy are shaping the future of autonomous systems and industrial automation.

LingBot-VA: Ant Group’s Causal World Model for Robotics Accepted at RSS 2026

LingBot-VA achieved average success rates of 92.0% and 91.1% under Easy and Hard settings respectively; it reached 98.5% on the LIBERO benchmark.

The overall success rate increased by more than 20 percentage points compared to the industry baseline π0.5, demonstrating good data efficiency and generalization capability.

The research paper "Causal World Modeling for Robot Control" by Ant LingBot and HKUST achieved acceptance at the RSS 2026 conference, introducing the first open-source autoregressive video-action world model, LingBot-VA. The model utilizes a Mixture-of-Transformers (MoT) architecture to integrate video prediction and action generation within a unified autoregressive diffusion framework. In benchmark testing, LingBot-VA attained a 92.0% success rate on RoboTwin 2.0 tasks and outperformed the π0.5 baseline by over 20 percentage points in real-world evaluations. This framework enables robots to predict environmental changes and generate subsequent action commands simultaneously, mimicking human-like perception-action loops. By requiring only 50 real demonstration data points for adaptation, the system demonstrates high data efficiency for complex physical tasks. Ant LingBot has released the model weights, training code, and inference scripts on platforms including Hugging Face and GitHub to support the embodied intelligence community.

Source: 量子位

LingBot-VA: Ant Group’s Causal World Model for Robotics Accepted at RSS 2026

AI Business

The AI business landscape is witnessing high-stakes executive movements and unprecedented infrastructure investments. Andrej Karpathy's transition to Anthropic signals a renewed emphasis on core pre-training excellence as major players compete for top-tier technical leadership. Meanwhile, massive GPU cluster expansions, such as Elon Musk's 220,000-card deployment, underscore the escalating capital intensity required to lead the next frontier of large-scale model development and enterprise implementation.

Karpathy Joins Anthropic Amid Pre-training Focus and Musk’s Massive GPU Expansion

Even Karpathy, a top student of Fei-Fei Li and co-founder of OpenAI, is struggling to keep up with current AI developments?

SpaceX has transformed into a computing power dealer, with Musk becoming a 'Space Jensen Huang'.

Andrej Karpathy, a co-founder of OpenAI and former Tesla AI lead, has joined Anthropic at a pivotal moment when pre-training is re-emerging as a critical differentiator in model development. This strategic shift highlights Karpathy’s expertise in synthetic data and large-scale model training as labs move toward the 2026 development cycle. Elon Musk’s massive deployment of 220,000 H100 GPUs through SpaceX and xAI is creating a new compute-as-a-service paradigm, positioning him as a major infrastructure player. Meanwhile, DeepMind leadership is reportedly investing in Anthropic, signaling a potential realignment of forces against OpenAI within the industry. The move suggests that despite the recent focus on post-training techniques, the underlying foundation of massive-scale pre-training remains the essential challenge labs must solve to reach AGI. This talent migration reflects broader tensions between commercial visions and technical purists seeking research independence within the Silicon Valley ecosystem.

Source: 人民公园说AI

Karpathy Joins Anthropic Amid Pre-training Focus and Musk’s Massive GPU Expansion

Emerging Tech

Explore the forefront of innovation through deep dives into transformative technologies like embodied AI and the evolving landscape of cybersecurity threats. This category highlights breakthrough robotics designed for home environments alongside daily recaps of trending discussions from the global developer community. By tracking these rapid advancements, we provide essential insights into the next generation of digital tools and intelligent systems that are currently reshaping our professional and personal lives.

Xu Huazhe Launches Poke Robot: Redefining Embodied AI for Home Environments

Starting from March 2026, Xu Huazhe has a new identity: founder of Poke Robot. For the previous two-plus years, Xu was the co-founder and chief scientist of Xinghaitu.

He said that embodied intelligence is not robotics, not autonomous driving, and not 'prehistoric deep learning'.

Xu Huazhe, the former Co-founder and Chief Scientist of Xinghaitu, officially launched his new venture, Poke Robot, in March 2026 with a primary focus on developing general-purpose home robotics. The startup aims to move beyond traditional robotics and autonomous driving paradigms, shifting the strategic focus toward general intelligence rather than what Xu describes as "prehistoric deep learning." Xu argues that reinforcement learning remains a significantly undervalued component in the current embodied AI landscape and emphasizes the importance of simple, consistent technical architectures. While China is expected to be a major hub for embodied intelligence innovation, the founder highlights the strategic necessity of pursuing the global market to capture the largest opportunities. The industry is projected to enter a phase of heavy resource competition within the next 18 to 24 months as large corporations increase their presence in the field.

Source: 晚点聊 LateTalk

Xu Huazhe Launches Poke Robot: Redefining Embodied AI for Home Environments

2026-05-26 Hacker News Top Stories Recap

By mapping Minsky register machines to Jira automation, the article proves that Jira is Turing complete.

A trial of the 100:80:100 four-day work week in 15 Australian companies saw no drop in productivity and significant stress reduction.

Pope Leo XIV’s Magnifica Humanitas encyclical advocates for AI to be governed by morality and law while rejecting transhumanist ideologies in favor of human dignity and social justice. Google’s aggressive push into AI-driven conversational search has prompted a surge in interest for privacy-focused alternatives like Kagi, DuckDuckGo, and Startpage that offer ad-free and controllable AI experiences. In the software development space, geohot argues that treating AI agents as primary programmers is a mistake because LLMs function on distribution fitting rather than world models. Technical research further demonstrates that Jira’s automation system is Turing complete by mapping it to Minsky register machines, though it remains constrained by cloud execution limits. Additionally, a trial of a four-day work week across 15 Australian companies reported maintained productivity levels alongside significant stress reduction for employees through meeting streamlining and automation.

Source: SuperTechFans

The Evolution of Chinese-language Phishing-as-a-Service (PhaaS)

GTIG has observed a fundamental move away from static password harvesting towards real-time interception and tokenization.

These services not only lower the barrier to entry for Chinese cyber criminals, but reveal broader patterns on the evolution of social engineering

Google Threat Intelligence Group has identified a significant shift in Chinese-language phishing services from static password harvesting toward real-time interception of one-time passcodes and tokenization. These services utilize live administration panels to bypass multifactor authentication by interacting with victims instantly during the login process to capture credentials. Threat actors are increasingly focusing on exploiting digital wallet provisioning to transform stolen payment data into tokenized assets, moving beyond simple account access to direct financial control. To evade traditional carrier security filters, these operations leverage encrypted delivery channels such as RCS and iMessage for phishing message distribution. Unlike Russian-speaking groups, Chinese PhaaS providers often operate openly on Telegram and primarily target non-Chinese entities and the general public. Google has recently taken legal action against one such provider while continuing to implement technical safeguards against these sophisticated criminal ecosystems.

Source: Google Cloud Blog

The Evolution of Chinese-language Phishing-as-a-Service (PhaaS)

Programming

Explore the latest breakthroughs in software engineering and modern web performance optimization. This section features technical deep dives into enhancing digital user experiences, such as utilizing ffmpeg for high-performance video scrubbing and optimizing complex mobile 3D interactions. Stay informed on the essential tools and methodologies developers employ to maintain high visual fidelity while achieving lightning-fast load times across today's increasingly demanding cross-platform applications.

Optimizing 3D Product Previews via High-Performance Video Scrubbing and ffmpeg

High-precision glTF models exported by artists are at least 150MB. Who will be responsible for the first screen white screen time for overseas users loading such a large file?

Ultimately, I rejected the WebGL solution and instead adopted an extremely restrained and high-performance Video Scrubbing solution.

High-precision glTF models for 360-degree product previews often exceed 150MB, causing significant loading delays and performance issues on mobile devices. By adopting a video scrubbing approach instead of WebGL rendering, developers can achieve high-fidelity visuals with a drastically smaller file footprint. This transition from real-time 3D models to offline-rendered videos significantly reduces GPU load and prevents overheating on lower-end devices. However, standard video encoding relies on inter-frame compression, which introduces latency and stuttering when users manually scrub through frames via touch events. Optimizing the video structure using specific ffmpeg parameters ensures smooth frame-by-frame navigation by addressing I-frame distribution. This method effectively balances visual fidelity with technical performance, resulting in a substantial reduction in asset size from 15MB to approximately 800KB. It provides a superior alternative for cross-border e-commerce platforms requiring high-quality product interactions.

Source: 掘金本周最热

Developer Tools

Developer tools are the backbone of modern software engineering, enabling creators to build, test, and deploy applications with precision. This section explores essential utilities like version control systems and integrated development environments that streamline workflows and enhance collaboration. From mastering Git integration in VS Code to optimizing your local environment, we provide insights into the software that empowers developers to write cleaner code and manage complex projects more effectively in today’s fast-paced technological landscape.

Beginners Guide: Integrating Git and GitHub with VS Code

Using GitHub in VS Code reduces context switching, streamlines your workflow, and boosts your productivity.

The first step to using Git with VS Code is initializing a folder to reflect your repository on GitHub.

Visual Studio Code provides built-in functionality that integrates directly with GitHub to reduce context switching and streamline developer workflows. Users can initialize a local folder as a Git repository by selecting the Source Control icon and clicking the Initialize Repository button within the editor interface. Once initialized, the UI displays branch names in the bottom-left corner and allows for branch renaming via the Command Palette using the Git: Rename Branch command. Files identified with a “U” label indicate they are currently untracked, while staging them changes the label to “A” to signify they are ready for a commit. This native integration enables developers to manage source code, stage changes, and push updates without ever leaving the VS Code environment, requiring only Git and VS Code to be installed on the local machine.

Source: The GitHub Blog

Beginners Guide: Integrating Git and GitHub with VS Code

This report is auto-generated by WindFlash AI based on public AI news from the past 48 hours.