• Business
  • Entertainment
  • Health
  • News
  • Sport
  • Tech
  • World
Newsy Today
news of today
Home - ML & Data Engineering
Tag:

ML & Data Engineering

Tech

Stripe Engineers Deploy Minions, Autonomous Agents Producing Thousands of Pull Requests Weekly

by Chief Editor March 20, 2026
written by Chief Editor

Stripe’s ‘Minions’ Signal a Modern Era of AI-Powered Coding

Engineers at Stripe have quietly launched a revolution in software development: autonomous coding agents dubbed “Minions.” These aren’t the yellow, banana-loving creatures, but sophisticated AI systems capable of generating production-ready pull requests with minimal human intervention. The implications for developer productivity and the future of coding are significant.

From Concept to 1,300 Pull Requests a Week

The Minions project began as an internal fork of Goose, a coding agent developed by Block. Stripe customized Goose for its specific LLM infrastructure and refined it to meet the demands of a large-scale payment processing system. The results are impressive. Currently, Minions generate over 1,300 pull requests per week, a figure that has climbed from 1,000 during initial trials. Crucially, all changes are reviewed by human engineers, ensuring quality and security.

This isn’t about replacing developers; it’s about augmenting their capabilities. The Minions handle tasks like configuration adjustments, dependency upgrades, and minor refactoring – the often-tedious but essential function that can consume a significant portion of a developer’s time.

One-Shot Agents: A Different Approach to AI Coding

What sets Minions apart from popular AI coding assistants like GitHub Copilot or Cursor? Minions operate on a “one-shot” basis, completing end-to-end tasks from a single instruction. Tasks can originate from various sources – Slack threads, bug reports, or feature requests – and are then orchestrated using “blueprints.” These blueprints combine deterministic code with flexible agent loops, allowing the system to adapt to different requirements.

This contrasts with interactive tools that require constant human guidance. Minions are designed to take a task description and deliver a complete, tested, and documented solution, ready for review.

Handling Complexity at Scale: $1 Trillion in Payments

The stakes are high. The code managed by Minions supports over $1 trillion in annual payment volume at Stripe. This means reliability and correctness are paramount. The system operates within a complex web of dependencies, navigating financial regulations and compliance obligations. Stripe reinforces reliability through robust CI/CD pipelines, automated tests, and static analysis.

Did you recognize? Stripe’s Minions are not just theoretical; they are actively managing critical infrastructure for a global payments leader.

The Rise of Agent-Driven Software Development

Stripe’s Minions are part of a broader trend toward agent-driven software development. LLM-based agents are becoming increasingly integrated with development environments, version control systems, and CI/CD pipelines. This integration promises to dramatically increase developer productivity while maintaining strict quality controls.

The key to success, according to Stripe engineers, lies in carefully defining tasks and utilizing blueprints to guide the agents. Blueprints act as a framework, weaving together agent skills with deterministic code to ensure both efficiency and adaptability.

Future Trends: What’s Next for AI Coding Agents?

The success of Minions suggests several potential future trends:

  • Increased Task Complexity: As agents become more sophisticated, they will be able to handle increasingly complex tasks, potentially automating entire features or modules.
  • Self-Improving Agents: Agents may learn from their successes and failures, continuously improving their performance and reducing the need for human intervention.
  • Domain-Specific Agents: We can expect to see the development of specialized agents tailored to specific industries or programming languages.
  • Enhanced Blueprinting Tools: Tools for creating and managing blueprints will become more user-friendly and powerful, allowing developers to easily define and orchestrate complex tasks.

FAQ

Q: Will AI coding agents replace developers?
A: No, the current focus is on augmenting developer productivity, not replacing developers entirely. Human review remains a critical part of the process.

Q: What are “blueprints” in the context of Stripe’s Minions?
A: Blueprints are workflows defined in code that specify how tasks are divided into subtasks and handled by either deterministic routines or the agent.

Q: How does Stripe ensure the reliability of code generated by Minions?
A: Stripe uses CI/CD pipelines, automated tests, and static analysis to ensure generated changes meet engineering standards before human review.

Q: What types of tasks are Minions best suited for?
A: Minions perform best on well-defined tasks such as configuration adjustments, dependency upgrades, and minor refactoring.

Pro Tip: Explore the Stripe developer blog for more in-depth technical details about the Minions project: https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-coding-agents

What are your thoughts on the future of AI-powered coding? Share your insights in the comments below!

March 20, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

QCon London 2026: Ontology‐Driven Observability: Building the E2E Knowledge Graph at Netflix Scale

by Chief Editor March 18, 2026
written by Chief Editor

The Future of Observability: Netflix Pioneers the “Knowledge Graph” Approach

Netflix is pushing the boundaries of observability, moving beyond traditional monitoring to a system built on interconnected knowledge. Engineers Prasanna Vijayanathan and Renzo Sanchez-Silva recently unveiled their function at QCon London 2026, detailing how a knowledge graph is transforming how the streaming giant understands and responds to issues across its vast infrastructure.

From Siloed Data to a Unified View: The Challenge of E2E Observability

Traditional observability often struggles with fragmented data. Metrics, events, logs and traces exist in silos, making it difficult to correlate information and pinpoint root causes. Here’s the core challenge of End-to-End (E2E) Observability – the ability to monitor a complex system from the user interface to the underlying infrastructure. Netflix’s approach directly addresses these issues.

The MELT Layer: A Foundation for Unified Observability

Central to Netflix’s strategy is the MELT Layer (Metrics, Events, Logs, Traces). This unified layer aims to improve incident resolution time by consolidating observability data. It’s a crucial step towards breaking down silos and providing a more holistic view of system health.

Ontology: Encoding Knowledge for Machine Understanding

But simply collecting data isn’t enough. Netflix leverages the power of Ontology – a formal specification of types, properties, and relationships – to encode knowledge about its systems. This isn’t just about the data itself, but about understanding the connections between data points. The fundamental unit of this knowledge is the Triple: (Subject | Predicate | Object), representing a single fact within the knowledge graph.

For example, a triple might state: “api-gateway | rdf:type | ops:Application,” defining the api-gateway as an application. Another could be: “INC-5377 | ops:affects | api-gateway,” indicating that incident INC-5377 impacts the api-gateway.

12 Operational Namespaces: Connecting the Netflix Universe

To manage the complexity of its infrastructure, Netflix utilizes 12 Operational Namespaces – including Slack, Alerts, Metrics, Logs, and Incidents – to categorize and connect all elements. The ontology captures, structures, and preserves this information in a machine-readable format, transforming operational chaos into a structured understanding.

The Knowledge Flywheel: Continuous Learning and Adaptation

Netflix’s system isn’t static. The Knowledge Flywheel embodies a continuous learning loop. It operates through three states – Observer, Enrich, and Infer – constantly adapting and improving its understanding of the system. This flywheel is integrated with a development process utilizing Claude, where the AI proposes code changes (pull requests) that are then reviewed and merged by human engineers.

This integration of AI and human expertise is a key element, allowing for automated improvements while maintaining control and oversight.

Future Trends: Automation and Self-Healing Infrastructure

Netflix’s vision extends beyond simply understanding incidents. They aim to automate root cause analysis, provide auto-remediation, and ultimately create a self-healing infrastructure. This represents a significant leap forward in operational efficiency and reliability.

The Rise of AI-Powered Observability

The integration of AI, as demonstrated by the utilize of Claude, is a major trend. Expect to see more AI-powered tools that can automatically analyze observability data, identify anomalies, and even suggest solutions. This will free up engineers to focus on more strategic tasks.

Knowledge Graphs as the Fresh Standard

Netflix’s knowledge graph approach is likely to become a standard practice. By representing infrastructure as interconnected entities, organizations can gain a deeper understanding of their systems and improve their ability to respond to incidents.

Shift Towards Proactive Observability

The goal is to move beyond reactive monitoring to proactive observability – predicting and preventing issues before they impact users. This requires sophisticated analytics and machine learning algorithms that can identify patterns and anomalies.

FAQ

What is an ontology in the context of observability?
An ontology is a formal specification of types, properties, and relationships, used to encode knowledge about a system and its components.

What is the MELT layer?
The MELT layer (Metrics, Events, Logs, Traces) is a unified observability layer designed to consolidate data and improve incident resolution time.

What is a Triple?
A Triple is a tuple (Subject | Predicate | Object) that defines one fact in a knowledge graph.

How does Netflix use AI in its observability system?
Netflix uses AI, specifically Claude, to propose code changes and automate parts of the observability workflow.

What are the 12 Operational Namespaces?
These are categories used by Netflix to organize and connect all elements of its infrastructure, including Slack, Alerts, Metrics, Logs, and Incidents.

Did you recognize? The concept of a knowledge graph isn’t new, but its application to large-scale observability, as demonstrated by Netflix, is a significant advancement.

Pro Tip: Start compact when implementing observability solutions. Focus on identifying key metrics and events, and gradually expand your coverage as you gain experience.

Seek to learn more about modern data engineering practices? Explore our other articles on data architecture and observability tools.

March 18, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

QCon London 2026: Behind Booking.com’s AI Evolution: The Unpolished Story

by Chief Editor March 17, 2026
written by Chief Editor

Booking.com’s AI Journey: Lessons for the Future of Data-Driven Platforms

Booking.com’s evolution from Perl scripts and MySQL databases to a sophisticated AI platform, as detailed at QCon London 2026 by Senior Principal Engineer Jabez Eliezer Manuel, offers valuable insights into the challenges and triumphs of scaling AI within a large organization. The presentation, “Behind Booking.com’s AI Evolution: The Unpolished Story,” highlighted a 20-year journey marked by pragmatic experimentation and a willingness to adapt.

The Power of Data-Driven DNA

In 2005, Booking.com began extensive A/B testing, running over 1,000 experiments concurrently and accumulating 150,000 total experiments. Despite a less than 25% success rate, the company prioritized rapid learning over immediate results, fostering a “Data-Driven DNA” that continues to shape its approach to innovation. This early commitment to experimentation laid the groundwork for future AI initiatives.

From Hadoop to a Unified Platform: A Migration Story

Booking.com initially leveraged Apache Hadoop for distributed storage and processing, building two on-premise clusters with approximately 60,000 cores and 200 PB of storage by 2011. However, limitations such as noisy neighbors, lack of GPU support, and capacity issues eventually led to a seven-year migration away from Hadoop. The migration strategy involved mapping the entire ecosystem, analyzing usage to reduce scope, applying the PageRank algorithm, migrating in waves, and finally phasing out Hadoop. A unified command center proved crucial to this complex undertaking.

The Evolution of the Machine Learning Stack

The company’s machine learning stack has undergone significant transformation, evolving from Perl and MySQL in 2005 to agentic systems in 2025. Key technologies along the way included Apache Oozie with Python, Apache Spark with MLlib, and H2O.ai. 2015 marked a turning point with the resolution of challenges in real-time predictions and feature engineering. As of 2024, the platform handles over 400 billion predictions daily with a latency of less than 20 milliseconds, powered by more than 480 machine learning models.

Domain-Specific AI Platforms

Booking.com has developed four distinct domain-specific machine learning platforms:

  • GenAI: Used for trip planning, smart filters, and review summaries.
  • Content Intelligence: Focused on image and review analysis, and text generation for detailed hotel content.
  • Recommendations: Delivering personalized content to customers.
  • Ranking: A complex platform optimizing for choice and value, exposure and growth, and efficiency and revenue.

The initial ranking formula, a simple function of bookings, views, and a random number, proved surprisingly resilient to machine learning replacements due to infrastructure limitations. The company adopted an interleaving technique for A/B testing, allowing for more variants with less traffic, followed by validation with traditional A/B testing.

Future Trends: What Lies Ahead?

Booking.com’s journey highlights several key trends likely to shape the future of AI-powered platforms:

  • Unified Orchestration Layers: The convergence of domain-specific AI platforms into a unified orchestration layer, as demonstrated by Booking.com, will become increasingly common. This allows for greater synergy and efficiency.
  • Pragmatic AI Adoption: The emphasis on learning from failures and iterating quickly, rather than striving for perfection, will be crucial for successful AI implementation.
  • Infrastructure as a Limiting Factor: Infrastructure limitations can significantly impact the effectiveness of even the most sophisticated algorithms. Investing in scalable and robust infrastructure is paramount.
  • The Importance of Data Management: Effective data management, including strategies for handling large datasets and ensuring data quality, remains a foundational element of any successful AI initiative.

FAQ

Q: What was the biggest challenge Booking.com faced during its AI evolution?
A: Migrating away from Hadoop proved to be a significant undertaking, requiring a seven-year phased approach.

Q: What is the current latency of Booking.com’s machine learning inference platform?
A: Less than 20 milliseconds.

Q: What is “interleaving” in the context of A/B testing?
A: A technique where 50% of experiments are interwoven into a single experiment, allowing for more variants with less traffic.

Q: What technologies did Booking.com use in its machine learning stack?
A: Perl, MySQL, Apache Oozie, Python, Apache Spark, MLlib, H2O.ai, deep learning, and GenAI.

Did you realize? Booking.com’s initial A/B testing experiments had a less than 25% success rate, but the focus was on learning, not immediate results.

Pro Tip: Don’t be afraid to experiment and fail quick. A culture of learning from mistakes is essential for successful AI adoption.

Want to learn more about the latest trends in AI and machine learning? Explore our other articles or subscribe to our newsletter for regular updates.

March 17, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

AWS Launches Strands Labs for Experimental AI Agent Projects

by Chief Editor March 12, 2026
written by Chief Editor

AWS Unveils Strands Labs: A Playground for the Future of AI Agents

Amazon Web Services (AWS) has launched Strands Labs, a new GitHub organization dedicated to experimental AI agent development. This move signals a significant investment in the rapidly evolving field of agentic AI, offering developers a sandbox to explore cutting-edge approaches beyond the constraints of production-ready software.

Robots Accept Center Stage: Bridging the Physical and Digital Worlds

A core focus of Strands Labs is robotics. The Strands Robots project aims to connect AI agents directly with physical hardware. This isn’t about remote control; it’s about agents that can perceive their environment, interpret instructions, and take action autonomously. Demonstrations showcase an agent controlling an SO-101 robotic arm using the NVIDIA GR00T model, a vision-language-action (VLA) model.

The integration with LeRobot further simplifies the process of interacting with robotics hardware and datasets. This combination allows developers to build agents capable of processing visual data, understanding commands, and performing physical tasks – a crucial step towards more versatile and adaptable robots.

Simulation as a Stepping Stone: The Power of Strands Robots Sim

Recognizing the challenges of working directly with physical robots, Strands Labs also offers Strands Robots Sim. This project provides a simulation environment where developers can test and refine their agents without the risks and costs associated with real-world hardware. The simulator supports environments from the Libero robotics benchmark and integrates VLA policies, allowing for iterative experimentation and debugging.

Pro Tip: Simulation environments are invaluable for rapid prototyping and testing different agent behaviors before deploying them to physical robots. This significantly reduces development time and potential damage to hardware.

AI Functions: A New Paradigm for Software Development

Beyond robotics, Strands Labs is exploring innovative approaches to software development itself. The AI Functions project introduces a novel concept: defining function behavior using natural language descriptions and validation conditions. The @ai_function decorator then triggers the Strands agent loop to generate code that meets the specified criteria.

This “specification-driven programming” approach represents a potential shift in how software is created, allowing developers to focus on *what* they want a function to do, rather than *how* to implement it. The system automatically retries if validation fails, ensuring the generated code meets the defined requirements. The framework can generate code that performs tasks like parsing files and data transformations, returning standard Python objects.

Community Response and Future Implications

The launch of Strands Labs has generated excitement within the AI development community. Clare Liguori, Senior Principal Engineer at AWS, described Strands Labs as “a playground for the next generation of ideas for AI agent development.” Others have highlighted the potential of AI Functions to revolutionize software development workflows.

Did you know? The Strands Agents SDK, upon which Strands Labs builds, has already been downloaded over 14 million times since its open-source release in May 2025, demonstrating strong developer interest in agentic AI.

FAQ

What is Strands Labs? Strands Labs is a new GitHub organization from AWS dedicated to experimental AI agent development.

What are the key projects in Strands Labs? The initial projects are Robots, Robots Sim, and AI Functions.

What is the NVIDIA GR00T model? GR00T is a vision-language-action (VLA) model used to control robots based on visual input and language instructions.

What is specification-driven programming? It’s an approach where developers define the desired behavior of a function using natural language and validation rules, and an AI agent generates the code to implement it.

Explore the projects and contribute to the future of agentic AI at Strands Labs on GitHub.

March 12, 2026 0 comments
0 FacebookTwitterPinterestEmail
Business

Next Moca Releases Agent Definition Language as an Open Source Specification

by Chief Editor February 9, 2026
written by Chief Editor

The Rise of Agent Definition Languages: A Fresh Standard for AI’s Future

The artificial intelligence landscape is rapidly evolving beyond simple chatbots and one-off prompts. We’re entering the era of AI agents – autonomous entities capable of reasoning, utilizing tools, accessing knowledge, and orchestrating complex workflows. But with this advancement comes a critical challenge: a lack of standardization. Every platform and team defines “agents” differently, leading to fragmentation and hindering scalability. Now, a new open-source standard, the Agent Definition Language (ADL), aims to solve this problem.

What is ADL and Why Does it Matter?

Developed by Next Moca and released under the Apache 2.0 license, ADL is essentially a blueprint for AI agents. It provides a vendor-neutral, declarative format for defining everything an agent *is* and *can do*. This includes its identity, purpose, the language model it uses, the tools it has access to, its permissions, how it accesses information (through Retrieval Augmented Generation or RAG), and even governance metadata like ownership and version history.

Think of it like this: OpenAPI defines APIs, allowing different systems to communicate seamlessly. ADL aims to do the same for AI agents. As Kiran Kashalkar, founder of Next Moca, puts it, ADL is “Think OpenAPI (Swagger) for agents.”

Addressing the Fragmentation Problem

Currently, agent definitions are often scattered across various formats – YAML files, code embedded configurations, proprietary JSON fields – making it difficult to understand an agent’s capabilities and boundaries. This lack of clarity poses significant challenges for security reviews, compliance, and reuse. ADL consolidates these definitions into a single, machine-readable format, enhancing inspectability and governance.

Pro Tip: A standardized definition layer like ADL allows for consistent validation in CI/CD pipelines, ensuring agents meet predefined standards before deployment.

How ADL Works: A Declarative Approach

ADL is a declarative language, meaning it focuses on *what* an agent should do, not *how* it should do it. It doesn’t define runtime behavior or agent-to-agent communication protocols. Instead, it provides a clear specification of the agent’s characteristics, allowing different platforms and frameworks to interpret and execute it.

This framework-agnostic approach is crucial for portability. Developers can define an agent once using ADL and then deploy it across various platforms without modification. This reduces vendor lock-in and promotes interoperability.

Beyond Definition: The Future of Agent Management

The release of ADL is just the beginning. The open-source nature of the project encourages community contributions and the development of an ecosystem of tools around the standard. This could include:

  • Editors: User-friendly interfaces for creating and managing ADL definitions.
  • Validators: Tools for ensuring ADL definitions are valid and conform to the specification.
  • Registries: Centralized repositories for storing and sharing ADL definitions.
  • Testing Tools: Automated tests for verifying agent behavior based on its ADL definition.

This ecosystem will streamline the entire agent lifecycle, from development and deployment to monitoring and maintenance.

ADL and Existing Technologies

ADL isn’t intended to replace existing technologies like A2A (agent-to-agent communication), MCP, OpenAPI, or workflow engines. Instead, it complements them. ADL defines the agent itself, while these other technologies handle communication, execution, and orchestration.

Did you know? ADL focuses on the “what” of an agent, while other technologies focus on the “how.”

Real-World Applications

The potential applications of ADL are vast. Consider these examples:

  • Customer Support: Defining agents that can handle specific customer inquiries, access knowledge bases, and escalate complex issues.
  • Fraud Detection: Creating agents that can analyze transactions, identify suspicious patterns, and flag potential fraud.
  • HR Automation: Developing agents that can automate tasks like onboarding, benefits administration, and employee inquiries.

In each of these scenarios, ADL provides a standardized way to define the agent’s capabilities, permissions, and governance policies.

Frequently Asked Questions (FAQ)

Q: Is ADL a runtime environment?
A: No, ADL is a definition language. It doesn’t execute code or manage agent workflows. It simply defines what an agent is and what it can do.

Q: Is ADL tied to a specific programming language?
A: No, ADL is model-agnostic and platform-agnostic. It’s based on JSON, a widely supported data format.

Q: How can I contribute to the ADL project?
A: The ADL repository on GitHub ([https://github.com/nextmoca/adl](https://github.com/nextmoca/adl)) provides contribution guidelines and a public roadmap.

Q: What are the benefits of using ADL?
A: Portability, auditability, vendor neutrality, and improved governance are key benefits.

The open-sourcing of ADL marks a significant step towards a more standardized and scalable future for AI agents. By providing a common language for defining these powerful entities, ADL empowers developers, enhances security, and unlocks new possibilities for innovation.

Explore the ADL project on GitHub: https://github.com/nextmoca/adl

February 9, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

Google Supercharges Gemini 3 Flash with Agentic Vision

by Chief Editor February 6, 2026
written by Chief Editor

AI Just Got a New Pair of Eyes: How Agentic Vision Will Change Everything

For years, artificial intelligence has struggled with a surprisingly human task: truly seeing. AI models could identify objects in images, but lacked the ability to investigate, to zoom in on details, or to reason about what they were looking at. That’s changing with the introduction of Agentic Vision in Google’s Gemini 3 Flash, a capability that’s poised to redefine how AI interacts with the visual world.

From Static Glance to Active Investigation

Traditionally, AI models like Gemini processed images with a single, static look. Miss a crucial detail – a serial number, a subtle sign – and the AI was forced to guess. Agentic Vision flips this script. It transforms image understanding into an active process, treating vision as an investigation. Instead of simply receiving an image, Gemini 3 Flash now plans how to examine it.

This process relies on a “think -> act -> observe” loop. First, the model analyzes the user’s request and the image. Then, it generates and executes Python code to manipulate the image – cropping, zooming, annotating – and extract more information. Finally, the transformed image is added to the model’s context, allowing it to refine its understanding before providing an answer.

The Power of Code Execution: Solving the “Hard Problems”

The key to Agentic Vision’s success lies in its ability to execute code. This allows for incredibly precise inspection of images. For example, Gemini can now reliably count the digits on a hand, a task that has historically stumped AI systems. It achieves this by drawing bounding boxes and labels directly onto the image, a “visual scratchpad” that grounds its reasoning in pixel-perfect understanding.

Beyond object counting, code execution also enables visual arithmetic and data visualization. Complex, image-based math problems can be offloaded to Python and Matplotlib, reducing the likelihood of AI “hallucinations” – those confidently incorrect answers that plague many current systems. Google reports a 5-10% accuracy improvement on vision tasks across most benchmarks as a result of this approach.

Beyond Gemini: The Future of Agentic Vision

Google’s vision for Agentic Vision extends far beyond the current capabilities of Gemini 3 Flash. The roadmap includes making the process more implicit, so the AI automatically zooms and rotates images without explicit instructions. Adding tools like web search and reverse image search will further enhance the model’s ability to gather evidence and contextualize its understanding.

The implications are significant, particularly for robotics. As one Redditor noted, Agentic Vision could unlock visual reasoning for AI in physical robots, giving them a much richer understanding of their surroundings and enabling more sophisticated agentic capabilities. While ChatGPT has experimented with similar code execution features, it still struggles with tasks like counting fingers.

Agentic Vision is currently accessible through the Gemini API in Google AI Studio and Vertex AI, and is rolling out in the Gemini app’s Thinking mode.

Pro Tip

Experiment with the “Code Execution” setting in the AI Studio Playground to see Agentic Vision in action. Try posing complex image-based questions to Gemini 3 Flash and observe how it uses code to arrive at its answers.

FAQ

What is Agentic Vision?
Agentic Vision is a new capability in Gemini 3 Flash that allows the AI to actively investigate images by planning steps, manipulating the image, and using code to verify details.

How does Agentic Vision improve accuracy?
It improves accuracy by enabling fine-grained inspection of details and reducing hallucinations through code execution and visual arithmetic.

Is Agentic Vision available now?
Yes, it’s accessible through the Gemini API in Google AI Studio and Vertex AI, and is rolling out in the Gemini app.

Will Agentic Vision be available in other Gemini models?
Google plans to extend support to other models in the Gemini family beyond Flash.

What are the potential applications of Agentic Vision?
Potential applications include robotics, image analysis, and any task requiring detailed visual understanding.

Did you know? Agentic Vision allows Gemini 3 Flash to not just *see* an image, but to actively *investigate* it, leading to more accurate and reliable results.

Want to learn more about the latest advancements in AI? Explore our other articles or subscribe to our newsletter for regular updates.

February 6, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

Google’s Universal Commerce Protocol (UCP) Powers Agentic Shopping

by Chief Editor January 25, 2026
written by Chief Editor

Google’s UCP: The Dawn of Agentic Commerce and What It Means for Your Business

Google recently unveiled the Universal Commerce Protocol (UCP), and it’s more than just another tech announcement. It’s a foundational shift in how online shopping will work, particularly as AI-powered shopping assistants – or “agents” – become increasingly prevalent. This open-source standard aims to streamline the entire buying process, from product discovery to final payment, and it has the potential to reshape the competitive landscape for businesses of all sizes.

The ‘N by N’ Problem Solved: Why UCP Matters

For years, online retailers have grappled with the “N by N” integration problem. Every new shopping platform, every new sales channel, required a separate, often complex, integration. This was costly, time-consuming, and a major barrier to entry for smaller businesses. UCP tackles this head-on by creating a standardized “common language” for commerce. Think of it as a universal translator for shopping, allowing AI agents to seamlessly interact with any business that adopts the protocol.

This isn’t just about convenience; it’s about speed. According to a recent Statista report, global e-commerce sales are projected to reach $6.3 trillion in 2024. Consumers expect instant gratification, and UCP is designed to deliver that by eliminating friction in the checkout process.

How UCP Works: A Deep Dive into the Technology

UCP works in conjunction with the Agent Payments Protocol (AP2) and Agent-to-Agent (A2A) communication, creating a secure and flexible ecosystem. Businesses can connect via APIs, or through existing infrastructure like Shopify and Merchant Center. Crucially, UCP separates payment instruments from handlers, meaning it can work with a wide range of payment providers – Google Wallet, PayPal, credit cards, and more – without requiring constant updates.

Pro Tip: Don’t get bogged down in the technical details. The key takeaway is that UCP simplifies integration, allowing businesses to focus on what they do best: creating great products and providing excellent customer service.

The Big Players Are Onboard: Shopify, Etsy, and More

Google isn’t going it alone. The development of UCP has been a collaborative effort, with major players like Shopify, Etsy, Wayfair, Target, and Walmart all contributing. This widespread support is a strong indicator that UCP is poised to become the industry standard. Over 20 global partners have already endorsed the protocol, signaling a broad commitment to its success.

The ‘Default Economy’ Debate: Will Smaller Brands Be Left Behind?

The launch of UCP hasn’t been without its critics. Andy Reid, Chief Innovation Officer, raised a valid concern on LinkedIn: could UCP lead to a “default economy” where only one brand is surfaced as the optimal choice by AI agents? This raises the specter of larger brands dominating search results, potentially squeezing out smaller competitors.

However, James Massey, AI lead at Google, countered that UCP actually *benefits* smaller brands. By becoming “discoverable” through the protocol, smaller businesses can gain visibility without relying on expensive advertising. If their product is the most relevant, the agent can surface it, regardless of brand recognition. Massey emphasized the importance of “data quality” – ensuring accurate product information and compelling descriptions – as the key to success.

Did you know? High-quality product data is becoming increasingly important for SEO and discoverability, even *without* AI agents. Investing in accurate and detailed product descriptions can pay dividends across multiple channels.

Beyond the Checkout Button: The Future of Agentic Commerce

UCP isn’t just about simplifying the checkout process. It’s about enabling a new era of agentic commerce, where AI assistants can handle everything from product discovery to personalized recommendations to automated reordering. Imagine an agent proactively suggesting a replacement for a product you’re running low on, and completing the purchase with a single voice command.

This future is closer than you think. Google’s reference implementation already allows purchases via AI Mode in Search and Gemini, using Google Wallet or other compatible payment methods. Developers can leverage Python-based SDKs to rapidly integrate UCP into their applications, unlocking a wealth of new possibilities.

Real-World Implications: What Businesses Need to Do Now

While UCP is still in its early stages, businesses should start preparing now. Here’s what you need to focus on:

  • Optimize Your Product Data: Ensure your product information is accurate, complete, and compelling.
  • Explore UCP Integration: If you use platforms like Shopify, investigate how to integrate with UCP.
  • Monitor the Landscape: Stay informed about the latest developments in agentic commerce and UCP.

FAQ: Universal Commerce Protocol Explained

  • What is UCP? UCP is an open-source standard designed to streamline commerce on AI-powered platforms.
  • Who developed UCP? Google developed UCP in collaboration with major retailers like Shopify, Etsy, and Walmart.
  • How will UCP benefit my business? UCP simplifies integration, reduces costs, and increases discoverability for your products.
  • Is UCP secure? Yes, UCP integrates with the Agent Payments Protocol (AP2) for secure payments.
  • Where can I learn more about UCP? Visit the Google Developers Blog and the UCP GitHub repository.

The Universal Commerce Protocol represents a significant step towards a more seamless and efficient online shopping experience. By embracing this new standard, businesses can position themselves for success in the age of AI-powered commerce.

Want to learn more about the future of e-commerce? Explore our other articles on AI and retail or subscribe to our newsletter for the latest insights.

January 25, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior

by Chief Editor January 12, 2026
written by Chief Editor

The Dawn of AI Transparency: How ‘Microscopes’ Like Gemma Scope 2 Are Reshaping AI Safety

For years, artificial intelligence has operated as something of a “black box.” We see the outputs – the generated text, the image creations, the predictive analyses – but understanding how an AI arrives at those conclusions has remained a significant challenge. That’s changing, rapidly, with the emergence of tools like Google’s Gemma Scope 2. This isn’t just about academic curiosity; it’s about building trust, mitigating risks, and unlocking the full potential of increasingly powerful AI systems.

Peeking Inside the AI Mind: What is Gemma Scope 2?

Gemma Scope 2 is essentially a suite of analytical tools designed to dissect the inner workings of Google’s Gemini 3 large language models (LLMs). Think of it as a high-powered microscope for AI. It leverages techniques like sparse autoencoders (SAEs) and transcoders to allow researchers to inspect the internal representations within the model. This means they can examine what the AI is “thinking” at each step and how those internal states influence its behavior. The primary goal? To identify and address potential safety issues like unintended biases, susceptibility to “jailbreaks” (where users trick the AI into harmful responses), and the generation of false information (hallucinations).

The original Gemma Scope focused on the Gemma 2 family of models. Gemma Scope 2 significantly expands on this, applying its analytical power to the more advanced Gemini 3, including its sophisticated skip-transcoders and cross-layer transcoders. These advancements are crucial for understanding the complex, multi-layered computations happening within these models.

Pro Tip: Sparse autoencoders and transcoders are key to this process. SAEs decompose and reconstruct LLM inputs, while transcoders approximate the output of specific layers, revealing which parts of the model are activated by particular inputs.

Why AI Interpretability Matters Now More Than Ever

As AI models become more capable, the need for interpretability grows exponentially. Consider the increasing use of AI in critical applications like healthcare diagnostics, financial risk assessment, and even autonomous vehicles. A lack of understanding about why an AI made a particular decision is simply unacceptable in these contexts. Interpretability isn’t just about safety; it’s about accountability and building public confidence.

Recent data from a Gartner report shows that while generative AI is at the peak of inflated expectations, a major barrier to wider adoption is a lack of trust and understanding of how these systems work. Tools like Gemma Scope 2 are directly addressing this concern.

Beyond Security: The Broader Implications of AI Microscopes

While security is a primary driver for developing these “AI microscopes,” the potential applications extend far beyond simply preventing malicious use. Researchers can use these tools to:

  • Improve Model Performance: Identify areas where the model is struggling and refine its training data or architecture.
  • Understand Emergent Behaviors: LLMs sometimes exhibit unexpected capabilities. Interpretability tools can help us understand how these behaviors arise.
  • Develop More Robust AI: Build AI systems that are less susceptible to adversarial attacks and more reliable in real-world scenarios.
  • Inform Fine-Tuning: As redditor Mescalian pointed out, these tools can help optimize AI capabilities through targeted adjustments to model weights.

It’s not just Google leading the charge. Anthropic and OpenAI have also released their own interpretability tools, demonstrating a growing industry-wide recognition of the importance of AI transparency.

The Future of AI: Towards Explainable and Controllable Systems

The development of Gemma Scope 2 and similar tools signals a significant shift in the AI landscape. We’re moving away from opaque “black box” models towards more explainable and controllable systems. This trend is likely to accelerate in the coming years, driven by several factors:

  • Increased Regulatory Pressure: Governments around the world are beginning to develop regulations for AI, many of which will require a degree of transparency and accountability.
  • Growing Demand for Trustworthy AI: Businesses and consumers are increasingly demanding AI systems they can trust.
  • Advancements in Interpretability Techniques: Researchers are continually developing new and more sophisticated methods for understanding AI behavior.

We can anticipate a future where AI interpretability is not an optional feature, but a fundamental requirement for deploying AI systems in any critical application. The open-sourcing of Gemma Scope 2’s weights on Hugging Face is a particularly encouraging sign, fostering collaboration and accelerating innovation in this crucial field.

FAQ: AI Interpretability Explained

  • What is AI interpretability? It’s the ability to understand how an AI model arrives at its decisions.
  • Why is it important? It builds trust, ensures accountability, and helps mitigate risks.
  • What are sparse autoencoders and transcoders? They are techniques used to analyze the internal workings of LLMs.
  • Is AI interpretability a solved problem? No, it’s an ongoing area of research and development.

Did you know? The computational demands of analyzing increasingly complex models like Gemini 3 required Google to develop specialized sparse kernels to maintain efficiency.

Want to learn more about the latest advancements in AI safety and interpretability? Explore our other articles on responsible AI development and the ethical implications of artificial intelligence. Share your thoughts in the comments below – what are your biggest concerns about AI, and what role do you think interpretability will play in addressing them?

January 12, 2026 0 comments
0 FacebookTwitterPinterestEmail
Tech

QCon AI NY 2025 – Becoming AI-Native Without Losing Our Minds To Architectural Amnesia

by Chief Editor December 25, 2025
written by Chief Editor

The Looming “Agentic Debt”: Why AI’s Rise Demands Architectural Discipline

The relentless march of AI isn’t just about flashy new features and productivity gains. A critical warning, delivered at QCon AI NY 2025 by Tracy Bannon, suggests we’re sleepwalking into a new era of technical debt – “agentic debt” – if we don’t apply established software architecture principles to these increasingly autonomous systems. The core message? AI amplifies existing weaknesses, it doesn’t create entirely new ones.

Beyond Bots and Assistants: Understanding the Spectrum of AI Autonomy

Bannon’s talk highlighted a crucial distinction often lost in the AI hype: not all “AI” is created equal. She categorized AI systems into three broad types: bots (scripted responders), assistants (human-collaborative), and agents (goal-driven, autonomous actors). This isn’t merely semantic. Each category carries a vastly different risk profile. A simple chatbot responding to FAQs poses minimal risk, while an AI agent managing a supply chain or controlling critical infrastructure demands rigorous architectural oversight.

Consider a real-world example: a marketing team deploying an AI agent to automatically adjust ad spend based on performance. Without proper identity management and access controls, that agent could potentially drain the entire marketing budget into a single, poorly performing campaign – a scenario easily preventable with sound architectural practices.

The Autonomy Paradox: Faster Innovation, Greater Risk

The speed at which AI agents are being adopted is breathtaking. Forrester predicts a significant rise in technical debt severity in the near term, directly linked to this AI-driven complexity. But Bannon argues that the problem isn’t the AI itself, but our tendency to prioritize speed over foundational architectural principles. We’re chasing “visible activity metrics” – like lines of code deployed or features launched – while neglecting the “work that keeps systems healthy”: design, refactoring, validation, and threat modeling.

Pro Tip: Before deploying any AI agent, ask yourself: “What happens when it makes a mistake?” If you can’t answer that question quickly and confidently, you’re likely building agentic debt.

Agentic Debt: The Familiar Faces of Failure

Agentic debt manifests in ways that will sound eerily familiar to seasoned software engineers. Bannon identified key areas of concern: identity and permissions sprawl (who *is* this agent?), insufficient segmentation and containment (can it access things it shouldn’t?), missing lineage and observability (can we trace its actions?), and weak validation and safety checks (how do we know it’s doing the right thing?).

A recent report by Gartner found that 40% of organizations struggle with AI observability, meaning they lack the tools and processes to understand *why* their AI systems are making certain decisions. This lack of transparency is a breeding ground for agentic debt.

Identity as the Cornerstone of Agentic Security

Bannon emphasized identity as the foundational control for agentic systems. Every agent, she argued, must have a unique, revocable identity. Organizations need to be able to quickly answer three critical questions: what can the agent access, what actions has it taken, and how can it be stopped? She proposed a minimal identity pattern centered around an agent registry – a centralized repository of information about each agent operating within the system.

Did you know? The concept of least privilege – granting agents only the minimum necessary permissions – is even *more* critical in agentic systems, as their autonomous nature means they can potentially exploit broader access if compromised.

Decision-Making Discipline: Why, Not Just How

Bannon urged teams to shift their focus from *how* to implement AI agents to *why* they’re doing so. Every decision to increase autonomy should be a conscious tradeoff, explicitly acknowledging the potential downsides. She framed decisions as optimizations – improvements in one dimension always come at the expense of another (e.g., speed vs. quality, value vs. effort).

For example, an AI agent designed to automate customer support might improve response times (speed) but potentially at the cost of personalized service (quality). Understanding this tradeoff is crucial for responsible AI deployment.

The Architect’s Role: Preventing Architectural Amnesia

The call to action from Bannon’s talk was clear: architects and senior engineers must take ownership of AI agent integration. This means preventing “architectural amnesia” by designing governed agents, making risk and debt visible, and pursuing higher levels of autonomy only when demonstrably valuable. The good news? The core principles of software architecture remain valid. The challenge isn’t learning entirely new disciplines, but applying existing knowledge to a new context.

FAQ: Addressing Common Concerns

  • What is “agentic debt”? It’s the technical debt accumulated when AI agents are deployed without sufficient architectural discipline, leading to issues like identity sprawl and lack of observability.
  • Is AI inherently risky? No, but it amplifies existing risks in software systems.
  • What’s the first step to mitigating agentic debt? Focus on establishing a strong identity management system for all AI agents.
  • Do I need to rewrite all my existing code? Not necessarily, but you should carefully assess the architectural implications of integrating AI agents into existing workflows.

Want to learn more about building robust and secure AI systems? Explore additional resources from QCon AI and InfoQ. Recorded videos from the conference will be available starting January 15, 2026.

What are your biggest concerns about the rise of AI agents? Share your thoughts in the comments below!

December 25, 2025 0 comments
0 FacebookTwitterPinterestEmail
Tech

Cactus v1: Cross-Platform LLM Inference on Mobile with Zero Latency and Full Privacy

by Chief Editor December 24, 2025
written by Chief Editor

The Rise of On-Device AI: Your Phone is About to Get a Lot Smarter

For years, artificial intelligence has largely lived in the cloud – requiring a constant internet connection and raising privacy concerns. But a quiet revolution is underway. Thanks to startups like Cactus, backed by Y Combinator, AI is rapidly becoming localized, running directly on your smartphone, wearable, or even a Raspberry Pi. This shift isn’t just about speed; it’s about fundamentally changing how we interact with technology.

Why On-Device AI Matters: Beyond Faster Responses

The benefits of running AI models locally are substantial. Eliminating the need to send data to remote servers drastically reduces latency. Cactus, for example, boasts sub-50ms time-to-first-token for on-device inference – meaning near-instant responses. But the advantages extend far beyond speed. Privacy is paramount. With data processing happening directly on your device, sensitive information never leaves your control. This is a game-changer for applications dealing with personal health data, financial information, or confidential communications.

Consider a real-world example: a doctor using a voice-to-text app powered by on-device AI to dictate patient notes. Previously, this data would have been transmitted to a cloud server, potentially raising HIPAA compliance issues. Now, the transcription happens securely on the device, ensuring patient confidentiality. This trend aligns with growing consumer demand for data privacy, as evidenced by a recent Pew Research Center study showing 79% of Americans are concerned about how their data is being used.

Cactus and the Democratization of Local AI

Cactus isn’t alone in this space, but it’s quickly gaining traction by offering a cross-platform solution. Unlike Apple’s Foundation frameworks or Google’s AI Edge, which are tied to specific operating systems and limited capabilities, Cactus supports a wide range of models – including popular options like Qwen, Gemma, Llama, and Mistral. This open approach is crucial for fostering innovation and preventing vendor lock-in.

The recently released v1 SDK is a significant step forward. It’s been rebuilt from the ground up to improve performance on lower-end hardware and offers optional cloud fallback for tasks that demand more processing power. This hybrid approach – local processing with cloud assistance when needed – provides the best of both worlds: speed, privacy, and reliability. The SDK’s support for languages like React Native, Flutter, and Kotlin Multiplatform makes it accessible to a broad range of developers.

Pro Tip: Quantization – reducing the precision of the numbers used in AI models – is key to running them efficiently on resource-constrained devices. Cactus supports quantization levels down to 2-bit, significantly reducing model size and improving performance.

The Future of On-Device AI: What to Expect

The current wave of on-device AI is just the beginning. Several key trends are poised to accelerate its growth:

  • More Powerful Mobile Processors: Chip manufacturers like Qualcomm and Apple are increasingly integrating dedicated Neural Processing Units (NPUs) into their mobile processors, specifically designed for AI workloads. Benchmarks published by Cactus demonstrate the impact: an iPhone 15 Pro achieves 136 tokens per second with the LFM2-VL-450m model, showcasing the power of NPUs.
  • Edge Computing Expansion: The principles of on-device AI are extending beyond smartphones to edge devices like smart cameras, industrial sensors, and autonomous vehicles. This will enable real-time decision-making without relying on cloud connectivity.
  • Generative AI Everywhere: Expect to see generative AI features – text generation, image creation, code completion – become seamlessly integrated into everyday apps, all powered locally on your device.
  • Personalized AI Experiences: On-device AI allows for truly personalized experiences. Models can be fine-tuned to your specific preferences and data, creating AI assistants that are uniquely tailored to your needs.
  • Advanced Tool Calling and Multimodal AI: Cactus v1 already supports tool calling and voice transcription, and the roadmap includes voice synthesis. The future will see more sophisticated multimodal AI – models that can process and understand multiple types of data (text, images, audio, video) simultaneously.

Benchmarks and Model Sizes: A Quick Reference

Here’s a snapshot of model sizes and performance (based on Cactus’ benchmarks using INT8 quantization):

Model Size (MB) Supported Features Tokens/Second (Mac M4 Pro)
gemma-3-270m-it 172 Completion 150
Qwen3-0.6B 394 Completion, Tool Calling, Embedding, Speech 160
Gemma-3-1b-it 642 Completion 165
Qwen3-1.7B 1,161 Completion, Tool Calling, Embedding, Speech 173

FAQ: On-Device AI Explained

  • What is on-device AI? It’s running AI models directly on your device (phone, laptop, etc.) instead of relying on a cloud server.
  • Is on-device AI secure? Yes, it’s generally more secure as your data doesn’t leave your device.
  • Will on-device AI replace cloud-based AI? Not entirely. A hybrid approach – local processing with cloud fallback – is likely to be the dominant model.
  • What are the limitations of on-device AI? Processing power and memory constraints can limit the complexity of models that can be run locally.

Cactus is available for cloning from GitHub and offers free access for students, educators, non-profits, and small businesses. Explore the possibilities and start building the future of localized AI today!

Want to learn more about the latest advancements in AI? Subscribe to our newsletter for exclusive insights and updates.

December 24, 2025 0 comments
0 FacebookTwitterPinterestEmail
Newer Posts
Older Posts

Recent Posts

  • Botafogo vs Caracas: How to Watch Copa Sudamericana Debut & Lineups 2026

    April 9, 2026
  • Polaroid Hi-Print 3×3: Portable Photo Printer & Frame for Gen-Z Creativity

    April 9, 2026
  • Antibiotic Use and Antibiotic Susceptibility of Common Environmental Bacterial Species in the Intensive Care Unit at Mbarara Regional Referral Hospital, Uganda

    April 9, 2026
  • Anutin Govt Policy Fails to Address Energy Crisis, MP Criticizes – Thailand News

    April 9, 2026
  • Senesi: Juventus, Dortmund & Premier League Clubs Chase Free Transfer

    April 9, 2026

Popular Posts

  • 1

    Maya Jama flaunts her taut midriff in a white crop top and denim jeans during holiday as she shares New York pub crawl story

    April 5, 2025
  • 2

    Saar-Unternehmen hoffen auf tiefgreifende Reformen

    March 26, 2025
  • 3

    Marta Daddato: vita e racconti tra YouTube e podcast

    April 7, 2025
  • 4

    Unlocking Success: Why the FPÖ Could Outperform Projections and Transform Austria’s Political Landscape

    April 26, 2025
  • 5

    Mecimapro Apologizes for DAY6 Concert Chaos: Understanding the Controversy

    May 6, 2025

Follow Me

Follow Me
  • Cookie Policy
  • CORRECTIONS POLICY
  • PRIVACY POLICY
  • TERMS OF SERVICE

Hosted by Byohosting – Most Recommended Web Hosting – for complains, abuse, advertising contact: o f f i c e @byohosting.com


Back To Top
Newsy Today
  • Business
  • Entertainment
  • Health
  • News
  • Sport
  • Tech
  • World