QCon London 2026: Behind Booking.com’s AI Evolution: The Unpolished Story

by Chief Editor

Booking.com’s AI Journey: Lessons for the Future of Data-Driven Platforms

Booking.com’s evolution from Perl scripts and MySQL databases to a sophisticated AI platform, as detailed at QCon London 2026 by Senior Principal Engineer Jabez Eliezer Manuel, offers valuable insights into the challenges and triumphs of scaling AI within a large organization. The presentation, “Behind Booking.com’s AI Evolution: The Unpolished Story,” highlighted a 20-year journey marked by pragmatic experimentation and a willingness to adapt.

The Power of Data-Driven DNA

In 2005, Booking.com began extensive A/B testing, running over 1,000 experiments concurrently and accumulating 150,000 total experiments. Despite a less than 25% success rate, the company prioritized rapid learning over immediate results, fostering a “Data-Driven DNA” that continues to shape its approach to innovation. This early commitment to experimentation laid the groundwork for future AI initiatives.

From Hadoop to a Unified Platform: A Migration Story

Booking.com initially leveraged Apache Hadoop for distributed storage and processing, building two on-premise clusters with approximately 60,000 cores and 200 PB of storage by 2011. However, limitations such as noisy neighbors, lack of GPU support, and capacity issues eventually led to a seven-year migration away from Hadoop. The migration strategy involved mapping the entire ecosystem, analyzing usage to reduce scope, applying the PageRank algorithm, migrating in waves, and finally phasing out Hadoop. A unified command center proved crucial to this complex undertaking.

The Evolution of the Machine Learning Stack

The company’s machine learning stack has undergone significant transformation, evolving from Perl and MySQL in 2005 to agentic systems in 2025. Key technologies along the way included Apache Oozie with Python, Apache Spark with MLlib, and H2O.ai. 2015 marked a turning point with the resolution of challenges in real-time predictions and feature engineering. As of 2024, the platform handles over 400 billion predictions daily with a latency of less than 20 milliseconds, powered by more than 480 machine learning models.

Domain-Specific AI Platforms

Booking.com has developed four distinct domain-specific machine learning platforms:

  • GenAI: Used for trip planning, smart filters, and review summaries.
  • Content Intelligence: Focused on image and review analysis, and text generation for detailed hotel content.
  • Recommendations: Delivering personalized content to customers.
  • Ranking: A complex platform optimizing for choice and value, exposure and growth, and efficiency and revenue.

The initial ranking formula, a simple function of bookings, views, and a random number, proved surprisingly resilient to machine learning replacements due to infrastructure limitations. The company adopted an interleaving technique for A/B testing, allowing for more variants with less traffic, followed by validation with traditional A/B testing.

Future Trends: What Lies Ahead?

Booking.com’s journey highlights several key trends likely to shape the future of AI-powered platforms:

  • Unified Orchestration Layers: The convergence of domain-specific AI platforms into a unified orchestration layer, as demonstrated by Booking.com, will become increasingly common. This allows for greater synergy and efficiency.
  • Pragmatic AI Adoption: The emphasis on learning from failures and iterating quickly, rather than striving for perfection, will be crucial for successful AI implementation.
  • Infrastructure as a Limiting Factor: Infrastructure limitations can significantly impact the effectiveness of even the most sophisticated algorithms. Investing in scalable and robust infrastructure is paramount.
  • The Importance of Data Management: Effective data management, including strategies for handling large datasets and ensuring data quality, remains a foundational element of any successful AI initiative.

FAQ

Q: What was the biggest challenge Booking.com faced during its AI evolution?
A: Migrating away from Hadoop proved to be a significant undertaking, requiring a seven-year phased approach.

Q: What is the current latency of Booking.com’s machine learning inference platform?
A: Less than 20 milliseconds.

Q: What is “interleaving” in the context of A/B testing?
A: A technique where 50% of experiments are interwoven into a single experiment, allowing for more variants with less traffic.

Q: What technologies did Booking.com use in its machine learning stack?
A: Perl, MySQL, Apache Oozie, Python, Apache Spark, MLlib, H2O.ai, deep learning, and GenAI.

Did you realize? Booking.com’s initial A/B testing experiments had a less than 25% success rate, but the focus was on learning, not immediate results.

Pro Tip: Don’t be afraid to experiment and fail quick. A culture of learning from mistakes is essential for successful AI adoption.

Want to learn more about the latest trends in AI and machine learning? Explore our other articles or subscribe to our newsletter for regular updates.

You may also like

Leave a Comment