How Datadog Cut the Size of Its Agent Go Binaries by 77%

by Chief Editor

The Shrinking Codebase: How Head Projects Are Winning the Binary Size War

For years, software bloat has been a silent performance killer. Larger binaries indicate slower downloads, increased network costs, and greater resource consumption – issues that are particularly acute in modern deployments like serverless functions and edge computing. Now, a concerted effort to slim down Go applications is gaining momentum, led by companies like Datadog and impacting projects across the ecosystem, including Kubernetes.

The Datadog Agent’s Transformation

The Datadog Agent, a crucial component for monitoring and observability, recently underwent a significant transformation. Over five years, its size ballooned from 428 MiB to 1.22 GiB. This growth, driven by new features, integrations, and third-party dependencies, created tangible problems for both Datadog and its users. Increased network costs, higher resource usage, and a negative perception of the Agent were all consequences. Datadog engineers, led by Pierre Gimalac, tackled this issue head-on, achieving a remarkable 77% reduction in binary size within six months – without removing any features.

The Culprits Behind Go Binary Bloat

The investigation revealed several key contributors to the bloat. Hidden dependencies, often pulled in transitively through other packages, were a major factor. Disabled linker optimizations, and subtle behaviors within the Go compiler and linker also played a significant role. Go’s dependency model, while powerful, can easily lead to a situation where a small change introduces hundreds of new packages into a build.

Practical Strategies for Code Slimming

Datadog’s engineers employed two primary strategies to combat this bloat. First, they leveraged build tags (//go:build feature_x) to exclude optional code during compilation. This allows for creating leaner binaries tailored to specific environments. Second, they restructured code into separate packages, isolating non-essential components and minimizing the size of core packages. A single function moved to its own package, for example, eliminated approximately 570 packages and 36 MB of generated code in builds that didn’t require it.

Fortunately, the Go ecosystem provides tools to aid in this process. go list helps identify all packages used in a build. goda visualizes dependency graphs, revealing hidden import chains. And go-size-analyzer pinpoints which dependencies contribute the most to binary size.

Beyond Dependencies: Reflection and Plugins

Dependency optimization wasn’t the only avenue for improvement. The team discovered that the leverage of reflection could silently disable crucial linker optimizations, such as dead-code elimination. By minimizing reflection and even submitting pull requests to projects like Kubernetes, uber-go/dig, and google/go-cmp to address reflection-related issues, they achieved further size reductions.

Similarly, Go plugins, while offering dynamic loading capabilities, also disable dead-code elimination. Simply importing the plugin package forces the linker to treat the binary as dynamically linked, significantly increasing its size. Eliminating plugin usage yielded an additional 20% reduction in some builds.

The Ripple Effect: Impact on the Go Ecosystem

Datadog’s work isn’t confined to its own codebase. The insights gained during this optimization effort have led to improvements in the Go compiler and linker, benefiting other large Go projects. Kubernetes, in particular, is poised to leverage these advancements to reduce its own binary sizes.

Future Trends in Go Binary Optimization

The focus on binary size reduction is likely to intensify as deployments become increasingly distributed and resource-constrained. Several trends are emerging:

  • More Aggressive Linker Optimizations: Continued improvements to the Go linker will likely unlock further opportunities for dead-code elimination and other size-reducing optimizations.
  • Enhanced Dependency Management: Tools for managing and analyzing Go dependencies will become more sophisticated, making it easier to identify and eliminate unnecessary imports.
  • Build-Time Configuration: The use of build tags and other mechanisms for tailoring binaries to specific environments will become more prevalent.
  • Alternative Compilation Strategies: Exploring alternative compilation strategies, such as ahead-of-time (AOT) compilation, could offer additional size and performance benefits.

FAQ

Q: What is binary bloat?
A: Binary bloat refers to the unnecessary increase in the size of executable files, often due to unused code, dependencies, or inefficient compilation practices.

Q: Why is reducing binary size important?
A: Smaller binaries lead to faster downloads, reduced network costs, lower resource consumption, and improved performance, especially in resource-constrained environments.

Q: What are build tags in Go?
A: Build tags (//go:build feature_x) allow you to conditionally compile code based on specific criteria, enabling you to create leaner binaries for different environments.

Q: Does optimizing binary size require removing features?
A: Not necessarily. Datadog demonstrated a 77% reduction in binary size without removing any features, by focusing on dependency optimization and linker improvements.

Did you know? Go’s transitive dependencies can quickly inflate binary sizes. Regularly auditing your imports is crucial for maintaining a lean codebase.

Pro Tip: Use go-size-analyzer to quickly identify the largest dependencies in your Go project and prioritize optimization efforts.

Want to learn more about optimizing your Go applications? Explore the Datadog engineering blog for a deep dive into their optimization journey. Share your own experiences and challenges with Go binary size in the comments below!

You may also like

Leave a Comment