Bazel Refresher — BUILD Files, Rules, Caching

Table of Contents

Overview & Philosophy

Bazel is an open-source build and test tool created by Google. It handles projects of any size, supports multiple languages in a single repository, and produces correct, reproducible outputs through strict hermeticity guarantees. Google uses it (internally called Blaze) to build millions of lines of code across thousands of engineers daily.

The Four Pillars

Hermeticity — actions run in a sandbox with only declared inputs visible. The same source always produces the same output, regardless of the machine.
Reproducibility — build outputs are content-addressed. Identical inputs always produce byte-identical outputs, making remote caching trivially correct.
Incrementality — Bazel tracks a fine-grained dependency graph (the action graph). Only actions whose inputs changed are re-executed. Rebuilding one file in a 50 MLOC repo takes milliseconds.
Scalability — parallelism is implicit. Actions without data dependencies run concurrently. Remote execution distributes work across a build farm.

Bazel vs Other Build Tools

Feature	Bazel	Maven / Gradle	Make / CMake	Pants / Buck2
Hermetic sandbox	Yes (strict)	No	No	Yes
Remote caching	Built-in	Plugin / limited	No	Built-in
Multi-language	Yes	JVM-focused	C/C++ focused	Yes
Monorepo support	Excellent	Poor / multi-module	Manual	Excellent
Build language	Starlark (Python-like)	XML / Groovy / Kotlin	Make syntax / CMakeLists	Starlark / BUCK
Incremental builds	Fine-grained (file-level)	Module-level	File-level (fragile)	Fine-grained
Learning curve	High	Medium	Medium	High

When to Use Bazel

Large monorepos with multiple languages (Java, Go, Python, C++, Proto)
Teams that need reproducible, cacheable CI builds across many machines
Projects where build correctness matters more than configuration simplicity
Organizations wanting to scale builds horizontally via remote execution

When NOT to use Bazel

Small single-language projects with standard tooling (e.g., a Node.js app or a single Go module) pay a steep upfront cost for minimal gain. Bazel shines at scale; below ~10 engineers or ~100k lines of code, the configuration overhead usually isn't worth it.

Installation via Bazelisk

Bazelisk is the recommended launcher. It reads .bazelversion from your workspace root and downloads the correct Bazel version automatically — no more version mismatch across team members.

# macOS
brew install bazelisk

# Linux (or via npm)
npm install -g @bazel/bazelisk

# Verify — Bazelisk proxies to the Bazel version in .bazelversion
bazel version

# Pin a specific Bazel version for your project
echo "7.0.2" > .bazelversion

Always commit .bazelversion

Checking in .bazelversion ensures every developer and CI runner uses the same Bazel version. Bazelisk downloads it on first use and caches it in ~/.cache/bazelisk.

Core Concepts

Workspace

The workspace is the root directory of your project, identified by the presence of a WORKSPACE or WORKSPACE.bazel file (or MODULE.bazel for Bzlmod). Everything inside the workspace can be referenced with Bazel labels. Files outside it are invisible to builds unless explicitly declared as external repositories.

Packages

A package is any directory containing a BUILD or BUILD.bazel file. It is the fundamental unit of ownership and visibility. A package owns all files in its directory that are not in a subdirectory that contains its own BUILD file.

Targets

A target is a buildable entity declared in a BUILD file — a rule invocation (like java_library), a file, or a file group. Targets are the nodes in the dependency graph.

Labels

Labels uniquely identify targets across the entire workspace. The syntax is:

//path/to/package:target_name

# Examples
//src/main/java/com/example:server      # explicit label
//src/main/java/com/example             # shorthand — target name = package name
:server                                 # relative label within the same BUILD file
@some_repo//path/to/package:target      # target in an external repository

Visibility

Visibility controls which packages can depend on a target. It is set via the visibility attribute.

java_library(
    name = "internal_util",
    srcs = ["InternalUtil.java"],
    # Only targets in the same package can depend on this
    visibility = ["//visibility:private"],
)

java_library(
    name = "public_api",
    srcs = ["PublicApi.java"],
    # Any target anywhere can depend on this
    visibility = ["//visibility:public"],
)

java_library(
    name = "service_only",
    srcs = ["ServiceImpl.java"],
    # Only targets in //src/services/... can depend on this
    visibility = ["//src/services:__subpackages__"],
)

Dependency Graph vs Action Graph

Dependency graph — the logical graph of targets and their declared deps. This is what you express in BUILD files.
Action graph — the physical graph of actions (compiler invocations, linker runs, etc.) that Bazel derives from the dependency graph. This is what Bazel actually executes. Use bazel aquery to inspect it.

Hermeticity and Sandboxing

Each action runs in an isolated sandbox. On Linux, Bazel uses Linux namespaces to create a private view of the filesystem containing only the declared inputs. On macOS, it uses a process sandbox. An action that reads a file it didn't declare will fail at build time rather than produce a subtly wrong output.

Why hermeticity matters

Without hermeticity, a build might succeed on your machine because it reads /usr/local/lib/libfoo.so implicitly — but fail in CI or on a colleague's machine. Bazel forces you to declare every input, making "works on my machine" a thing of the past.

Project Structure

Typical Monorepo Layout

my-monorepo/
├── WORKSPACE.bazel          # Workspace root (legacy) — or MODULE.bazel for Bzlmod
├── MODULE.bazel             # Bzlmod module descriptor (Bazel 6+)
├── MODULE.bazel.lock        # Pinned dependency versions
├── .bazelversion            # Pinned Bazel version for Bazelisk
├── .bazelrc                 # Project-level build flags
├── BUILD.bazel              # Root package (optional but common)
├── tools/
│   ├── BUILD.bazel
│   └── linting/
│       └── BUILD.bazel
├── third_party/             # Vendored or wrapped external deps
│   └── java/
│       └── BUILD.bazel
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── com/example/
│   │   │       ├── BUILD.bazel     # Package: //src/main/java/com/example
│   │   │       ├── Server.java
│   │   │       └── api/
│   │   │           ├── BUILD.bazel # Sub-package
│   │   │           └── Handler.java
│   │   └── python/
│   │       └── my_app/
│   │           ├── BUILD.bazel
│   │           └── main.py
│   └── test/
│       └── java/
│           └── com/example/
│               ├── BUILD.bazel
│               └── ServerTest.java
└── proto/
    ├── BUILD.bazel
    └── api.proto

WORKSPACE File

The legacy WORKSPACE.bazel file declares external dependencies and repository names. With Bzlmod (MODULE.bazel), most of this moves to the module system, but many projects still use WORKSPACE for compatibility.

# WORKSPACE.bazel
workspace(name = "my_monorepo")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

# Pull in rules_go
http_archive(
    name = "io_bazel_rules_go",
    sha256 = "d93ef02f1e72c82d8bb3d5169519b36167b33cf68c252525e3b9d3d5dd143de7",
    urls = [
        "https://mirror.bazel.build/github.com/bazelbuild/rules_go/releases/download/v0.49.0/rules_go-v0.49.0.zip",
        "https://github.com/bazelbuild/rules_go/releases/download/v0.49.0/rules_go-v0.49.0.zip",
    ],
)

load("@io_bazel_rules_go//go:deps.bzl", "go_register_toolchains", "go_rules_dependencies")
go_rules_dependencies()
go_register_toolchains(version = "1.22.0")

# Pull in rules_python
http_archive(
    name = "rules_python",
    sha256 = "c68bdc4fbec25de5b5493b8819cfc877c4ea299c0dcb15c244c5a00208cde311",
    strip_prefix = "rules_python-0.31.0",
    url = "https://github.com/bazelbuild/rules_python/releases/download/0.31.0/rules_python-0.31.0.tar.gz",
)

load("@rules_python//python:repositories.bzl", "py_repositories")
py_repositories()

BUILD File

BUILD files contain rule invocations. They are written in Starlark (a Python dialect). Each rule invocation declares a target.

# src/main/java/com/example/BUILD.bazel

# Load custom rules from external repos or .bzl files
load("@rules_java//java:defs.bzl", "java_binary", "java_library")

java_library(
    name = "server_lib",
    srcs = glob(["*.java"], exclude = ["*Test.java"]),
    deps = [
        "//src/main/java/com/example/api:handler",
        "@maven//:com_google_guava_guava",
    ],
    visibility = ["//visibility:private"],
)

java_binary(
    name = "server",
    main_class = "com.example.Server",
    runtime_deps = [":server_lib"],
    visibility = ["//visibility:public"],
)

.bazelrc Configuration

# .bazelrc — project-level flags applied to all Bazel commands

# Common flags for all commands
common --enable_bzlmod

# Build flags
build --java_runtime_version=remotejdk_17
build --incompatible_strict_action_env

# Test flags — show output only on failure
test --test_output=errors
test --build_tests_only

# Enable remote cache for CI (override in CI with --config=ci)
build:ci --remote_cache=grpc://your-cache-server:9090
build:ci --google_default_credentials

# Disk cache for local development
build --disk_cache=~/.cache/bazel-disk-cache

# Address noincompatible_use_toolchain_transition warnings
build --incompatible_use_toolchain_resolution_for_constraints

Starlark Language

Starlark (formerly Skylark) is a deterministic, Python-like language designed for build configuration. It is intentionally restricted compared to Python to guarantee that evaluation is hermetic and terminating.

Key Differences from Python

Feature	Python	Starlark
Classes	Yes (`class`)	No — use structs/providers
While loops	Yes	No — use `for` only
Import	`import`	`load()` statement only
Mutability	Always mutable	Frozen after function returns
Recursion	Yes	No — prevents infinite loops
Global state	Yes	No mutable globals
Try/except	Yes	No — use `fail()`
eval()	Yes	No
Integers	Arbitrary precision	64-bit signed

Basic Syntax

# .bzl file — reusable Starlark functions and rules

# String operations
name = "hello"
upper = name.upper()             # "HELLO"
joined = ", ".join(["a", "b"])   # "a, b"
formatted = "pkg: %s" % name     # "pkg: hello"

# List operations
srcs = ["a.java", "b.java"]
srcs_with_c = srcs + ["c.java"]  # concatenation
first = srcs[0]                  # indexing
sliced = srcs[1:]                # slicing
filtered = [s for s in srcs if s.endswith(".java")]  # comprehension

# Dict operations
attrs = {"name": "foo", "size": 42}
keys = attrs.keys()
merged = dict(attrs, extra="val")

# Boolean
is_debug = True
if is_debug:
    print("debug mode")          # print() is allowed in .bzl during loading

# Functions
def make_label(pkg, name):
    """Return a fully-qualified label string."""
    return "//%s:%s" % (pkg, name)

result = make_label("src/main", "server")  # "//src/main:server"

The load() Statement

load() is the only way to import symbols from other .bzl files. It must appear at the top of the file, before any rule invocations.

# Load specific symbols from a .bzl file in the same repo
load("//tools/build_defs:my_rules.bzl", "my_library", "my_binary")

# Load from an external repository
load("@rules_go//go:defs.bzl", "go_binary", "go_library")

# Load and rename to avoid conflicts
load("//tools:helpers.bzl", my_helper = "helper_fn")

# Use loaded symbols
my_library(name = "foo")

select() — Configurable Attributes

select() makes attribute values conditional on configuration flags or platform constraints. It is evaluated lazily at analysis time, not at loading time.

cc_binary(
    name = "my_binary",
    srcs = ["main.cc"],
    deps = select({
        # Key is a label pointing to a config_setting or constraint_value
        "//config:linux": ["//lib:linux_impl"],
        "//config:macos": ["//lib:macos_impl"],
        "//conditions:default": ["//lib:generic_impl"],
    }),
    copts = select({
        "@bazel_tools//src/conditions:windows": ["/W4"],
        "//conditions:default": ["-Wall", "-Werror"],
    }),
)

# config_setting lets you branch on --define flags
config_setting(
    name = "opt_build",
    values = {"compilation_mode": "opt"},
)

config_setting(
    name = "enable_feature_x",
    define_values = {"feature_x": "true"},  # --define=feature_x=true
)

Macros

A macro is a Starlark function that calls one or more rules. It generates targets at loading time and is a lightweight way to reduce boilerplate. Macros are different from custom rules: they don't have their own providers or analysis phase — they just expand to rule calls.

# tools/build_defs/java_service.bzl

def java_service(name, srcs, deps = [], **kwargs):
    """Macro that produces a java_library + a java_binary for a service."""
    lib_name = name + "_lib"

    native.java_library(
        name = lib_name,
        srcs = srcs,
        deps = deps,
        visibility = ["//visibility:private"],
    )

    native.java_binary(
        name = name,
        main_class = "com.example." + name + ".Main",
        runtime_deps = [":" + lib_name],
        **kwargs
    )

# BUILD.bazel — using the macro
load("//tools/build_defs:java_service.bzl", "java_service")

java_service(
    name = "payment_service",
    srcs = glob(["src/**/*.java"]),
    deps = [
        "//shared/proto:payment_proto_java",
        "@maven//:com_google_guava_guava",
    ],
    visibility = ["//visibility:public"],
)

Macros generate implicit target names

Because macros expand to multiple rules, target names are generated programmatically. If a macro creates name + "_lib", that target is queryable as //pkg:payment_service_lib. Poorly named generated targets can be confusing — document your macro's output targets clearly.

Built-in Rules

Java Rules

load("@rules_java//java:defs.bzl", "java_binary", "java_library", "java_test")

# Library — compiled JAR, depended on by other targets
java_library(
    name = "my_lib",
    srcs = glob(["src/main/java/**/*.java"]),
    resources = glob(["src/main/resources/**"]),
    deps = [
        "//shared:common",
        "@maven//:com_google_guava_guava",
        "@maven//:org_slf4j_slf4j_api",
    ],
    exports = [
        # Targets that depend on my_lib also get these on the classpath
        "@maven//:org_slf4j_slf4j_api",
    ],
    visibility = ["//visibility:public"],
)

# Binary — runnable JAR with a main class
java_binary(
    name = "server",
    main_class = "com.example.Server",
    runtime_deps = [":my_lib"],
    jvm_flags = ["-Xmx512m", "-Dfile.encoding=UTF-8"],
)

# Test — JUnit test target
java_test(
    name = "my_lib_test",
    srcs = glob(["src/test/java/**/*Test.java"]),
    test_class = "com.example.MyLibTest",
    deps = [
        ":my_lib",
        "@maven//:junit_junit",
        "@maven//:org_mockito_mockito_core",
    ],
    size = "small",     # small/medium/large/enormous — sets timeout
)

Python Rules

load("@rules_python//python:defs.bzl", "py_binary", "py_library", "py_test")

py_library(
    name = "my_lib",
    srcs = glob(["src/**/*.py"]),
    deps = [
        "//shared:utils",
        "@pip//requests",   # pip dependency via pip_parse
    ],
    imports = ["src"],      # adds src/ to PYTHONPATH
)

py_binary(
    name = "main",
    srcs = ["main.py"],
    deps = [":my_lib"],
    python_version = "PY3",
    main = "main.py",
)

py_test(
    name = "my_lib_test",
    srcs = ["test_my_lib.py"],
    deps = [
        ":my_lib",
        "@pip//pytest",
    ],
)

Go Rules (rules_go)

load("@io_bazel_rules_go//go:defs.bzl", "go_binary", "go_library", "go_test")

go_library(
    name = "my_lib",
    srcs = glob(["*.go"], exclude = ["*_test.go"]),
    importpath = "github.com/myorg/myrepo/path/to/pkg",
    deps = [
        "//shared:common",
        "@com_github_gorilla_mux//:mux",
    ],
    visibility = ["//visibility:public"],
)

go_binary(
    name = "server",
    embed = [":my_lib"],   # embed the library into the binary
    visibility = ["//visibility:public"],
)

go_test(
    name = "my_lib_test",
    srcs = glob(["*_test.go"]),
    embed = [":my_lib"],
    deps = [
        "@com_github_stretchr_testify//assert",
    ],
)

C++ Rules

cc_library(
    name = "my_lib",
    srcs = ["my_lib.cc"],
    hdrs = ["my_lib.h"],      # public headers — propagated to dependents
    includes = ["."],          # include search path
    copts = ["-std=c++17"],
    deps = ["//third_party:abseil"],
    visibility = ["//visibility:public"],
)

cc_binary(
    name = "my_app",
    srcs = ["main.cc"],
    deps = [":my_lib"],
    linkopts = ["-pthread"],
)

cc_test(
    name = "my_lib_test",
    srcs = ["my_lib_test.cc"],
    deps = [
        ":my_lib",
        "@com_google_googletest//:gtest_main",
    ],
)

Generic Rules

# genrule — run an arbitrary shell command to produce outputs
genrule(
    name = "gen_config",
    srcs = ["config.template"],
    outs = ["config.h"],
    cmd = """sed 's/VERSION/1.2.3/g' $< > $@""",
    # $< = first src, $@ = first out, $(SRCS) = all srcs, $(OUTS) = all outs
)

# filegroup — group files without building anything
filegroup(
    name = "all_protos",
    srcs = glob(["**/*.proto"]),
    visibility = ["//visibility:public"],
)

# test_suite — run multiple tests together
test_suite(
    name = "all_tests",
    tests = [
        ":unit_tests",
        ":integration_tests",
    ],
)

# sh_test — shell script as a test
sh_test(
    name = "smoke_test",
    srcs = ["smoke_test.sh"],
    data = [":server"],   # make :server available in runfiles
)

Dependencies

deps vs runtime_deps vs data

Attribute	Available at compile time?	Available at runtime?	Use for
`deps`	Yes	Yes	Libraries needed at compile and runtime
`runtime_deps`	No	Yes	Plugins, drivers, logging backends loaded via reflection
`data`	No (files only)	Yes (via runfiles)	Config files, testdata, fixtures, binaries needed at runtime
`exports`	Propagated to dependents	Propagated	Re-export a dep's API as part of your public API (Java)

java_test(
    name = "db_test",
    srcs = ["DbTest.java"],
    deps = [
        ":db_lib",                  # compile + runtime
        "@maven//:junit_junit",
    ],
    runtime_deps = [
        "@maven//:org_postgresql_postgresql",  # JDBC driver — loaded at runtime via Class.forName()
    ],
    data = [
        "//testdata:schema.sql",    # file available in runfiles at test time
        "//tools:db_server",        # binary launched by the test
    ],
)

http_archive (WORKSPACE)

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "com_google_protobuf",
    sha256 = "616bb3536ac1fff3fb1a141450fa28b875e985712170ea7f1866a1e3f2a36be5",
    strip_prefix = "protobuf-24.4",
    urls = [
        "https://mirror.bazel.build/github.com/protocolbuffers/protobuf/archive/v24.4.tar.gz",
        "https://github.com/protocolbuffers/protobuf/archive/v24.4.tar.gz",
    ],
)

Always specify sha256

Omitting sha256 makes the download non-hermetic — the archive could change between fetches. Bazel will print a warning and the correct hash after the first successful download.

rules_jvm_external (Maven Dependencies)

# WORKSPACE.bazel
load("@rules_jvm_external//:defs.bzl", "maven_install")

maven_install(
    artifacts = [
        "com.google.guava:guava:32.1.3-jre",
        "junit:junit:4.13.2",
        "org.mockito:mockito-core:5.6.0",
        "org.postgresql:postgresql:42.7.1",
    ],
    repositories = [
        "https://repo1.maven.org/maven2",
        "https://repo.maven.apache.org/maven2",
    ],
    # Pin versions — generate with: bazel run @maven//:pin
    maven_install_json = "//:maven_install.json",
)

load("@maven//:defs.bzl", "pinned_maven_install")
pinned_maven_install()

# Generate the lockfile after changing artifacts
bazel run @maven//:pin

# Reference artifacts in BUILD files as:
# @maven//:group_id_artifact_id   (dots and hyphens become underscores)
# e.g. @maven//:com_google_guava_guava

pip_parse (Python Dependencies)

# WORKSPACE.bazel
load("@rules_python//python:pip.bzl", "pip_parse")

pip_parse(
    name = "pip",
    requirements_lock = "//requirements:requirements_lock.txt",
)

load("@pip//:requirements.bzl", "install_deps")
install_deps()

# Generate locked requirements
pip-compile requirements.in -o requirements/requirements_lock.txt

# Reference in BUILD files as:
# @pip//requests      (package name, underscores for hyphens)
# @pip//flask
# @pip//google_cloud_bigquery

go_repository (Go External Deps)

# WORKSPACE.bazel
load("@io_bazel_rules_go//go:deps.bzl", "go_repository")

go_repository(
    name = "com_github_gorilla_mux",
    importpath = "github.com/gorilla/mux",
    sum = "h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=",
    version = "v1.8.1",
)

# Better: use gazelle to auto-generate go_repository entries
# bazel run //:gazelle-update-repos -- -from_file=go.mod -to_macro=deps.bzl%go_dependencies

Bzlmod Bazel 6+

Bzlmod is the new, recommended dependency management system replacing WORKSPACE files. It introduces MODULE.bazel files, a central registry (Bazel Central Registry), and a lockfile — solving dependency diamond problems and version conflicts that WORKSPACE couldn't handle.

MODULE.bazel Syntax

# MODULE.bazel — in the workspace root

# Declare your module
module(
    name = "my_monorepo",
    version = "1.0.0",
    # repo_name overrides the @-name used to reference this module
)

# Declare dependencies — versions are resolved centrally
bazel_dep(name = "rules_go", version = "0.49.0")
bazel_dep(name = "rules_python", version = "0.31.0")
bazel_dep(name = "gazelle", version = "0.36.0")
bazel_dep(name = "rules_jvm_external", version = "6.1")
bazel_dep(name = "com_google_protobuf", version = "24.4", repo_name = "protobuf")

# Dev-only dependencies (not propagated to dependents)
bazel_dep(name = "buildifier_prebuilt", version = "6.4.0", dev_dependency = True)

# Use a module extension to configure go toolchain
go_sdk = use_extension("@rules_go//go:extensions.bzl", "go_sdk")
go_sdk.download(version = "1.22.0")

# Use a module extension to configure Python
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    python_version = "3.12",
    is_default = True,
)

# Expose repos from extensions into your module's scope
use_repo(go_sdk, "go_toolchains")

# Maven deps via rules_jvm_external extension
maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven")
maven.install(
    artifacts = [
        "com.google.guava:guava:32.1.3-jre",
        "junit:junit:4.13.2",
    ],
    repositories = ["https://repo1.maven.org/maven2"],
)
use_repo(maven, "maven")

Lockfile

# Bazel automatically writes MODULE.bazel.lock
# Always commit it — it pins exact resolved versions

# Update the lockfile after changing MODULE.bazel
bazel mod deps --lockfile_mode=update

# Check for outdated dependencies
bazel mod deps --check_direct_dependencies=warning

Overrides (for Development)

# MODULE.bazel — override a dep with a local path during development
# Only valid in root module (not in libraries)

single_version_override(
    module_name = "rules_go",
    version = "0.49.0",
    patches = ["//patches:rules_go_fix.patch"],
)

# Use a local directory instead of the registry version
local_path_override(
    module_name = "my_shared_lib",
    path = "../my_shared_lib",
)

Registry

The Bazel Central Registry (BCR) at bcr.bazel.build is the default source for modules. You can also host a private registry for internal modules.

# MODULE.bazel — add a private registry
# (Bazel checks registries in order, BCR is always the fallback)
bazel_dep(name = "my_internal_lib", version = "2.1.0")

# .bazelrc — specify custom registry
# common --registry=https://my-internal-registry.corp.com
# common --registry=https://bcr.bazel.build

WORKSPACE vs Bzlmod migration

Many projects use a hybrid: MODULE.bazel for Bzlmod deps, plus a WORKSPACE.bazel stub that loads Bzlmod-managed repos. Enable Bzlmod with common --enable_bzlmod in .bazelrc. The goal is to eventually delete WORKSPACE.bazel entirely.

Building & Testing

Core Commands

# Build a specific target
bazel build //src/main/java/com/example:server

# Build all targets in a package
bazel build //src/main/java/com/example/...

# Build everything in the repo
bazel build //...

# Run a binary
bazel run //src/main/java/com/example:server -- --port=8080
#                                              ^^ args passed to the binary

# Run a specific test
bazel test //src/test/java/com/example:server_test

# Run all tests
bazel test //...

# Run tests matching a filter (Java JUnit)
bazel test //... --test_filter=com.example.ServerTest#testStartup

# Run tests with verbose output
bazel test //... --test_output=all       # all output always
bazel test //... --test_output=errors    # output only on failure (CI-friendly)
bazel test //... --test_output=streamed  # stream output in real time

Test Sharding and Parallelism

# Shard a slow test across N workers
bazel test //... --test_sharding_strategy=explicit  # respects shard_count in BUILD

# In BUILD file:
java_test(
    name = "heavy_test",
    srcs = ["HeavyTest.java"],
    shard_count = 4,    # Bazel runs 4 shards in parallel
    size = "large",
)

# Run tests in parallel (default: uses all CPUs)
bazel test //... --jobs=8        # limit to 8 parallel jobs
bazel test //... --local_test_jobs=4  # limit local test executions

Output Paths

# Bazel creates symlinks in the workspace root (don't commit these):
bazel-bin/          # compiled binaries and outputs
bazel-out/          # all outputs, organized by config
bazel-testlogs/     # test logs and XMLs
bazel-genfiles/     # generated files (deprecated alias for bazel-out)

# Find a specific output file
bazel build //src:server && ls -la bazel-bin/src/server

# The actual output base (outside workspace)
bazel info output_base     # e.g., ~/.cache/bazel/_bazel_user/abc123/
bazel info bazel-bin       # absolute path to bazel-bin
bazel info execution_root  # where actions run

Useful Flags

# Compilation mode
bazel build //... -c opt      # optimized (release)
bazel build //... -c dbg      # debug symbols
bazel build //... -c fastbuild # default — minimal optimizations

# Verbose — see all commands executed
bazel build //... --subcommands

# Sandbox debugging — keep sandbox after failure
bazel build //... --sandbox_debug

# Show what would be built without building
bazel build //... --nobuild

# Force rebuild everything (ignore cached results)
bazel build //... --noremote_accept_cached --noremote_upload_results

# Disk cache
bazel build //... --disk_cache=~/.cache/bazel-disk

# Profile a slow build
bazel build //... --profile=/tmp/build_profile.json.gz
# Then open in Chrome: chrome://tracing

Query Language

Bazel has three query tools, each operating on a different graph:

bazel query — the dependency graph (targets and their deps), fast, configuration-independent
bazel cquery — the configured dependency graph (accounts for select() and transitions)
bazel aquery — the action graph (individual compiler/linker invocations)

bazel query Examples

# List all targets in a package
bazel query //src/main/java/com/example/...

# Find direct dependencies of a target
bazel query 'deps(//src:server, 1)'

# Find ALL transitive dependencies
bazel query 'deps(//src:server)'

# Find reverse dependencies (what depends on this?)
bazel query 'rdeps(//..., //shared:common)'

# Find all paths between two targets
bazel query 'allpaths(//src:server, //shared:common)'

# Find a single path (faster)
bazel query 'somepath(//src:server, //shared:common)'

# Find all java_library targets
bazel query 'kind(java_library, //...)'

# Find targets with a specific attribute
bazel query 'attr(visibility, "//visibility:public", //...)'

# Filter by name pattern
bazel query 'filter(".*_test", //src/...)'

# Set operations
bazel query '//src/... except //src/test/...'
bazel query '//src/... intersect kind(java_library, //...)'

# Find all tests
bazel query 'tests(//...)'

# Output formats
bazel query 'deps(//src:server)' --output=graph | dot -Tpng > deps.png
bazel query 'deps(//src:server)' --output=label_kind
bazel query '//...' --output=build    # show BUILD file representation

bazel cquery

# cquery is like query but accounts for configuration
# Use it when select() causes issues with plain query

# Show which deps are selected for the linux platform
bazel cquery 'deps(//src:server)' --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64

# Show configured target output files
bazel cquery //src:server --output=files

# Starlark output format for custom processing
bazel cquery //... \
  --output=starlark \
  --starlark:expr='target.label.name + " " + str(target.label.package)'

bazel aquery

# Show all actions for a target
bazel aquery //src:server

# Show actions that produce a specific output
bazel aquery 'outputs(".*ServerImpl.class", //src:server_lib)'

# Show the exact command line of a compile action
bazel aquery //src:server --output=text | grep -A5 "action 'Javac'"

# Find all genrule actions
bazel aquery 'kind(Genrule, //...)'

Custom Rules

When macros aren't enough — when you need a new provider, a new action type, or integration with an external tool — you write a custom rule.

Rule Anatomy

# tools/build_defs/proto_gen.bzl

# Step 1: Declare a provider — the typed output of your rule
ProtoInfo = provider(
    "Information about compiled protobuf outputs",
    fields = {
        "java_srcs": "depset of generated Java source files",
        "descriptor": "FileInfo for the .descriptor file",
    },
)

# Step 2: Write the implementation function
# ctx is the analysis context — it has everything you need
def _proto_java_library_impl(ctx):
    # ctx.attr — declared attributes
    proto_srcs = ctx.files.srcs
    output_dir = ctx.actions.declare_directory("gen_java")

    # ctx.executable — executable tools declared in attrs
    protoc = ctx.executable._protoc

    # Declare outputs
    descriptor = ctx.actions.declare_file(ctx.label.name + ".descriptor")

    # Declare an action — hermetic, sandboxed
    ctx.actions.run(
        inputs = depset(proto_srcs),
        outputs = [output_dir, descriptor],
        executable = protoc,
        arguments = [
            "--java_out=" + output_dir.path,
            "--descriptor_set_out=" + descriptor.path,
        ] + [f.path for f in proto_srcs],
        mnemonic = "ProtoJava",
        progress_message = "Generating Java from protos %{label}",
    )

    # Return providers — other rules consume these
    return [
        DefaultInfo(files = depset([output_dir])),
        ProtoInfo(
            java_srcs = depset([output_dir]),
            descriptor = descriptor,
        ),
    ]

# Step 3: Call rule() to create the rule
proto_java_library = rule(
    implementation = _proto_java_library_impl,
    attrs = {
        "srcs": attr.label_list(
            allow_files = [".proto"],
            doc = "Proto source files",
        ),
        "deps": attr.label_list(
            providers = [ProtoInfo],   # only accept targets that provide ProtoInfo
            doc = "Proto library dependencies",
        ),
        # Private attrs for tools — prefixed with underscore by convention
        "_protoc": attr.label(
            default = "@com_google_protobuf//:protoc",
            allow_single_file = True,
            executable = True,
            cfg = "exec",     # run the tool in the execution config, not target config
        ),
    },
    outputs = {
        # Declare implicit outputs
        "descriptor": "%{name}.descriptor",
    },
)

ctx.actions — Declaring Work

def _my_rule_impl(ctx):
    # Run a command
    ctx.actions.run(
        inputs = ctx.files.srcs,
        outputs = [ctx.outputs.out],
        executable = ctx.executable.tool,
        arguments = ["--input", ctx.files.srcs[0].path, "--output", ctx.outputs.out.path],
    )

    # Run a shell command (less hermetic — avoid when possible)
    ctx.actions.run_shell(
        inputs = ctx.files.srcs,
        outputs = [ctx.outputs.out],
        command = "cat %s | gzip > %s" % (
            " ".join([f.path for f in ctx.files.srcs]),
            ctx.outputs.out.path,
        ),
    )

    # Write a file with known content
    config = ctx.actions.declare_file("config.json")
    ctx.actions.write(
        output = config,
        content = json.encode({"version": ctx.attr.version}),
    )

    # Expand a template
    ctx.actions.expand_template(
        template = ctx.file.template,
        output = ctx.outputs.out,
        substitutions = {
            "{VERSION}": ctx.attr.version,
            "{NAME}": ctx.label.name,
        },
    )

Aspects

Aspects propagate along dependency edges to collect information or add actions to targets they don't own. They're used for IDE integration, code generation, and cross-cutting analysis.

# An aspect that collects all source files transitively
CollectedSrcs = provider(fields = ["srcs"])

def _collect_srcs_impl(target, ctx):
    # target — the target this aspect is applied to
    # ctx — analysis context (ctx.rule has the rule attributes)
    direct_srcs = []
    if hasattr(ctx.rule.attr, "srcs"):
        direct_srcs = ctx.rule.files.srcs

    # Collect srcs from deps that also have this aspect applied
    transitive = [dep[CollectedSrcs].srcs for dep in ctx.rule.attr.deps
                  if CollectedSrcs in dep]

    return [CollectedSrcs(srcs = depset(direct_srcs, transitive = transitive))]

collect_srcs = aspect(
    implementation = _collect_srcs_impl,
    attr_aspects = ["deps"],   # propagate along 'deps' edges
)

Remote Execution & Caching

How Remote Caching Works

Bazel computes a cache key for each action based on the hash of its inputs (source files, tool versions, environment variables, command-line flags). Before running an action, Bazel checks the remote cache. On a hit, it downloads the output instead of executing the action. On a miss, it executes locally and uploads the result.

Remote Cache Configuration

# .bazelrc — remote cache (read/write)
build:ci --remote_cache=grpc://cache.internal.corp:9092
build:ci --remote_upload_locals         # upload local results to the cache

# HTTP-based cache (simpler setup)
build:ci --remote_cache=http://cache.internal.corp:8080

# Read-only cache (safe for local dev — don't pollute shared cache)
build --remote_cache=grpc://cache.internal.corp:9092
build --noremote_upload_results         # don't upload from local machines

# Google Cloud Storage as cache backend (via bazel-remote or Remote Build Execution)
build:gcs --remote_cache=https://storage.googleapis.com/my-bazel-cache
build:gcs --google_default_credentials

Remote Execution

# Remote execution — Bazel sends actions to a build farm
# Compatible servers: BuildBuddy, EngFlow, Google RBE, Buildkite Remote Cache

build:rbe --remote_executor=grpc://rbe.buildbuddy.io:443
build:rbe --remote_cache=grpc://rbe.buildbuddy.io:443
build:rbe --remote_instance_name=projects/my-gcp-project/instances/default
build:rbe --google_default_credentials

# Platform config — tell the build farm what container to use
build:rbe --extra_execution_platforms=//platforms:remote_linux_amd64

# Execute locally but check remote cache first
build --remote_executor=          # unset executor = local execution
build --remote_cache=grpc://...   # but still use the shared cache

Disk Cache

# Local persistent cache — survives bazel clean
build --disk_cache=~/.cache/bazel-disk

# Combined: disk cache as local fallback, remote as primary
build --disk_cache=~/.cache/bazel-disk
build --remote_cache=grpc://cache.internal.corp:9092

# Clear the disk cache
rm -rf ~/.cache/bazel-disk

Build Event Protocol

# Publish build events to a BES backend (e.g., BuildBuddy)
build --bes_backend=grpc://events.buildbuddy.io:443
build --bes_results_url=https://app.buildbuddy.io/invocation/

# This gives you a build dashboard with logs, test results, timing

Configuration

.bazelrc Hierarchy

File	Location	Committed?	Purpose
`system bazelrc`	`/etc/bazel.bazelrc`	N/A	Sysadmin defaults for all users
`user bazelrc`	`~/.bazelrc`	No	Personal developer preferences
`workspace bazelrc`	`.bazelrc` in workspace	Yes	Project-wide flags shared by the team
`invocation bazelrc`	Specified with `--bazelrc=`	Sometimes	CI environment-specific flags

# .bazelrc structure: command[:config] flags
# "command" is the Bazel command (build, test, run, query, common)
# ":config" is an optional named config, activated with --config=name

common --enable_bzlmod
common --announce_rc             # print flags being used (helpful for debugging)

build --incompatible_strict_action_env
build --sandbox_default_allow_network=false

# Named config — activated with: bazel build --config=asan
build:asan --strip=never
build:asan --copt=-fsanitize=address
build:asan --linkopt=-fsanitize=address

build:tsan --strip=never
build:tsan --copt=-fsanitize=thread
build:tsan --linkopt=-fsanitize=thread

test --test_output=errors
test --test_verbose_timeout_warnings

# CI config
build:ci --remote_cache=grpc://cache.corp.com:9090
build:ci --noremote_upload_results=false
build:ci --build_event_publish_all_actions

Platforms and Toolchains

# platforms/BUILD.bazel — define target and execution platforms

platform(
    name = "linux_x86_64",
    constraint_values = [
        "@platforms//os:linux",
        "@platforms//cpu:x86_64",
    ],
)

platform(
    name = "macos_arm64",
    constraint_values = [
        "@platforms//os:macos",
        "@platforms//cpu:aarch64",
    ],
)

# Toolchain — maps a platform to a concrete tool
toolchain(
    name = "my_cc_toolchain_linux",
    toolchain = "//toolchains:cc_linux",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
    target_compatible_with = [
        "@platforms//os:linux",
        "@platforms//cpu:x86_64",
    ],
)

# Select a target platform at build time
bazel build //... --platforms=//platforms:linux_x86_64

# Cross-compilation: build for linux from macOS
bazel build //src:server \
  --platforms=//platforms:linux_x86_64 \
  --host_platform=//platforms:macos_arm64

Transitions

Transitions change the build configuration for a specific dep or across an entire subtree. They're used for fat binaries, multi-arch builds, and feature flags.

# A configuration transition that switches to opt mode
def _opt_transition_impl(settings, attr):
    return {"//command_line_option:compilation_mode": "opt"}

opt_transition = transition(
    implementation = _opt_transition_impl,
    inputs = [],
    outputs = ["//command_line_option:compilation_mode"],
)

# Apply the transition to a dep
def _my_rule_impl(ctx):
    # ctx.attr.optimized_dep is built in opt mode
    pass

my_rule = rule(
    implementation = _my_rule_impl,
    attrs = {
        "optimized_dep": attr.label(cfg = opt_transition),
    },
)

Performance

Understanding Build Phases

Bazel builds in three phases:

Loading — reads and evaluates all BUILD and .bzl files reachable from the targets requested
Analysis — runs rule implementation functions, constructs the action graph
Execution — runs actions (compiles, links, tests), in parallel where possible

Profiling

# Generate a profile
bazel build //... --profile=/tmp/build.json.gz

# Visualize in Chrome (open chrome://tracing, load the file)
# OR use Bazel's built-in analysis
bazel analyze-profile /tmp/build.json.gz

# Profile the loading phase separately
bazel build //... --profile=/tmp/profile.gz \
  --record_full_profiler_data

Persistent Workers

Workers are long-lived processes that handle multiple compile requests. They avoid JVM startup overhead for Java/Kotlin and are critical for build speed at scale.

# Enable persistent workers (usually on by default for Java)
build --strategy=Javac=worker
build --strategy=KotlinCompile=worker

# Control worker parallelism
build --worker_max_instances=4

# Workers use a work request protocol — actions must explicitly support it
# Check: bazel build //... --worker_verbose to see worker activity

Repository Cache

# External repos (http_archive downloads) are cached here
bazel info repository_cache
# Typically: ~/.cache/bazel/_bazel_user/cache/repos/v1/

# This cache is shared across workspaces on the same machine
# Useful for offline builds after first fetch

# Prefetch all external deps
bazel fetch //...

Avoiding Slow Builds

Keep packages small — large glob() patterns are evaluated on every build. Split big packages into smaller ones.
Minimize deps — every transitive dep increases analysis time. Prefer narrow interfaces and thin layers.
Use exports sparingly — exports in Java rules propagate the entire transitive closure upward, increasing classpath size.
Avoid glob() in deeply nested dirs — use explicit file lists or smaller patterns.
Mark tests with size — size = "small" (60s timeout), medium (300s), large (900s), enormous (3600s). Accurate sizing improves scheduling.
Use --build_tests_only — when running bazel test //..., skip building non-test targets.

Build Graph Optimization

# Use depset for efficient transitive collection — O(1) merge vs O(n) list append
def _collect_impl(target, ctx):
    # BAD: O(n^2) for deep graphs
    all_srcs = []
    for dep in ctx.rule.attr.deps:
        all_srcs += dep[MyInfo].srcs   # list concatenation copies everything

    # GOOD: O(1) merge, O(n) traversal only at the end
    transitive = [dep[MyInfo].srcs for dep in ctx.rule.attr.deps]
    all_srcs = depset(ctx.rule.files.srcs, transitive = transitive)
    return [MyInfo(srcs = all_srcs)]

CI/CD Integration

GitHub Actions Example

# .github/workflows/ci.yml

{
  "name": "CI",
  "on": {"push": {"branches": ["main"]}, "pull_request": {}},
  "jobs": {
    "build-and-test": {
      "runs-on": "ubuntu-latest",
      "steps": [
        {"uses": "actions/checkout@v4"},
        {
          "name": "Mount Bazel cache",
          "uses": "actions/cache@v4",
          "with": {
            "path": "~/.cache/bazel",
            "key": "${{ runner.os }}-bazel-${{ hashFiles('.bazelversion', 'MODULE.bazel', 'MODULE.bazel.lock') }}",
            "restore-keys": "${{ runner.os }}-bazel-"
          }
        },
        {
          "name": "Build",
          "run": "bazel build --config=ci //..."
        },
        {
          "name": "Test",
          "run": "bazel test --config=ci --test_output=errors //..."
        },
        {
          "name": "Upload test logs on failure",
          "if": "failure()",
          "uses": "actions/upload-artifact@v4",
          "with": {
            "name": "test-logs",
            "path": "bazel-testlogs/"
          }
        }
      ]
    }
  }
}

Caching Strategy for CI

# Option 1: GitHub Actions cache (free, limited to 10GB)
# Works well for small/medium repos
# Cache key on .bazelversion + lockfiles

# Option 2: Self-hosted remote cache (bazel-remote server)
# docker run -p 9090:9090 -v /data/bazel-cache:/data \
#   buchgr/bazel-remote-cache:latest --dir=/data --max_size=50

# Then in .bazelrc:
# build:ci --remote_cache=grpc://your-cache-host:9090
# build:ci --remote_upload_results

# Option 3: BuildBuddy / EngFlow (managed remote cache + RBE)
# build:ci --remote_cache=grpcs://remote.buildbuddy.io
# build:ci --remote_header=x-buildbuddy-api-key=$BUILDBUDDY_API_KEY

Release Stamping

# Embed git info into binaries — useful for release artifacts
# Create tools/bazel/workspace_status.sh:
#!/bin/bash
echo "STABLE_GIT_COMMIT $(git rev-parse HEAD)"
echo "STABLE_GIT_TAG $(git describe --tags --always --dirty)"
echo "BUILD_TIMESTAMP $(date -u +%Y-%m-%dT%H:%M:%SZ)"

# .bazelrc
build --workspace_status_command=tools/bazel/workspace_status.sh
# Add --stamp for release builds only (disable for dev to preserve caching)
# bazel build --stamp //...

# In rules, access stamp values
def _my_binary_impl(ctx):
    if ctx.attr.stamp:
        stamp_files = ctx.info_file, ctx.version_file
        # These files contain the STABLE_* and volatile values

Continuous Testing Patterns

# Only test targets affected by changed files
# (Requires Bazel's affected targets analysis)
bazel query 'tests(//...)' --output=label > all_tests.txt

# Changed files since last merge
git diff --name-only origin/main...HEAD > changed_files.txt

# Use bazel query to find affected tests
bazel query \
  "tests(rdeps(//..., set($(cat changed_files.txt | tr '\n' ' '))))" \
  --keep_going 2>/dev/null | head -50

# Or use a tool like bazel-diff for more robust affected target detection

Common Pitfalls

Non-Hermetic Actions

Reading undeclared inputs

An action that reads a file not in its inputs is non-hermetic. Bazel may or may not catch this via sandboxing — on macOS, the sandbox is less strict. The symptom is builds that succeed locally but fail in CI (different filesystem state).

# BAD — reads /etc/hosts implicitly (no sandboxing on some platforms)
ctx.actions.run_shell(
    command = "curl http://example.com/data > $@",
    outputs = [output],
)

# BAD — reads an undeclared input file
ctx.actions.run(
    inputs = [src],             # forgot to declare headers!
    executable = compiler,
    arguments = [src.path, "-I", "include/"],  # include/ not in inputs
)

# GOOD — declare all inputs
ctx.actions.run(
    inputs = depset([src] + ctx.files.hdrs),
    executable = compiler,
    arguments = [src.path] + ["-I" + h.dirname for h in ctx.files.hdrs],
)

glob() Gotchas

# Pitfall 1: glob() does NOT cross package boundaries
# If src/foo/ has its own BUILD file, glob won't reach files in it
srcs = glob(["**/*.java"])  # stops at sub-package boundaries

# Pitfall 2: glob() is evaluated at loading time — changes require bazel build
# If a file is added but Bazel isn't re-invoked, it may not be included
# (Bazel watches the filesystem and invalidates when files change — this is usually fine)

# Pitfall 3: glob() with allow_empty
# By default, an empty glob is an error in Bazel 7+
srcs = glob(["optional/**/*.py"], allow_empty = True)

# Pitfall 4: exclude patterns — easy to miss
srcs = glob(
    ["src/**/*.java"],
    exclude = [
        "src/**/*Test.java",    # exclude tests
        "src/**/generated/**",  # exclude generated code
    ],
)

Label Syntax Errors

# WRONG — missing // for absolute label
deps = ["src/main/java/com/example:lib"]     # interpreted as relative

# WRONG — extra colon
deps = ["//src/main:java:lib"]

# WRONG — spaces in label
deps = ["//src/main/java : lib"]

# CORRECT — absolute label
deps = ["//src/main/java/com/example:lib"]

# CORRECT — shorthand when target name = last segment of package path
deps = ["//src/main/java/com/example"]  # == //src/main/java/com/example:example

# CORRECT — relative label (within same BUILD file)
deps = [":lib"]

Circular Dependencies

# Bazel will error with a cycle detected message
# Diagnose with:
bazel query 'allpaths(//pkg:a, //pkg:b)'
bazel query 'allpaths(//pkg:b, //pkg:a)'

# Fix by:
# 1. Extracting shared code into a common library with no deps on either
# 2. Inverting the dependency (interface + implementation split)
# 3. Merging the two targets if they truly belong together

Cache Poisoning

Mutating outputs breaks caching

If an action modifies a file after Bazel records its hash, or if two actions write to the same output, the cache is corrupted. Symptoms: flaky builds, wrong outputs that only clear after bazel clean.

# Diagnose cache issues
bazel build //... --explain=/tmp/explain.txt
bazel build //... --verbose_explanations  # why each action ran

# Nuclear option — clear everything
bazel clean                # clears the output base for this workspace
bazel clean --expunge      # also clears the action cache and external repos
bazel clean --expunge_async  # same but in background

Slow First Builds

# First build fetches all external deps — can take minutes
# Warm up dev machines with:
bazel fetch //...      # downloads all external repos
bazel build //...      # populates the action cache

# On CI, use a combination of:
# 1. Remote cache (share results across all CI machines)
# 2. Persistent workers (avoid JVM startup)
# 3. Build only affected targets on PRs

Runfiles Confusion

Runfiles are the runtime data files a target needs to execute. They live in a .runfiles tree next to the binary, not in bazel-bin directly.

# In a test, access runfiles correctly
# Java
import com.google.devtools.build.runfiles.Runfiles;
Runfiles runfiles = Runfiles.create();
String path = runfiles.rlocation("my_workspace/testdata/config.json");

# Python (rules_python provides a helper)
from rules_python.python.runfiles import runfiles as runfiles_lib
r = runfiles_lib.Create()
path = r.Rlocation("my_workspace/testdata/config.json")

# Go
import "github.com/bazelbuild/rules_go/go/tools/bazel"
path, err := bazel.Runfile("testdata/config.json")

Output Base Issues

# Bazel's output base is outside the workspace — don't delete it manually
bazel info output_base     # see where it is
bazel info output_path     # output path for current workspace

# If builds fail mysteriously after upgrading Bazel or changing WORKSPACE:
bazel clean --expunge      # nuclear clean — full reset
# Then: bazel fetch //... to pre-warm

# Each workspace gets its own output base keyed by workspace path hash
# Moving a workspace to a different path = new output base = cold build

Quick Reference Cheatsheet

Frequently used commands

Task	Command
Build everything	`bazel build //...`
Test everything	`bazel test //...`
Run a binary	`bazel run //pkg:target -- args`
List targets in package	`bazel query //pkg/...`
Find who depends on X	`bazel query 'rdeps(//..., //pkg:X)'`
Show dep tree	`bazel query 'deps(//pkg:target)' --output=graph`
Pin Maven deps	`bazel run @maven//:pin`
Update go deps	`bazel run //:gazelle-update-repos`
Fetch all external deps	`bazel fetch //...`
Profile build	`bazel build --profile=/tmp/p.json.gz //...`
Show build flags	`bazel build //... --announce_rc`
Clean workspace	`bazel clean --expunge`
Format BUILD files	`buildifier -r .`
Lint BUILD files	`buildifier --lint=warn -r .`
Update Bzlmod lock	`bazel mod deps --lockfile_mode=update`