1# Build Performance
2
3## Debugging Build Performance
4
5### Tracing
6
7soong_ui has tracing built in, so that every build execution's trace can be
8viewed.  Just open `$OUT_DIR/build.trace.gz` in Chrome's <chrome://tracing>, or
9with [catapult's trace viewer][catapult trace_viewer]. The last few traces are
10stored in `build.trace.#.gz` (larger numbers are older). The associated logs
11are stored in `soong.#.log` and `verbose.#.log.gz`.
12
13![trace example](./trace_example.png)
14
15### Critical path
16
17soong_ui logs the wall time of the longest dependency chain compared to the
18elapsed wall time in `$OUT_DIR/soong.log`.  For example:
19```
20critical path took 3m10s
21elapsed time 5m16s
22perfect parallelism ratio 60%
23critical path:
24    0:00 build out/target/product/generic_arm64/obj/FAKE/sepolicy_neverallows_intermediates/policy_2.conf
25    0:04 build out/target/product/generic_arm64/obj/FAKE/sepolicy_neverallows_intermediates/sepolicy_neverallows
26    0:13 build out/target/product/generic_arm64/obj/ETC/plat_sepolicy.cil_intermediates/plat_sepolicy.cil
27    0:01 build out/target/product/generic_arm64/obj/ETC/plat_pub_versioned.cil_intermediates/plat_pub_versioned.cil
28    0:02 build out/target/product/generic_arm64/obj/ETC/vendor_sepolicy.cil_intermediates/vendor_sepolicy.cil
29    0:16 build out/target/product/generic_arm64/obj/ETC/sepolicy_intermediates/sepolicy
30    0:00 build out/target/product/generic_arm64/obj/ETC/plat_seapp_contexts_intermediates/plat_seapp_contexts
31    0:00 Install: out/target/product/generic_arm64/system/etc/selinux/plat_seapp_contexts
32    0:02 build out/target/product/generic_arm64/obj/NOTICE.txt
33    0:00 build out/target/product/generic_arm64/obj/NOTICE.xml.gz
34    0:00 build out/target/product/generic_arm64/system/etc/NOTICE.xml.gz
35    0:01 Installed file list: out/target/product/generic_arm64/installed-files.txt
36    1:00 Target system fs image: out/target/product/generic_arm64/obj/PACKAGING/systemimage_intermediates/system.img
37    0:01 Install system fs image: out/target/product/generic_arm64/system.img
38    0:01 Target vbmeta image: out/target/product/generic_arm64/vbmeta.img
39    1:26 Package target files: out/target/product/generic_arm64/obj/PACKAGING/target_files_intermediates/aosp_arm64-target_files-6663974.zip
40    0:01 Package: out/target/product/generic_arm64/aosp_arm64-img-6663974.zip
41    0:01 Dist: /buildbot/dist_dirs/aosp-master-linux-aosp_arm64-userdebug/6663974/aosp_arm64-img-6663974.zip
42```
43
44If the elapsed time is much longer than the critical path then additional
45parallelism on the build machine will improve total build times. If there are
46long individual times listed in the critical path then improving build times
47for those steps or adjusting dependencies so that those steps can run earlier
48in the build graph will improve total build times.
49
50### Soong
51
52Soong proper (i.e., `soong_build` executable that processes the blueprint
53files) can be traced and profiled using the standard Go tools. It understands
54the `-trace`, `-cpuprofile`, and `-memprofile` command line arguments.
55Setting `SOONG_PROFILE_CPU` and/or `SOONG_PROFILE_MEM` environment variables
56for the build enables respective profiling, e.g., running
57
58```shell
59SOONG_PROFILE_CPU=/tmp/foo m ..._
60```
61
62saves CPU profile for each Soong invocation in /tmp/foo._step_ file, where
63_step_ is Soong execution step. The main step is `build`. The others as
64`bp2build_files`, `bp2build_workspace`, `modulegraph`, `queryview`,
65`api_bp2build`, `soong_docs` (not all of them necessarily run during the build).
66The profiles can be inspected with `go tool pprof` from the command line or
67with _Run>Open Profiler Snapshot_ in IntelliJ IDEA.
68
69### Kati
70
71In general, the slow path of reading Android.mk files isn't particularly
72performance sensitive, since it doesn't need to happen on every build. It is
73important for the fast-path (detecting whether it needs to regenerate the ninja
74file) to be fast however. And it shouldn't hit the slow path too often -- so
75don't rely on output of a `$(shell)` command that includes the current timestamp,
76or read a file that's going to change on every build.
77
78#### Regen check is slow
79
80In most cases, we've found that the fast-path is slow because all of the
81`$(shell)` commands need to be re-executed to determine if their output changed.
82The `$OUT_DIR/verbose.log.gz` contains statistics from the regen check:
83
84```
85verbose: *kati*: regen check time: 0.754030
86verbose: *kati*: glob time (regen): 0.545859 / 43840
87verbose: *kati*: shell time (regen): 0.278095 / 66 (59 unique)
88verbose: *kati*:   0.012 / 1 mkdir -p out/target/product/generic && echo Android/aosp_arm/generic:R/AOSP.MASTER/$(date -d @$(cat out/build_date.txt) +%m%d%H%M):eng/test-keys >out/target/product/generic/build_fingerprint.txt && grep " " out/target/product/generic/build_fingerprint.txt
89verbose: *kati*:   0.010 / 1 echo 'com.android.launcher3.config.FlagOverrideSampleTest com.android.launcher3.logging.FileLogTest com.android.launcher3.model.AddWorkspaceItemsTaskTest com.android.launcher3.model.CacheDataUpdatedTaskTest com.android.launcher3.model.DbDowngradeHelperTest com.android.launcher3.model.GridBackupTableTest com.android.launcher3.model.GridSizeMigrationTaskTest com.android.launcher3.model.PackageInstallStateChangedTaskTest com.android.launcher3.popup.PopupPopulatorTest com.android.launcher3.util.GridOccupancyTest com.android.launcher3.util.IntSetTest' | tr ' ' '\n' | cat
90verbose: *kati*:   0.010 / 1 cd cts/tests/framework/base/windowmanager ; find -L  * -name "Components.java" -and -not -name ".*"
91verbose: *kati*:   0.010 / 1 git -C test/framework/build log -s -n 1 --format="%cd" --date=format:"%Y%m%d_%H%M%S" 2>/dev/null
92verbose: *kati*:   0.009 / 2 cd development/samples/ShortcutDemo/publisher ; find -L  ../common/src -name "*.java" -and -not -name ".*"
93verbose: *kati*:   0.009 / 2 cd development/samples/ShortcutDemo/launcher ; find -L  ../common/src -name "*.java" -and -not -name ".*"
94verbose: *kati*:   0.009 / 1 if ! cmp -s out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk; then mv out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk; else rm out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp; fi
95verbose: *kati*:   0.008 / 1 mkdir -p out/target/product/generic && echo R/AOSP.MASTER/$(cat out/build_number.txt):eng/test-keys >out/target/product/generic/build_thumbprint.txt && grep " " out/target/product/generic/build_thumbprint.txt
96verbose: *kati*:   0.007 / 1 echo 'com.android.customization.model.clock.BaseClockManagerTest com.android.customization.model.clock.ClockManagerTest com.android.customization.model.grid.GridOptionsManagerTest com.android.customization.model.theme.ThemeManagerTest' | tr ' ' '\n' | cat
97verbose: *kati*:   0.007 / 1 uname -sm
98verbose: *kati*: stat time (regen): 0.361907 / 1241
99```
100
101In this case, the total time spent checking was 0.75 seconds, even though the
102other "(regen)" numbers add up to more than that (some parts are parallelized
103where possible). Often times, the biggest contributor is the `$(shell)` times
104-- in this case, 66 calls took 0.27s. The top 10 longest shell functions are
105printed.
106
107All the longest commands in this case are all variants of a call to `find`, but
108this is where using pure make functions instead of calling out to the shell can
109make a performance impact -- many calls to check if `26 > 20` can add up. We've
110added some basic math functions in `math.mk` to help some common use cases that
111used to be rather expensive when they were used too often.
112
113There are some optimizations in place for find commands -- if Kati can
114understand the find command, the built-in find emulator can turn some of them
115into glob or stat checks (falling back to calling `find` if one of those imply
116that the output may change). Many of the common macros produce find commands
117that Kati can understand, but if you're writing your own, you may want to
118experiment with other options if they're showing up in this list. For example,
119if this was significantly more expensive (either in runtime, or was called
120often):
121
122```
123.../kati.go:127: *kati*:  0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
124```
125
126It may be more efficient to move the grep into make, so that the `find` portion
127can be rewritten and cached:
128
129```
130$(filter-out $(file <$(LOCAL_PATH)/java_tests_blacklist),$(call all-java-files-under,luni/src/test/java))
131```
132
133Others can be simplified by just switching to an equivalent find command that
134Kati understands:
135
136```
137.../kati.go:127: *kati*:  0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
138```
139
140By adding the implicit `-a` and moving the `| sort` to Make, this can now be
141cached by Kati:
142
143```
144$(sort $(shell find device vendor -type -f -a -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop))
145```
146
147Kati has now learned about the implicit `-a`, so this particular change is no
148longer necessary, but the basic concept holds.
149
150#### Kati regens too often
151
152Kati prints out what triggered the slow path to be taken -- this can be a
153changed file, a changed environment variable, or different output from a
154`$(shell)` command:
155
156```
157out/soong/Android-aosp_arm.mk was modified, regenerating...
158```
159
160The state is stored in `$OUT_DIR/.kati_stamp*` files, and can be (partially)
161read with the `ckati_stamp_dump` tool in prebuilts/build-tools. More debugging
162is available when ckati is run with `--regen_debug`, but that can be a lot of
163data to understand.
164
165#### Debugging the slow path
166
167Kati will now dump out information about which Makefiles took the most time to
168execute. This is also in the `verbose.log.gz` file:
169
170```
171verbose: *kati*: included makefiles: 73.640833 / 232810 (1066 unique)
172verbose: *kati*:  18.389 /     1 out/soong/Android-aosp_arm.mk
173verbose: *kati*:  13.137 / 20144 build/make/core/soong_cc_prebuilt.mk
174verbose: *kati*:  11.743 / 27666 build/make/core/base_rules.mk
175verbose: *kati*:   2.289 /     1 art/Android.mk
176verbose: *kati*:   2.054 /     1 art/build/Android.cpplint.mk
177verbose: *kati*:   1.955 / 28269 build/make/core/clear_vars.mk
178verbose: *kati*:   1.795 /   283 build/make/core/package.mk
179verbose: *kati*:   1.790 /   283 build/make/core/package_internal.mk
180verbose: *kati*:   1.757 / 17382 build/make/core/link_type.mk
181verbose: *kati*:   1.078 /   297 build/make/core/aapt2.mk
182```
183
184This shows that soong_cc_prebuilt.mk was included 20144 times, for a total time
185spent of 13.137 secounds. While Android-aosp_arm.mk was only included once, and
186took 18.389 seconds. In this case, Android-aosp_arm.mk is the only file that
187includes soong_cc_prebuilt.mk, so we can safely assume that 13 of the 18 seconds
188in Android-aosp_arm.mk was actually spent within soong_cc_prebuilt.mk (or
189something that it included, like base_rules.mk).
190
191By default this only includes the top 10 entries, but you can ask for the stats
192for any makefile to be printed with `$(KATI_profile_makefile)`:
193
194```
195$(KATI_profile_makefile build/make/core/product.mk)
196```
197
198With these primitives, it's possible to get the timing information for small
199chunks, or even single lines, of a makefile. Just move the lines you want to
200measure into a new makefile, and replace their use with an `include` of the
201new makefile. It's possible to analyze where the time is being spent by doing
202a binary search using this method, but you do need to be careful not to split
203conditionals across two files (the ifeq/else/endif must all be in the same file).
204
205### Ninja
206
207#### Understanding why something rebuilt
208
209Add `NINJA_ARGS="-d explain"` to your environment before a build, this will cause
210ninja to print out explanations on why actions were taken. Start reading from the
211beginning, as this much data can be hard to read:
212
213```
214$ cd art
215$ mma
216$ touch runtime/jit/profile_compilation_info.h
217$ NINJA_ARGS="-d explain" mma
218...
219ninja explain: output out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o older than most recent input art/runtime/jit/profile_compilation_info.h (
2201516683538 vs 1516685188)
221ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o is dirty
222ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/cpp-define-generator-data is dirty
223ninja explain: out/soong/host/linux-x86/bin/cpp-define-generator-data is dirty
224ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-asm-support/gen/asm_support_gen.h is dirty
225ninja explain: out/soong/.intermediates/art/cmdline/art_cmdline_tests/android_arm_armv7-a_core_cmdline_parser_test/obj/art/cmdline/cmdline_parser_test.o is dirty
226...
227```
228
229In this case, art/cmdline/cmdline_parser_test.o was rebuilt because it uses
230asm_support_gen.h, which was generated by cpp-define-generator-data, which uses
231profile_compilation_info.h.
232
233You'll likely need to cross-reference this data against the build graph in the
234various .ninja files. The files are (mostly) human-readable, but a (slow) web
235interface can be used by running `NINJA_ARGS="-t browse <target>" m`.
236
237There is also `SOONG_UI_NINJA_ARGS`, which passes ninja arguments to soong ui's
238ninja invocations, e.g. to emit $OUT_DIR/soong/build.ninja, $OUT_DIR/soong/module-graph.json, etc.
239
240```bash
241$ m nothing
242$ touch Android.bp
243$ SOONG_UI_NINJA_ARGS="-d explain" m nothing
244...
245ninja explain: restat of output out/soong/build.ninja older than most recent input Android.bp
246...
247```
248
249#### Builds take a long time
250
251If the long part in the trace view of a build is a relatively solid block, then
252the performance is probably more related to how much time the actual build
253commands are taking than having extra dependencies, or slowdowns in
254soong/kati/ninja themselves.
255
256Beyond looking at visible outliers in the trace view, we don't have any tooling
257to help in this area yet. It's possible to aggregate some of the raw data
258together, but since our builds are heavily parallelized, it's particularly easy
259for build commands to impact unrelated build commands. This is an area we'd
260like to improve -- we expect keeping track of user/system time per-action would
261provide more reliable data, but tracking some full-system data (memory/swap
262use, disk bandwidth, etc) may also be necessary.
263
264## Known Issues
265
266### Common
267
268### <= Android 10 (Q): mm
269
270Soong always loads the entire module graph, so as modules convert from Make to
271Soong, `mm` is becoming closer to `mma`. This produces more correct builds, but
272does slow down builds, as we need to verify/produce/load a larger build graph.
273
274As of Android Q, loading large build graphs is fast, and in Android R, `mm` is
275now an alias of `mma`.
276
277### Android 8.1 (Oreo MR1)
278
279In some cases, a tree would get into a state where Soong would be run twice on
280every incremental build, even if there was nothing to do. This was fixed in
281master with [these changes][blueprint_microfactory], but they were too
282significant to backport at the time. And while they fix this particular issue,
283they appear to cause ninja to spend more time during every build loading the
284`.ninja_log` / `.ninja_deps` files, especially as they become larger.
285
286A workaround to get out of this state is to remove the build.ninja entry from
287`$OUT_DIR/.ninja_log`:
288
289```
290sed -i "/\/build.ninja/d" $(get_build_var OUT_DIR)/.ninja_log
291```
292
293[catapult trace_viewer]: https://github.com/catapult-project/catapult/blob/master/tracing/README.md
294[ninja parse optimization]: https://android-review.googlesource.com/c/platform/external/ninja/+/461005
295[blueprint_microfactory]: https://android-review.googlesource.com/q/topic:%22blueprint_microfactory%22+status:merged
296