Debugging an embedded Linux system - part I
Industrializing an embedded Linux based system is a complex task. Making sure that the system will be easy to debug is part of that complexity. In this series of articles, I will share my experience regarding the debug of those systems.
-
In the first article, I will cover the basics: how to use the most common embedded Linux debug tools and how to setup up a coredump infrastructure.
-
In the second article, we will discuss kernel debugging and how to setup a kdump infrastructure for your system.
-
In the last article, I will discuss some more advanced debugging methods around custom backtraces with libunwind.
The value of logging
Across all the embedded Linux systems that I have worked on, one thing stood up: the first step of debugging is almost always to have a look to system logs.
It is true that such systems often use eMMC as persistent storage. That technology offers a limited amount of read/write cycles, so one has to be careful about its use. It is important not to flood the system logs with debug level messages that will wear out prematurely the eMMC.
On systems that have up-times of months or years, the logs also have to be rotated and compressed so that the persistent storage is not filled with system logs.
Regarding the logging technology itself, my preference goes to having a single system log file where all the applications are writing their logs. The embedded engineers and the validation team will quickly be able to parse this unique log file in a few seconds, scrolling down that file as if they were half machine half human.
When a bug occurs it is often a combination of multiple factors: the RTC clock was not up-to-date causing a delay in a first application that spawned a second application too late, causing a third one to crash. Having to browse multiple log files scattered all around the place, often makes debugging harder, and is the best way not to see the crucial path to the failure.
Using the same logging framework in all applications, and being quite picky about the messages and the log levels is often the best guarantee to have a system that can be debugged easily in my opinion.
There are multiple logging frameworks out there, I have worked with syslog and rsyslog on system-V based embedded Linux systems and with journald on systemd based systems.
I have also seen systems using derivations of the Android logger, such as ulog.
Some systems are using third-party libraries such as spdlog configured to log to the syslog.
Higher level frameworks such as Qt or Gtk have their own logging facilities that can also be configured to use the syslog.
Regardless of the logging framework, the important once again is to have a library that can be used consistently all over the code base. That library must provide support for different log levels. During code reviews, logging needs to be treated with extra care so that in the default log level, only a few meaningful log messages are printed in the system logs.
In addition to the application logs, having the Linux kernel logs, as well as the logs of other kernels running concurrently, such as OP-TEE or other dedicated systems running on co-processors can be valuable.
Finally, once the embedded Linux system runs on the field, the system log file(s) have to be collect-able by the client or remotely so that in case of crash they can be analyzed. System logs can be part of diagnostic dumps together with coredumps and kdumps that will be covered by future sections of this article series.
Strace-ing everything
Now you have an embedded Linux system with a logging framework. Yet, some binaries or scripts do not behave the way that you want them to. You have added plenty of debug level traces, but you still do not get why your read call is returning early.
At that point and in many other situations, the tool that I find the most useful to debug an embedded Linux system, right after logging, is strace. Your code base can be written in C or in a high level language such as Java, at some point, regardless of the abstraction layers, your programs and libraries will need to discuss with the Linux kernel.
Opening a file, sending a network packet, rebooting the system, all of that
will require on or multiple
syscalls. Some syscalls such as
the read
syscall mentioned above are incredibly error-prone. Have you
handled the case where your forked process will die, send a SIGCHLD, that will
interrupt your read
call that will return early with an EINTR
error?
Probably not, and it that case strace will definitely help you.
A few notes on how I use strace. If I have a process, say ls
that is somehow
standalone and behaving funny then, the easier is to start that given process
with strace that way:
strace -f -s 100000 ls
As we are most of the time fishing here, following forks (-f
) and avoiding
truncated strings (-s something-big
) are my default options.
If we are dealing with daemon that is started by the init manager under a specific environment, it is often handier to attach to the daemon process.
Something like:
strace -f -s 100000 $(pgrep wireplumber)
attaches to the running wireplumber
daemon and give you the opportunity to
see every single syscall that is issued by the daemon, its arguments and its
result:
strace: Process 1072 attached with 6 threads
[pid 1120] epoll_wait(6, <unfinished ...>
[pid 1093] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 1091] futex(0x11a80b10, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 1089] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 1072] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 1202] ppoll([{fd=48, events=POLLIN}, {fd=49, events=POLLIN}], 2, NULL, NULL, 8^[[A
<unfinished ...>
[pid 1093] <... restart_syscall resumed>) = 1
[pid 1093] write(13, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 1093] recvmsg(31, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\4\1\18\0\0\0\r\f\0\0\225\0\0\0", iov_len=16}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16
[pid 1093] poll([{fd=31, events=POLLIN}], 1, 0) = 1 ([{fd=31, revents=POLLIN}])
[pid 1093] recvmsg(31, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1\1o\0%\0\0\0/org/bluez/hci0/dev_06_65_7D_45_7A_16\0\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\10sa{sv}as\0\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0\22\0\0\0org.bluez.Battery1\0\0\23\0\0\0\0\0\0\0\n\0\0\0Percentage\0\1y\0@\0\0\0\0\0", iov_len=208}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 208
[pid 1093] write(13, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 1093] write(13, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 1093] poll([{fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}], 3, 0) = 1 ([{fd=13, revents=POLLIN}])
[pid 1093] read(13, "\3\0\0\0\0\0\0\0", 8) = 8
[pid 1093] read(13, 0x7feb30c8fd98, 8) = -1 EAGAIN (Resource temporarily unavailable)
[pid 1093] write(3, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 1072] <... restart_syscall resumed>) = 1
[pid 1093] write(3, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
Using strace requires no specific compilation option, symbols, or anything fancy. It will just work with every binary or script out of the box and can help you in many cases.
Now at some point it can be that you have a proper crash, a memory corruption,
or a timing issue in your program and in that case strace
won’t be of any
help. You will need to put your faith in GDB.
Please help me GDB
I have seen GDB used in three different ways over the years:
-
Integrated with a heavy IDE, to step the code for hours trying to figure out how “the myself of three months ago” could write something that hard to understand.
-
To debug a memory corruption using hardware breakpoints or to analyze post-mortem a process crash by printing out the backtrace and the content of the stack and registers.
-
Not used at all because it was “too scary”.
I have always felt that browsing the code, trying to build a mental representation of the code structure, and adding a few traces here and there, was way more valuable in the long run than stepping the code from an IDE. The good thing (with the only real) text editors such as Emacs, is that the GDB integration is bad enough so that you are not tempted to use it too much.
Let’s go back to a real world use case. You have built an embedded Linux image for your system and a C program is crashing once every two days. The only thing that you get is the elusive:
Segmentation fault
and now good luck. Nothing fancy like the super easy to understand Python backtraces:
Traceback (most recent call last):
File "example.py", line 12, in <module>
result = divide_numbers(10, 0)
File "example.py", line 6, in divide_numbers
return a / b
ZeroDivisionError: division by zero
only two words that you can meditate upon: Segmentation fault
.
As you are using Yocto to build your image, you can rebuild it with GDB inside, but that won’t be enough. The program will still crash in the same way, and you will still be clueless.
What you can do, is that you can enable coredumps, by doing something like:
ulimit -c unlimited
echo /tmp/core > /proc/sys/kernel/core_pattern
./your-program
and you should get something like:
Segmentation fault (core dumped)
which is way better than before even if we only have two new words.
We can use the GDB program that we have added to the image and the /tmp/core
file that has been produced by the kernel. That one consists of the recorded
state of the working memory of the program. It will help us in our quest to
learn more about the crash.
By running:
gdb ./your-program /tmp/core
we will get something slightly more helpful like:
GNU gdb (GDB) 15.2
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...
Reading symbols from ./your-program...
(No debugging symbols found in ./your-program)
...
(gdb) bt
#0 0x00000000004011e0 in ?? ()
#1 0x0000000000401235 in ?? ()
In a nutshell, GDB has read the unwinding tables of your-program
, has
extracted the stack of the program at the time of the crash from /tmp/core
,
has unwound the stack and displayed a backtrace as instructed by the bt
command.
The problem here is that we have used Yocto which by default, strips out the
symbols (the name of the functions and variables basically) from every binary
of the image to keep it lightweight. That’s why we get ??
instead of the
name of the functions corresponding to the 0x00000000004011e0
and
0x0000000000401235
addresses.
We then have three options here:
- Instruct Yocto to keep the symbols and not to strip them, by modifying the configuration this way:
DEBUG_BUILD = "1"
INHIBIT_PACKAGE_STRIP = "1"
INHIBIT_PACKAGE_DEBUG_SPLIT = "1
The image will get quite bigger, but depending on the system and the underlying persistent storage that can be acceptable.
- Instruct Yocto to keep compressed, minimal debug symbols, by modifying the configuration this way:
DISTRO_FEATURES:append = "minidebuginfo"
That will only add a ~10% overhead to the image but will make the backtrace such as the one above more meaningful.
- Ask Yocto to produce archives with the debug symbols, download the
/tmp/core
file, and use GDB on the host system, to analyze the core file. That option will be discussed in more details right after.
Regardless of the option that was picked, we can get on-target or off-target, something like:
GNU gdb (GDB) 15.2
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f0a5f158c78 in __strlen_evex () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
(gdb) bt
#0 0x00007f0a5f158c78 in __strlen_evex () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#1 0x00007f0a5f03f2e1 in __vfprintf_internal () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#2 0x00007f0a5f04ed83 in __vsnprintf_internal () from /gnu/store/ln6hxqjvz6m9gdd9s97pivlqck7hzs99-glibc-2.35/lib/libc.so.6
#3 0x00000000004011e0 in format_to_buffer (buffer=0x7ffd799ea690 "5.000001\t", buffer_size=256, format=0x402004 "%ld.%06ld\t%s\t%s") at tmp/t.c:11
#4 0x0000000000401235 in main () at tmp/t.c:20
which is already way more useful.
As of today, I feel like coredumps + GDB are the only portable and efficient
way to get consistent backtraces. The problem is that setting up coredump on
your embedded Linux system is not always as simple as the ulimit -c unlimited && echo /tmp/core > /proc/sys/kernel/core_pattern
call above.
You need to take into consideration the following elements:
-
The coredumps can get quite heavy, a few gigabytes sometimes.
-
The ulimit parameter needs to be set before the dumpable programs are started. If some of those are daemons then you will have to fiddle with your init-manager to have the ulimit set.
-
In some corner cases, you won’t have coredumps because your program is using capabilities for instance.
-
Depending on the process, dumping all of its memory into a file can have some security implications.
Note that the core man-page is doing a great job at describing the different reasons why the kernel is not creating a coredump for your process.
If your system is based on systemd, it is also important to note that the systemd-coredump service can help you cope with some of the concerns that are raised above.
Offline symbols
Let’s get back to the third option presented above. When your root file system
has to be as small as possible because the updates are sent over the air,
having minidebuginfo
or worst, full symbols on target is clearly not an
option. In that case, the only thing that can be considered is to store along
with every Yocto image the associated symbol archive.
The Yocto’s documentation is giving detailed instructions to setup such an infrastructure: https://docs.yoctoproject.org/dev-manual/debugging.html#using-the-gdbserver-method
The idea is to instruct Yocto to build an archive with the debug symbols that have been stripped out from the regular image with the following configuration:
IMAGE_GEN_DEBUGFS = "1"
IMAGE_FSTYPES_DEBUGFS = "tar.bz2"
Note that there’s a 1-to-1 match between your image and this debug archive. If
you are trying to get symbols for a coredump with an image that does not
exactly match the symbols archive generated by the very same bitbake
command, it won’t work at all. GDB will print the backtrace without any
symbols, with no specific error message and that will be it.
It means that you need to store your release images and your debug archives, close together, on your release image server.
Then, you will need to prepare a directory with the uncompressed root file system archive, and debug symbols archive:
mkdir debugfs
cd debugfs
tar xvfj build-dir/tmp/deploy/images/machine/image.rootfs.tar.bz2
tar xvfj build-dir/tmp/deploy/images/machine/image-dbg.rootfs.tar.bz2
You can then fire up GDB, this way:
gdb-multiarch debugfs/usr/bin/ls
(gdb) set sysroot debugfs
(gdb) set substitute-path /usr/src/debug debugfs/usr/src/debug
(gdb) core-file /tmp/core #the core file that you have downloaded from the target
(gdb) bt
to get the same backtrace as the one presented above.
I often setup a project specific script helper that can take as argument a core file and a release number. The scripts takes care of downloading the release root file system and debug archive, extracts them, and fires-up GDB in the exact same way as presented above:
def analyze_core(core, temp_dir):
"""
Run GDB with the environment setup to analyze the given core file.
Use temp_dir as the GDB sysroot.
"""
file_info = execute("file {}".format(core))
print("core file info", file_info)
m__ = re.search(r"execfn: '([^']*)'", file_info)
executable = m__.group(1)
binary_path = os.path.join(temp_dir, executable[1:])
print("The path of the binary that generated the dump is: " + binary_path)
gdb_cmds = (F"-ex 'set auto-load safe-path' "
F"-ex 'set sysroot {temp_dir}' "
F"-ex 'set directories {temp_dir}' "
F"-ex 'set auto-load safe-path /' "
F"-ex 'core-file {core}' "
F"-ex bt")
subprocess.run("gdb-multiarch " + binary_path + " " + gdb_cmds,
shell=True, check=True)
Online debugging
We have a last subject to cover in this first part. What if you cannot survive without the debugging capabilities of your precious VS Code, but you are still operating with a Yocto based system without any symbols?
This time we are not speaking of post-mortem coredump analysis, but of the
interactive debugging of a live program of the target. Luckily you can reuse
most of what was covered in the Offline symbols
section above.
GDB still needs to be configured in the exact same way, except for the
core-file
part. This time we need to instruct GDB to attach to a live
program on the target. To do that you can use the following commands instead:
(gdb) target remote 172.16.0.1:3389
Note that beforehand, you will need to start a GDB server instance of your target with a command that will look like:
gdbserver :3389 /usr/bin/ls
You can improve on the helper script that was mentioned above and add capabilities to connect to a remote GDB server in addition to coredump debugging support.
Regarding how to setup your VS Code to interact with GDB, I will only say that some have succeeded before. On my side, I much prefer the GDB cli that is more than enough to issue the few GDB commands required to understand the source of the crash.
Conclusion
With logging, strace and GDB covered, you have what’s need to debug 90% of the problems that you can encounter in user-space on your embedded Linux system. There are always more complex problems that require more complex tooling, but those will be really dependant of your use-cases.
The underside of the iceberg is that the user-space applications will probably represent only a fraction of the code base compared to the Linux kernel sources. Those sources are written in C and also have bugs. Those bugs will cause your entire system to crash without notice.
We will see in the part II of that series how to prepare a kdump infrastructure to get backtraces in case of kernel crashes.