GDB GNU Debugger Intro

Introduction

Background

Have you ever experienced a core dump? Didn't have a clue what to do with those multi megabyte files that just seem to be filling up your harddisk? Jump right in, in this article we'll take a look at how the GNU Debugger, gdb can help you make sense out of those obscure core dumps. We'll look at one of the most simple problem analysis available, backtracing.

This article is filed under the Linux section, but it's equally applicable to any other platform on which you have access to the gcc suite. These platforms include Solaris, FreeBSD, OpenBSD and NetBSD.

What you'll need

For this introductionary course, you will only need access to gcc and gdb. Ubuntu users can simply install the build-essential package for the full array of tools necessary. You will also need a simple text editor for entering the demo program sources. Other than this, you're free to use this information in your own situation, for debugging your own programs and source code.

First Steps

The first program

Let's jump right in with a simple but effective demonstrational program. The following code should be entered in a file called main.c:

#include <stdio.h>

int main(int argc, char* argv[]) {
char buffer[16];

buffer[80000] = 3;

return;
}

Once you've saved the file, it's time to compile it. Use the -g gcc switch to enable debug information. Without this information we can't debug the program using symbols later on. Debugging using symbols means that you can browse the program by using the full method and parameter names from the source code. Normally this information is stripped during the compilation and linking process.

Compiling the program should be done using the following command:

gcc -g -o main main.c

This will create the binary main which can be executed.

Executing the program

When executing the program, the program will attempt to read memory outside of the allocated buffer space. This will lead to a so-called segmentation fault which often indicates a memory leak or invalid pointer leading to a corrupt memory address to read or write. Let's see if our binary produces this error on runtime:

frank@tightrope:~/tmp/gdbtest$ ./main
Segmentation fault

That is pretty clear. But, as you might encounter yourself on a Linux host, you don't see the Core dumped message, only Segmentation fault (often referred to as segfault). This is because core dumps are disabled by default on some linux distributions. To enable core dumps for your current shell, use ulimit to set a maximal core dump size in megabytes. For this example, we'll "limit" core dumps to 1 gigabyte:

frank@tightrope:~/tmp/gdbtest$ ulimit -c 1024

Now, after setting the core dump size, let's retry executing the program:

frank@tightrope:~/tmp/gdbtest$ ./main
Segmentation fault (core dumped)

frank@tightrope:~/tmp/gdbtest$ ls -als core
72 -rw------- 1 frank frank 147456 2006-01-13 21:19 core

That's it. We've got a core dump on disk. Let's analyze it.

Analyzing Core Dumps

Starting GDB

Starting GDB is fairly easy once it's already installed. Start GDB using "gdb" and supply the executable name and core dump file as parameters. In this case, the executable was called "main" and the core dump is simply called "core". I'll start GDB using this command:

gdb ./main ./core

Don't worry, because once you execute this command GDB will output a few lines of information to the screen. The output in my case looks something like this, notice that GDB gives you a gdb prompt at the end of the output:

frank@tightrope:~/tmp/gdbtest$ gdb ./main ./core
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

Core was generated by `./main'.
Program terminated with signal 11, Segmentation fault.

warning: current_sos: Can't read pathname for load map: Input/output error

Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0 main (argc=1, argv=0xbfca9944) at main.c:6
6 buffer[80000] = 3;
(gdb)

The output might differ slightly from platform to platform and you can ignore most of the lines here. I'll discuss the more important ones now.

Initial GDB information analysis

The first two interesting lines are these:

Core was generated by `./main'.
Program terminated with signal 11, Segmentation fault.

Actually, these two lines tell us what we already knew, but in a debugging situation where you didn't witness the crash this information is very useful. In this case it tells us that the program "main" terminated because of memory access violation, or segmentation fault. Because we compiled the program with debugging symbols, the last two lines are also very informative:

#0 main (argc=1, argv=0xbfca9944) at main.c:6
6 buffer[80000] = 3;

The first of these two lines indicates where the segmentation fault occurred. On this line you can read the call stack depth, #0 means it was both the inner- and outermost function in which the error occurred. You can also see the method name in which the error occurred, "main" in this case.

After the method name, the parameter values are shown, this is useful for spotting null-pointers. In this example, the argv pointer does have a correct value so this won't be a problem. After the parameters, the source file and line number on which the segmentation fault occurred are shown. The error was in the file "main.c" on line 6.

The line below the #0-line simply displays the contents of the source line which trapped the segmentation fault.

As you can see, simply invoking GDB gives you detailed information on the error location. There's not a lot I can add to this, so stop GDB by typing "quit" and pressing the Enter key. We'll look into a more complex example now.

Null Pointer Experiment

A new program

To make matters just one step more complex, I give you the following program, which contains two methods instead of just one:

#include <stdio.h>

int main(int argc, char* argv[]) {

// notice the erroneous "=", the coder meant "=="
if(argv = 0) return;

printString(argv);

return;
}

int printString(char* string) {
sprintf(string, "This is a test.\n");
}

I've added a small comment to indicate where the problem lies. The coder wanted to check whether or not "argv" was a NULL pointer or not. I admit this is a very artificial example, but it shows the problem clearly. Let's compile and run this application, I've called it "main.c" again:

frank@tightrope:~/tmp/gdbtest$ gcc -g -o main main.c

frank@tightrope:~/tmp/gdbtest$ ./main
Segmentation fault (core dumped)

frank@tightrope:~/tmp/gdbtest$ ls -als core
68 -rw------- 1 frank frank 147456 2006-01-13 21:53 core

I still had the core limit set so it correctly says core dumped. Let's start GDB again.

The backtrace command

Starting GDB using the new core dump gives a bit of information which mainly differs at the bottom (superfluous output stripped):

frank@tightrope:~/tmp/gdbtest$ gdb ./main ./core
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.

...

#0 0x08048396 in printString (string=0x0) at main.c:14
14 sprintf(string, "This is a test.\n");

In the #0-line it's apparent that the string pointer is a null pointer. We can't see where it's coming from, but at least we have a lead. Because of the fact that the error didn't occur in the "main" method we can trace back the path the code took to end up at line 14 in "main.c".

To trace the code flow, called a backtrace, type "backtrace" or "bt" for short in the GDB prompt like this:

(gdb) backtrace
#0 0x08048396 in printString (string=0x0) at main.c:14
#1 0x0804837c in main (argc=1, argv=0x0) at main.c:8

The backtrace command will give you a reverse chronological list of the functions that were called. The topmost line shows the source line which caused the segmentation fault, the following lines (only one in this case) tell you how the lines directly above them got called.

So, in the example, line 8 of "main.c" caused line 14 in "main.c" to be called which caused the segmentation fault. Using the "frame" command you can retrieve more information about each entry in the backtrace. Let's investigate both frame 0 and 1.

Walking through the frames

Type "frame 0" for the #0-line and type "frame 1" for the #1-line:

(gdb) frame 0
#0 0x08048396 in printString (string=0x0) at main.c:14
14 sprintf(string, "This is a test.\n");
(gdb) frame 1
#1 0x0804837c in main (argc=1, argv=0x0) at main.c:8
8 printString(argv);

The frame information for frame 1 shows that the printString routine gets called while argv is already 0x0. This gives the crucial information that argv gets set to 0 before line 8 in "main.c". Again, this is an artificial example but it shows the basic tricks of backtracing a crashed program by stepping through the stack frames.

Miscellaneous tips

Feel free to experiment with other programs and different crashes. There was a time I got really depressed when seeing a program issuing a core dump, but nowadays I'm glad since it provides a lot of information for making the program better and more robust.

One other very useful thing I learned is that when you're using pthreads or some other kind of threads, you should use threadsafe functions. For instance, don't use localtime since it will crash your application sooner or later, instead use localtime_r which is threadsafe. This is also the case for MySQL libraries and functions, use the threadsafe _r functions is available.

About this article

This article was added to the site on the 13th of January 2006.

Back to top