Exploring the DWARF debug format information
DWARF (debuggingwith attributedrecord formats) is a debugging file format used by many compilers and debuggers to support source-level debugging. It is the format of debugging information within an object file. The DWARF description of a program is a tree structure where each node can have children or siblings. The nodes might represent types, variables, or functions.
DWARF uses a series of debugging information entries (DIEs) to define a low-level representation of a source program. Each debugging information entry consists of an identifying tag and a series of attributes. An entry or group of entries together, provides a description of a corresponding entity in the source program. The tag specifies the class to which an entry belongs and the attributes define the specific characteristics of the entry.
The different DWARF sections that make up the DWARF data are:
| Abbreviations used in the |
|Lookup table for mapping addresses to compilation units|
|Call frame information|
|Core DWARF information section|
|Line number information|
| Location lists used in the |
|Lookup table for global objects and functions|
|Lookup table for global types|
| Address ranges used in the |
| String table used in |
.debug_abbrev section contains the abbreviation tables for all the compilation units that are DWARF compiled. The abbreviations table for a single compilation unit consists of a series of abbreviation declarations. Each declaration specifies the tag and attributes for a particular debugging information entry. The appropriate entry in the abbreviations table guides the interpretation of the information contained directly in the
.debug_info section. The
debug_info section contains the raw information regarding the symbols. Each compilation unit is associated with a particular abbreviation table, but multiple compilation units can share the same table.
There are licensed tools, such as readelf, dwarfdump, and libdwarf available to read DWARF information. A script or program can read the output of these tools to find and interpret the required information. It is important to know tags and attribute definitions to write such scripts.
Common tags and attributes
The following list shows the tags that are mostly of interest when debugging a C++ application.
|Represents the class name and type information|
|Represents the structure name and type information|
|Represents the union name and type information|
|Represents the enum name and type information|
|Represents the typedef name and type information|
|Represents the array name and type information|
|Represents the array size information|
|Represents the inherited class name and type information|
|Represents the members of class|
|Represents the function name information|
|Represents the function arguments' information|
|Represents the name string|
|Represents the type information|
|Is set when it is created by compiler|
|Represents the sibling location information|
|Represents the location information|
|Is set when it is virtual|
The following command is used to compile a program in the DWARF format using the XLC compiler.
/usr/vacpp/bin/xlC -g -qdbgfmt=dwarf -o test test.C
Figure 1. Sample test program
dwarfdump output of the above example can be interpreted in the following way.
.debug_abbrev section for
DW_TAG_compile_unit looks as shown in Figure 2.
Figure 2. .debug_abbrev section
DW_TAG_* is generally followed by
DW_CHILDREN_* and a series of attributes (
DW_AT_*) along with the (
DW_CHILDREN_* is a 1-byte value that determines whether a debugging information entry using this abbreviation has child entries. If the value is
DW_CHILDREN_yes, the next physically succeeding entry of any debugging information entry using this abbreviation is the first child of that entry. If the 1-byte value following the abbreviation's tag encoding is
DW_CHILDREN_no, the next physically succeeding entry of any debugging information entry using this abbreviation is a sibling of that entry. Each chain of sibling entries is terminated by a null entry.
Figure 3. DW_TAG_compile_unit in the .debug_info section
DW_FORM_* attribute specifies the way to read
DW_AT* in the
.debug_info section. In this case,
DW_AT_name is of the form string. So the first attribute of
DW_TAG_compile_unit has to be handled as a string in the
.debug_info section, which is
- The file type is
C_plus_plusand it is present at
- The file is compiled using
IBM XL C/C++v12.
Extract class information
DW_TAG_class_type– Represents the class name and type information
DW_TAG_member– Represents the members of class
Figure 4. Class name and member information
Figure 4 explains that:
- There is a data type, named
intand its size is
- There is a class, named
baseand its size is
4bytes and its sibling entry is at location
- There is a class member, named
basemember. The type of this member is at location
<82>, which is
int. The scope is
publicand it is present at the 0th location from the starting of the class.
Extract array size
The immediate child of
DW_TAG_subrange_type, which has the array size. Array size is calculated as (
DW_AT_lower_bound) +1. If it is a two-dimensional array, there will be an immediate sibling of type
DW_TAG_subrange_type again. In this case, the array size is 8 (7+1).
Figure 5. Array size
Extract function names and arguments
DW_TAG_subprogram -Represents function name information
-Represents function arguments' information
Figure 6. Function name
Figure 7. Function arguments
Figure 7 describes that:
- There is a function, named
display, and its scope is
publicand its sibling is located at
- The mangled name is
- The first argument to the function is
this. It is created by the compiler as
DW_AT_artificialand is set to
yesand the type is at location
<421>, which is
- The second argument name is x and the type is at location
<82>, which is
DW_TAG_typedef represents the typedef name and type information.
Figure 8. typedef
From Figure 8, we can understand that there is a
typedef entry named
int_type and its type is at location
<82>, which is
Extract enum information
DW_TAG_enumeration_typehas the enum name and
DW_TAG_enumerator represents its elements' information.
DW_AT_const_valuespecifies the values assigned to the elements.
Figure 9. enum values
Figure 9 explains that:
myenumis the name of the enum and its size is
Janis the first element and its value is
Febis the second element and its value is
Extract inheritance information
DW_TAG_inheritance represents the inherited class name and type information.
Figure 10. inheritance
Figure 10 explains that:
- There is a derived class named,
myclass, and its size is
- The base class is at location
<94>, and is named
The DW_VIRTUALITY_noneattribute specifies it as a non-virtual class.