Debug symbol explained

A debug symbol is a special kind of symbol that attaches additional information to the symbol table of an object file, such as a shared library or an executable. This information allows a symbolic debugger to gain access to information from the source code of the binary, such as the names of identifiers, including variables and routines.

The symbolic information may be compiled together with the module's binary file, or distributed in a separate file, or simply discarded during the compilation and/or linking.

This information can be helpful while trying to investigate and fix a crashing application or any other fault.

Debugging information

Debug symbols typically include not only the name of a function or global variable, but also the name of the source code file in which the symbol occurs, as well as the line number at which it is defined. Other information includes the type of the symbol (integer, float, function, exception, etc.), the scope (block scope or global scope), the size, and, for classes, the name of the class, and the methods and members in it.

Part of the debug information includes ithe line of code in the source file which defines that symbol (a function or global variable), as well as symbols associated with exception frames.

This information may be stored in the symbol table of an object file, executable file, or shared library, or may be in a separate file.

On some systems, e.g., z/OS, the debug information contains more than just the symbol tabled, e.g., the ADATA discussed in contains source code.

Debugging information can take up quite a bit of space, especially the filenames and line numbers. Thus, binaries with debug symbols can become quite large, often several times the stripped file size. To avoid this extra size, most operating system distributions ship binaries that are stripped, i.e. from which all of the debugging symbols have been removed. This is accomplished, for example, with the strip command in Unix. If the debugging information is in separate files, those files are usually not shipped with the distribution.

Embedded symbols

Unix-like systems

stabs was an early format for debugging symbols on Unix-like systems. The newer DWARF format, for which formal specifications exist, has largely supplanted it. The specification allows any compatible compiler or assembler to create debug symbols in a standardized format, and for any debugger, such as the GNU Debugger (GDB), to gain access and display these symbols.

IBM

The compilers for the IBM mainframe line descended from the System/360 have a TEST option that causes the compiler to include debugging information[1] [2] in the object file. Similarly, the Binder and linkage editors have a TEST option that causes the debug information to be retained in the load module. Various debug tools, e.g., OS/360 TESTRAN, TSO TEST, have the ability to use the embedded symbol definitions.

External debug files

OS/390 et al

The IBM High Level Assembler (HLASM) and other compilers running on, e.g., z/OS, have an ADATA option that produces an Associated data (ADATA) file[3] containing more information than that produced by the old TEST option. In particular, the ADATA file includes lines of source code and their metadata.

Microsoft debug symbols

Microsoft compilers generate a program database (PDB) file containing debug symbols. Some companies ship the PDB on their CD/DVD to enable troubleshooting and other companies (like Microsoft, and the Mozilla Corporation) allow downloading debug symbols from the Internet. The WinDbg debugger and the Visual Studio IDE can be configured to automatically download debug symbols for Windows dynamic-link libraries (DLLs) on demand. The PDB debug symbols that Microsoft distributes include only public functions, global variables and their data types. The Mozilla Corporation has similar infrastructure but distributes full debug information.

Apple

On Apple platforms, debug symbols are optionally emitted during the build process as files. Apple uses the term "symbolicate" to refer to the replacement of addresses in diagnostic files with human readable values.

History

Symbolic debuggers have existed since the mainframe era, almost since the first introduction of suitable computer displays on which to display the symbolic debugging information (and even earlier with symbolic dumps on paper). They were not restricted to high level compiled languages and were available also for assembly language programs. For the IBM/360, these produced object code (on request) that included "SYM cards". These were usually ignored by the program loader but were useful to a symbolic debugger as they were kept on the same program library as the executable logic code.

See also

External links

Notes and References

  1. Book: IBM System/360 Operating System - TESTRAN - Program Logic Manual - Program Number 3605-PT-516 . GY28-6611-0 . April 1, 1971 . TNL GN26-8016 . Appendix D: TESTRAN Editor Input Record Formats . https://ia804503.us.archive.org/14/items/bitsavers_ibm360testProgramLogicManual197104_9236226/GY28-6611-0_TESTRAN_Program_Logic_Manual_197104.pdf#page=123 . 119–120 . . July 11, 2024 .
  2. Book: MVS/370 - Linkage Editor Logic - Data Facility Product 5665-295 - Release 1.0 . LY26-3921-0 . April 1983 . First . Appendix. Input conventions and Record Formats . http://bitsavers.org/pdf/ibm/370/MVS/LY26-3921-0_MVS-370_Linkage_Editor_and_Loader_Logic_1st_ed_198304.pdf#page=204 . 195–206 . . . July 11, 2024 .
  3. Book: High Level Assembler for z/OS & z/VM & z/VSE - Programmer's Guide - Version 1 Release 6 . SC26-4941-07 . 2015 . Appendix C. Associated data file output . https://publibz.boulder.ibm.com/epubs/pdf/asmp1022.pdf#page=251 . 227–275 . . July 11, 2024 .