NAME

egypt - create call graph from gcc RTL dump

SYNOPISIS

egypt <options> <rtl-file>... | dotty -
egypt <options> <rtl-file>... | dot <dot-options>

DESCRIPTION

Egypt is a simple tool for creating call graphs of C programs.

OPTIONS

--callers function,function...

Show only the given function(s) and their callers, recursively including all functions from which the given functions can be reached. Taking the address of a function is considered equivalent to a call for the purpose of determining reachability. For the supported forms of the "function" arguments, see the section titled "Selecting functions".

--callees function,function...

Show only the given function(s) and their callees, recursively including all functions that can be reached from the given functions. Taking the address of a function is considered equivalent to a call for the purpose of determining reachability. For the supported forms of the "function" arguments, see the section titled "Selecting functions".

--cluster-by-file

Create a Graphviz cluster for each input file, containing the functions defined in that file.

--include-external

Include calls to external functions in the call graph. A function is considered external if it is not defined in any of the input files. For example, functions in the standard C library are external. Only direct function calls will be displayed; there is no way to display the action of taking the address of an external function.

--omit function,function...

Omit the given function(s) from the call graph. For the supported forms of the "function" arguments, see the section titled "Selecting functions".

--summarize-callers n

When a function has n or more callers, don't show show a separate arrow for each caller, but just a single arrow from a pseudo-node indicating the number of callers. This is useful to reduce clutter resulting from utility functions called in a large number of places.

HOW IT WORKS

The two major tasks in creating a call graph are analyzing the syntax of the source code to find the function calls and laying out the graph, but Egypt actually does neither. Instead, it delegates the source code analysis to GCC and the graph layout to Graphviz, both of which are better at their respective jobs than egypt could ever hope to be itself. Egypt itself is just a small Perl script that acts as glue between these existing tools.

Egypt takes advantage of GCC's capability to dump an intermediate representation of the program being compiled into a file (a RTL file); this file is much easier to extract information from than a C source file. Egypt extracts information about function calls from the RTL file and massages it into the format used by Graphviz.

GENERATING THE RTL FILE

Compile the program or source file you want to create a call graph for with gcc, adding the option "-fdump-rtl-expand" to CFLAGS. This option causes gcc to dump its intermediate code representation of each file it compiles into a file. In old versions of GCC this option was called "-dr", but GCC 4.4.0 and newer accept only the "-fdump-rtl-expand" form. For best results, compile without optimization.

For example, the following works for many programs:

make clean
make CFLAGS=-fdump-rtl-expand

Depending on the GCC version, the RTL file for a source file foo.c may be called something like foo.c.rtl, foo.c.00.rtl, foo.c.00.expand, or foo.c.229r.expand.

SELECTING FUNCTIONS

In the command line options that take function names as arguments, multiple C function names can be given separated by commas.

C++ functions names can be given either in their name-mangled or demangled form; see the section titled "C++ Support" for details about demangling. Because the demangled form can include commas, multiple functions must be given by specifying the command line option multiple times rather than as a single option with a comma-separated argument.

VIEWING THE CALL GRAPH

To view the call graph in an X11 window, run egypt with one or more RTL files as command line arguments and pipe its output to the dotty program from the Graphviz package. For example, if you compiled foo.c with gcc -fdump-rtl-expand to generate foo.c.229r.expand, you can use

egypt foo.c.229r.expand | dotty -

If the program contains many source files, consider using shell wildcards or the find command:

egypt *.expand | dotty -
egypt $(find . -name '*.expand' -print) | dotty -

PRINTING THE CALL GRAPH

To convert the call graph into a file format suitable for printing or embedding into a document, use the use the dot program from the Graphviz package.

For example, to generate a PDF file callgraph.pdf fitting everything on a US letter size page in landscape mode, try

egypt *.expand | dot -Grotate=90 -Gsize=11,8.5 -Tpdf -o callgraph.pdf

Sometimes, the graph will fit better if function calls go from left to right instead of top to bottom. The dot option -Grankdir=LR will do that:

egypt *.expand | dot -Gsize=8.5,11 -Grankdir=LR -Tpdf -o callgraph.pdf

For nontrivial programs, the graph may end up too small to comfortably read. If that happens, try N-up printing using the PostScript driver:

egypt *.expand | dot -Gpage=8.5,11 -Tps -o callgraph.ps

You can also try playing with other dot options such as -Gratio, or for a different style of graph, try using neato instead of dot. See the Graphviz documentation for more information about the various options available for customizing the style of the graph and the supported output file formats.

READING THE CALL GRAPH

Function calls are displayed as solid arrows. A dotted arrow means that the function the arrow points from takes the address of the function the arrow points to.

INDIRECT FUNCTION CALLS

Egypt does not display indirect function calls. Doing that is impossible in the general case: determining which functions will call each other indirectly at runtime would require solving the halting problem.

The dotted arrows generated by egypt are sometimes misunderstood to represent indirect calls, but they actually represent taking the address of a function, resulting in a function pointer. That function pointer will typically be used to make an indirect function call at some later time, but the call is not necessarily made from the same function where there address was taken, and it is generally not possible to determine where or even whether that call will take place.

The dotted arrows may or may not be useful for understanding the program structure depending on the particular style of programming used. One case where they are often useful is with event-driven programs where a sequence of events is handled by a chain of callback functions, each one registering the address of the next with the event handling framework before returning to the event loop. In such a program, the dotted arrows will indicate which callbacks cause which other callbacks to be invoked; such a graph may to be more useful than a graph of the actual indirect calls, which would just show the event loop calling every callback.

C++ SUPPORT

Egypt provides limited support for C++. Producing readable call graphs from C++ is more challenging than the C case because the fully qualified names of C++ functions tend to be long, especially when using templates.

Egypt attempts to automatically demangle C++ function names and display them in in the native C++ syntax. This demangling support relies on comments in the RTL output generated by GCC, and the exact format of the demangled names varies between GCC versions. Some include the return type and argument list (e.g., gcc 7.5.0), while others don't (e.g., gcc 10.2), so the same member function can appear in the call graph as either void C::f(int, int) or just C::f depending on the GCC version. If GCC does not include the argument list, overloaded functions will appear in the graph as distinct nodes with identical labels.

When specifying C++ function names on the command line using the demangled form, they must be given exactly as they appear in the call graph using your version of GCC, including any embedded whitespace. Shell quoting may be needed to preserve whitespace or special characters in the function name.

Egypt will not display virtual function calls, because there is no easy way to determine which virtual function is being called based on the RTL. Inline functions may be subsumed into their callers.

When using the --include-external option to display external C++ functions, such as those in the C++ standard library, they will be displayed in their name mangled form, and when they are given as arguments to command line options, they must be given in their name mangled form.

WHY IS IT CALLED EGYPT?

The original plan was to call it rtlcg, short for RTL Call Graph, but it turned out to be one of those rare cases where ROT13'ing the name made it easier to remember and pronounce.

SEE ALSO

gcc, dotty, dot, neato

COPYRIGHT

Copyright 1994-2022 Andreas Gustafsson

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Andreas Gustafsson