# Notes on understanding the backtraces generated by PVFS2 # # Dec 19, 2003 PVFS2 is capable of generating backtrace information at runtime on systems that use glibc. This is sometimes useful in tracking down bugs that are difficult to reproduce in a debugger. This backtrace information may appear in a log file or on stderr, depending on what part of the file system generated the error and how your error logs are configured. The backtraces can be triggered by one of two events: - the gossip_lerr() error message function in PVFS2 (generally used for unexpected internal erorr paths) - a segmentation fault, if --enable-segv-backtrace was enabled at configure time Backtraces take on the following general form: [bt] (+) [] ... [bt] (+) [] The stack may be up to 8 levels deep, with the most recent level on top. Here is an example of a backtrace generated by manually sending a segmentation fault signal to a running server: PVFS2 server: signal 11, faulty address is 0x42fa, from 0x400c8a46 [bt] /lib/libpthread.so.0(nanosleep+0x46) [0x400c8a46] [bt] /lib/libc.so.6 [0x40148db0] [bt] /lib/libpthread.so.0 [0x400c1ec9] [bt] ./pvfs2-server(job_testcontext+0x169) [0x8060d73] [bt] ./pvfs2-server(main+0x38d) [0x80502de] [bt] /lib/libc.so.6(__libc_start_main+0xc7) [0x401357a7] [bt] ./pvfs2-server(read+0x65) [0x804fd81] The memory addresses can be translated into more specific code file and line numbers after the fact using the "addr2line" utility. For example: addr2line -e pvfs2-server 0x8060d73 src/io/job/job.c:3310 This tells us that the server received the segfault signal while executing line 3310 of job.c, which is a pthread_cond_timedwait() call. You can do the same for any address in the dump that corresponds to PVFS2 code. Thats it! ----- An alternative approach is to use gdb: 1) load the offending executable into gdb (do not run it) 2) set listsize 1 3) list * 4) repeat step 3) for any other addresses you wish to inspect src/server> gdb pvfs2-server GNU gdb 5.3 ... (gdb) set listsize 1 (gdb) list * 0x8060d73 0x8060d73 is in job_testcontext (src/io/job/job.c:3310). 3310 ret = pthread_cond_timedwait(&completion_cond, (gdb)