This is an old revision of the document!
====== Stefan Hajnoczi: GDB Remote Debugging ====== ===== Journal ===== ==== Week 1 ==== **Milestone:** Set up IDT and write an interrupt handler. === Fri May 23 === Some notes after chatting with mdc and mcb30: * Place IDT code in ''arch/i386/transitions/librm.S'' - in similar places to ''lgdt'' and ''sgdt''. * GDB stub should be written with portability in mind, separate out arch-specific parts. * GDB stub should be a build option. * Source-level debugging and symbols should work (mostly) out-of-the-box due to ELF build. === Sat May 24 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=ac29ad53aff6e89f12bd5a163861d1afb1846049|ac29ad53aff6e89f12bd5a163861d1afb1846049]] Implemented an interrupt handler in ''arch/i386/transitions/librm.S''. It currently sets ''eax'' to ''0xcafebabe'' and spins in an infinite loop. Have tested that it is working using QEMU. === Sun May 25 === Git commits: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=12144ffbfadef9c6597f9ac754685223bb736368|12144ffbfadef9c6597f9ac754685223bb736368]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=1ff72edaa0c68966e1bc102ae5167d714eeb03e6|1ff72edaa0c68966e1bc102ae5167d714eeb03e6]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=c15542a614961acc1051296fc2367d1539db57ff|c15542a614961acc1051296fc2367d1539db57ff]] When GDB reads or writes registers on x86, it wants a snapshot like this: ''EAX'', ''ECX'', ''EDX'', ''EBX'', ''ESP'', ''EBP'', ''ESI'', ''EDI'', ''EIP'', ''EFLAGS'', ''CS'', ''SS'', ''DS'', ''ES'', ''FS'', ''GS''. This snapshot is a blob that gets sent between the GDB stub and GDB. The interrupt handler now takes this register snapshot and passes it to the GDB stub. It also applies the register snapshot to the actual CPU state when the GDB stub returns. So if the GDB stub changes ''EAX'' in the register snapshot, then the ''EAX'' register will be changed when the interrupt handler returns. Each interrupt is mapped onto a POSIX signal number (e.g. ''SIGSEGV''). The GDB protocol communicates these numbers when reporting that execution was interrupted. === Mon May 26 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=f6c6b14468fffff0cf55df77ee7bb796113bcb4a|f6c6b14468fffff0cf55df77ee7bb796113bcb4a]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=6f8c3b03af1fa4733958a0ad66496a0acc8ce882|6f8c3b03af1fa4733958a0ad66496a0acc8ce882]] Asked mcb30 for feedback on the code so far. The latest git commit includes his suggested clean ups and simplifications. The interrupt handler calls ''gdbstub_handler(regs)'', where ''regs'' is a pointer to the register snapshot. The GDB stub may change the values in the register snapshot. When the interrupt handler exits, it applies the snapshot to the CPU state. **Changing ''ESP'' is currently not supported**, since it is more difficult to implement and we do not anticipate it ever being changed. === Tue May 27 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=d1e823a19d9c847fb7a965f8fbb9345f68875c3a|d1e823a19d9c847fb7a965f8fbb9345f68875c3a]] The GDB stub has initial support for: * Register read/write * Memory read/write * Continue and step * Breakpoints * Source-level debugging Here is a screenshot: {{:soc:2008:stefanha:journal:gdbstub.png|Early version of the GDB stub in action.}} The stub currently uses the serial driver directly. I need to design a clean GDB transport interface. To try it out: <code> $ git clone git://git.etherboot.org/scm/people/stefanha/gpxe.git gdbstub $ cd gdbstub/src $ make $ qemu -serial tcp::4444,server bin/gpxe.usb [From a different terminal] $ cd gdbstub/src $ gdb (gdb) file bin/gpxe.hd.tmp (gdb) target remote localhost:4444 </code> === Wed May 28 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=6f5d000a673209278400b9a04e12ee36cab07d28|6f5d000a673209278400b9a04e12ee36cab07d28]] Slow day today because I need to do university work for Thurs and Fri, my last ever assessment. Will get back to gPXE tomorrow afternoon. Talked to mcb30 about improvements to the GDB stub. Checked that the register ordering is indeed correct and that GDB uses the reverse order to the pushal instruction. I tried running a process that polls for serial activity and breaks into the GDB stub on activity. This eliminates the need for a hardcoded breakpoint during gPXE startup. gPXE will boot normally when GDB is not being used. If GDB is connected, then it will break into the GDB stub. I am not sure if this is the best solution, but I'll use it for a while and see how effective it is. === Thur May 29 === Git commits: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=d828b65e7182372fdf3f9174f76a6d31341d16a9|d828b65e7182372fdf3f9174f76a6d31341d16a9]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=00f6bbbb61348865d57ebc979617f5f962bd037c|00f6bbbb61348865d57ebc979617f5f962bd037c]] I cleaned up the code today and did manual testing to see if things behave the way they are expected to. Moved register snapshot and single step support into its own header file ''arch/i386/include/gdbmach.h''. This keeps the GDB stub portable. Implemented retransmit support if GDB sends NACK after receiving a corrupted reply from the GDB stub. === Fri May 30 === Had the weekly mentor meeting today with mdc. The aim for next week is **to push the GDB stub into mainline**. Mainline support will make using the GDB stub easy for developers and advanced users. I hope that others will find debugging useful way to develop faster. Mainline exposure will also help the GDB stub to improve. The mainline effort requires making the GDB stub a ''config.h'' option and writing usage documentation. I think the GDB stub should not be enabled in default builds since only a small number of users will ever need it. When these tasks are complete I will ask mcb30 to consider merging it into mainline. I have been reading the [[http://sourceware.org/gdb/current/onlinedocs/gdb_toc.html|GDB Manual]] and playing with GDB command files (aka scripts). I am writing a test suite for the GDB stub using GDB scripts. To run the test suite you start gPXE and launch GDB with the test suite scripts. The scripts step through a series of test procedures, like writing to memory and reading that memory back to check the write was correctly performed. Having a test suite is important to me because I want to have the freedom of experimenting with the GDB stub code without worrying about breaking things. The test suite will make sure I don't introduce regressions. === Sat May 31 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=42838a3f3147236be9a586dabc3524dc128235ec|42838a3f3147236be9a586dabc3524dc128235ec]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commitdiff;h=95b94a699fb0e62613885f0936890e902f624ae0|95b94a699fb0e62613885f0936890e902f624ae0]] **Added ''GDBSTUB'' ''config.h'' option**. GDB remote debugging support is not built in when ''config.h'' contains ''#undef GDBSTUB''. I have enabled ''GDBSTUB'' by default for now, but will make it default to off when proposing a patch to mainline. This option is implemented by placing a few ''#ifdefs'' in ''arch/i386/transitions/librm.S''. Hopefully there is a way to eliminate the need for ''#ifdefs''. Please let me know if you have ideas :-). **Introduced ''config-local.h'' to prevent accidental commits**. Since we often need to tweak ''config.h'' options during development, for example enabling ''GDBSTUB'' ;-), it sometimes happens that we forget about these temporary changes before committing. Then we have to go back and amend the commit to avoid introducing changes to ''config.h''. A neat solution proposed by mdc was to have ''config-local.h''. This file never gets checked in to git and can override the defaults in ''config.h''. I have implemented this idea by adding support for the ''@TRYSOURCE'' directive to ''util/mkconfig.pl''. The main ''config.h'' now tries to source ''config-local.h'' if it exists. === Sun Jun 1 === I played with the [[http://sourceforge.net/mailarchive/forum.php?thread_name=20080527223721.GA7464%40motherbox.xtech.com.ar&forum_name=etherboot-developers|ipv4_arp_check hang bug]]. This was a great test for the GDB stub. The GDB stub worked well except for the usual issues with optimized code. The hang was caused by NULL pointer memory corruption. I wanted to use a watchpoint to find out where the NULL pointer was being written from. Unfortunately, the current GDB stub does not support hardware breakpoints or watchpoints. I have not looked at implementating this on i386 using the debug registers yet, but think it would be a very useful feature. I had to work around this missing functionality by dumping memory at several points in time to find out when the corruption occurred. Thinking about this class of bugs also suggests having a default watchpoint on 0x00000000 whenever gPXE is built with ''GDBSTUB''. A memory read or write to 0x00000000 will result in breaking into the debugger. This is like the ''NULL_TRAP'' feature on steroids :-). ==== Week 2 ==== === Mon Jun 2 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=0004992b5bb35b4d5df324d8afffae04e3ee8285|0004992b5bb35b4d5df324d8afffae04e3ee8285]] **I hit an interesting bug while writing the GDB stub test suite**. The interrupt handler shares the stack with the main thread of control. If the main thread uses stack memory below the stack pointer when an interrupt comes in, then that memory may be corrupted. The interrupt handler uses the stack to store the CPU state and assumes that memory below the stack pointer is unused. One option is a seperate stack for the interrupt handler and GDB stub. This stack doesn't need to be very big (definitely less than 4K, probably 512 bytes is fine). Another option is to subtract an arbitrary amount to "bump" the stack pointer in the interrupt handler. This is a hack and would not guarantee anything. Finally, I am going to leave this issue for now because I think there is very little code (if any) which behaves as described. GCC tends to reserve space for locals variables at the beginning of a function. **Committed a test suite for the GDB stub**. It works in two parts: a GDB script that executes tests and some assembly to set things up on gPXE's end. The two parts pass control through each other using the GDB ''continue'' command and the x86 ''int $3'' breakpoint. The tests include register and memory read/write. I plan to add more as needed by the feature set. To run the test suite, build gPXE with ''GDBSTUB'' and make sure ''tests/gdbstub_test.S'' gets linked in. Then run: <code> $ make bin/gpxe.hd.tmp $ make $ qemu -serial tcp::4444,server bin/gpxe.usb [From another terminal] $ gdb -x tests/gdbstub_test.gdb </code> === Tue Jun 3 === Git commits: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=813fd7500972968b19f17326c5cac308c43b59b6|813fd7500972968b19f17326c5cac308c43b59b6]], [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=6b0bc333ef9123a2f255db49f1346a3cb2d28285|6b0bc333ef9123a2f255db49f1346a3cb2d28285]] **''librm.S'' ''#ifdef'' removal**. The ''GDBSTUB'' ''config.h'' option was implemented using ''#ifdef''. Today I got some advice from mcb30 and removed the ''#ifdefs''. The GDB stub interrupt handlers are now linked in using weak symbols. That means certain symbols are overridden when ''GDBSTUB'' is enabled, causing GDB IDT setup code to be called. **Split serial console from serial driver**. This patch makes the serial driver interface available in ''include/gpxe/serial.h''. It moves the serial console code from ''core/serial.c'' to ''core/serial_console.c'', leaving just the serial driver itself. There is currently no check against ''GDBSTUB'' and ''CONSOLE_SERIAL'' being enabled at the same time. Just don't do it :-). === Wed Jun 4 === Git commit: [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=b739cae42af8a7b37def48c9284381e98a91043e|b739cae42af8a7b37def48c9284381e98a91043e]] **Removed unused ''arch/i386/core/gdbsym.c''**. This was an attempt at GDB debugging via QEMU and mcb30 suggested it could be removed. Its ''GDBSYM'' ''config.h'' option has potential to confuse users wanting to build the GDB stub, so I removed it. Next steps: * Send rom-o-matic patch to mdc so that ''GDBSTUB'' can be chosen when configuring a ROM. * Design a GDB protocol transport interface that serial and UDP can implement. Discuss with mentors. * Handle disconnect properly - ignore breakpoints when GDB is not attached. * Implement hardware breakpoint and watchpoint support using debug registers. * Using debug register, implement NULL pointer bug guard. * Memory read/write support for device memory. Check kgdb implementation for rules on device memory read/write.