Stefan Hajnoczi: GDB Remote Debugging

Project Plan

Summary

This project will enable gPXE debugging using the GDB remote target feature. Developers will be able to inspect gPXE from a remote machine and control its execution.

Due to the low-level nature of gPXE and the environment it executes in, there is little support for debugging and crash analysis. This can make understanding and fixing errors frustrating.

The GNU Debugger (GDB) can debug remotely using a simple protocol. The code that implements this protocol on the target is called a GDB stub.

Using remote debugging, developers will be able to get much better access to gPXE while it is running or when it has crashed. It will be easier to diagnose crashes and to understand error conditions.

Outline

The GDB Remote Protocol

The GNU Debugger (GDB) has support for remote debugging. This lets users debug applications running on other machines or in special environments, like under virtualization. Remote debugging works by speaking a simple protocol that carries checksummed packets of commands and replies.

The protocol is extensible and most commands are optional. A minimal GDB stub must implement the following commands:

  • Register read/write. These commands allow the debugger to get and set the state of the CPU.
  • Memory read/write. These commands allows the debugger to peek and poke memory, as well as set breakpoints.
  • Single-step and continue. These commands control the execution of a program.

Additional GDB protocol commands provide more advanced features, like hardware breakpoints, or optimizations of existing commands, like binary memory dumps for faster transfer.

I will implement the minimal set of commands. If we find that additional commands are useful in practice, I will also implement them.

Execution Model

The GDB stub is interrupt-driven. Control is transferred to the GDB stub when an exception occurs. When not in the interrupt context, the GDB stub is inactive.

Upon entering an interrupt context, the GDB stub notes the state of the registers and the exception that was raised. This information is sent to the remote GDB.

GDB displays its prompt and lets the user invoke commands. In the meantime, the GDB stub holds control of gPXE and blocks on input until the remote GDB sends a command.

Commands are processed in a loop by the GDB stub until a continue or step command is received. These commands pass control back to gPXE by leaving the interrupt context.

Note that this execution model makes the GDB stub a blocking, top-level thread of control in gPXE. After discussion with mcb30, we decided that although this is the anticipated execution model, the GDB stub should not be written to assume blocking. This makes it possible to support other modes of debugging later, like periodic memory dumps while the program runs.

Only 32-bit protected mode support is planned. The bulk of gPXE runs in 32-bit protected mode and gdb/binutils support this mode well.

Isolation

The GDB stub controls the execution of gPXE, but the GDB stub is part of gPXE. This sounds recursive - and it is! Therefore, care must be taken to isolate the GDB stub from gPXE. If the GDB stub is not isolated, it can hang itself. For example, if the GDB stub calls strcmp(), then placing a breakpoint inside strcmp() may lead to recursion in the GDB stub.

The GDB stub must be designed to depend on as few gPXE functions as possible. That way, as much of gPXE as possible remains debuggable. This conflicts with code reuse, so we will have to keep an eye on this during development.

Transports

Several transports are supported by GDB including serial, UDP, and TCP. Serial is simple and serves as a good starting point. UDP and TCP are more flexible but also more complex. After discussion with mcb30, it looks like UDP is in scope and should be implemented.

I believe TCP is not a big win if UDP support is already in place. The problem with TCP is that it depends on the TCP/IP stack and therefore blacklists a lot of gPXE code for breakpoints, due to isolation issues discussed above.

Milestones and Timeline

  • Week 1
    1. Set up IDT and write interrupt handler.
  • Week 2
    1. Decide on interface for GDB transports and refactor serial console code to support serving as a GDB transport.
    2. Implement GDB protocol encoder and decoder.
  • Week 3
    1. Implement memory read/write, including GDB scripts as tests.
    2. Implement register read/write, including GDB scripts as tests.
  • Week 4
    1. Implement continue and single-step, including GDB scripts as tests.
    2. Ensure breakpoints are working, including GDB scripts as tests.
  • Week 5
    1. Ensure source-level debugging works.
  • Week 6
    1. Half-term buffer for any schedule slip.
  • Week 7
    1. Implement UDP transport.
  • Week 8
    1. Documentation (how to debug, how to run tests).
  • Week 9
    1. Usability and testing. Work with other gPXE developers, encourage remote GDB usage, fix issues.
  • (Possibly) support running as a gPXE process for memory peek/poke during execution.
  • (Possibly) refactor PXE UDP to bypass IP stack?
  • (Possibly) implement TCP transport and support TCP listen sockets.

Navigation

* [[:start|Home]] * [[:about|About our Project]] * [[:download|Download]] * [[:screenshots|Screenshots]] * Documentation * [[:howtos|HowTo Guides]] * [[:appnotes|Application Notes]] * [[:faq:|FAQs]] * [[:doc|General Doc]] * [[:talks|Videos, Talks, and Papers]] * [[:hardwareissues|Hardware Issues]] * [[:mailinglists|Mailing lists]] * [[http://support.etherboot.org/|Bugtracker]] * [[:contributing|Contributing]] * [[:editing_permission|Wiki Edit Permission]] * [[:wiki:syntax|Wiki Syntax]] * [[:contact|Contact]] * [[:relatedlinks|Related Links]] * [[:commerciallinks|Commercial Links]] * [[:acknowledgements|Acknowledgements]] * [[:logos|Logo Art]]

QR Code
QR Code soc:2008:stefanha:project_plan:start (generated for current page)