Stefan Hajnoczi: GDB Remote Debugging

Week 5

Milestones:

  • [b44] Tested and clean for mainline review.
  • [bzImage] Fix Lilo stack clobber in prefix.

Mon Jun 23

Git commit: [bzImage] Place our own stack to prevent clobbering

Lilo bzImage problem identified. Lilo cannot boot gpxe.lkrn images. This happens because Lilo places gPXE's stack into the BIOS free memory area. gPXE uses the free memory area when decompressing and loading itself into memory.

This means that the stack is overwritten during decompression and the machine triple-faults. I tested this by hardcoding the stack pointer to the value GRUB assigns; this results in a clean boot.

A solution was suggested by mcb30 and is already implemented by some of the other prefixes: set our own stack below 0000:7c00.

I am continuing work on the b44 driver. RX and TX are working but there is lots of cleanup and error handling left.

Tue Jun 24

Git commit: [b44] Interrupt status error reporting

Added error handling to the b44 driver and refactored code. The driver resets the card if a serious error is reported. I haven't had much success at triggering errors during testing. I tried unplugging the cable, taking the network interface down on the other end, etc.

I successfully booted gtest.gpxe (Tom's Root Boot disk) from Etherboot.org today. It was cool to boot from the internet with the b44 driver. By the way, I did not experience the problem that DrV has reported on Tulip-based hardware.

To speed up development and take it easy on my USB stick, I set up PXE chaining to network boot directly from the freshly compiled image. After doing make bin/b44.kpxe, I simply power on the b44 laptop and it loads my latest gPXE image.

Noticed that “Linux Device Drivers, Third Edition” is free. See here. Some of the book applies to gPXE since our code is often inspired or based off the Linux kernel.

The b44 cannot access memory above 1 GB. We need a workaround that allocates descriptor rings and IO buffers from memory below 1 GB. We hit this issue today when dmb tested the driver on his machine with 2 GB memory. I have a temporary hack to place the gPXE heap at 4 MB into the address space. Need to talk to mentors about a long term solution.

Wed Jun 25

Tested and sent bzImage patch for review. The bzImage patch updates gPXE lkrn images to a modern Linux kernel image format. It also includes a fix for zImage loading in gPXE and allows Lilo to load gPXE lkrn images.

Currently investigating a solution to the b44 1 GB memory limitation. Will also continue simplifying the driver. While testing, dmb mentioned it downloaded images slowly so I may work on the performance. I did not port the performance optimizations in the Linux driver, so there is plenty of low-hanging fruit.

Fri Jun 27

Linux DMA mapping as a solution to driver memory constraints. The b44 driver can only access the first gigabyte of memory. When I asked how to work around this limitation, mcb30 suggest looking at Linux pci_map_single(). I am going to implement something similar to manage bounce buffers for devices that cannot access gPXE's heap.

Sat Jun 28

Git commit:

Working on DMA mapping. I have designed and implemented DMA mapping for gPXE, see commits above. The b44 driver uses DMA mapping to work around the chip addressing limitations. I am currently writing tests.

I am not done with the code but wanted to commit instead of keeping this out of tree. Before submitting the code for review I will break it up into several patches:

  • uhmalloc, general-purpose external memory allocator. Separates the umalloc heap from its memory allocator (which I call uhmalloc). The idea is that DMA mapping reuses uhmalloc to manage its DMA heap.
  • DMA mapping for transparently managing bounce buffers when needed by hardware. The DMA mapping API provides a way to structure DMA transactions. If the hardware has addressing limitations and is unable to access gPXE's heap, bounce buffers are used to communicate via regions of memory that the device has access to.
  • b44 with DMA mapping. Update b44 code using DMA mapping API so it runs on machines with more than 1 GB of RAM. This reverses the hack to place the gPXE heap at 4 MB into the physical address space.

Next week

On to Week 6!


QR Code
QR Code soc:2008:stefanha:journal:week5 (generated for current page)