[gPXE] IBM netboot, large network, and KVM virtualization issues

Jarrod Johnson jarrod.b.johnson+gpxe at gmail.com
Fri Jan 29 08:22:39 EST 2010


Since 1.0.0 is in rc phase, thought I'd mention three issues I've been
patching in my usage.

One is that on IBM firmware (not BBS compliant), there is a way of working
better with boot order via hooking int18 instead of int19 (as well as a few
other changes.  I have attached the patch I use, it does the following:
-Does not save off BIOS handlers (simply because they are never used in this
case)
-Hooks int18 instead of int19
-Exits via iret rather than calling int18/int19
-saves and restores registers
Note that I am not in a position to evaluate where besides IBM firmware BBS
is not available.  I ask that if there is concern over other non-BBS systems
having a problem with this, that someone make this controllable via a
config/general.h flag, analogous to how it was done in etherboot.

My second issue is a tad more tricky.  I got gPXE to netboot on a single
ethernet segment of thousands of servers, but I had to patch as the general
noise of such a large network caused gPXE some pain allocating resources
(see etherboot-discuss thread from July 1, 2009).  I noted an afflicted gPXE
system encountered:
-Massive unrelated ARP traffic
-Various systems unrelated to the boot process trying to do ICMP ECHO
REQUEST to the system, evoking echo replies
-Various systems attempting to connect to TCP services on the system while
it was in gPXE, causing it to have to send TCP reset packets
My big problem here is that my attempts to reproduce the problem on a
smaller testbed than thousands of systems have not panned out, and after
this patch, I have no justification in scale-testing without the patch on a
large system anymore, since the patch can get them into production faster.
 As you can guess, my patch includes the one from Michael Brown that only
pulls in ARP if it is a reply, and additionally disables ICMP ECHO REPLY and
TCP RESET packets.  I have no idea if problematic conditions would have been
resolved there.  I have attached that patch for discussion as well.

A third issue is a bug with older KVM stacks (including RHEL5.4).  The bug
is KVM's, but I worked around it in gpxe.  I have attached the rather
simplistic, if odd looking, workaround I've been using of sticking some nops
in.

I apologize if the patches turn out unclean, they are provided as I use them
and I don't look closely to see if it applied cleanly or fuzzy (shouldn't
hit rejects though).  If there is an issue, let me know.  I apply other
patches (iBFT on PXE, 'hdboot' command to call int13), but those are
features and I've submitted those ideas (available via etherboot-discuss
somewhere if someone wants them) and potentially better designs than mine
were suggested.  I don't know if any were implemented though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://etherboot.org/pipermail/gpxe/attachments/20100129/117da9e7/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gpxe-int18.patch
Type: application/octet-stream
Size: 3478 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe/attachments/20100129/117da9e7/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gpxe-ignorepackets.patch
Type: application/octet-stream
Size: 2025 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe/attachments/20100129/117da9e7/attachment-0001.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gpxe-kvmworkaround.patch
Type: application/octet-stream
Size: 390 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe/attachments/20100129/117da9e7/attachment-0002.obj 


More information about the gPXE mailing list