Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
unity-patch [2009/11/25 10:05]
rwcr
unity-patch [2009/11/25 11:20]
rwcr
Line 46: Line 46:
 Currently, each SAN boot protocol has four components (example): the block device protocol (''​scsi.c''​),​ the networked backend transport (''​iscsi.c''​),​ the firmware table creator (''​ibft.c''​),​ and the boot glue (''​iscsiboot.c''​). The latter two are OS-specific,​ and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "​attach"​ it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable,​ as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies. Currently, each SAN boot protocol has four components (example): the block device protocol (''​scsi.c''​),​ the networked backend transport (''​iscsi.c''​),​ the firmware table creator (''​ibft.c''​),​ and the boot glue (''​iscsiboot.c''​). The latter two are OS-specific,​ and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "​attach"​ it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable,​ as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies.
  
-In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''​bootsector''​ image format (a semi-thin wrapper around the existing ''​call_bootsector()''​) and a generalization of gPXE's ElTorito support. One can ''​chain''​ or ''​imgfetch''​ a SAN disk in the same way as a URI, and ''​sanboot''​ would be identical to ''​chain''​ were it not for the need to keep support for the ''​keep-san''​ setting. ​As such, the boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''​struct blockdev''​ interface (''​blockdev''​ and ''​ramdisk''​ are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups.+In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''​bootsector''​ image format (a semi-thin wrapper around the existing ''​call_bootsector()''​) and a generalization of gPXE's ElTorito support. One can ''​chain''​ or ''​imgfetch''​ a SAN disk in the same way as a URI, and ''​sanboot''​ would be identical to ''​chain''​ were it not for the need to keep legacy ​support for the ''​keep-san''​ setting. ​The boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''​struct blockdev''​ interface (''​blockdev''​ and ''​ramdisk''​ are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups.
  
-To be continued...+Attachment of a data source now occurs in three places: before attempting a SAN boot if ''​keep-san''​ is set; just before executing a bootsector or ElTorito image (and detached if execution fails); and when the user explicitly requests it using a new ''​attach''​ command. The traditional use-case for ''​keep-san'',​ a Windows install, is replaced simply by 
 +  gPXE> attach -f iscsi:​1.2.3.4::::​iqn.2009-06.com.example.host:​wininst 
 +  gPXE> exit 
 +and can be automated by serving a gPXE script with the "​attach"​ line in it. (The ''​-f''/''​--fetch''​ option asks to create an image for a URI and attach that, instead of attaching an already-fetched image.) Also, ''​attach''​ now supports an option ''​-t extra''​ to attach the source as an "​extra"​ disk (numbered after existing hard drives) instead of the default of a "​boot"​ disk (first hard drive, pushing others down). You can even attach a "​boot"​ disk that's blank, an "​extra"​ disk containing WinPE, boot the "​extra"​ disk, and use it to install Windows onto the blank iSCSI target :-) 
 + 
 +**Size impact:** 
 + 
 +^ object ​   ^ size change | 
 +| autoboot ​ | +33         | 
 +| int13     | +320        | 
 +| keepsan ​  | -128        | 
 +| abft      | +23         | 
 +| ibft      | +29         | 
 +| aoe       | +121        | 
 +| iscsi     | +105        | 
 +| aoeboot ​  | -427        | 
 +| iscsiboot | -453        | 
 +| ata       | +149        | 
 +| scsi      | +350        | 
 +^ Total     | +122        | 
 + 
 +===== Binding abstraction ===== 
 +How does one acquire a data source in the first place? Well, if you're downloading it, you get it using ''​download_uri()'',​ which calls ''​create_downloader()'',​ which calls ''​xfer_open()''​. It would be a mistake to try to fit random-access storage into the ''​xfer_interface''​ framework; that framework does a marvelous job of handling network sockets, but it's very stream-oriented. So URI openers will stay download-only. How do we fit in SAN protocols, and eventually filesystem access? 
 + 
 +The unity patch introduces the concept of a //​binding//,​ an object that lets one look up a URI and get a data source back. Bindings are registered with a name, and when one attempts to fetch a URI with that name as the scheme, it gets looked up in the appropriate binding instead of downloaded. SAN boot protocols are implemented as global bindings named ''​iscsi'',​ ''​aoe'',​ ''​ib_srp'',​ etc, so when you do 
 +  gPXE> imgfetch iscsi:​1.2.3.4::::​iqn.2009-06.com.example.host:​mydisk 
 +it's passing a URI to ''​iscsi_lookup()''​ that has ''​scheme''​ set to ''​iscsi''​ and ''​opaque''​ set to ''​1.2.3.4::::​iqn.2009-06.com.example.host:​mydisk''​. The fact that a full URI is passed allows something like HTTPDisk or NFS to work intuitively;​ you can (assuming proper implementation of an ''​httpdisk''​ SAN boot protocol) 
 +  gPXE> chain httpdisk://​my.server/​myimage.hdd 
 +and it'll work exactly like ''​chain http://​...''​ except the whole image won't be downloaded before it's booted. 
 + 
 +Looking ahead a bit, this patch implements the concept of a //binding type//, a way of creating bindings that are based on some data source instead of being global. For instance, if ''​ext2''​ is a binding type, you can do 
 +  gPXE> imgfetch -n disk aoe:e0.0 
 +  gPXE> attach disk 
 +  gPXE> bind -t ext2 disk bootfs 
 +  disk on bootfs type ext2 
 +  gPXE> chain bootfs:/​boot/​vmlinuz 
 +The explicit ''​attach'',​ to fill the aBFT for the kernel, will probably become unnecessary. 
 + 
 +A binding is memory-managed much like a network device; the allocation for its structure contains some amount of private data requested by the binding type creating it, and ''​binding->​priv''​ points at that private data. Sources looked up in the binding hold a reference to it, and a reference is taken when it is registered with a name as well. The binding holds a reference to the source it's based on. This system keeps sources and bindings around as long as anyone is using them. 
 + 
 +A global binding is created as a ''​struct global_binding'',​ which serves as the template for an autoregistered ''​struct binding''​ at init-time. Data source attachers can specify a ''​struct global_binding''​ to limit themselves to, so only AoE disks will be recorded in the aBFT, etc. 
 + 
 +There'​s also a special URI syntax for recursive binding in a single command: 
 +  gPXE> chain ext2(part(aoe:​e0.0):​1):/​boot/​bzimage 
 +If both ext2 filesystems and partition tables can be autodetected,​ that reduces to 
 +  gPXE> chain ((aoe:​e0.0):​1):/​boot/​bzimage 
 +This is rather obtuse, but it does allow a complicated boot path to be specified in a single DHCP filename option. 
 + 
 +**Size impact**: ''​uri.o''​ +56, ''​vfs.o''​ +895, ''​vfs_cmd.o''​ +2037.

QR Code
QR Code unity-patch (generated for current page)