Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
unity-patch [2009/11/25 10:05] rwcr |
unity-patch [2009/11/25 11:20] (current) rwcr |
||
---|---|---|---|
Line 46: | Line 46: | ||
Currently, each SAN boot protocol has four components (example): the block device protocol (''scsi.c''), the networked backend transport (''iscsi.c''), the firmware table creator (''ibft.c''), and the boot glue (''iscsiboot.c''). The latter two are OS-specific, and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "attach" it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable, as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies. | Currently, each SAN boot protocol has four components (example): the block device protocol (''scsi.c''), the networked backend transport (''iscsi.c''), the firmware table creator (''ibft.c''), and the boot glue (''iscsiboot.c''). The latter two are OS-specific, and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "attach" it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable, as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies. | ||
- | In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''bootsector'' image format (a semi-thin wrapper around the existing ''call_bootsector()'') and a generalization of gPXE's ElTorito support. One can ''chain'' or ''imgfetch'' a SAN disk in the same way as a URI, and ''sanboot'' would be identical to ''chain'' were it not for the need to keep support for the ''keep-san'' setting. As such, the boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''struct blockdev'' interface (''blockdev'' and ''ramdisk'' are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups. | + | In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''bootsector'' image format (a semi-thin wrapper around the existing ''call_bootsector()'') and a generalization of gPXE's ElTorito support. One can ''chain'' or ''imgfetch'' a SAN disk in the same way as a URI, and ''sanboot'' would be identical to ''chain'' were it not for the need to keep legacy support for the ''keep-san'' setting. The boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''struct blockdev'' interface (''blockdev'' and ''ramdisk'' are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups. |
- | To be continued... | + | Attachment of a data source now occurs in three places: before attempting a SAN boot if ''keep-san'' is set; just before executing a bootsector or ElTorito image (and detached if execution fails); and when the user explicitly requests it using a new ''attach'' command. The traditional use-case for ''keep-san'', a Windows install, is replaced simply by |
+ | gPXE> attach -f iscsi:1.2.3.4::::iqn.2009-06.com.example.host:wininst | ||
+ | gPXE> exit | ||
+ | and can be automated by serving a gPXE script with the "attach" line in it. (The ''-f''/''--fetch'' option asks to create an image for a URI and attach that, instead of attaching an already-fetched image.) Also, ''attach'' now supports an option ''-t extra'' to attach the source as an "extra" disk (numbered after existing hard drives) instead of the default of a "boot" disk (first hard drive, pushing others down). You can even attach a "boot" disk that's blank, an "extra" disk containing WinPE, boot the "extra" disk, and use it to install Windows onto the blank iSCSI target :-) | ||
+ | |||
+ | **Size impact:** | ||
+ | |||
+ | ^ object ^ size change | | ||
+ | | autoboot | +33 | | ||
+ | | int13 | +320 | | ||
+ | | keepsan | -128 | | ||
+ | | abft | +23 | | ||
+ | | ibft | +29 | | ||
+ | | aoe | +121 | | ||
+ | | iscsi | +105 | | ||
+ | | aoeboot | -427 | | ||
+ | | iscsiboot | -453 | | ||
+ | | ata | +149 | | ||
+ | | scsi | +350 | | ||
+ | ^ Total | +122 | | ||
+ | |||
+ | ===== Binding abstraction ===== | ||
+ | How does one acquire a data source in the first place? Well, if you're downloading it, you get it using ''download_uri()'', which calls ''create_downloader()'', which calls ''xfer_open()''. It would be a mistake to try to fit random-access storage into the ''xfer_interface'' framework; that framework does a marvelous job of handling network sockets, but it's very stream-oriented. So URI openers will stay download-only. How do we fit in SAN protocols, and eventually filesystem access? | ||
+ | |||
+ | The unity patch introduces the concept of a //binding//, an object that lets one look up a URI and get a data source back. Bindings are registered with a name, and when one attempts to fetch a URI with that name as the scheme, it gets looked up in the appropriate binding instead of downloaded. SAN boot protocols are implemented as global bindings named ''iscsi'', ''aoe'', ''ib_srp'', etc, so when you do | ||
+ | gPXE> imgfetch iscsi:1.2.3.4::::iqn.2009-06.com.example.host:mydisk | ||
+ | it's passing a URI to ''iscsi_lookup()'' that has ''scheme'' set to ''iscsi'' and ''opaque'' set to ''1.2.3.4::::iqn.2009-06.com.example.host:mydisk''. The fact that a full URI is passed allows something like HTTPDisk or NFS to work intuitively; you can (assuming proper implementation of an ''httpdisk'' SAN boot protocol) | ||
+ | gPXE> chain httpdisk://my.server/myimage.hdd | ||
+ | and it'll work exactly like ''chain http://...'' except the whole image won't be downloaded before it's booted. | ||
+ | |||
+ | Looking ahead a bit, this patch implements the concept of a //binding type//, a way of creating bindings that are based on some data source instead of being global. For instance, if ''ext2'' is a binding type, you can do | ||
+ | gPXE> imgfetch -n disk aoe:e0.0 | ||
+ | gPXE> attach disk | ||
+ | gPXE> bind -t ext2 disk bootfs | ||
+ | disk on bootfs type ext2 | ||
+ | gPXE> chain bootfs:/boot/vmlinuz | ||
+ | The explicit ''attach'', to fill the aBFT for the kernel, will probably become unnecessary. | ||
+ | |||
+ | A binding is memory-managed much like a network device; the allocation for its structure contains some amount of private data requested by the binding type creating it, and ''binding->priv'' points at that private data. Sources looked up in the binding hold a reference to it, and a reference is taken when it is registered with a name as well. The binding holds a reference to the source it's based on. This system keeps sources and bindings around as long as anyone is using them. | ||
+ | |||
+ | A global binding is created as a ''struct global_binding'', which serves as the template for an autoregistered ''struct binding'' at init-time. Data source attachers can specify a ''struct global_binding'' to limit themselves to, so only AoE disks will be recorded in the aBFT, etc. | ||
+ | |||
+ | There's also a special URI syntax for recursive binding in a single command: | ||
+ | gPXE> chain ext2(part(aoe:e0.0):1):/boot/bzimage | ||
+ | If both ext2 filesystems and partition tables can be autodetected, that reduces to | ||
+ | gPXE> chain ((aoe:e0.0):1):/boot/bzimage | ||
+ | This is rather obtuse, but it does allow a complicated boot path to be specified in a single DHCP filename option. | ||
+ | |||
+ | **Size impact**: ''uri.o'' +56, ''vfs.o'' +895, ''vfs_cmd.o'' +2037. |