Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
unity-patch [2009/11/25 01:23]
rwcr
unity-patch [2009/11/25 10:47]
rwcr
Line 1: Line 1:
 ====== Unity patch ====== ====== Unity patch ======
  
-I (rwcr) have been working on a rather extensive modification of gPXE, to allow images and SAN devices (and eventually files on filesystems) to be treated with more unity. This resolves a great many "ugly hack" comments, makes SAN booting less architecture-dependent,​ and allows one to SAN-boot ISO images (to name a few possibilities). The cost is a tiny size increase in image code due to an additional layer of indirection,​ and a more significant size increase in block device ​code for the same reason. ​Mainly, though, since this changes rather ​lot in the codebase, I want to make sure people understand the patch so they can review it sensibly:-)+I (rwcr) have been working on a rather extensive modification of gPXE, to allow images and SAN devices (and eventually files on filesystems) to be treated with more unity. This resolves a great many "ugly hack" comments, makes SAN booting less architecture-dependent,​ and allows one to SAN-boot ISO images (to name a few possibilities). The cost is a tiny size increase in image type codesize ​due to an additional layer of indirection,​ and a more significant size increase in block device ​codesize ​for the same reason. ​This page is meant to summarize the changes, since a commit message can only be so long.
  
 ===== Data source abstraction ===== ===== Data source abstraction =====
 A new abstraction is introduced, that of a "data source"​ (''​struct source''​),​ that can support random access and splitting and blocking of reads. In the case of an image already in memory it reduces to constructions like ''​copy_from_user()'';​ to preserve size (about 800 bytes) in ROM images, it is possible to define ''​MEM_SOURCE''​ in ''​config/​general.h''​ such that this reduction occurs at compile-time. Normally, though, a layer of indirection in ''​core/​source.c''​ is kept around to support SAN devices and eventually files on a filesystem, which may not be always resident in memory, may have requirements that they are accessed in fixed-size blocks, and may only support reading or writing a certain number of blocks at a time. A new abstraction is introduced, that of a "data source"​ (''​struct source''​),​ that can support random access and splitting and blocking of reads. In the case of an image already in memory it reduces to constructions like ''​copy_from_user()'';​ to preserve size (about 800 bytes) in ROM images, it is possible to define ''​MEM_SOURCE''​ in ''​config/​general.h''​ such that this reduction occurs at compile-time. Normally, though, a layer of indirection in ''​core/​source.c''​ is kept around to support SAN devices and eventually files on a filesystem, which may not be always resident in memory, may have requirements that they are accessed in fixed-size blocks, and may only support reading or writing a certain number of blocks at a time.
  
-A data source is an implementation-specific structure (''​struct download'',​ ''​struct scsi_device'',​ etc) that contains a ''​struct source''​ by value. The containing structure must be reference-counted,​ and ''​source.refcnt''​ points to that reference counter. One fills in ''​source.read''​ and optionally ''​source.write''​ with appropriate functions, optionally defines ''​source.blkshift''​ and ''​source.blkburst''​ to restrict the alignment and length of requests they can receive, and sets ''​source.len''​ to the length of the data source in bytes. After this point, the conceptual ​data source is passed around as a pointer to the ''​struct source'' ​member; the implementation-specific containing structure can be retrieved with ''​container_of()'',​ and it will automatically be freed when the last reference to its source is dropped. (References taken against the data source increment the reference counter in the containing structure.)+A data source is an implementation-specific structure (''​struct download'',​ ''​struct scsi_device'',​ etc) that contains a ''​struct source''​ by value. The containing structure must be reference-counted,​ and ''​source.refcnt''​ points to that reference counter. One fills in ''​source.read''​ and optionally ''​source.write''​ with appropriate functions, optionally defines ''​source.blkshift''​ and ''​source.blkburst''​ to restrict the alignment and length of requests they can receive, and sets ''​source.len''​ to the length of the data source in bytes. After this point, the data source is passed around as a pointer to the ''​struct source'';​ the implementation-specific containing structure can be retrieved with ''​container_of()'',​ and it will automatically be freed when the last reference to its source is dropped. (References taken against the data source increment the reference counter in the containing structure.)
  
-Data sources support two additional features. First, they can be //loaded//, to allow for anything that needs the whole source in memory to work with it but doesn'​t particularly care where in memory it goes. Second, they can be //​attached//,​ using platform-specific handlers to make the contents of the source available (as an emulated disk or otherwise) to a booted operating system. Both INT13 hooks and iBFT/​aBFT/​sBFT filling are implemented as source attachers. Both loading and attaching can be done recursively,​ so one can attach a SAN disk, boot from it (which will attach, execute, detach), and if the boot fails, still have the disk attached when gPXE exits; this is a cleaner way of achieving the "​keep-san"​ functionality. One fills in ''​source.data''​ with a user pointer to indicate a source already resident in memory (loading and unloading become a no-op), or sets ''​source.loaded''​ to a nonzero integer while keeping ''​source.data''​ null to indicate a source that cannot sensibly be loaded in its entirety (e.g. a SAN disk).+Data sources support two additional features. First, they can be //loaded//, to allow for anything that needs the whole source in memory to work with it but doesn'​t particularly care where in memory it goes. (Loaded sources wind up on the external heap like downloaded images.) ​Second, they can be //​attached//,​ using platform-specific handlers to make the contents of the source available (as an emulated disk or otherwise) to a booted operating system. Both INT13 hooks and iBFT/​aBFT/​sBFT filling are implemented as source attachers. The code requesting that a source be attached doesn'​t need to know how that attachment is done, which keeps things as platform-independent as possible. Both loading and attaching can be done recursively,​ so one can attach a SAN disk, boot from it (which will attach, execute, detach), and if the boot fails, still have the disk attached when gPXE exits; this is a cleaner way of achieving the "​keep-san"​ functionality. One fills in ''​source.data''​ with a user pointer to indicate a source already resident in memory (loading and unloading become a no-op), or sets ''​source.loaded''​ to a nonzero integer while keeping ''​source.data''​ null to indicate a source that cannot sensibly be loaded in its entirety (e.g. a SAN disk). 
 + 
 +**Size impact:** source.o +792 unless ''​MEM_SOURCE''​ minimalist option enabled
  
 ===== Changes to downloads ===== ===== Changes to downloads =====
Line 17: Line 19:
   * The downloading abilities of ''​imgfetch()''​ are separated into a new function, ''​download_uri()'',​ in ''​usr/​dlmgmt.c''​.   * The downloading abilities of ''​imgfetch()''​ are separated into a new function, ''​download_uri()'',​ in ''​usr/​dlmgmt.c''​.
   * Instead of calling ''​download_uri()''​ directly, ''​imgfetch()''​ calls ''​vfs_fetch_uri()'',​ which does some magic multiplexing so you can ''​imgfetch''​ a SAN disk or eventually a file on a filesystem as well as a downloadable URI. The reference to ''​vfs_fetch_uri()''​ is weak, so unless ''​vfs.c''​ is linked in by a common feature in the API of SAN protocols and filesystem types, it will reduce to ''​download_uri()''​ at compile time.   * Instead of calling ''​download_uri()''​ directly, ''​imgfetch()''​ calls ''​vfs_fetch_uri()'',​ which does some magic multiplexing so you can ''​imgfetch''​ a SAN disk or eventually a file on a filesystem as well as a downloadable URI. The reference to ''​vfs_fetch_uri()''​ is weak, so unless ''​vfs.c''​ is linked in by a common feature in the API of SAN protocols and filesystem types, it will reduce to ''​download_uri()''​ at compile time.
 +
 +**Size impact:** dlmgmt.o +166, imgmgmt.o -29, downloader.o +120, net +257.
  
 ===== Changes to images ===== ===== Changes to images =====
Line 22: Line 26:
  
 A new image API function, ''​image_set_source()'',​ can be used to set or change the data source associated with an image. It handles reference counting properly, and an image releases its reference to its data source when freed. A new image API function, ''​image_set_source()'',​ can be used to set or change the data source associated with an image. It handles reference counting properly, and an image releases its reference to its data source when freed.
 +
 +**Size impact:** image.o +41, image_cmd.o -13
 +
 +^ image type ^ mem - old ^ full - mem ^ net    |
 +| bootsector | +121      | +74        | +195   |
 +| bzimage ​   | +98       | +39        | +137   |
 +| com32      | +25       | +9         | +34    |
 +| comboot ​   | +17       | -4         | +13    |
 +| elf        | +30       | +12        | +42    |
 +| elfboot ​   | +3        | +2         | +5     |
 +| multiboot ​ | +47       | +22        | +69    |
 +| pxe_image ​ | +7        | -4         | +3     |
 +| script ​    | +38       | +19        | +57    |
 +^ Totals ​    | +386      | +169       | +555   |
 +
 +Most of the ''​mem - old''​ impact is from the 64-bitness of ''​image->​source->​len''​ and the additional level of indirection required to access the fields of ''​image->​source''​. The ''​full - mem''​ impact is from the fact that ''​source_read_user()''​ takes two more parameters, including one 64-bit one, than the ''​memcpy()''​ that ''​copy_from_user()''​ reduced to before.
  
 ===== Changes to SAN booting ===== ===== Changes to SAN booting =====
 Currently, each SAN boot protocol has four components (example): the block device protocol (''​scsi.c''​),​ the networked backend transport (''​iscsi.c''​),​ the firmware table creator (''​ibft.c''​),​ and the boot glue (''​iscsiboot.c''​). The latter two are OS-specific,​ and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "​attach"​ it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable,​ as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies. Currently, each SAN boot protocol has four components (example): the block device protocol (''​scsi.c''​),​ the networked backend transport (''​iscsi.c''​),​ the firmware table creator (''​ibft.c''​),​ and the boot glue (''​iscsiboot.c''​). The latter two are OS-specific,​ and the boot glue is the entry point; it creates a block device of the appropriate type, calls the networked backend to "​attach"​ it, calls the firmware table creator to fill in data about it, hooks the device via int13h, attempts to boot it, and undoes all of that if keep-san isn't set and the boot fails. This is all rather undesirable,​ as it involves a lot of code duplication and makes SAN booting inherently platform-specific because that's where its entry point lies.
  
-In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''​bootsector''​ image format (a semi-thin wrapper around the existing ''​call_bootsector()''​) and a generalization of gPXE's ElTorito support. One can ''​chain''​ or ''​imgfetch''​ a SAN disk in the same way as a URI, and ''​sanboot''​ would be identical to ''​chain''​ were it not for the need to keep support for the ''​keep-san''​ setting. ​As such, the boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''​struct blockdev''​ interface (''​blockdev''​ and ''​ramdisk''​ are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups.+In the new system, SAN booting is not a special case; any data source that looks like a hard disk or CD can be booted, thanks to a new ''​bootsector''​ image format (a semi-thin wrapper around the existing ''​call_bootsector()''​) and a generalization of gPXE's ElTorito support. One can ''​chain''​ or ''​imgfetch''​ a SAN disk in the same way as a URI, and ''​sanboot''​ would be identical to ''​chain''​ were it not for the need to keep legacy ​support for the ''​keep-san''​ setting. ​The boot glue is removed entirely in the unity patch. The firmware table creator is extended with a small glue function to make it work as a data source attacher, so SAN protocol code need not know about its existence directly; this allows the SAN code to remain platform-independent. The block device protocol provides a data source interface instead of a ''​struct blockdev''​ interface (''​blockdev''​ and ''​ramdisk''​ are both done away with) and the network backend transport provides a VFS binding (see below) to continue the existing URI-like syntax for lookups. 
 + 
 +Attachment of a data source now occurs in three places: before attempting a SAN boot if ''​keep-san''​ is set; just before executing a bootsector or ElTorito image (and detached if execution fails); and when the user explicitly requests it using a new ''​attach''​ command. The traditional use-case for ''​keep-san'',​ a Windows install, is replaced simply by 
 +  gPXE> attach iscsi:​1.2.3.4::::​iqn.2009-06.com.example.host:​wininst 
 +  gPXE> exit 
 +and can be automated by serving a gPXE script with the "​attach"​ line in it. Also, ''​attach''​ now supports an option ''​-t extra''​ to attach the source as an "​extra"​ disk (numbered after existing hard drives) instead of the default of a "​boot"​ disk (first hard drive, pushing others down). You can even attach a "​boot"​ disk that's blank, an "​extra"​ disk containing WinPE, boot the "​extra"​ disk, and use it to install Windows onto the blank iSCSI target :-) 
 + 
 +**Size impact:** 
 + 
 +^ object ​   ^ size change | 
 +| autoboot ​ | +33         | 
 +| int13     | +320        | 
 +| keepsan ​  | -128        | 
 +| abft      | +23         | 
 +| ibft      | +29         | 
 +| aoe       | +121        | 
 +| iscsi     | +105        | 
 +| aoeboot ​  | -427        | 
 +| iscsiboot | -453        | 
 +| ata       | +149        | 
 +| scsi      | +350        | 
 +^ Total     | +122        |
  
-To be continued... 

QR Code
QR Code unity-patch (generated for current page)