Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
soc:alanshieh [2006/06/12 10:03] ashieh |
soc:alanshieh [2006/08/11 13:15] ashieh |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Alan Shieh, Linux UNDI Driver ====== | ====== Alan Shieh, Linux UNDI Driver ====== | ||
- | IRC logs, e-mails, and development notes coming soon! | + | |
+ | ===== Deliverables and Timeline ===== | ||
+ | |||
+ | Note: Since I am working with Etherboot 5.4.x, I am going directly for 16:32 UNDI stack support. As of 7/30, the UNDI driver works with the NE2K-PCI, which uses PIO to send data to/from the card. | ||
+ | |||
+ | * Test on alternate Etherboot hardware, including real hardware | ||
+ | ** Test card that uses PIO to set up DMA | ||
+ | ** Test card that uses memory mapped registers to set up DMA | ||
+ | * Test with full network boot (LTSP, NFS root) | ||
+ | |||
+ | * Experiment with getting other other PXE stacks -- inference of segment lengths via E820 holes. | ||
+ | |||
+ | These steps are done | ||
+ | |||
+ | * Implement memory map functionality for Linux | ||
+ | * Set up UNDI Probe memory map | ||
+ | * Find UNDI ROM | ||
+ | * Make sure E820 Map is sane [[E820IRC:IRC Logs for E820 issue]]. I am Here (6/15/2006). Estimated completion time 6/20/2006 | ||
+ | * Hard code segment descriptor & location. 16:32 downcall (est 6/27/2006) | ||
+ | * Test UNDI calls, see proposal for details (est 7/4/2006) | ||
+ | * Integration with TUN/TAP device; transmit data with Linux (est 7/11/2006) | ||
+ | * PXE Extensions for segment descriptor & location | ||
+ | * Interrupt processing cleanup (est 7/18/2006) | ||
+ | |||
+ | |||
+ | ===== Resources ===== | ||
+ | [[Alan's test / development infrastructure]] | ||
+ | ===== UNDI proposal ===== | ||
+ | |||
+ | [[OldUNDIProposal]] | ||
+ | |||
+ | <file> | ||
+ | = Goals = | ||
+ | * Support both 16:16 and 16:32 protected mode UNDI stacks | ||
+ | |||
+ | = Phase 1: 16:16 UNDI stack = | ||
+ | |||
+ | Most work should be reusable in the 16:32 mode. The main difference | ||
+ | will be the page table and LDT/GDT setup, which will be driven by the | ||
+ | PXE extensions to provide the necessary information to the kernel. | ||
+ | |||
+ | == Linux UNDI execution process == | ||
+ | |||
+ | The process will interact with a network driver to gain access to the | ||
+ | kernel send and receive queues, and to perform interrupt processing. | ||
+ | |||
+ | The 16:16 version will make assumptions about page table / memory | ||
+ | layout. This restriction will be removed in the 16:32 version. | ||
+ | |||
+ | High level requirements | ||
+ | 1. Page table will be initialized for two "regions": | ||
+ | a) PXE execution environment: the physical/virtual address range that | ||
+ | Etherboot is known to reside in. | ||
+ | b) Area for Linux process | ||
+ | 2. Implement 16:32=>16:16 thunks between Linux process and PXE code. | ||
+ | ** 16:16 parameter passing area for all the parameter structures | ||
+ | 3. Poll for interrupts using PXENV_UNDI_ISR_IN_START. | ||
+ | 4. Implement bottom-half processing using | ||
+ | PXEENV_UNDI_ISR.PXEENV_UNDI_ISR_IN_PROCESS, | ||
+ | PXEENV_UNDI_ISR.PXEENV_UNDI_ISR_GET_NEXT | ||
+ | |||
+ | The processing flowchart is provided in the PXE 2.1 specification. | ||
+ | |||
+ | == Implement stub driver for NIC == | ||
+ | Prototype will support only PCI devices. | ||
+ | |||
+ | 1. Perform PCI probe using PCI_IDs specified as module | ||
+ | parameters. | ||
+ | 2. Initialize GDT using !PXE information. | ||
+ | 3. Pump packets between UNDI execution process and kernel | ||
+ | transmit/receive queues. | ||
+ | 4. Rmmod should clean up properly so that a full driver can be loaded | ||
+ | later. | ||
+ | |||
+ | == Communications between UNDI execution process and driver == | ||
+ | The driver will export two pipes under /proc with the following interfaces: | ||
+ | |||
+ | kernel_to_process: | ||
+ | * TxPacket(char data[len], int len); // Kernel requests to send a packet | ||
+ | |||
+ | Maybe mii/ethtool/ifconfig? | ||
+ | |||
+ | process_to_kernel: | ||
+ | * RxPacket(char data[len], int len); // Process received packet, tell | ||
+ | kernel to queue it for network stack | ||
+ | |||
+ | == Etherboot & boot process modifications == | ||
+ | 1. Modify Etherboot to report 16:16 segment descriptors via !PXE. | ||
+ | 2. Add a configuration flag to Etherboot to prevent it from unloading UNDI | ||
+ | 3. Pass PCI_IDs to kernel | ||
+ | 4. Linux: | ||
+ | Before switching to protected mode, reserve !PXE.SegDescCnt | ||
+ | descriptors in GDT, and set !PXE.FirstSelector to the appropriate | ||
+ | location. | ||
+ | |||
+ | The descriptors will not be copied from !PXE structure until the | ||
+ | stub driver is loaded. Hopefully, the PXE stack will not gain | ||
+ | control in the interim and crash because the segment descriptors | ||
+ | have not been initialized. | ||
+ | |||
+ | 5. Use PCI_IDs to initialize NIC stub driver | ||
+ | |||
+ | == Initialization and testing == | ||
+ | I. Internal testing from within the driver process. | ||
+ | Implement an interactive debugging console? | ||
+ | |||
+ | 1. UNDI_OPEN, UNDI_CLOSE, UNDI_GET_STATE | ||
+ | |||
+ | UNDI_GET_INFORMATION, UNDI_GET_STATISTICS, | ||
+ | UNDI_CLEAR_STATISTICS | ||
+ | (stats are useful for debugging later functionality) | ||
+ | |||
+ | UNDI_INITIATE_DIAGS | ||
+ | (sanity check the driver and execution environment) | ||
+ | |||
+ | 2. UNDI_TRANSMIT | ||
+ | 3. UNDI_RECEIVE | ||
+ | |||
+ | Optional: UNDI_SET_STATION_ADDRESS, UNDI_GET_NIC_TYPE, UNDI_GET_IFACE_INFO | ||
+ | |||
+ | Ignored: UNDI_SET_PACKET_FILTER, UNDI_SET_MULTICAST_ADDRESS, UNDI_FORCE_INTERRUPT, UNDI_GET_MULTICAST_ADDRESS | ||
+ | |||
+ | 4. IPC pipes | ||
+ | 5. Interrupt polling, Rx/Tx handling | ||
+ | |||
+ | II. Linux milestones & tests | ||
+ | 1. Transmit & receive | ||
+ | 2. DHCP | ||
+ | 3. ARP | ||
+ | 4. TCP & NFS | ||
+ | 5. Simulated boot process: | ||
+ | a) fetch network module from remote host, e.g. via HTTP or NFS | ||
+ | b) unload UNDI module, kill UNDI execution process | ||
+ | c) load network module | ||
+ | d) network tests (as before) | ||
+ | |||
+ | = Phase 2: 16:32 mode = | ||
+ | |||
+ | == Extend PXE interface == | ||
+ | |||
+ | This mode will require additional information from the PXE stack, such | ||
+ | as the precise page table / memory layout and LDT/GDT entries for the | ||
+ | PXE stack. | ||
+ | |||
+ | This information will be returned via a new UNDI op-code | ||
+ | PXENV_GET_UNDI_ENV32. This entry point will be specially coded so that | ||
+ | it can execute in 16:16 mode (e.g. with KEEP_IT_REAL compile/link | ||
+ | options) via !PXE.EntryPointSP. | ||
+ | |||
+ | This will be the only op-code supported via that entry point, so the | ||
+ | vast majority of Etherboot will use 16:32. | ||
+ | |||
+ | typedef struct s_PXENV_GET_UNDI_ENV32 | ||
+ | { | ||
+ | /* Outputs */ | ||
+ | PXENV_STATUS Status; | ||
+ | ADDR32 PageDirectoryBase; | ||
+ | SEGOFF16 EntryPoint16_32; | ||
+ | |||
+ | /* Inputs */ | ||
+ | UINT32 DescriptorBufferSize; | ||
+ | ADDR32 DescriptorBuffer; | ||
+ | |||
+ | } t_PXENV_GET_UNDI_ENV32; | ||
+ | |||
+ | All UNDI op-codes will be accessible only through EntryPoint16_32. | ||
+ | |||
+ | PageDirectoryBase will be used to pass the memory map to the kernel. | ||
+ | The format is the native IA-32 2-level page table. The AVAIL fields of | ||
+ | the PTE and PDEs will be used to convey additional information about | ||
+ | each page: | ||
+ | |||
+ | 000 = Normal page, can be relocated in physical memory | ||
+ | 001 = PCI DMA page, can only be relocated if the relocation is transparent to the PCI device. | ||
+ | 010 = PCI MMIO page. Must use these exact physical addresses. | ||
+ | |||
+ | min(!PXE.SegDescCnt, DescriptorBufferSize / sizeof(descriptor)) | ||
+ | entries will be set in DescriptorBuffer. | ||
+ | |||
+ | Each descriptor entry is 64 bits, in the native IA-32 segment | ||
+ | descriptor format. This way, any arbitrary set of descriptors can be | ||
+ | specified. | ||
+ | |||
+ | == Delta from 16:16 prototype == | ||
+ | 1. Page table, LDT/GDT initialization | ||
+ | 2. Thunks will call into 16:32 if available, otherwise 16:16 | ||
+ | |||
+ | ========================== | ||
+ | End of base functionality goals | ||
+ | ========================== | ||
+ | |||
+ | = Compatibility improvements = | ||
+ | |||
+ | 1. Interrupt-driven operation for quirky cards | ||
+ | |||
+ | Polling is much easier to deal with, however according to comments in | ||
+ | undi.c, this doesn't work for all cards. So interrupt-driven operation | ||
+ | will increase the compatibility of the driver. | ||
+ | |||
+ | This will be quite tricky, since incorrect top-half handling can | ||
+ | easily jam the system. | ||
+ | |||
+ | After PCI probe, the kernel module will install a small interrupt handler. | ||
+ | The interrupt handler will need to use | ||
+ | PXEENV_UNDI_ISR.PXEENV_UNDI_ISR_IN_START. However, this is | ||
+ | not readily accessible, as it resides in the UNDI execution process. | ||
+ | |||
+ | If the NIC IRQ line is not shared, then this is trivial. The IRQ line | ||
+ | can be disabled while invoking the PXEENV_UNDI_ISR_IN_START in the | ||
+ | UNDI execution process, and the re-enabled when the process returns. | ||
+ | |||
+ | If the IRQ line is shared, we'll still need to mask the IRQ line while | ||
+ | dispatching to the ISR. However, this will also mask the other | ||
+ | devices, which might be needed to execute the process (e.g. disk IRQs | ||
+ | for demand paging). To decrease the likelihood of problems, the entire | ||
+ | process should be pinned in memory, linked statically, | ||
+ | etc. Alternatively, the kernel could switch to a polling strategy on | ||
+ | all IRQs while waiting for the user application to return. | ||
+ | |||
+ | Another solution would be to pull an UNDI execution environment into | ||
+ | the kernel context, then dispatching directly to the UNDI ISR, however | ||
+ | this would require changes to the kernel memory map and probably end | ||
+ | up being messy. | ||
+ | |||
+ | == New kernel<=>user pipe commands == | ||
+ | |||
+ | kernel_to_process: | ||
+ | * UNDI_Int(t_PXEENV_UNDI_ISR isr); // UNDI interrupt received by top-half | ||
+ | |||
+ | process_to_kernel: | ||
+ | * UNDI_Int_Ack(); // Acknowledge interrupt, which will reenable the IRQ line | ||
+ | |||
+ | = Support for other PXE stacks = | ||
+ | ** Experience & experiment with other PXE stack | ||
+ | *** how they use memory (DMA, code locations) | ||
+ | *** tricks to support generic PXE stack | ||
+ | (paging, IOMMU, Linux kernel layout modifications) | ||
+ | *** Attempt to support unmodified PXE stacks | ||
+ | </file> |