====== Joshua Oreman: 802.11 wireless development ====== ===== Journal Week 12 ===== I can hardly believe there's only a week left of SoC. It's been a wonderful experience working with such talented developers, and I hope my coursework in the fall will leave me enough time to continue contributing :-) Also, I believe this IRC message needs to find a permanent record here: 10:45 At the point you're talking about, the system is not fully initialised. On many systems, the memory map is not yet valid. If running normal BIOS-level code is marked with "Here be dragons", running during POST is marked with "Here be huge, ugly, vindictive, sociopathic dragons with a mean sense of humour" Well put indeed! ==== Monday, 10 August ==== Not too much gPXE work today. I pushed a cleaned-up version of the large-ROM fix from my ath5k branch to staging as **bigrom-oremanj** (following the new [[:staging|staging tree protocol]]). A suggestion by Michael for making some of the condition checks for overflow more intuitive revealed the rather surprising fact that bit-shifting in C by more places than the size of the variable is undefined; on gcc-x86, ''1ul << var'' when var is 32 will be not zero but one! This led to a small-scale [[:todo:audit-the-shifts|audit of variable-amount bitshifts]] in the gPXE source, but I didn't find any code that would cause problems with this undefined behavior. I received a new e1000 card, and was able to use it to restore the flash on the old one following a procedure that I've outlined on the [[:romburning#recovering_from_a_bad_flash|ROM burning page]]. The issue was indeed one of option ROM overflow; gPXE loads to segment ''CC00'', meaning it has exactly 80kB of ROM space on my test system. The ROM that had caused trouble was about 90kB. I found a regression in the 802.11 code caused by recent changes to ''process_add()'' to ensure the same process is not added twice. The changes assume that all callers use ''process_init_stopped()'' to initialize all fields of the process structure, instead of setting just ''step'' and ''refcnt'' manually (which has worked fine in the past). The 802.11 code used the later method, and now does not start the association process at all. I pushed a two-line fix to staging as **wiprocfix**, and it probably will be merged tomorrow. I rebased my **linker** branch against recent changes and pushed it to staging. I updated my **firmware** branch to use the new symbol requirement macros defined in linker, and pushed it to my personal repository as **firmware-pretty**. It will go to staging after linker is merged, since it depends on the macros in linker. Priorities for the rest of the week: * Write a page for 802.11 users and a page for driver developers * Post a brain-dump of the 802.11 knowledge I've gained working on this project (about halfway done writing it) * Once linker is merged, rebase and push firmware and wireless branches * Start working on flash-stub large ROM idea Regarding the last bullet point, I think I'm going to try using the PCI ROM BAR before device-specific flash code. Video cards almost universally have very large memory regions compared to a typical flash size, so it should be easy enough to look for a BAR larger than the flash size, disable it, and map the flash in its place for long enough to copy its contents to RAM. Disabling the BAR doesn't affect the card's internal operation, so as long as we don't output anything while the flash is mapped this method should work. (If anyone reading this knows something I don't about PCI architecture and can see that this is a stupid idea, please let me know.) ==== Tuesday, 11 August ==== Wrote some documentation for [[:wirelessboot|users of the 802.11 code]] and [[:wirelessboot:drivers|driver writers]]. Updated **wireless** branch to cope with a quirk of Linksys routers' WPA support; they don't accept 4-Way Handshake packets that include the optional capabilities field. Since we don't advertise any capabilities, there's no reason to include the field, so don't. Updated **firmware** branch to clean up the makefile changes a bit. Fixed my **wiprocfix** fix to properly set the reference count for the process object (no behavioral change, but the correctness is more intuitive) and saw it merged. **bigrom-oremanj** was also merged. Rebased **linker** in staging against recent changes to mainline, and then **wireless** and **firmware** in my personal repository against linker. Thought about future driver support; it seems that the combination of //b43//, //ath9k//, and //iwlwifi// should support almost all currently-unsupported cards in common use. Each of these drivers is over 20,000 lines of C in the Linux kernel, though, so this won't be an easy task. ==== Wednesday, 12 August ==== Mostly worked on the ROM-from-PCI loader, which can be found as branch **xrom** in my personal repository for the curious. I doubt I'll be cleaning this up for mainline, as there seems to be no really safe way of doing it, and newer systems with PCI3.0 and PMM get the benefit for free. ==== Thursday, 13 August ==== Wrote some [[:wirelessboot:implementation|implementation notes]] for the wireless code. Updated branch **linker** in staging to add the line number to symbols generated by ''REQUIRE_SYMBOL(foo)'', so that they now look like e.g. ''_''''_require_foo_47''. This fixes an issue with ''REQUIRE_OBJECT()'' multiple times in the same file (e.g. with both ''GDBUDP'' and ''GDBSERIAL'' defined); now that the symbols that macro introduces are initialized data rather than common, the compiler refuses to allow two with the same name. Discovered that a stock e1000 gPXE ROM does not work on my development system (very recent BIOS, PCI3.0/BBS/PMM). It seems the BIOS will refuse to hand out 1MB or more at a time using PMM, and since gPXE keeps requesting larger allocations until it gets one aligned to 2MB, gPXE doesn't use PMM at all. The ROM is 70656 bytes, and it's relocated 71680 bytes below the end of option ROM space. Loading gPXE from the POST-time prompt works fine (except that e820 is not yet available there on my system); loading it as a boot device freezes immediately after "gPXE starting execution...", and I get garbage onscreen after 10 seconds or so. I suspect some other card is trampling on our tail and hopelessly confusing the decompressor. It turns out my test system exhibits the same underlying problem (BIOS won't give out 1MB or more via PMM); it just has slightly more option ROM space. When I tried to test a fix that would accept A20-set allocations if the BIOS had set up the A20 line properly, I managed to brick my e1000 //again//. Forty minutes of shuffling around PCI cards later, I fixed things, and verified that the state of the A20 line during POST is no indication of the state of A20 when our BEV or int19h is called. The BIOS disables it before booting. Fix that would allow us to use PMM (and thus larger ROMs) on such limited BIOSes: accept any PMM buffer address, and set the A20 gate ourselves in the BEV. The code to do this is rather messy, though, and might not be worth it. Also discovered a small bug in ''src/arch/i386/firmware/pcbios/gateA20.c'': #define A20_KBC_RETRIES (2^21) "You keep on using that operator. I do not think it means what you think it means." :-) ==== Friday, 14 August ==== On my test system, if we make use of a PMM buffer with A20 set, we don't even get to the BEV entry point to have a chance to set the A20 gate up properly. Adding a ljmp $0xf000, $0xfff0 immediately after ''bev_entry:'', which reboots the system at that point on a PMMless gPXE, does not prevent the freeze. There may be a subtler issue here. Started taking a look at the Linux ''b43'' (Broadcom wireless) driver. It's quite well-written and -commented, especially for a reverse-engineered driver, but the hardware is really a mess. Some models have the 30-bit DMA restriction Stefan dealt with during his SoC last year. The hardware uses an SSB interface, which seems to be on the level of a whole different bus bridged to PCI. And then there's this line: err = request_firmware(&blob, ctx->fwname, ctx->dev->dev->dev); ''dev->dev->dev''? Seriously? :-) Figured out a possible solution for the problem for **xrom** that we can't know about the devices like APICs that don't have their mappings in PCI BARs: just read the entire space we're going to cover with our mapping before we map it. The standard on x86 is for unmapped memory to read all-ones, and designers of MMIO interfaces actively avoid all-ones being normal in a register. If all 128k or whatever read as ''0xFF'', plus we find no overlap in BARs or e820, it's almost certainly safe to map. The ROM-mapping logic could also be used for UNDI. Split up the FireWire branch into a more logical separation of commits (first the generic interface, then the gPXE code that uses it, then the host-side utilities to make it useful). Pushed it as **firewire** to my personal repository and removed it from staging, as I have other code there that I think is more important (specifically **linker** and the various things depending on it). ==== Final Thoughts ==== Well, Summer of Code is over, and what an adventure it's been. I've immensely enjoyed working on such a mature and well-developed codebase, with a great many talented people, and in a very interesting field with lots of room for innovation. Thank you to everyone who's helped to make it possible! Things I'd like to still get done, in rough order of priority: * Merge branch **linker** to mainline. * Merge branch **wireless** to mainline, with all the crypto and iwmgmt stuff. * Get branch **firmware** ready for mainline. * Get branch **firewire** ready for mainline, in one form or another. (It may be a very useful interface for driving gPXE for testing, a la DrV's project.) * Write **xrom** in a fashion that's remotely valid, doing the grunt work of scanning PCI BARs and e820 and such to look for a valid place to map instead of the ugly and immoral hack I initially went with. * In the Maybe category: * Extend branch **eap** into some real support for EAP / WPA Enterprise authentication, with a few common methods implemented. * Port driver ''ath9k'' or ''b43'' or ''iwlwifi'' from Linux. The latter two require firmware loading, and all are something of a mess. * Add standardized support for NV options using the VPD area or (on EFI) RuntimeServices->SetVariable(). The storing in EEPROM using NVS is nice, but most cards don't support it because their EEPROM is earmarked for other purposes. With wireless cards especially, another spot is needed for SSID and encryption key. Final sanity check of local git branches related to the work I've done: ath5k Merged (ath5k wireless driver) bigrom-oremanj Merged (small patch to support big ROMs) sky2 Merged (sky2 wired NIC driver) mainline-review Merged (initial bout of wireless code) wiprocfix Merged (small patch to wireless code) linker In staging (improve linker macros, object-specific config) firewire Waiting (debugging interface over FireWire) firmware-pretty Waiting (firmware image embedding and loading) wireless-pretty Waiting (wireless crypto and improvements) eap To-do (802.1X authentication, WPA Enterprise) xrom To-do (load ROM from the PCI card) ath5k-old History (superseded by ath5k) firewire-old History (superseded by firewire) firewire-really-old History (superseded by firewire) wireless History (superseded by wireless-pretty) fwtrans Academic interest (load files over firewire debug link) And so we go, again. Thank you to everyone who's made this summer great, and I hope to be able to continue contributing! :-)