This is an old revision of the document!


Joshua Oreman: 802.11 wireless development

Journal Week 4

Monday, 15 June

Worked on a bunch of small things today. My video card didn't come until late in the day, so I was only able to do a little iSCSI testing, but what I found was revealing.

Commits:

I have no idea how adding a function argument managed to decrease code size, but it did. (Measuring uncompressed text size of gpxe.lkrn.tmp with no debugging.)

So far the changes on mainline-review are all the commits I made before yesterday that are not in my rtl8180 driver or in 802.11-specific code. I want to get a few things sorted out with 802.11 handling before I push it into mainline review - specifically some kind of rate handling, as working at 1 Mbps is a little undesirable :-)

Questions I'd like to get answered by someone with the authority to decide such things:

  • Is my print_status() addition consistent with the gPXE Way? I think having the 802.11 state is important - if a user says “it's really slow” the first thing we'd ask is an ifstat to show signal strength and transmission speed. An alternative would be to special-case 802.11 in the ifstat() code using net80211_get(); if we did that, we could only avoid always linking in 802.11 support by using linker tricks like weak symbols.
  • Is removing -Wformat-nonliteral an acceptable tradeoff for 60 bytes of uncompressed code size?
  • How asynchronous should association be? Currently the probe part is synchronous (freezes gPXE for a few seconds) and the rest is asynchronous. I think we should probably push it all in one direction or the other. A blocking association with a status message would have the advantage of obvious error reporting (user doesn't have to check ifstat to see it failed and compile with DEBUG=net80211 to see why), and of preventing the possibility of over-eager networking code sending packets before the link is up, but it detracts somewhat from the “it works just like a wired link with respect to link-up” abstraction.

iSCSI testing using DOS scandisk produced a complete scan with no errors, but one significantly slower than wired and quite jagged in its progress. (The unit of progress, 16 clusters, took anywhere between 3 seconds and a minute.) I did this testing before I fixed a bug in the rate-choosing code; what was supposed to be a “conservative” choice of the first 802.11b rate (1Mbps) actually had its logic inverted to use the first 802.11g rate, which in this case was 18Mbps. My host-AP and test machines are about a meter apart, which 802.11 RF modulation is not designed for, so I expect there was a great deal of packet loss between the two. The periodic stalling suggests some TCP strangeness that's borne out by the packet captures. I don't know enough about TCP to diagnose this, but I've posted the packet captures for posterity anyway. Both represent a scan of about 500 clusters on the same disk.

Update: Testing at 1Mbps produced even more horrible results, with upwards of 43 duplicate ACKs detected from my computer to gPXE for the same segment. For almost every single TCP segment. I think this may be an issue with capturing on the same node that's hosting the Access Point; I'll test more with the WRT54G when I receive it tomorrow.

Tuesday, 16 June

No other commits today - I spent the day debugging the iSCSI retransmission issue. It was a productive if slightly maddening time, and I'm hopeful that I'll get this thing squashed tomorrow.

The wireless packet captures references yesterday were indeed exacerbated by capturing on the access point; it turns out that running Wireshark on mon.wlan0 (or running airodump-ng) does very poorly in that situation, while Wiresharking on wlan0 works fine. You don't get 802.11 headers, but this is a TCP-level issue where those aren't really necessary.

At about 1am in the morning, after setting up a “serial” console over FireWire and poring over the debugging information thus created, I finally identified the root cause of the problem. gPXE debugging output:

TCP 0x1da94 TX 1024->3260 3ee9656a..3ee9676a           c227fd74  512 PSH ACK [send a packet]
TCP 0x1da94 RX 1024<-3260           3ee9676a c227fd74..c227fd74    0 ACK     [packet ACKed ok]
TCP 0x1da94 TX 1024->3260 3ee9676a..3ee9679a           c227fd74   48 PSH ACK [send another packet]
TCP 0x1da94 RX 1024<-3260           3ee9676a c227fd74..c227fd74    0 ACK     [that ACK seems to be stale...]
TCP 0x1da94 TX 1024->3260 3ee9676a..3ee9679a           c227fd74   48 PSH ACK [so resend packet]
TCP 0x1da94 RX 1024<-3260           3ee9676a c227fd74..c227fd74    0 ACK     [get another stale ACK]
TCP 0x1da94 TX 1024->3260 3ee9676a..3ee9679a           c227fd74   48 PSH ACK [resend it again]
TCP 0x1da94 RX 1024<-3260           3ee9679a c227fd74..c227fd74    0 ACK     [finally, ACKed properly]

Wireshark summary output, prettied up a bit and with sequence and ACK numbers in absolute hex:

iSCSI    SCSI: Data Out LUN: 0x00 (Write(10) Request Data)       Seq=3ee9656a
TCP      iscsi-target > 1024 [ACK]                                            Ack=3ee9676a
TCP      [TCP segment of a reassembled PDU]                      Seq=3ee9676a
TCP      iscsi-target > 1024 [ACK]                                            Ack=3ee9679a
TCP      [TCP Retransmission] [TCP segment of a reassembled PDU] Seq=3ee9676a
TCP      [TCP Dup ACK 4864#1] iscsi-target > 1024 [ACK]                       Ack=3ee9679a
TCP      [TCP Retransmission] [TCP segment of a reassembled PDU] Seq=3ee9676a
TCP      [TCP Dup ACK 4864#2] iscsi-target > 1024 [ACK]                       Ack=3ee9679a

gPXE thinks the iSCSI target is ACKing the packet before the one it just sent, so good TCP denizen that it is, it assumes the packet it just sent didn't get through and resends it. This continues an increasing number of times; the capture just shown is from early on, with only 2 retransmissions per packet, but I've seen it over 75 per packet. At some point, about a minute after the retransmissions start to occur, they abruptly cut off and the link is fine for another minute or so.

Wireshark shows that the iSCSI target is actually ACKing the packet gPXE just sent.

This could be a bug in gPXE's TCP stack (unlikely), an rtl8180 driver-level issue causing it to resubmit stale received packets, memory corruption somewhere, or something to do with 802.11's longer link-layer header. Tomorrow I try to figure out which one it is. Wheee!

Wednesday, 17 June

It was duplicate ACKs: a silly bug (signed versus unsigned) in the 802.11 layer caused the duplicate RX elimination code to only work half the time, and gPXE's TCP stack did not elegantly handle the duplicate ACKs thus generated. (802.11 can generate duplicate packets when a packet is received but its link-layer ACK is not, causing a retransmission which is also received.) I've patched the issue in both the 802.11 layer and the TCP stack, since TCP is meant to be resilient against such things. RFC793 allows my fix: “If the ACK is a duplicate, it can be ignored” (p.72).

I also found an unrelated bug in rtl8180 that caused it to cycle through its whole TX ring whenever one packet was completed, reporting the spurious TX completions with iob set to NULL. I believe the Linux driver does this too. No symptoms, but it's best to fix such things.

Thus, commits:

Remaining things for this week: rate control, answers to questions from yesterday's entry, and pushing 802.11 code to mainline-review after I get the first two sorted out. Figuring out the iSCSI issue took much longer than I anticipated, but hopefully I'll still be able to get everything done.

Thursday, 18 June

Lots of progress today! I didn't get to rate control, but every other outstanding issue that I know about has been fixed.

Commits:

Wheee!

I heard from Michael concerning my questions above. He suggested I try to make association fully asynchronous, implementing some kind of link-up error indicator in net_device to handle the problem of errors never showing up. His original suggestion was to replace the link-up bit with the link-up rc value, using -EINPROGRESS to indicate that link was ongoing; I chose not to do it that way because of a subtle downside to gPXE's error-reporting system:

If netdevice sets rc to -EINPROGRESS, it's setting a value for that error that has been defined (by errno.h) to include a constant showing that the error came from netdevice.c. If ifmgmt then wants to check whether the error is -EINPROGRESS, its comparison will be against its own EINPROGRESS, with a field showing it came from ifmgmt.c, and it will thus never conclude that the net_device's error code is really EINPROGRESS. The error-reporting system is optimized for the assumption that errors will either be handled within the file that originated them, dropped at some layer, or propagated all the way back to the user for display. Indeed, a quick grep of the gPXE source showed not a single error equality comparison for an error originating outside the file the comparison was in. This tradeoff is very well-suited to gPXE's use cases, but it means we have to be careful about how we use the error codes. :-)

Michael also explained a bit about a conceptual separation in gPXE between kernel-ish code and user-ish code; user-ish code uses printf() to report status and errors, while kernel-ish code uses DBG() (which is normally invisible) and reports errors via return codes. I realized that my earlier attempt at including wireless status violated that separation by putting a function using printf() in the 802.11 stack directly; so I scrapped it and worked out a mechanism that uses a wireless-specific command (iwstat, to which I added iwlist and iwassoc).

The netdev→link_rc addition produces a code size increase in gpxe.lkrn that is disproportionate to its real impact; it adds only a few bytes at a time, but they come in at every driver's use of netdev_link_down or netdev_link_up (to keep the link_rc field consistent). For the real size-critical case of a ROM with support for only one driver, the size impact will be negligible (~20 uncompressed bytes for rtl8139).

Tomorrow is rate control, figuring out a subset of 802.11 error codes to include human-readable definitions for, and cleaning all of this up to push it to mainline-review in time for Saturday's meeting. Hopefully I can manage it :-)

Friday, 19 June

I got rate control working today, and squashed a few other outstanding bugs that I found. With the rate-control algorithm in place, I was able to transfer a 13MB file over TFTP in 15 seconds on an 802.11g network with the access point about 12 feet away. For comparison, the same transfer over wired gigabit Ethernet took 4 seconds. I think the level this RC algorithm achieves is probably more than adequate for gPXE's performance requirements; there are various constants that can be tuned, but I'm not going to worry about that side of things right now.

I designed gPXE's rate-control algorithm mostly from scratch, based on my thoughts about how it would work well; a few aspects (such as the overriding rate decrease if we get 3 failed TXes) are based on Linux implementations. We keep data for every rate that could be used (<16 for all practical purposes), for the TX and RX paths separately, updating the TX information for our current TX rate when we receive TX completion status on a packet, and the RX information for a received packet's RX rate whenever we receive a data packet (management packets are generally sent at 1Mbps and so would skew the results). The data for each (rate, direction) combination is kept in a simple 32-bit integer, with two bits per packet (3 = OK, 2 = retried once, 1 = retried multiple times, 0 = didn't get through); when new packets are received old data is automatically shifted off the end, so that we always keep information for at most 16 packets for each (rate, direction) combination. The average of TX and RX qualities for a given rate, weighted by number of packets of data available for each and weighting TX packets more heavily than RX packets (they're more reliable), is munged into a “goodness” value for that rate between 0 and 99. Whenever the current rate's “goodness” falls below 85, we switch to the fastest rate with “goodness” over 85, or the rate with best “goodness” if none is over 85.

I got lucky with this one: the algorithm worked well as I designed it with only minor modifications, despite the fact that most of the numeric parameters would best be classified as educated guesses. :-) And it's quite small:

oremanj@xenon /home/oremanj/dev/gpxe % size src/bin/rc80211.o linux-2.6.30/net/mac80211/rc80211_minstrel.o
   text   data    bss	    dec	    hex	filename
    602      0      0	    602	    25a	src/bin/rc80211.o
   3472     96      0	   3568	    df0	linux-2.6.30/net/mac80211/rc80211_minstrel.o

With the addition of rate control, and a few other bugfixes that came up while I was testing today, I think the wireless code is just about ready for mainline review. I'm going to go through over the weekend and make sure nothing is missing documentation, remove whitespace that's creeped onto line ends, and so forth; I plan to have everything I've worked on thus far over the summer in my mainline-review branch by Monday.

A quick size sanity check on the wireless code:

    220     24      0	    244	     f4	bin/iwmgmt_cmd.o
    602      0      0	    602	    25a	bin/rc80211.o
   1479     56      0	   1535	    5ff	bin/iwmgmt.o
   7282    100     24	   7406	   1cee	bin/net80211.o

And for the rtl8180/rtl8185 driver: (only one of the RF handlers is required for any given card, in addition to rtl8180.o)

   3096    200      0	   3296	    ce0	bin/rtl8180.o
   1264     24      0	   1288	    508	bin/rtl8180_grf5101.o
    609     24      0	    633	    279	bin/rtl8180_max2820.o
   8174     24      0	   8198	   2006	bin/rtl8180_rtl8225.o
    985     24      0	   1009	    3f1	bin/rtl8180_sa2400.o

It's big by gPXE standards, but not enormous, and I think the size is reasonable given the complexity of the 802.11 protocol. And with the mucurses stuff (login and config) taken out but everything else default including iSCSI linked in, it meets the real test:

68 -rw-r--r-- 1 oremanj oremanj 65024 2009-06-20 01:59 bin/rtl8180--rtl8180_rtl8225.rom

Under 64k—yippee! (With a completely default config it's 66,560 bytes.) Of course, the real challenge will be squeezing encryption support in there… but I've got the rest of the summer to figure that out ;-)

Saturday, 20 June

Did a bunch of cleanup with no feature changes, and everything has been pushed to mainline-review. :-)

Commits ready for mainline review, in reverse order:

oremanj@xenon /home/oremanj/dev/gpxe/src % git log --pretty=oneline mainline-review | head -n 12
dcd4ae5d0edbc9abd429bce50f0e58726cdfe00b [iwmgmt] Add user-level 802.11 management commands and common error tables
b23fba30c9847b8fabf651d50cf6e4e323753548 [rtl818x] Add driver for Realtek 8180/8185 wireless cards
4edf4718760dfb35c6a0c811fc2e019fb176e9fc [802.11] Add support for 802.11 devices with software MAC layer
6d63a4a5f928b46422e2eb79837a8aba103e5bb7 [dhcp] Await link-up before starting DHCP
22c261e77bda0984f4cb052037008f487a4bcaa6 [hci] Expose ifcommon_exec() in a local header so wireless commands can use it
30df822acbf6a207201f111d886effa0e4fc97d3 [ifmgmt] Move link-up status messages from autoboot() to iflinkwait()
4602299f4b96f7692766553d5972066dfd567b4e [netdevice] Add netdev->link_rc field for errors encountered during link-up
8041741323b40d9f5c482d3c6e1391bee7be759d [tcp] Ignore duplicate ACKs in TCP ESTABLISHED state
a9a0567225493046f70e6252e21ab5c6d8219e87 [image] Modify imgfree command to accept an argument
f77b486f42b2b12604ce94d65cbba33b55a589e5 [netdevice] Adjust maximum link-layer header length for 802.11
d429b31ac28760004e753dc79178400d507975e2 [netdevice] Add netdev argument to link-layer push and pull handlers
18e6470d06d8846d531d97d881be6f1278bd2f15 [nvs] Add init function for Atmel 93C66 EEPROM

Next up: encryption…


QR Code
QR Code soc:2009:oremanj:journal:week4 (generated for current page)