An 802.11 wireless network is composed of a set of nodes (called STAs in the standard) communicating over radio frequencies in the 2.4GHz (b/g/n) or 5GHz (a/n) unlicensed band.
Most of the details of the RF modulation are handled by the card, meaning we don't have to worry much about them - which is a very good thing. However, we do need to worry about regulatory domains, speed control, and congestion control.
802.11 devices can be used worldwide, but different jurisdictions have different regulations concerning what RF channels can be used at what power. As such, software must be responsible for telling the card what channel and transmit power to use.
The regulatory situation for 2.4GHz band networks is fairly simple; there are only fourteen channels, and they have well-defined frequencies and bandwidths. The channel centers increase in 5 MHz increments from channel 1 at 2.412 GHz to channel 13 at 2.472 GHz, and channel 14 (unique to Japan) lies at 2.484 GHz. The usual channel bandwidth is 22 MHz. Channels 1-11 are allowed in most of the world; 12 and 13 are OK most everywhere except North America; and 14 is only allowed in Japan for 802.11b (DSSS/CCK style transmission rather than 802.11g's OFDM).
The situation for the 5GHz band is considerable more complex. There are over 40 channels defined, numbered noncontiguously based on their frequency, and none can be used worldwide. The US, Europe, and Japan allow transmissions centered from 5.180 to 5.320 GHz and from 5.500 to 5.700 GHz at intervals of 20 MHz, with a bandwidth of 20 MHz; the US, Singapore, China, and Korea allow from 5.745 to 5.825 GHz at intervals of 20 MHz, with 20 MHz bandwidth; and there are many more rules. http://en.wikipedia.org/wiki/List_of_WLAN_channels has a complete set culled from the Annex J of the IEEE specification. “Channel numbers” for 802.11a/n are simply an index to frequency: frequency = 5000 + 5*channel MHz, or 4000 + 5*channel MHz if channel is above 180.
There are also regulatory restrictions covering the maximum power that can be transmitted on a given channel. However, such information is (ideally!) contained in the beacon frames sent out by an access point, so if we know what channels to listen on we might be able to determine it at runtime.
Some cards are manufactured only in certain countries, and/or do not support generic setting of frequency and power. This is especially the case with 802.11b/g cards, which usually restrict one to the defined 2.4GHz channels.
802.11b supports speeds up to 11Mbps and 802.11a/g up to 54Mbps, but the faster you transmit, the more likely errors become in a suboptimal RF environment. It often produces much better throughput to transmit at a lower speed that has a greater likelihood of getting all bits through to the destination. Speed is generally controlled by software; we need to see how many packets are getting dropped (un-ACKed), and adjust our speed accordingly. There are good algorithms (the Linux driver uses one called “minstrel”) available to do this.
In order to prevent every node from trying to transmit at the same time, each frame has embedded in it a field marking a dead-time period to be enforced after it is received, by all nodes except the receiving one. For instance, when sending a fragment, you set “duration” to the amount of time it will take to send the return ACK and the sending next fragment, to make sure all fragments can get through at once without competition.
While the NIC interprets received durations in hardware, software must know enough about the transmission modes to set the duration field properly in frames it sends.
IEEE 802.11 networks can be organized in two ways: a “managed” mode, where all traffic to the network goes through an Access Point (AP) that manages authentication as well, and an “ad-hoc” mode, where nodes communicate directly with each other. The two modes of operation are called ESS and IBSS respectively in the specification.
Over the RF link maintained by the physical layer, software sends frames to transmit data and manage network state. There are three types of frames: control, management, and data.
Control frames are used for low-level network control; for instance, before starting a transmit operation you can send an RTS and wait to receive a CTS from the recipient [you can also send a CTS to yourself if the transmission will take place at nonstandard speed]. Every data and management frame must be ACK'ed to show that it was received; non-ACK'ed frames are retransmitted. (The possibility that the ACK was lost instead of the frame forces software-level elimination of duplicate frames.)
In general, control frames are all handled by hardware; the software 802.11 layer doesn't need to RTS or CTS or ACK or retransmit. We hope.
Tests I've performed on some high-traffic live wireless networks [my school] show that the norm is for RTS/CTS not to be used at all. Cards have built-in activity sensing that makes it mostly unnecessary. The newest Linux wireless stack doesn't even have code for doing flow control in software.
Management frames are used to manage associations with the network.
The specific types are detailed below, but the generall association process is as such:
Unlike most Ethernet headers, the 802.11 frame header does not have a protocol type field. This is provided by an 8-byte LLC header at the start of the MAC-encapsulated packet; amongst a lot of superfluous information is the 2-byte EtherType.
Everything in the air is sent using a frame format, like Ethernet's but more convoluted [as almost everything in 802.11 is]. Unlike Ethernet, there is a 32-bit CRC “Frame Check Sequence” *after* the data payload of each frame. EtherType information is NOT contained in the 802.11 frame header; there is an additional 802.2 header tacked on before the data.
Overall frame structure:
Apparently “network byte order” isn't standard enough: multi-byte fields are sent in little-endian. In the below descriptions, bit numbers follow the Intel standard, with bit 0 the LSB. (The IEEE spec confuses this horribly by putting bit 0 on the left-hand side of the page.)
The first byte in the Frame Control field defines the type of frame:
The second byte in the Frame Control field is a set of flags:
There are other frame types defined to support QoS, but we don't have to handle those if we advertise ourselves as a QoS-incapable node. APs are required to degrade gracefully when talking to a client if they support QoS and that client doesn't, at no expense to other QoS-capable nodes.
For PS-Poll control frames (polling low-power nodes, which we don't need to worry about), the lower 14 bits of this field contain the association ID of the node transmitting the frame. AIDs range from 1 to 2007. The upper 2 bits are 1.
During a contention-free period, this field is fixed at 32768.
Otherwise, it is usually some measurement of the duration for which this frame is expected to be “live”. Further details below.
The bottom 4 bits contain a fragment number, and the top 12 bits contain a sequence number. This field is only present in Management and Data frames, not Control frames.
Non-QoS nodes (that's us) use a single monotonically increasing counter for sequence numbers. There are no security issues with starting at 0. All fragments of a packet contain the same sequence number and monotonically increasing fragment numbers, which must start at 0. Retransmissions preserve the value of the Sequence control field.
The FCS is a 32-bit cyclic redundancy check computed in the finite field of size 2^32 using the generator polynomial with one coefficients on the terms with exponents [32, 26, 23, 22, 16, 12, 11, 10, 8, 7, 5, 4, 2, 1, 0].
I'll leave the details to mathematicians. The CRC is nice and self-contained, so we can just steal the Linux version.
Control frames are used to manage contention and noise on the wireless network. All timing information is intended to be used with the low-level explanation below, and assumes we do not implement QoS.
Used to indicate a Request To Send a management or data frame.
Used to tell the sender of an RTS frame that it is Clear To Send the management or data frame it wants to send.
Used to acknowledge receipt of a data, management, PS-Poll, or block ACK frame (an optimization we don't have to support).
Sent by nodes to an AP; used for managing low-power devices, which we won't be using.
Currently I don't think we need them…
Data frames transmit… drumroll please… data.
All data frames we will see contain 3 address fields.
For frames in an ad-hoc network (To DS and From DS both 0), address 1 is the final destination node, address 2 is the sending node, and address 3 is the BSSID for the ad-hoc network.
For frames to an AP (To DS = 1, From DS = 0), address 1 is the BSSID (MAC of the AP), address 2 is the sending node, and address 3 is the MAC of the ultimate receiver.
For frames from an AP (To DS = 0, From DS = 1), address 1 is the receiving node (us), address 2 is the BSSID, and address 3 is the MAC of the original sender.
The duration field is set to 0 for frames sent to a group/multicast address.
For the final or only fragment of a unicast packet, the duration field is set to the microseconds required to transmit one ACK frame plus the interframe space before it.
For a non-final fragment of a unicast packet, the duration field is set to the number of microseconds required to transmit the next fragment and two ACK frames, plus three interframe spaces.
Management frames are used to communicate and change the state of the network - most importantly, for one node to get onto it.
The duration field in a management frame is set using the same logic as a data frame.
Address 1 is set to the final destination of the frame; Address 2 is set to the origin of the frame; and Address 3 is the BSSID, for all management frames.
Each management frame is defined as an ordered combination of some fields, present in all frames, and some “information elements”, present at the whim of the sender. In the below lists, anything not marked “(field)” is an information element and may not be present.
An information element is structured as a one-byte element ID, followed by a one-byte length field (whose value does not include the two header bytes), followed by that many bytes of information.
Beacon frames are sent periodically by APs to advertise their status to potential clients. They are unencrypted and contain some or all of the following, in the listed order.
I'm not sure what this does yet.
No body.
Informs a node that we are no longer a part of a managed (AP'ed) network. We can send this to the AP to disassociate, or the AP can send it to us to force us off. It is a declaration, not a request. The network can cope OK if a node simply disappears, but sending a disassociation frame is ideal.
All that is required in the body is a reason code (2 bytes, the complete list of codes is in IEEE 802.11-2007 pp. 92-93).
Informs an AP that we want to join its network. The body of the frame contains some or all of the following fields, in the listed order.
Sent by an AP in response to our association request. The body of the frame contains some or all of the following fields, in the listed order.
Sent for reasons I'm not sure about. The format of the body is identical to that of an association request frame, except that a “Current AP address” (field, 6-byte MAC address) is included before the SSID.
Sent in response to a reassociation request frame. The format of the body is identical to that of an association response frame.
Sent to ask for additional information about an AP, beyond what it advertises in its beacon, before joining. Contains:
Sent by an AP in response to a probe request frame. The first part of the body is identical to a beacon frame body, except that it never includes the “Traffic indication map” element. After the beacon-like body, including any vendor-specific elements, come specific responses to the “Request information” from the probe request frame.
An authentication frame is required before associating. Authentication is used on both ad-hoc and AP-managed networks, while one can only associate with an access point. The name “authentication” is a bit of a misnomer, because the only way to actually use this frame for authentication purposes (Shared Key use) is so insecure that it is never used. The normal way (Open System) relies for security on the fact that, whether you've associated or not, you're not going to be able to send or receive data if your encryption key is wrong. In addition, WPA or better networks (RSNs) have a 4-way handshake for initializing security parameters after authentication.
The frame contains:
This frame is sent by another node to tell us it needs us to reauthenticate. The only body content is a reason code field (2 bytes).
This frame is commonly used in active attacks, because it's fairly easy to capture the authentication sequence that will follow after a spoofed deauth frame. From there, if the user used a weak passphrase, it's easy enough to crack - and if they used WEP there are statistical attacks as well.
This frame is sent to invoke an extension to the set of defined management frames. There is a one-byte Action Category field in the body, followed by a variable-length Action Details field.
The capability field is 16 bits wide and contains information about requested or advertised optional capabilities. The bits are
The status code field is a two-byte integer. It is set to 0 for a successful operation and positive for a failure; the failure modes are enumerated in IEEE 802.11-2007 pp. 94-95.
The following information elements bear data that can be interpreted as a simple string:
Each byte of the data represents one supported transfer rate. “Basic” rates have the high bit set to 1 and the low 7 bits equal to the rate in units of 500 kb/sec. “Non-basic” rates have the high bit set to 0 and the low 7 bits used for a table lookup.
And on and on. I didn't feel it was necessary to document each one of these, since the standard is fairly clear on the details.
To be continued…