−Table of Contents

Joshua Oreman: 802.11 wireless development
- 802.11 protocol notes

Joshua Oreman: 802.11 wireless development

802.11 protocol notes

Functional overview

An 802.11 wireless network is composed of a set of nodes (called STAs in the standard) communicating over radio frequencies in the 2.4GHz (b/g/n) or 5GHz (a/n) unlicensed band.

Physical layer (PHY)

Most of the details of the RF modulation are handled by the card, meaning we don't have to worry much about them - which is a very good thing. However, we do need to worry about regulatory domains, speed control, and congestion control.

Regulatory domains

802.11 devices can be used worldwide, but different jurisdictions have different regulations concerning what RF channels can be used at what power. As such, software must be responsible for telling the card what channel and transmit power to use.

The regulatory situation for 2.4GHz band networks is fairly simple; there are only fourteen channels, and they have well-defined frequencies and bandwidths. The channel centers increase in 5 MHz increments from channel 1 at 2.412 GHz to channel 13 at 2.472 GHz, and channel 14 (unique to Japan) lies at 2.484 GHz. The usual channel bandwidth is 22 MHz. Channels 1-11 are allowed in most of the world; 12 and 13 are OK most everywhere except North America; and 14 is only allowed in Japan for 802.11b (DSSS/CCK style transmission rather than 802.11g's OFDM).

The situation for the 5GHz band is considerable more complex. There are over 40 channels defined, numbered noncontiguously based on their frequency, and none can be used worldwide. The US, Europe, and Japan allow transmissions centered from 5.180 to 5.320 GHz and from 5.500 to 5.700 GHz at intervals of 20 MHz, with a bandwidth of 20 MHz; the US, Singapore, China, and Korea allow from 5.745 to 5.825 GHz at intervals of 20 MHz, with 20 MHz bandwidth; and there are many more rules. http://en.wikipedia.org/wiki/List_of_WLAN_channels has a complete set culled from the Annex J of the IEEE specification. “Channel numbers” for 802.11a/n are simply an index to frequency: frequency = 5000 + 5*channel MHz, or 4000 + 5*channel MHz if channel is above 180.

There are also regulatory restrictions covering the maximum power that can be transmitted on a given channel. However, such information is (ideally!) contained in the beacon frames sent out by an access point, so if we know what channels to listen on we might be able to determine it at runtime.

Some cards are manufactured only in certain countries, and/or do not support generic setting of frequency and power. This is especially the case with 802.11b/g cards, which usually restrict one to the defined 2.4GHz channels.

Speed control

802.11b supports speeds up to 11Mbps and 802.11a/g up to 54Mbps, but the faster you transmit, the more likely errors become in a suboptimal RF environment. It often produces much better throughput to transmit at a lower speed that has a greater likelihood of getting all bits through to the destination. Speed is generally controlled by software; we need to see how many packets are getting dropped (un-ACKed), and adjust our speed accordingly. There are good algorithms (the Linux driver uses one called “minstrel”) available to do this.

Congestion control

In order to prevent every node from trying to transmit at the same time, each frame has embedded in it a field marking a dead-time period to be enforced after it is received, by all nodes except the receiving one. For instance, when sending a fragment, you set “duration” to the amount of time it will take to send the return ACK and the sending next fragment, to make sure all fragments can get through at once without competition.

While the NIC interprets received durations in hardware, software must know enough about the transmission modes to set the duration field properly in frames it sends.

MAC layer

IEEE 802.11 networks can be organized in two ways: a “managed” mode, where all traffic to the network goes through an Access Point (AP) that manages authentication as well, and an “ad-hoc” mode, where nodes communicate directly with each other. The two modes of operation are called ESS and IBSS respectively in the specification.

Over the RF link maintained by the physical layer, software sends frames to transmit data and manage network state. There are three types of frames: control, management, and data.

Control frames

Control frames are used for low-level network control; for instance, before starting a transmit operation you can send an RTS and wait to receive a CTS from the recipient [you can also send a CTS to yourself if the transmission will take place at nonstandard speed]. Every data and management frame must be ACK'ed to show that it was received; non-ACK'ed frames are retransmitted. (The possibility that the ACK was lost instead of the frame forces software-level elimination of duplicate frames.)

In general, control frames are all handled by hardware; the software 802.11 layer doesn't need to RTS or CTS or ACK or retransmit. We hope.

Tests I've performed on some high-traffic live wireless networks [my school] show that the norm is for RTS/CTS not to be used at all. Cards have built-in activity sensing that makes it mostly unnecessary. The newest Linux wireless stack doesn't even have code for doing flow control in software.

Management frames

Management frames are used to manage associations with the network.

The specific types are detailed below, but the generall association process is as such:

At frequent periodic intervals, every AP not configured in “hidden SSID” mode (which is against the standard) sends out beacon frames with information about its network: its MAC address, ESSID (network name), regulatory information (how much power are we allowed to transmit?), etc. The AP can choose how much information to include in the beacon frame.
If we're not satisfied with the information in the beacon we get, we're trying to associate with a hidden network, or we just don't want to wait for a beacon, we can send a probe frame requesting specific information in a beacon-like response from a particular network. The network is referenced by name, not AP MAC address, so large networks with multiple APs work fine.
Once we have enough information to connect, we send an authentication frame. In most cases there's no actual authentication involved; anyone requesting authentication is granted it. The one exception is a WEP network configured in “Shared Key” mode, in which one must encrypt a challenge with the WEP key to authenticate. This is even more insecure than ordinary “Open System” WEP, and in practice no one uses it.
Once we're authenticated, we send an association frame requesting to be added to the AP's list of clients.
Once we're associated, there may be additional security steps if we're connecting to a WPA network. There's a four-way handshake involved, the details of which I'm still researching.

LLC layer

Unlike most Ethernet headers, the 802.11 frame header does not have a protocol type field. This is provided by an 8-byte LLC header at the start of the MAC-encapsulated packet; amongst a lot of superfluous information is the 2-byte EtherType.

Frame format

Everything in the air is sent using a frame format, like Ethernet's but more convoluted [as almost everything in 802.11 is]. Unlike Ethernet, there is a 32-bit CRC “Frame Check Sequence” *after* the data payload of each frame. EtherType information is NOT contained in the 802.11 frame header; there is an additional 802.2 header tacked on before the data.

Overall frame structure:

Frame control (2 bytes).
Duration/ID (2 bytes).
Address 1 (6 bytes, MAC address). Specific meaning depends on the frame, but we only accept frames where this is our address or multicast/broadcast.
Address 2 (6 bytes, MAC address). Specific meaning depends on the frame, but this is the address we ACK to if we have to ACK.
Address 3 (6 bytes, MAC address). Specifies another address; e.g. if we're sending through the AP (Address 1) from ourselves (Address 2), the final destination goes in Address 3.
Sequence control (2 bytes). Contains sequence number and fragment number information.
Data (up to 2304 bytes plus encryption overhead).
Frame check sequence (4 bytes). CRC-32 over the rest of the frame *after* everything is packaged up/encrypted, just before it's sent over the wire.

Apparently “network byte order” isn't standard enough: multi-byte fields are sent in little-endian. In the below descriptions, bit numbers follow the Intel standard, with bit 0 the LSB. (The IEEE spec confuses this horribly by putting bit 0 on the left-hand side of the page.)

Frame control field (2 bytes)

The first byte in the Frame Control field defines the type of frame:

Bits 0-1: 2-bit Protocol Version field. Always 0; if nonzero we must drop the packet.
Bits 2-3: 2-bit Type field.
- 0: Management frame (association, breacon, authentication, etc).
- 1: Control frame (ACK, RTS, CTS).
- 2: Data.
Bits 4-7: 4-bit Subtype field.
- For frames of type 0 (Management):
  - 0: Association request; 1: Association response.
  - 2: Reassociation request; 3: Reassociation response.
  - 4: Probe request; 5: Probe response.
  - 8: Beacon.
  - 9: ATIM.
  - 10: Disassociation.
  - 11: Authentication; 12: Deauthentication.
  - 13: Action.
- For frames of type 1 (Control): [we hope the card handles these]
  - 11: RTS [request to send].
  - 12: CTS [clear to send].
  - 13: ACK [acknowledgement].
- For frames of type 2 (Data):
  - 0: Data.
  - 4: Null (no data).

The second byte in the Frame Control field is a set of flags:

Bit 8: To DS; bit 9: From DS. The “DS” is IEEE-speak for the access point, on the theory that one might have a mesh network (Data System) of them. If both bits are 0, the packet is being transferred between two nodes in ad-hoc mode; if To DS is 1, it's from a node to the AP, and if From DS is 1, it's from an AP to the node. If both bits are 1 the packet is inter-AP communication and we should drop it.
Bit 10: More Frag. Set on all fragments except the last; fragment details are included in the Sequence Control field (described below). The spec requires that we handle fragments of three different packets all mixed up before we start dropping. Maximum packet size is 64k.
Bit 11: Retry. Set if this is a retransmission, clear if not.
Bit 12: Pwr Mgmt. Set iff the sending node is a low-power device that wants to go to sleep. We can probably ignore this; APs aren't allowed to be low-power devices.
Bit 13: More Data. Set on packets transmitted from an AP if more packets than this one are buffered for transmit, as an indictation to low-power devices that they shouldn't go to sleep yet.
Bit 14: Protected. Set if this frame is encrypted by any method.
Bit 15: Order. Set if this frame is transmitted in strict order (with respect to its fragments, I assume). Probably can be ignored; I've never seen it used in the wild.

There are other frame types defined to support QoS, but we don't have to handle those if we advertise ourselves as a QoS-incapable node. APs are required to degrade gracefully when talking to a client if they support QoS and that client doesn't, at no expense to other QoS-capable nodes.

Duration/ID field (2 bytes)

For PS-Poll control frames (polling low-power nodes, which we don't need to worry about), the lower 14 bits of this field contain the association ID of the node transmitting the frame. AIDs range from 1 to 2007. The upper 2 bits are 1.

During a contention-free period, this field is fixed at 32768.

Otherwise, it is usually some measurement of the duration for which this frame is expected to be “live”. Further details below.

Sequence control field (2 bytes)

The bottom 4 bits contain a fragment number, and the top 12 bits contain a sequence number. This field is only present in Management and Data frames, not Control frames.

Non-QoS nodes (that's us) use a single monotonically increasing counter for sequence numbers. There are no security issues with starting at 0. All fragments of a packet contain the same sequence number and monotonically increasing fragment numbers, which must start at 0. Retransmissions preserve the value of the Sequence control field.

Frame check sequence (4 bytes)

The FCS is a 32-bit cyclic redundancy check computed in the finite field of size 2^32 using the generator polynomial with one coefficients on the terms with exponents [32, 26, 23, 22, 16, 12, 11, 10, 8, 7, 5, 4, 2, 1, 0].

I'll leave the details to mathematicians. The CRC is nice and self-contained, so we can just steal the Linux version.

Control frames description

Control frames are used to manage contention and noise on the wireless network. All timing information is intended to be used with the low-level explanation below, and assumes we do not implement QoS.

RTS frame

Used to indicate a Request To Send a management or data frame.

Address 1 is the intended immediate recipient of the frame we want to send. (Probably the AP's MAC.)
Address 2 is our MAC address.
Address 3 and the Sequence Control field are not included. There is no data; the frame is 20 bytes long including FCS.
Duration is the number of microseconds we expect to be required for one CTS frame, one ACK frame, the data or management frame we're asking to send, and three short interframe spaces.

CTS frame

Used to tell the sender of an RTS frame that it is Clear To Send the management or data frame it wants to send.

Address 1 is copied from Address 2 of the preceding RTS frame: it's the MAC of the node we're allowing to send.
Addresses 2 and 3 and the Sequence Control field are not included. There is no data; the frame is 14 bytes long including FCS.
Duration is the duration value from the RTS frame minus the time required to transmit the CTS frame and its preceding interframe space.

ACK frame

Used to acknowledge receipt of a data, management, PS-Poll, or block ACK frame (an optimization we don't have to support).

Address 1 is copied from Address 2 of the frame we're ACKing.
Duration is 0 if the frame we're ACKing had no more fragments; otherwise, it's the duration value from the frame we're ACKing minus the time to transmit the ACK frame and its preceding interframe space.

PS-Poll frame

Sent by nodes to an AP; used for managing low-power devices, which we won't be using.

Address 1 is the MAC address of the AP we're polling.
Address 2 is our MAC address.
Address 3 and the Sequence Control field are not included. There is no data; the frame is 20 bytes long including FCS.
As explained above, Duration is here used as an association ID field.

Other types

Currently I don't think we need them…

Data frames description

Data frames transmit… drumroll please… data.

Addressing

All data frames we will see contain 3 address fields.

For frames in an ad-hoc network (To DS and From DS both 0), address 1 is the final destination node, address 2 is the sending node, and address 3 is the BSSID for the ad-hoc network.

For frames to an AP (To DS = 1, From DS = 0), address 1 is the BSSID (MAC of the AP), address 2 is the sending node, and address 3 is the MAC of the ultimate receiver.

For frames from an AP (To DS = 0, From DS = 1), address 1 is the receiving node (us), address 2 is the BSSID, and address 3 is the MAC of the original sender.

Duration

The duration field is set to 0 for frames sent to a group/multicast address.

For the final or only fragment of a unicast packet, the duration field is set to the microseconds required to transmit one ACK frame plus the interframe space before it.

For a non-final fragment of a unicast packet, the duration field is set to the number of microseconds required to transmit the next fragment and two ACK frames, plus three interframe spaces.

Management frames description

Management frames are used to communicate and change the state of the network - most importantly, for one node to get onto it.

The duration field in a management frame is set using the same logic as a data frame.

Address 1 is set to the final destination of the frame; Address 2 is set to the origin of the frame; and Address 3 is the BSSID, for all management frames.

Each management frame is defined as an ordered combination of some fields, present in all frames, and some “information elements”, present at the whim of the sender. In the below lists, anything not marked “(field)” is an information element and may not be present.

An information element is structured as a one-byte element ID, followed by a one-byte length field (whose value does not include the two header bytes), followed by that many bytes of information.

Beacon frame

Beacon frames are sent periodically by APs to advertise their status to potential clients. They are unencrypted and contain some or all of the following, in the listed order.

Timestamp (field, 64bit): value of the timing sync function of the AP.
Beacon interval (field, 16bit): number of 1024-us intervals between beacons.
Capability (field, 16bit): see below.
SSID (element #0, 2-34byte).
Supported rates (element #1, 3-10byte).
Frequency-hopping parameter set (element #2, 7byte).
DS parameter set (element #3, 3byte).
CF parameter set (element #4, 8byte).
IBSS parameter set (element #6, 4byte).
Traffic indication map (element #5, 6-256byte).
Country information (element #7, 8-256byte).
Frequency-hopping parameters (element #8, 4byte) -OR- FH pattern table (element #9, 6-256byte).
Power constraint (element #32, 3byte).
Channel switch announcement (element #37, 5byte).
Quiet (element #40, 8byte).
IBSS DFS (element #41, 10-255byte).
TPC report (element #35, 4byte).
ERP [extended rate protocol] information (element #42, 3byte).
Extended supported rates (beyond 8) (element #50, 3-257byte).
RSN (information on better-than-WEP security) (element #48, 36-256byte).
BSS load (element #11, 7byte).
EDCA parameter (element #12, 20byte) -OR- QoS capability (element #46, 3byte).

ATIM frame

I'm not sure what this does yet.

No body.

Disassociation frame

Informs a node that we are no longer a part of a managed (AP'ed) network. We can send this to the AP to disassociate, or the AP can send it to us to force us off. It is a declaration, not a request. The network can cope OK if a node simply disappears, but sending a disassociation frame is ideal.

All that is required in the body is a reason code (2 bytes, the complete list of codes is in IEEE 802.11-2007 pp. 92-93).

Association request frame

Informs an AP that we want to join its network. The body of the frame contains some or all of the following fields, in the listed order.

Capability (field, 16bit): see below.
Listen interval (field, 16bit): multiple of the beacon interval indicating how often a low-power device wakes up to listen for beacon frames. We can set this to 1.
SSID (element #0, 2-34byte).
Supported rates (element #1, 3-10byte).
Extended supported rates (beyond 8) (element #50, 3-257byte).
Power capability (element #33, 4byte).
Supported channels (element #36, 4-256byte).
RSN information (element #48, 36-256byte).

Association response frame

Sent by an AP in response to our association request. The body of the frame contains some or all of the following fields, in the listed order.

Capability (field, 16bit): see below.
Status code (field, 16bit): see below.
Association ID (field, 16bit): tells the associating node what its ID in this iteration of the network is. The top 2 bits are set to 1, for quick matching against the AID field in PS-Poll control frames.
Supported rates (element #1, 3-10byte).
Extended supported rates (beyond 8) (element #50, 3-257byte).
EDCA parameter set (element #12, 20byte).

Reassociation request frame

Sent for reasons I'm not sure about. The format of the body is identical to that of an association request frame, except that a “Current AP address” (field, 6-byte MAC address) is included before the SSID.

Reassociation response frame

Sent in response to a reassociation request frame. The format of the body is identical to that of an association response frame.

Probe request frame

Sent to ask for additional information about an AP, beyond what it advertises in its beacon, before joining. Contains:

SSID (element #0, 2-34byte).
Supported rates (element #1, 3-10byte).
Request information (element #10, 2-256byte).
Extended supported rates (beyond 8) (element #50, 3-257byte).

Probe response frame

Sent by an AP in response to a probe request frame. The first part of the body is identical to a beacon frame body, except that it never includes the “Traffic indication map” element. After the beacon-like body, including any vendor-specific elements, come specific responses to the “Request information” from the probe request frame.

Authentication frame

An authentication frame is required before associating. Authentication is used on both ad-hoc and AP-managed networks, while one can only associate with an access point. The name “authentication” is a bit of a misnomer, because the only way to actually use this frame for authentication purposes (Shared Key use) is so insecure that it is never used. The normal way (Open System) relies for security on the fact that, whether you've associated or not, you're not going to be able to send or receive data if your encryption key is wrong. In addition, WPA or better networks (RSNs) have a 4-way handshake for initializing security parameters after authentication.

The frame contains:

Authentication algorithm number (field, 16bit): 0 for open system or 1 for shared key.
Authentication transaction sequence number (field, 16bit): starts at 1, increases with each frame; frames sent by node wanting to be authenticated are odd, replies are even.
Status code (field, 16bit): see below. Used only for even sequence numbers (AP replies).
Challenge text (element #16, 3-255byte) (if Shared Key type and sequence number 2 or 3).

Deauthentication frame

This frame is sent by another node to tell us it needs us to reauthenticate. The only body content is a reason code field (2 bytes).

This frame is commonly used in active attacks, because it's fairly easy to capture the authentication sequence that will follow after a spoofed deauth frame. From there, if the user used a weak passphrase, it's easy enough to crack - and if they used WEP there are statistical attacks as well.

Action frame

This frame is sent to invoke an extension to the set of defined management frames. There is a one-byte Action Category field in the body, followed by a variable-length Action Details field.

Management frame field description

Capability field

The capability field is 16 bits wide and contains information about requested or advertised optional capabilities. The bits are

Bit 0: ESS; bit 1: IBSS. Ad-hoc networks are IBSS, AP-managed are ESS. Exactly one of these bits should be set.
Bit 2: CF-Pollable. Used for high-performance “contention-free” connections, which gPXE will not support. Bit 3 (CF-Poll Request) should in such cases also be 0.
Bit 4: Privacy. Set by the AP if all data on its network is encrypted. Cleared in association request frames we send (it is ignored).
Bit 5: Short Preamble. (PHY-level)
Bit 6: PBCC. (PHY-level)
Bit 7: Channel agility. (PHY-level)
Bit 8: Spectrum management. (PHY-level)
Bit 9: QoS. Indicates support for (duh) QoS.
Bit 10: Short slot time. (PHY-level)
Bit 11: APSD. No idea.
Bit 13: DSSS-OFDM. (PHY-level)
Bit 14: Delayted block ACK. Set by nodes that implement delayed block ACK.
Bit 15: Immediate block ACK. Set by nodes that implement immediate block ACK.

Status code field

The status code field is a two-byte integer. It is set to 0 for a successful operation and positive for a failure; the failure modes are enumerated in IEEE 802.11-2007 pp. 94-95.

Plain data elements

The following information elements bear data that can be interpreted as a simple string:

SSID (element #0)
Challenge text (element #16)

Supported Rates element

Each byte of the data represents one supported transfer rate. “Basic” rates have the high bit set to 1 and the low 7 bits equal to the rate in units of 500 kb/sec. “Non-basic” rates have the high bit set to 0 and the low 7 bits used for a table lookup.

And on and on. I didn't feel it was necessary to document each one of these, since the standard is fairly clear on the details.

Security

To be continued…