How USB works

The USB protocol

USB is a polled bus in that devices cannot initiate any communication to the host, not even interrupts. Upon request from the host the device responds. There are four ways in which devices can communicate, given in the order in which they are handled.

Isochronous Transfer - Meant for streaming devices like Video Cameras. (unreliable)
Interrupt Transfer - Short, low latency reliable delivery.
Control Transfer - Meant for configuring the device, etc.
Bulk Transfer - Bulk Transfers.

We will mostly be dealing with Bulk Transfers as USB NICs only use that method of communication.

USB Device Addressing

Every device in a USB bus is given a device address during configuration (the time when it is detected as present on the bus). A USB device is made up of a certain number of endpoints. An endpoint is a simplex communication channel that is a source or sink. It is a unique addressable portion of a USB device. An endpoint is associated with a direction and a type of data transfer. The tuple {device number, endpoint number, direction} uniquely identifies an endpoint. Endpoints are configured during device probe and have certain characteristics such as MaxPacketSize associated with them.

USB NICs are expected to have at least one Bulk IN endpoint and a Bulk OUT endpoint in addition to the Control Endpoint (Number 0) which every device is expected to provide as a basic requirement.

USB Software mainly exists in two places, on a PC (called host) and on a device (called device). We will be mainly dealing with the host part in this document.

The host part of the USB mainly involves a piece of hardware known as a Host Controller. It is the host controller that sends out bits on the serial bus in the format specified by the USB Specification. Host controller specifications are not covered by the USB specification. Their implementation is left to hardware vendors who in turn create specifications for their own products. Examples of Host Controllers are UHCI, EHCI, and OHCI, among other very less well known ones.

We'll first look at the UHCI host controller. It is a simple host controller suporting only Low Speed (1.5 Mbps) and Full Speed (12 Mbps) devices (the other speed being High Speed operating at 480 Mbps). For a PC, it is usually available as a PCI device. The UCHI documentation lists all registers used by the card.

Now, lets see how a Host Controller works. For now, I'll be dealing with the UHCI host controller only. So, for example let's say our job is to tell the host controller to transfer certain data from a memory location to some USB address on the bus. Here's how we do it with the UHCI host controller.

Working of the UHCI host controller.

( Works with Full Speed (12 Mbps) and Low Speed (1.5 Mbps) only )

Telling what to transfer, who to transfer it to, and in what direction is done using a data structure called a TD (Transfer Descriptor). The structure of a TD is quite obvious looking at the schedule shown below in Fig. 4. It points to other TDs and QHs (described ahead).

As seen above, a host controller maintains a list of “frames” in system memory, whose address is pointed to by a register called “Frame List Base Address register”. The list of franes occupies a page (4K ) space. Each frame entry is 32 bits wide allowing for 1024 frame entries.

The USB specification defines the idea of a frame. A frame is a one millisecond time window on the bus which can be used to transfer data. Another register called “Frame Counter” indexes into the list of frames to select a particular frame for transfer. In RAM, a frame is a data structure (32 bits long) which points to a list of TDs (Transfer Descriptors - which contain what, whom, and in which direction to transfer data).

There is also the Queue Head (QH) that remain to be explained. It is a data structure just to organise TDs. It looks like this. It contains feilds which point to a QH or a TD.

Finally, once we know how TDs and QHs are arranged in memory, we'll describe how the Host Controller processes them. The Frame Counter register is incremented every millisecond. It indexes into the Frame list and selects a frame. From the frame, TDs one by one are executed, and their status marked in the TD itself. The TDs can be optionally marked to generate Interrupts on Completion (IOCs). This goes on and on. When interrupts are not available for use, we can poll for their status and recognize their completion.

How do we use this in a USB NIC

Transmitting Packets

Now we're given a buffer of certain length to transmit. The first step is creating TDs for that buffer. Each TD can transfer up to a max packet size defined by the endpoint. Hence, we need to split up our buffer into several TDs. The structure of a TD is shown above. The 'Buffer Pointer' is set to the beginning of the buffer and the length pointer is set to the size of a maximum packet except for a last TD. The 'Link pointer' field is updated properly to point to the next TD. A dummy TD is introduced at the end indicating end of the TD chain.

Now we have a chain of TDs which must be introduced into the schedule as shown in the figure above, i.e the 'Queue Element pointer' of a some QH must point to the beginning of our TD chain. Finding this QH will be simple if we make a rule that all TDs of a certain endpoint are anchored to one particular QH thats unique to that endpoint.

Now, this QH has to be introduced into the schedule, i.e it must be linked to the Frame List as shown above. The approach used by Linux Kernel solves this problem elegantly. It creates a skeleton of QHs for various types of transfers, the addresses of which can be stored in the Host Controller descriptor data structure.

To summarize, when we are asked to transmit data, we create a chain of TDs containing the data. We know from the network device descriptor the USB device descriptor which contains the endpoint numbers of the transmitting and receiving endpoints. We can now obtain the QH to which the TD chain should be anchored to. We introduce the QH into the schedule if it has not been done already (happens when the endpoint is being used for the first time). This requires that we know the appropriate skeleton QH's address, which can be obtained from the Host Controller Descriptor whose address is given by the USB Device descriptor of our device (which is known at device configuration time)

Skeleton QHs

As mentioned above, several skeleton QHs are created for each type of transfer. Their addresses are appropriately put in the Frame List which is just a page (4K) of memory. Once this is done, we put the address of this page into a register called 'Frame List Base address register - which is described above.

Receiving Packets

First, we allocate empty buffers for incoming packets. Next, we need to instruct the device to put data into that buffer. To do this, we create a chain of TDs in a way similar to the method used for transmitting packets. The host controller can figure out the direction of data transfer from the endpoint we've addressed the TDs to since each endpoint is a simplex channel. When our TDs get 'executed' in the schedule, they will have data in them.

When the device is initialized we create many empty buffers and ask the USB controller to insert it into the schedule. In the poll method, we check for filled buffers. We hand away received packets to upper layers and add buffers in the schedule for receiving data. This process continues as long as the device is open.

Also, in the poll method, we handle completion of tx packets by signalling (un)successful transmission of data to the upper layers.

Trace: • week5 • usb_outline