Michael Decker: Driver Development

Week 5


23 & 24 June

OK, so I rewrote the tx path again, using the suspend bit as necessary. To give you an idea of what I'm working with, here is a short overview.

The 8255x processes command blocks to perform actions, such as configuration and transmit.

The gPXE driver executes an individual address setup command and then a configure command in ifec_net_open(). In addition, a ring of transmit control blocks (TCB) are initialized. The configure command's link member points to the first TCB in the list, and the suspend bit is set. When ifec_net_transmit() is called, the next TCB in the ring is configured to transmit and suspend, then the card is issued a resume command.

The card will fetch the next command block, the address of which was cached from the link field of the previous command block. In our case, the link fields never change, and they form the ring of TCBs. The transmit command will be processed, and the card will suspend.

If multiple ifec_net_transmit()s are issued quickly, they may form a chain of TCBs without intermediate suspends occuring. This is enabled by clearing the suspend bit in the previous TCB when preparing a TCB for transmission.

The code to do this is rather straight-forward, so why did it take two days to rewrite this? In a word, bugs.

There were a few mistakes in my code, such as calls ifec_scb_cmd_wait ( ioaddr ) instead of ifec_scb_cmd_wait ( ioaddr + SCBCmd ), or assigning tcb→link = ptr instead of tcb→link = virt_to_bus ( ptr ). However, there was a momma-jomma bug that eluded me for some time. This bug caused the machine to triple fault or simply freeze at seemingly random moments, although the moment was the same for any given compile.

The bug was located in a loop in ifec_net_poll():

	/* Check status of transmitted packets */
	while ( ( status = tcb->status ) && tcb->iob ) {
		if ( status & TCB_U ) {
			netdev_tx_complete_err ( netdev, tcb->iob, -ENOMEM );
		} else {
			netdev_tx_complete ( netdev, tcb->iob );
		}
		DBGIO ( "tx completion\n" );

		tcb->iob = NULL;
		tcb->status = 0;
		tcb->command &= ~CmdSuspend;	/* Allow controller to resume. */

		tcb = a->tcb_tail = tcb->next;	/* Next TCB */
	}

The above is the proper code. The bug was one line, line 453 in this commit. The line free_iob ( tcb→iob ); is not only unnecessary, it is wrong. The netdev_tx_complete() functions free this io_buffer.

After finally fixing this bug, I tested it out more and it was working just fine. Except, a few times after it successfully downloaded the kernel and initrd over http, the system froze. I haven't been able to duplicate this behavior since; perhaps it's the same bug that DrV encountered.

25 June

Marty spared a few hours today and performed some code inspection on my driver. The bug mentioned at the end of the previous journal entry continued to intermittently occur, so we prepared to debug the beast. We scrutinized ifec_net_close(), as the bug occurs after downloading everything, just prior to booting the kernel. This would be when this function is called.

/*
 * Close a network device.
 *
 * @v netdev		Device to close.
 *
 * This is a gPXE Network Device Driver API function.
 */
static void ifec_net_close ( struct net_device *netdev ) {
	struct ifec_private *priv   = netdev->priv;
	struct ifec_active  *a      = priv->active;
	unsigned long        ioaddr = priv->ioaddr;
	unsigned short       intr_status;
	int i;
	
	ifec_net_irq ( netdev, 0 );                     /* Mask ints */

	intr_status = inw ( ioaddr + SCBStatus );       /* Ack & clear ints */
	outw ( intr_status, ioaddr + SCBStatus );
	inw ( ioaddr + SCBStatus );











	ifec_reset ( ioaddr );

	/* Free any necessary resources */
	for ( i = 0; i < RFD_COUNT; i++ )
		if ( priv->rx_iobs[i] ) free_iob ( priv->rx_iobs[i] );
	free ( a );
}

This is the correct code. The offending lines were lines 241 & 242 in this commit.

 241         for ( i = 0; i < TCB_COUNT; i++ )
 242                 if ( a->tcbs[i].iob ) free_iob ( a->tcbs[i].iob );

Look at that, more free_iob calls on tx io_buffers! I erroneously assumed the driver owns the io_buffer received in ifec_net_transmit(). I reasoned, if the driver has not relinquished the io_buffers via netdev_tx_complete() in ifec_net_poll(), then the driver is responsible for freeing these buffers. You know what they say about assumptions.

Marty showed that netdevice.c tracks all issued io_buffers, and will free those not-yet-completed after closing a net_device. The removal of the offending lines from my driver placed it into a bug-free state.

During code inspection, Marty suggested key points for more comment documentation. Additionally, I mentioned I may modify the rx ring traversal code slightly, in an effort to clean it up and slightly reduce code size. These changes should be completed by tomorrow.

26 June

The tx io_buffer free bug was removed. Additional comments were added. Some straggling printfs were converted to DBGs at Marty's suggestion. Some extraneous DBGs were removed. Some testing was performed. A commit was made.

More on the way. 8-)

All data members of ifec_private supporting the rx ring were moved into ifec_active. ifec_active is allocated in ifec_net_open() and freed in ifec_net_close(), so the driver's footprint is minimized while not open.

I also converted the IAS command block allocation in ifec_net_open() to use malloc_dma() instead of malloc(). The card requires command blocks to be aligned on even physical addresses. Any alignment provided via gcc will affect virtual addresses. Although the functionality of page-mapping hardware dictates the lower bits of a virtual address will correspond to a physical address, I'm unsure of how much of an assumption the driver should make as to the underlying hardware. malloc_dma() provides an interface that guarantees aligned physical addresses, so this is the best method of allocation.

The second commit corrects a small typo in ifec_net_open(), caused by some temporary changes to the ias allocation. The destination of the link field shouldn't be followed, since the card is ordered to suspend, and is later followed by the issuance of a start command. However, the IAS command is a little touchy in how it is performed, so to be safe it links to itself.

The EEPROM caching sequence in ifec_pci_probe() was moved into a separate routine, ifec_eeprom_cache(). Additionally eeprom was changed from a static array member of ifec_private to a pointer. The cache is allocated in ifec_eeprom_cache() and is later freed after its usage in ifec_net_open()'s call to ifec_mdio_setup(). If the pointer is not freed, e.g. an error occurs or ifec_net_open() is never called, then ifec_pci_remove() will free the pointer.

I also changed ifec_scb_cmd_wait() to not be inline. No reason for it.

Previously, most of the non-API functions took ioaddr as a parameter. Marty suggested I change them to take a net_device *netdev as did the API functions, so all the function prototypes will be consistent. This should make it easier for newbies browsing the code to figure things out. Only one function, ifec_rfd_init(), doesn't take a netdev. This function only encapsulates a data structure filling sequence.

Tomorrow we shall see what's next!

27 June

Had the weekly meeting today. Marty suggested reworking my error handling technique. He also pointed me to this Linux coding style guide. So the code will be getting a style upgrade ;)

28 & 29 June

Formatting changes were made throughout the code to conform with the aforementioned style guide. Multiple variable declarations on a single line were separated. Multiple assignments on a line were separated. if-statements on a single line were split. Large comments were moved to function comments or to the header.

In ifec_net_open(), the struct ifec_cfg code was redundant. The struct's data declaration and the run-time modification were not necessary. I moved most of the assignments into the data declaration for the struct. Also, Individual Address Setup (ias) command references were coalesced into one sequence for code clarity.

In ifec_net_poll(), tx processing and rx processing were separated into ifec_tx_process() and ifec_rx_process(). In ifec_net_transmit(), the switch statement responsible for issuing resume commands as necessary was moved into ifec_tx_wake().

Also, a pointer check was added in ifec_net_close() and error handline in ifec_net_open() was cleaned up.

The code segment at the beginning of ifec_net_open() is currently commented out. I don't believe this is needed, as it was added while trying to debug earlier. I will do more thorough testing before removing it completely.