Michael Decker: Driver Development

Week 8


16 July

This morning I created a new Windows disk image to test AoE with. The last image I made was on a different machine, so that wouldn't work. The AoE driver seemed most sensitive to the change of hardware.

I connected a spare hard disk & cd drive to the test machine. I booted off a Windows XP x64 cd, created a 3GB partition, and installed there (it barely fit!). I then downloaded the AoE driver on my development machine, copied via a pen drive, and installed it. Then, I rebooted the test machine with an Ubuntu Live cd, mounted an SMB folder from the vblade server, then dd'd the 3GB partition into a file in the SMB folder.

I then reconfigured vblade to use the new image, removed the disc from the test machine, and rebooted it. I watched in amazement as it booted up Windows without a hitch. 8-o

I then recompiled my eepro100 driver and copied to the tftp path, and rebooted the test machine. Again, it booted up Windows using my latest driver. I ran a disk benchmark and defragged the drive without a hitch.

My two eepro100 cards successfully tested:

  • 82559-based PCI card - Vendor ID: 8086 Device ID: 1229
  • 82558-based PCI card - Vendor ID: 8086 Device ID: 1229

They operate flawlessly when installed alone. If I install them both, then it will work correctly if the first NIC booted by gPXE is connected. If the first NIC times out waiting for link-up, the second NIC times out AoE booting. I'm not convinced this is a problem with my driver.

18 July

Had a meeting this morning. Key points:

  • Need to finish up eepro100
  • Start writing Marvell Yukon driver
  • Perform testing with DOS

Marty was using DOS over iSCSI for thorough testing. I already have AoE setup, so he sent over a DOS image and I configured for AoE.

Initial DOS AoE testing showed a bug - the system reported 'Sector not found' when I attempted to execute a command. After the meeting, however, I have been unable to duplicate this error. Everything appears to work correctly. I tested via scandisk and doom on the DOS image. So.. weird. I'll continue to test occasionally to see if it will reproduce.

In the meantime, I added a few updates to eepro100.

It's pretty straight-forward. The link-checking is cool; I booted with the network cable unplugged, and once gPXE reported it was waiting for link-up, I connected the cable, and within a second it reported OK and continued booting. I only check the link-state every thousand poll()s or so, as mcb30 recommended.

As to the Marvell Yukon card, I have an 88E8003 Gigabit card to test with. The Marvell website doesn't provide any developer information on their hardware, only provides an application for access to such information, while eluding to an NDA. Marvell doesn't appear to be open-source friendly.

So, I installed the card and booted off an Ubuntu Live CD, found the system utilized the skge driver. I will tear apart skge.c and skge.h and using the guts of that driver, without additional documentation, I will attempt to gPixify it.

Amazing, week 8 is just about over. Time flies.

19 July

I found a memory-freeing bug in ifec_net_close() of eepro100. It was a small, one-line commit, in which I made a subsequently-corrected mistake:

For the Marvell Yukon, I moved skge.c and skge.h into the source tree and I added skge to errfile.h:

I then removed all the useless code I could find. By remove I mean it's been commented out. At this stage I'm preserving code so I can quickly draw it back in if need be.

I also converted sk_buffer code to use io_buffer, and I converted memory allocation & freeing routines. The device id table was adjusted, and necessary PCI and network driver structures were added. There was much ethertool-related code, and some WOL-related code which I commented out for now.

My basic methodology was to identify the core routines needed by gPXE, enumerate all dependent functions, then to eliminate all other functions. Then I changed the #included headers and ran it through the compiler a multitude of times while correcting identified lines. I also kept an eye on what the current routine was doing to remove/change any obvious code as necessary.

At this point it will compile, and I need to go through and determine what needs to be adjusted to meet gPXE's needs. I also placed a significant number of todo comments listing things I need to double-check and what-not.

Woot!

Driver is now fully operational!

I netbooted a DOS image via AoE and ran scandisk and doom without any sign of trouble, and without any obvious packet errors via wireshark.

How's that for next-day delivery! :)

The driver is still currently cluttered throughout with commented-out original-driver code. Once I get feedback from mdc and others, I'll begin pruning out the excess.

20 July

I decided to jump the gun and prune away on skge.

First, a new branch and a merge of recent updates into the tree:

Most of those cluttering commented lines of old code were removed:

A few functions were revived to support the XM PHY fully:

Additional functions were clipped, debug statements were placed throughout. All todo's were taken care of. Note I left the EEPROM functions in place, but commented out, in case nvs support for this is added later:

I currently don't foresee additional work to this driver until a code review with mdc.


QR Code
QR Code soc:2008:mdeck:journal:week8 (generated for current page)