Oh no! I need to write a PCI driver!

Step by step debug guide for PCIe device drivers

Photo by Markus Spiske on Unsplash

The despair is real when someone has to develop a PCI driver with no experience in driver development. I try to give a step by step guide for those who have gotten stuck among memory spaces and system calls.

Because driver development is a challenging issue, it shouldn’t be written from scratch. Download one or more example projects from your vendors, and cherry-pick the needed parts. But what if an error comes up during the merge of the example drivers? I’ll try to give you a step by step guide to help locate the bug in the code. This will lead you from “It’s not working” to “It’s not working because …”.

Start with a working example

Driver development requires spread knowledge of both software and hardware.

Yes, it requires hardware knowledge. The drivers are very close to the hardware, so you must know something about the hardware. More precisely, you need to know the endpoint’s registers, its behavior. You should know something about the physical interface (PCIe) standard.

Linux drivers are written in C. Hard, native C. You will be a C expert in a while during driver development. Functions, macros, pointers, structures, union, special data-types, etc…

The drivers are plugins (i.e., modules) of the operating system. So driver development requires deep understanding of the operating system. You will use native system calls, virtual and physical memory operation, address transformation, memory spaces.

Drivers are concurrent. Drivers are the glue of multiple user-space application’s threads and the asynchronous hardware. You have to be clear about what data-structures are accessible locally and what are globally. And you have to prevent incoherent access to data using mutex and locks. Thread safety is a challenging topic on its own. Common problems are deadlocks, data-loss, or crash of the whole system.

After all of these threatening facts, I recommend starting the development from a working example. Buy a dev-board, download the example driver code from their site, and iterate from that driver.

Sometimes you just can’t find any relevant thing, or something just doesn’t work. The following chapters will guide you through basic steps to find and point on errors in the driver, which I hope will save you hours and days of headache.

Step by step debug guide

Photo by Jason Leung on Unsplash

OK, I connected my boards, loaded the driver, and it does nothing. What should I do? Help! Here are the necessary steps:

  1. Is the device visible in lspci? (Link-up — lspci)
  2. Has the driver compiled successfully? (Build the driver)
  3. Has the driver insertion completed faultlessly? (Driver insertion)
  4. Has the probing passed successfully? (Probing)
  5. Do file operations work? (File operations)
  6. Does your driver access the required registers? (Control the hardware)

I recommend checking these points if any error has occurred. Be sure that a former point is OK before you step to the next.

Link-up — lspci

The first question is: Is your PCI link up? Has the enumeration been passed successfully? Check it with lspci. Lspci is your best friend. It must be clear: Your driver won’t be loaded if the desired device is not listed in lspci (i.e., not visible by the Linux).

Running the lspci with no arguments will list all bridges and endpoints. Note: embedded Linux can have no PCI device at all (lspci won’t print anything), while desktop Linux has ~10 (or more) PCI devices (SATA-, USB-controller, video-, audio-card, etc.) Find your device in the list. How? Let’s see an example:

user@mypc:~$ lspci
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1)
0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)

I suggest to use with -vm switches:

user@mypc:~$ lspci -vmm
Slot: 0001:00:00.0
Class: PCI bridge
Vendor: NVIDIA Corporation
Device: Device 1ad2
Rev: a1
Slot: 0001:01:00.0
Class: SATA controller
Vendor: Marvell Technology Group Ltd.
Device: Device 9171
SVendor: Marvell Technology Group Ltd.
SDevice: Device 9171
Rev: 13
ProgIf: 01

The essential part is the Vendor and the Device. Your given PCI endpoint should have a fixed Vendor- and the Device-ID pair. Search for it in your endpoint-device’s documentation. If your endpoint is an FPGA design, the FPGA IP generator tool has to have an option to set it. Here is an example of setting the IDs in Xilinx’s XDMA IP:

Set Device and Vendor IDs in Xilinx’s XDMA IP

The lspci will fetch and print a human-readable string form of the vendor instead of printing it in hex form by default. Use https://www.pcilookup.com/ page to see which hex ID belongs to which vendor. Or use the -n argument of the lspci:

user@mypc:~$ lspci -vmn
Device: 0001:00:00.0
Class: 0604
Vendor: 10de
Device: 1ad2
Rev: a1
Device: 0001:01:00.0
Class: 0106
Vendor: 1b4b
Device: 9171
SVendor: 1b4b
SDevice: 9171
Rev: 13
ProgIf: 01

If you don't see your device here (i.e., desired vendor and device ID pair), unfortunately, you have to dive into hardware debugging...

Debug tips:
Note that if your system does not support PCI hotplugging, your PCI endpoint should be up and running before the host (i.e., the Linux) is powered. Power on the endpoint first, and then power on the host Linux.

Check connections! Basic PCI requires N number of lanes (N TX-, and RX-pairs), reference clock for both sides. Some require the PERSTN reset pin. But first of all, read the documentation of your device about the required PCI pins.

Try to read and dump endpoint’s internal registers. Check if it has locked to the reference clock, check if the PCI core is in a reset state, etc…

Build the driver

I want to encourage you again to start from an example driver. Open your favorite internet and find a driver for your device or a similar device. I encourage you to start with your host’s examples. I mean, if your host Linux runs on an NXP-ARM, and your endpoint is a Xilinx FPGA, you should start from an NXP driver, then merge it with your FPGA vendor’s driver. In this case, the build will be simpler. I mean, the NXP example driver will compile successfully on their device or with less pain than an Xilinx example driver.

So let’s download your driver and try to build. Note that building a driver will require special headers and a bit different settings. (The gcc -o mydriver, won’t be a success.) The example driver should contain a makefile for building it. If you are interested in more details of driver building, please check out this article.

Driver insertion

After a successful build, you have to see a *.ko file (kernel object) in the build folder, your fresh driver. You can insert it with the insmod command.

user@mypc:~$ insmod mydriver.ko

Then you should see it in the modules. List it with the lsmod:

user@mypc:~$ lsmod
Module Size Used by
mydriver 28503 0
fuse 103841 3
nvgpu 1579891 21 picoevb_rdma
ip_tables 19441 0
x_tables 28951 1 ip_tables

If your driver is not listed here, the module insertion has failed. Note that the driver should be listed here even if no related device is found in the system (when the lspci lists no relevant device.) Note that the “Used by” counts the applications that have opened that device, and not the number of handled devices. (So this “Used by” will be 0 even after your device is visible in lspci.)

Debug tips:
If you cannot see your driver, let’s debug it. The main entry point of a driver is the parameter of module_init() macro. This function will be called after insmod-ing. This function must return 0 on success.

From this point, the printk() and the dmesg will be your first debug tools. Place printk() to the first line and all return points of the initiator function. Check your prints in another window using dmesg -w to see new messages.

Some error crashes the system, which results in you can’t see all prints in dmesg. In this case, checkout the /var/log/ directory for the logs of the crashed system.

For more details, read the official guide about the Building and Running Modules Chapter of the LDD.

Probing

Photo by Kevin Jarrett on Unsplash

While the module insertion has finished successfully, the probing will start. From kernel.org:

Probing function gets called (during execution of pci_register_driver() for already existing devices or later if a new device gets inserted) for all PCI devices which match the ID table and are not “owned” by the other drivers yet. This function gets passed a “struct pci_dev *” for each device whose entry in the ID table matches the device. The probe function returns zero when the driver chooses to take “ownership” of the device or an error code (negative number) otherwise. The probe function always gets called from process context, so it can sleep.

The probing function should be registered in the module_init() by the pci_register_driver() call. It gets a parameter called pci_driver, which holds all driver operations, as well as the probing function. This probing function will be called when a relevant device is available in the system. Note that at this point, our threads have joined. The module insertion can be done if our PCI device is not visible in the lspci, but the probing requires a relevant PCI device.

What does a relevant device mean? The device has a vendor/device ID pair (and some further identifier, which can be printed by the lspci), while the device driver has a pci_device_id structure with one or more vendor/device IDs. The driver’s IDs must match the device’s ID to be able to handle it.

So when our device is listed by the lspci (checkout Link-up — lspci section), and our driver is listed under the lsmod (checkout Driver insertion section), the probing function must be called, which results in the lspci verbose mode will list your driver as the used driver by the device:

user@mypc:~$ lspci -v
0005:01:00.0 Memory controller: NVIDIA Corporation Device 8034
Subsystem: NVIDIA Corporation Device 0001
Flags: bus master, fast devsel, latency 0, IRQ 39
Memory at 1f40000000 (32-bit, non-prefetchable) [size=512M]
... Capabilities: [300] #19
Kernel driver in use: mydriver

If the last line is missing, the probing has failed.

Debug tips:
Trace the probing function using printk() and dmesg as above in the Driver insertion section. (The probing function must return 0 on success. Check it.) If you cannot see any print (or any relevant error) the pci_device_id should be wrong. Check if the vendor/device ID in that structure matches the vendor/device ID in the lspci -vm.

The iomap of the PCI BAR(s) usually takes place in the probing function. If your probing has failed around the iomap, you can just comment them out for a while, to fix later, or check out the related part in the Control the hardware section.

File operation

We have a device and a driver for it so far. But how can we communicate with the device? The probing function should create a character device using the cdev_init(), cdev_add()and the device_create() functions. The cdev_init() gets a file_operations structure, which describes the valid file operations and registers their handler function.

A driver usually (always) registers one or more character devices. So after probing (not inserting), a new file should appear under the /dev folder. The name of the file is defined as the last parameter of the device_create(). Find that filename in the /dev folder:

user@mypc:~$ ls /dev/mydriver
/dev/mydriver

If you cannot see your character device under the dev, be sure that previous points (insertion) have passed and you have registered and added the character device correctly.

The most important handlers of the file_operations are the open and release (the close). These two functions only allocate/release memory regions, and they don’t do any hardware operation on your device usually.

I suggest implementing the read operation (beside the open and close) by filling the given buffer with some useful information. Here is a basic example:

static ssize_t fops_read(struct file *filep, char *buffer, size_t len, loff_t *offset_ptr){
int error_count = 0;
char kbuff[2048]; // kernel buffer
int kbuff_len = 2048;
loff_t offset = *offset_ptr;
printk(KERN_CRIT "Open start");

if(offset > 0){
// End of file Does not support to read more chuks, seek etc...
return 0;
}
offset += snprintf(kbuff, kbuff_len, "Hello from My driver\n");error_count = copy_to_user(buffer, kbuff, offset);if (error_count==0){ // if true then have success
printk(KERN_CRIT "Read success!\n");
*offset_ptr = offset; // Increment the offset
return offset; // Return the count of the read character
}
else {
printk(KERN_CRIT "Failed to send %d characters\n", error_count);
return -EFAULT;
}
}

After that, you can cat your device:

user@mypc:~$ cat /dev/mydriver
Hello from My driver
user@mypc:~$

If the cat failed, one of the open, close, or read has failed. Locate the error with our favorite printk and dmesg.

Control the hardware

At this point, we have a driver with an attached hardware device. We have a basic knowledge of implementing the file operations. It’s time for controlling the hardware using our device driver.

The most elegant way to communicate with a driver from the user-space application is the ioctl. Let’s implement our ioctl beside the open, release, and read in the file_operations.

Define your hardware-specific commands and call them from the main-ioctl function. For example define LED_ON and LED_OFF operation, and call led_on() and led_off() function from a switch-case from the main-ioctl. These hardware-specific command-functions should implement the exact device access by reading and writing its registers.

At this point, you should open your device’s documentation, register map, or their example code, how to control their device. In an ideal word, no error would appear… but what should we do if the device won’t do what we want? How to debug at this point?

Debug tips:
Again, the printk, dmesg pair is a good debugging tools, but I suggest another mechanism beside them. A useful debug tool can be the busybox command (not the software suite). You can read/write to/from a physical address with this. For this, you will need the BAR of the PCI. But what is this?

The BAR (Base Address Register) shows where the device memory region(s) has been mapped to in the Linux’s memory. The memory mapping is done during the bus enumeration, so when the device is visible in the lspci, the BAR(s) has been set already. Let’s see lspci:

user@mypc:~$ lspci -v
0005:01:00.0 Memory controller: NVIDIA Corporation Device 8034
Subsystem: NVIDIA Corporation Device 0001
Flags: bus master, fast devsel, latency 0, IRQ 39
Memory at 1f40000000 (32-bit, non-prefetchable) [size=512M]
Memory at 1f60000000 (32-bit, non-prefetchable) [size=64K]

Capabilities: [80] Power Management version 3
...

In this case, we have two BARs that have been mapped to 0x1f40000000 and 0x1f60000000 respectively. Reading and writing the device registers have to be offset by these addresses. So take an arbitrary register address from the device’s documentation, and add to the proper BAR to access it:

busybox devmem <BAR + device_register> # Read a register
busybox devmem <BAR + device_register> w <value> # Write a register

This, busybox devmem can be a handy debug tool. You can check all of your registers after your faulty driver.

Note that using the busybox command, you can control your PCIe device without your device driver (right after that the device is visible in the lspci). Sure, this is not what we want, but it is a handy way of board bring-up.

+1 Concurrency

Photo by Jonathan Chng on Unsplash

Concurrency is crucial and not a negligible part of device drivers. You can say that you will use the driver only in one application, which runs on only one thread (which seems a hard limitation), but what about the hardware interrupts? If you handle the interrupts from the hardware in your driver (i.e., registering your callback with request_irq()) you have to care about the data coherency and thread-safety.

There are many great descriptions about thread safety on the internet, so I won’t explain how to do it. I just want to highlight that concurrency is a real and challenging topic in device drivers. First, I suggest modeling your driver’s race conditions in any higher-level language (Java, Python), which can handle crash and deadlocks without the system crash. Then I would implement the same in C, but in a simple hello_thread application. Then when these models have passed, I would implement the locks and mutex in my driver.

Please check out the Concurrency and Race Conditions chapter of the LDD, which describes which lock has to be used and where to prevent deadlocks and incoherent data access..

Sum

However, Linux device driver development is not an easy task, but this article is not for threatening you. It just gives steps for developing. I highly recommend using this step by step methodology. And do not enter the next step if you have problems with a former one. What’s more, check what step you are in before posting your issues on forums. You will be able to ask the right question if you know what step has been failed.

So let’s develop device drivers step by step!

Open heart, open source. I’m FPGA developer.