External USB fuzzing for Linux kernel
syzkaller supports fuzzing the Linux kernel USB subsystem from the external USB device side. Instead of relying on physical hardware (like Facedancer-based boards) or VM management software features (like QEMU usbredir), syzkaller fuzzes USB fully within a (potentially-virtualized) environment that runs the Linux kernel.
The USB fuzzing support in syzkaller is based on:
- Raw Gadget — a kernel module that implements a low-level interface for the Linux USB Gadget subsystem (now in the mainline kernel);
- Dummy HCD/UDC — a kernel module that sets up virtual USB Device and Host controllers that are connected to each other inside the kernel (existed in the mainline kernel for a long time);
- KCOV changes that allow collecting coverage from background kernel threads and interrupts (now in the mainline kernel);
- syzkaller changes built on top of the other parts; see the Internals section.
Besides this documentation page, for details, see:
-
Coverage-Guided USB Fuzzing with Syzkaller talk (video) from OffensiveCon 2019 (was given while the USB fuzzing support was a work-in-progress, so some details are outdated);
-
Fuzzing USB with Raw Gadget talk (video) from BSides Munich 2022 (more up-to-date, but less in-depth);
-
External fuzzing of Linux kernel USB drivers with syzkaller talk (outlines steps for adding new basic USB descriptions).
See this page for the list of bugs found in the Linux kernel USB stack so far.
Setting up
-
Use the kernel version 5.7 or later.
It's also recommended to backport all kernel patches that touch
drivers/usb/gadget/legacy/raw_gadget.c,drivers/usb/gadget/udc/dummy_hcd.c, andkernel/kcov.c; -
Enable
CONFIG_USB_RAW_GADGET=yandCONFIG_USB_DUMMY_HCD=yin the kernel config.Optionally, also enable the config options for the specific USB drivers that you wish to fuzz.
As an alternative, you can use the config from the syzbot USB fuzzing instance. This config has many USB drivers commonly-used in distro kernels enabled;
-
Build the kernel;
-
Optionally, update syzkaller descriptions by extracting the USB IDs.
This step is recommended if you wish to just rely on the existing syzlang descriptions to fuzz all USB drivers enabled in your kernel config.
If you plan to add new syzlang descriptions that target a specific USB driver, this step can be skipped;
-
Optionally, write syzlang descriptions for the USB driver you wish to fuzz;
-
Enable
syz_usb_connect,syz_usb_disconnect,syz_usb_control_io,syz_usb_ep_writeandsyz_usb_ep_readpseudo-syscalls (or any of their specific variants) in the manager config; -
Set
sandboxtononein the manager config; -
Pass
dummy_hcd.num=4(or whichever number you use forprocs) to the kernel command-line parameters in the manager config; -
Run syzkaller.
Make sure that you see both
USBEmulationandExtraCoverageenabled in themachine checksection in the log.You should also see a decent amount of coverage in
drivers/usb/core/*after the first few programs get added to the corpus.
Internals
syzkaller defines 5 main pseudo-syscalls for fuzzing USB drivers (see the syzlang descriptions and the C implementations):
syz_usb_connect— handles the enumeration process of a new USB device (in simple terms: connects a new USB device; in detail: handles all control requests up until aSET_CONFIGURATIONrequest is received);syz_usb_disconnect— disconnects a USB device;syz_usb_control_io— handles a post-enumeration control request (INorOUT);syz_usb_ep_write— handles a non-controlINrequest on an endpoint;syz_usb_ep_read— handles a non-controlOUTrequest on an endpoint.
Additionally, there is the syz_usb_connect_ath9k pseudo-syscall targeted to handle a few post-enumeration control requests the ath9k driver expects.
The syzlang descriptions for these pseudo-syscalls are targeted at a few different parts of the USB subsystem:
-
The common USB enumeration code is targeted by the generic
syz_usb_connectvariant.In addition, this generic variant also briefly touches the enumeration code in various USB drivers: the USB device descriptor fields get patched during the program generation;
-
The code of the class-specific drivers is targeted by
syz_usb_connect$hid,syz_usb_connect$cdc_ecm, and other variants (accompanied by matchingsyz_usb_control_io$*andsyz_usb_ep_read/write$*pseudo-syscalls).The descriptor fields for these
syz_usb_connectvariants are also intended to be patched during the program generation based on the matching rules specific to the driver class (to exercise various driver quirks). So far, this has only been implemented for the HID (seegenerateUsbHidDeviceDescriptor) and the printer (seegenerateUsbPrinterDeviceDescriptor) classes; -
The code of the device-specific drivers is intended to be targeted by more
syz_usb_connectvariants whose descriptors do not get patched and are fully defined in syzlang instead. (However, they can be patched as well for drivers that define quirks.)So far, the only such (partial) descriptions have been added for the
ath9k,rtl8150,sierra_net, andlan78xxdrivers.
Dealing with complicated descriptors
Many USB drivers expect a fixed USB descriptors structure from the connected USB device. Writing syzlang descriptions for these drivers is straightforward: just describe the required structures in syzlang.
However, some drivers allow various permutations of the interface/endpoint descriptors (typically the case for class-specific drivers like UVC).
Describing a fixed descriptor structure for such drivers in syzlang would allow passing the driver descriptor checks and efficiently fuzz the post-enumeration communication. But it would also prevent syzkaller from exploring the various error-checking paths within the driver's enumeration code. On the other hand, defining a relaxed structure for the descriptors would make syzkaller often fail the driver's enumeration and thus would not allow efficiently fuzzing the post-enumeration communication.
There are two proposed ways to handle such drivers:
-
Writing relaxed descriptions and adding seed programs (aka runtests).
This works by only defining a single
syz_usb_connectvariant with the relaxed descriptions of the USB descriptors but also adding a seed program (intosys/linux/test/) that contains thesyz_usb_connectpseudo-syscall with the specific fixed values to pass the driver's enumeration checks; -
Adding two
syz_usb_connectvariants.One with relaxed descriptions to explore the error-checking paths; and the other with fixed descriptions to allow fuzzing the post-enumeration communication.
Handling post-enumeration control requests
Many USB drivers finish their enumeration procedure with the SET_CONFIGURATION request.
Following this, these drivers start functioning normally and are ready to accept both control and non-control requests.
However, some USB drivers expect certain control requests to be handled by the device directly following the SET_CONFIGURATION request.
Without these requests being handled, these drivers abort their execution and disconnect the device.
There are two proposed ways to handle such drivers:
-
Adding seed programs (aka runtests).
This works by adding a seed program (into
sys/linux/test/) that contains both thesyz_usb_connectpseudo-syscall for the specific driver and a fewsyz_usb_control_iopseudo-sycalls that handle the required control requests.This is the currently recommended way of handling such drivers, as it allows syzkaller to permute the expected post-
SET_CONFIGURATIONcontrol requests and thus possibly trigger unexpected driver behavior during their handling.See #6283 for a reference implementation of this approach;
-
Modifying the behavior of the
syz_usb_connectpseudo-syscall.An alternative approach is to modify the C implementation of the
syz_usb_connectpseudo-syscall (or add a similar new pseudo-syscall) to handle the required post-SET_CONFIGURATIONrequests directly from C.Right now, this approach is only implemented for the
ath9kdriver viasyz_usb_connect_ath9k. Unlikesyz_usb_connect,syz_usb_connect_ath9kalso handles the post-SET_CONFIGURATIONfirmware download requests expected byath9k.If this approach is to be used for other drivers, the proper implementation should rework and extend
syz_usb_connect_ath9kinstead of adding more similar pseudo-syscalls. For example, add a new argument tosyz_usb_connectthat allows specifiying (in syzlang) the responses to post-SET_CONFIGURATIONcontrol requests and which one them is the last one (i.e., the condition for exiting the pseudo-syscall). And then port the hardcoded responses from the C implementation ofsyz_usb_connect_ath9kinto syzlang and extend the functionality based on the specific new encountered cases.
Note: There is also a way to describe unusual pre-SET_CONFIGURATION GET_DESCRIPTOR requests via the conn_descs argument of syz_usb_connect, but none of the class/driver-specific descriptions use this feature at the moment.
However, this approach does not allow specifying the order of the responses if there are multiple requests of the same type and also does not allow handling pre-SET_CONFIGURATION non-GET_DESCRIPTOR requests, as no described drivers required this so far.
USB IDs patching
Many USB drivers implement various quirks depending of which device is connected.
Such quirks are either defined in the mathing rule table for the driver (a table of usb_device_id structures; see e.g. usb_audio_ids) or hardcoded in the drivers code (by manually checking e.g. the Vendor and Product IDs against certain values; see e.g. quirk_printers).
To exercise these driver quirks, syzkaller has to provide certain values (aka IDs) within the USB descriptors for the emulated device. Listing these USB IDs manually in syzlang descriptions is inefficient (there are too many), so syzkaller employs the way of automatically extracting the IDs before building and then patching them in during the program generation.
The IDs defined in the driver matching rules can be extracted (and patched in) automatically; see Updating USB IDs.
However, the hardcoded IDs need to be specified manually (extacting them automatically is hard, as they are often embedded into the drivers code in non-standard ways).
The proposed approach is to read the driver code to find out the hardcoded IDs and then modify the program generation code to embed them into the appropriate USB decriptors (see e.g. generateUsbPrinterDeviceDescriptor).
Updating USB IDs
syzkaller relies on a list of automatically-extracted USB IDs that are patched into syz_usb_connect (for the generic, the HID, and the printer variants) during the program generation.
To update the syzkaller USB IDs based on the USB drivers enabled in your kernel config:
-
Apply this kernel patch;
-
Build and boot the kernel;
-
Connect a USB HID device.
Assuming you have
CONFIG_USB_RAW_GADGET=yenabled, you can just run the keyboard emulation program; -
Use syz-usbgen to update sys/linux/init_vusb_ids.go:
bash ./bin/syz-usbgen $KERNEL_LOG ./sys/linux/init_vusb_ids.go -
Revert the applied kernel patch and rebuild the kernel before starting syzkaller.
Seed programs
There are a few seed programs (aka runtests) for the USB pseudo-syscalls (named starting with the vusb prefix).
These seed programs serve two purposes:
-
Allow syzkaller to go deeper into the driver code without having to write detailed syzlang descriptions;
-
Allow veryfing the USB fuzzing functionality and catching potential descriptions breaking changes by running the seed programs as runtests:
bash ./bin/syz-manager -config usb-manager.cfg -mode run-tests -tests vusb
Limitations
Most of the limitations of the USB fuzzing support in syzkaller stem from the features missing from the Raw Gadget and Dummy HCD/UDC implementations (USB 3, isochronous transfers, etc).
In addition, syz_usb_connect only supports devices with a single configuration (but this can be fixed).
This is not a critical limitation, as most kernel drivers are tied to specific interfaces and do not care about the configurations.
However, there are USB devices whose drivers assume the device to have multiple specific configurations.
Things to improve
The core support for USB fuzzing is in place, but there's still space for improvement:
-
Remove the device from
usb_devicesinsyz_usb_disconnectand don't calllookup_usb_indexmultiple times withinsyz_usb_connect. Currently, this causes some reproducers to have therepeatflag set when it's not required; -
Add descriptions for more relevant USB classes and drivers and resolve TODOs from sys/linux/vusb.txt;
-
Implement a proper way for dynamically extracting relevant USB ids from the kernel (a related discussion);
-
Add a mode for standalone fuzzing of physical USB hosts (from a Linux-based board with a UDC). This includes: a. Making sure that the current way
syz_usb_connecthandles the enumeration works properly with different OSes (there are some differences); b. Using USB requests coming from the host as a signal (like coverage) to enable "signal-driven" fuzzing; c. Making UDC driver name configurable forsyz-execprogandsyz-prog2c; -
Generate syzkaller programs from a
usbmontrace produced by real USB devices (this should make syzkaller go significantly deeper into the USB drivers code).
Running reproducers on Linux-based boards
It is possible to run the reproducers for USB bugs generated by syzkaller on a Linux-based board plugged into a physical USB host.
See Running syzkaller USB reproducers in the Raw Gadget repository for the instructions.
