aboutsummaryrefslogtreecommitdiffstats
path: root/executor
Commit message (Collapse)AuthorAgeFilesLines
* executor: update executor code to match new flatbuffersPimyn Girgis2025-12-031-4/+4
| | | | flatbuffers changed some function signatures. Update executor code to match.
* executor: update flatbuffersPimyn Girgis2025-12-0331-903/+1460
| | | | Update flatbuffers to v23.5.26, which matches the compiler version in the new env container.
* executor: apply optnone to guest_handle_nested_vmentry_intel()Alexander Potapenko2025-11-282-2/+14
| | | | | | | | | | | | | Florent Revest reported ThinLTO builds failing with the following error: <inline asm>:2:1: error: symbol 'after_vmentry_label' is already defined after_vmentry_label: ^ error: cannot compile inline asm , which turned out to be caused by the compiler not respecting `noinline`. Adding __attribute__((optnone)) (or optimize("O0") on GCC) fixes the issue.
* executor: improve startup time on machines with many CPUsFlorent Revest2025-11-261-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I observed that on machines with many CPUs (480 on my setup), fuzzing with a handful of procs (8 on my setup) would consistently fail to start because syz-executors would fail to respond within the default handshake timeout of 1 minute. Reducing procs to 4 would fix it but sounds ridiculous on such a powerful machine. As part of the default sandbox policy, a syz-executor creates a large number of virtual network interfaces (16 on my kernel config, probably more on other kernels). This step vastly dominates the executor startup time and was clearly responsible for the timeout I observed that prevented me from fuzzing. When fuzzing or reproducing with procs > 1, all executors run their sandbox setup in parallel. Creating network interfaces is done by socket operations to the RTNL (routing netlink) subsystem. Unfortunately, all RTNL operations in the kernel are serialized by a "rtnl_mutex" mega lock so instead of paralellizing the 8*16 interfaces creation, they effectively get serialized and the timing it takes to set up the default sandbox for one executor scales lineraly with the number of executors started "in parallel". This is currently inherent to the rtnl_mutex in the kernel and as far as I can tell there's nothing we can do about it. However, it makes it very important that each critical section guarded by "rtnl_mutex" stays short and snappy, to avoid long waits on the lock. Unfortunately, the default behavior of a virtual network interface creation is to create one RX and one TX queue per CPU. Each queue is associated with a sysfs file whose creation is quite slow and goes through various sanitized paths that take a long time. This means that each critical section scales linearly to the number of CPUs on the host. For example, in my setup, starting fuzzing needs 2 minutes 25. I found that I could bring this down to 10 seconds (15x faster startup time!) by limiting the number of RX and TX queues created per virtual interface to 2 using the IFLA_NUM_*X_QUEUES RTNL attributes. I opportunistically chose 2 to try and keep coverage of the code that exercises multiple queues but I don't have evidences that choosing 1 here would actually reduce the code coverage. As far as I can tell, reducing the number of queues would be problematic in a high performance networking scenario but doesn't matter for fuzzing in a namespace with only one process so this seems like a fair trade-off to me. Ultimately, this lets me start a lot more parallel executors and take better advantage of my beefy machine. Technical detail for review: veth interfaces actually create two interfaces for both side of the virtual ethernet link so both sides need to be configured with a low number of queues.
* executor: sys/linux: implement SYZOS_API_NESTED_AMD_VMCB_WRITE_MASKAlexander Potapenko2025-11-211-0/+23
| | | | | | | | | | | The new command allows mutation of AMD VMCB block with plain 64-bit writes. In addition to VM ID and VMCB offset, @nested_amd_vmcb_write_mask takes three 64-bit numbers: the set mask, the unset mask, and the flip mask. This allows to make bitwise modifications to VMCB without disturbing the execution too much. Also add sys/linux/test/amd64-syz_kvm_nested_amd_vmcb_write_mask to test the new command behavior.
* executor: sys/linux: implement SYZOS_API_NESTED_INTEL_VMWRITE_MASKAlexander Potapenko2025-11-211-0/+28
| | | | | | | | | | | | The new command allows mutation of Intel VMCS fields with the help of vmwrite instruction. In addition to VM ID and field ID, @nested_intel_vmwrite_mask takes three 64-bit numbers: the set mask, the unset mask, and the flip mask. This allows to make bitwise modifications to VMCS without disturbing the execution too much. Also add sys/linux/test/amd64-syz_kvm_nested_vmwrite_mask to test the new command behavior.
* executor: sys/linux/test: handle RDTSCP in L2Alexander Potapenko2025-11-212-3/+19
| | | | | | | Enable basic RDTSCP handling. Ensure that Intel hosts exit on RDTSCP in L2, and that both Intel and AMD can handle RDTSCP exits. Add amd64-syz_kvm_nested_vmresume-rdtscp to test that.
* executor: x86: factor out common code in rdmsr()/wrmsr()Alexander Potapenko2025-11-211-37/+21
| | | | | | | | | | While at it, fix a bug in rdmsr() that apparently lost the top 32 bits. Also fix a bug in Intel's Secondary Processor-based Controls: we were incorrectly using the top 32 bits of X86_MSR_IA32_VMX_PROCBASED_CTLS2 to enable all the available controls without additional setup. This only worked because rdmsr() zeroed out those top bits.
* executor: sys/linux/test: handle RDTSC in L2Alexander Potapenko2025-11-212-3/+16
| | | | | | | Enable basic RDTSC handling. Ensure that Intel hosts exit on RDTSC in L2, and that both Intel and AMD can handle RDTSC exits. Add amd64-syz_kvm_nested_vmresume-rdtsc to test that.
* executor: sys/linux/test: basic CPUID handling in L2Alexander Potapenko2025-11-211-9/+20
| | | | | Ensure L2 correctly exits to L1 on CPUID and resumes properly. Add a test.
* executor: sys/linux: implement SYZOS_API_NESTED_VMRESUMEAlexander Potapenko2025-11-202-19/+60
| | | | | | | | | | | | | | | Provide the SYZOS API command to resume L2 execution after a VM exit, using VMRESUME on Intel and VMRUN on AMD. For testing purpose, implement basic handling of the INVD instruction: - enable INVD interception on AMD (set all bits in VMCB 00Ch); - map EXIT_REASON_INVD and VMEXIT_INVD into SYZOS_NESTED_EXIT_REASON_INVD; - advance L2 RIP to skip to the next instruction. While at it, perform minor refactorings of L2 exit reason handling. sys/linux/test/amd64-syz_kvm_nested_vmresume tests the new command by executing two instructions, INVD and HLT, in the nested VM.
* executor: x86: retire UEXIT_STOP_L2Alexander Potapenko2025-11-201-6/+3
| | | | | | | It was useful initially for vendor-agnostic tests, but given that we have guest_uexit_l2() right before it, we can save an extra L2-L1 exit. Perhaps this should increase the probability of executing more complex payloads (fewer KVM_RUN calls to reach the same point in L2 code).
* executor: sys/linux: implement SYZOS_API_NESTED_VMLAUNCHAlexander Potapenko2025-11-192-1/+202
| | | | | | | | | | | | Provide a SYZOS API command to launch the L2 VM using the VMLAUNCH (Intel) or VMRUN (AMD) instruction. For testing purposes, each L2->L1 exit is followed by a guest_uexit_l2() returning the exit code to L0. Common exit reasons (like HLT) will be mapped into a common exit code space (0xe2e20000 | reason), so that a single test can be used for both Intel and AMD. Vendor-specific exit codes will be returned using the 0xe2110000 mask for Intel and 0xe2aa0000 for AMD.
* executor: sys/linux: implement SYZOS_API_NESTED_LOAD_CODEAlexander Potapenko2025-11-191-0/+41
| | | | The new command loads an instruction blob into the specified L2 VM.
* executor: sys/linux: renumber SYZOS API IDsAlexander Potapenko2025-11-191-13/+13
| | | | | | | | Now that we are using volatiles in guest_main(), there is no particular need to base the numbers on primes (this didn't work well with Clang anyway). Instead, group the commands logically and leave some space between the groups for future updates.
* executor: x86: implement SYZOS_API_NESTED_CREATE_VMAlexander Potapenko2025-11-192-1/+666
| | | | | | Provide basic setup for registers, page tables, and segments to create Intel/AMD-based nested virtual machines. Note that the machines do not get started yet.
* executor: x86: implement SYZOS_API_ENABLE_NESTEDAlexander Potapenko2025-11-192-0/+121
| | | | | | Add vendor-specific code to turn on nested virtualization on Intel and AMD. Also provide get_cpu_vendor() to pick the correct implementation.
* executor: x86: Configure L1 guest TSS for nested virtualizationAlexander Potapenko2025-11-192-3/+38
| | | | Set up the L1 guest's 64-bit Task State Segment (TSS), a prerequisite for VMX/SVM.
* executor: x86: Prepare memory layout and hardware constants for NVAlexander Potapenko2025-11-192-2/+61
| | | | | | | | | | | This patch lays the groundwork for nested virtualization by rearranging the KVM guest's memory map. Key changes include: - Introducing a dedicated per-VCPU memory region for L2 VMs. - Updating `executor/kvm.h` with: - Adjusted stack addresses for the L1 guest. - Detailed memory layout macros for L2 VM structures
* executor: disallow O_CREAT in syz_open_devFlorent Revest2025-11-141-1/+1
| | | | | | | | | | I can't think of a valid reason to create nodes under /dev/ if they don't already exist. On systems where /dev/ isn't backed by a virtual/temp file system, O_CREAT lets syzkaller create persistent files on disk and may unnecessarily clutter or fill the disk with files that have nothing to do with the intended syscall descriptions.
* executor: enable periodic leak checkingPimyn Girgis2025-11-031-6/+63
| | | | | This commit enables the periodic execution of a leak checker within the executor. The leak checker will now run every 2 * num_procs executions, but only after the corpus has been triaged and all executor processes are in an idle state.
* executor: add include guards to KVM headersAlexander Potapenko2025-10-279-0/+45
| | | | | Not having these results in three copies of every KVM-related #define in each reproducer.
* executor: common_kvm.h: fix compilationAlexander Potapenko2025-10-171-1/+1
| | | | | | Add #if checks to define executor_fn_guest_addr() for __NR_syz_kvm_setup_cpu and __NR_syz_kvm_setup_syzos_vm. This fixes a compilation error spotted by csource_test.go
* executor: common_kvm_ppc64.h: drop kvm_ppc_mmuv3_cfgAlexander Potapenko2025-10-171-6/+0
| | | | | | struct kvm_ppc_mmuv3_cfg seems to be defined in /usr/powerpc64le-linux-gnu/include/asm/kvm.h, remove the duplicate definition.
* executor: s/true/1 in common_kvm_ppc64.hAlexander Potapenko2025-10-171-1/+1
| | | | Fix a compilation error spotted by csource_test.go
* executor: introduce __addrspace_guestAlexander Potapenko2025-10-176-64/+63
| | | | | | | | | | | | Apply __addrspace_guest to every guest function and use a C++ template to statically validate that host functions are not passed to executor_fn_guest_addr(). This only works in Clang builds of syz-executor, because GCC does not support address spaces, and C reproducers cannot use templates. The static check allows us to drop the dynamic checks in DEFINE_GUEST_FN_TO_GPA_FN(). While at it, replace DEFINE_GUEST_FN_TO_GPA_FN() with explicit declarations of host_fn_guest_addr() and guest_fn_guest_addr().
* executor: unify ARM64_ADDR_EXECUTOR_CODE and X86_SYZOS_ADDR_EXECUTOR_CODEAlexander Potapenko2025-10-173-5/+13
| | | | | Use SYZOS_ADDR_EXECUTOR_CODE instead of both. Also put platform-specific definitions under #if GOARCH_xxx.
* executor: amd64: remove the switch from guest_main()Alexander Potapenko2025-10-171-31/+35
| | | | Somehow Clang still manages to emit a jump table for it.
* executor: fix setup_cpuid() declarationAlexander Potapenko2025-10-171-0/+2
| | | | Make sure setup_cpuid() is only declared together with install_user_code()
* executor: sys/linux: implement SYZOS_API_SET_IRQ_HANDLERAlexander Potapenko2025-10-172-13/+70
| | | | | | | | | | The new API call allows to initialize the handler with one of the three possible values: - NULL (should cause a page fault) - dummy_null_handler (should call iret) - uexit_irq_handler (should perform guest_uexit(UEXIT_IRQ)) Also add a test for uexit_irq_handler()
* executor: use dynamic page table allocation for guestAlexander Potapenko2025-10-172-63/+38
| | | | | | | | Use a pool of 32 pages to allocate PT and PE entries for the guest page tables. This eliminates the need for manually assigned page table entries that are brittle and may break when someone changes the memory layout.
* executor: refactor x86 SYZOS setupAlexander Potapenko2025-10-172-65/+67
| | | | | Pass around struct kvm_syzos_vm instead of one-off pointers to various guest memory ranges.
* executor: rework GDT setup for SYZOSAlexander Potapenko2025-10-173-54/+109
| | | | | Untangle SYZOS GDT setup from the legacy one. Drop LDT and TSS for now.
* executor: fix the definition of struct tss64Alexander Potapenko2025-10-171-2/+2
| | | | | Per https://wiki.osdev.org/Task_State_Segment#Long_Mode, io_bitmap and reserved3 should be 16-bit.
* executor: use a list of memory regions to set up SYZOS guestAlexander Potapenko2025-10-171-35/+44
| | | | | Instead of open-coding every memory region in several places, use a single array to configure their creation.
* executor: more robust x86 page table creation in SYZOSAlexander Potapenko2025-10-172-16/+117
| | | | | | Provide map_4k_region() to ease page table creation for different regions. While at it, also move the stack from 0x0 to 0x90000.
* executor: introduce DEFINE_GUEST_FN_TO_GPA_FN()Alexander Potapenko2025-10-173-2/+39
| | | | | DEFINE_GUEST_FN_TO_GPA_FN() allows to define helper functions to calculate guest addresses in the host/guest code.
* executor: rename SYZOS-related address definitionsAlexander Potapenko2025-10-173-21/+25
| | | | | | | To distinguish SYZOS addresses from other x86 definitions, change them to start with X86_SYZOS_ADDR_ No functional change.
* prog: fix syz_kfuzztest_run allocation strategyEthan Graham2025-09-221-4/+4
| | | | | | | | | | | | | | | | | | | | Previously, the generated KFuzzTest programs were reusing the address of the top-level input struct. A problem could arise when the encoded blob is large and overflows into another allocated region - this certainly happens in the case where the input struct points to some large char buffer, for example. While this wasn't directly a problem, it could lead to racy behavior when running KFuzzTest targets concurrently. To fix this, we now introduce an additional buffer parameter into syz_kfuzztest_run that is as big as the maximum accepted input size in the KFuzzTest kernel code. When this buffer is allocated, we ensure that we have some allocated space in the program that can hold the entire encoded input. This works in practice, but has not been tested with concurrent KFuzzTest executions yet.
* kfuzztest: introduce syz_kfuzztest_run pseudo-syscallEthan Graham2025-09-221-0/+54
| | | | | | | | | | | | | Add syz_kfuzztest_run pseudo-syscall, KFuzzTest attribute, and encoding logic. KFuzzTest targets, which are invoked in the executor with the new syz_kfuzztest_run pseudo-syscall, require specialized encoding. To differentiate KFuzzTest calls from standard syzkaller calls, we introduce a new attribute called KFuzzTest or "kfuzz_test" in syzkaller descriptions that can be used to annotate calls. Signed-off-by: Ethan Graham <ethangraham@google.com>
* sys/linux: executor: add IN_DX and OUT_DX to SYZOS x86 APIAlexander Potapenko2025-09-191-0/+67
| | | | | | | | Add SYZOS calls that correspond to the IN and OUT x86 instructions that perform port I/O. These instructions have different variants, for now we just implement the one that takes the port number from DX instead of encoding it in the opcode.
* sys/linux: executor: implement SYZOS_API_WR_DRN on x86Alexander Potapenko2025-09-191-0/+45
| | | | | Add a SYZOS call to write to one of the debug registers (DR0-DR7).
* executor: sys/linux/: pkg/runtest: pkg/vminfo: add syz_kvm_assert_syzos_kvm_exitAlexander Potapenko2025-09-195-1/+35
| | | | Implement a pseudo-syscall to check the value of kvm_run.exit_reason
* executor: introduce __no_stack_protector and use it for guest codeAlexander Potapenko2025-09-113-23/+37
| | | | | | | | | | | When compiling the executor in syz-env-old, -fstack-protector may kick in and introduce global accesses that tools/check-syzos.sh reports. To prevent this, introduce the __no_stack_protector macro attribute that disable stack protection for the function in question, and use it for guest code. While at it, factor out some common definitions into common_kvm_syzos.h
* executor: x86: fix check-syzos errorAlexander Potapenko2025-09-111-14/+16
| | | | | Replace the switch statement in guest_handle_wr_crn() with a series of if statements.
* executor: refactor execute_req parsing to use names for IPC flagsJann Horn2025-09-021-5/+5
| | | | | | This makes it easier to figure out where the flags go by grepping for them by name. No functional change intended.
* executor: move proc opts to a separate structAleksandr Nogikh2025-08-211-36/+41
| | | | This will reduce code duplication and simplify adding new fields.
* executor: arm64: syzos: add flush_cache_range()Alexander Potapenko2025-08-081-3/+32
| | | | | | | | ARMv8-A architecture mandates how caches should be flushed when writing self-modifying code. Although it would be nice to catch some bugs caused by omitting this synchronization, we want it to happen in most cases, so that our code actually works.
* executor: arm64: syzos: fix the constraints in gicv3_cpu_init()Alexander Potapenko2025-08-081-2/+1
| | | | | | Somehow we were using an input constraint instead of an output one in the assembly code performing a read of ICC_SRE_EL1 into a GP register.
* executor: arm64: syzos: delete clobbers from one_irq_handler_fn()Alexander Potapenko2025-08-081-3/+1
| | | | | In fact this function does not clobber any registers, they all are restored. Therefore, just delete the registers from the clobber list.