aboutsummaryrefslogtreecommitdiffstats
path: root/executor/executor.cc
Commit message (Collapse)AuthorAgeFilesLines
* executor: fix MAP_FIXED_NOREPLACE dependencyTaras Madan2025-02-111-0/+3
| | | | Some environments don't define MAP_FIXED_NOREPLACE.
* executor: favor MAP_FIXED_NOREPLACE over MAP_FIXEDAleksandr Nogikh2025-02-041-0/+8
| | | | | | | | | | | | MAP_FIXED_NOREPLACE allows to fail early if we happened to overlap with an existing memory mapping. It should help detects bugs #5674 at an earlier stage, before it led to memory corruptions. MAP_FIXED_NOREPLACE is supported from Linux 4.17, which is okay for all syzkaller use cases on syzbot. There's no such option for some of the supported OSes, so set it depending on the configuration we're building for.
* executor: query globs in the test program contextDmitry Vyukov2024-12-111-4/+36
| | | | | | | | | | | | | | | | | We query globs for 2 reasons: 1. Expand glob types in syscall descriptions. 2. Dynamic file probing for automatic descriptions generation. In both of these contexts are are interested in files that will be present during test program execution (rather than normal unsandboxed execution). For example, some files may not be accessible to test programs after pivot root. On the other hand, we create and link some additional files for the test program that don't normally exist. Add a new request type for querying of globs that are executed in the test program context.
* executor: don't revert coverage orderDmitry Vyukov2024-11-261-1/+2
| | | | | | | Currently we write coverage backwards. This is visible e.g. when running syz-execprog -coverfile, and in the manager raw cover mode. Write it in the right order.
* executor: increase coverage buffer sizeDmitry Vyukov2024-11-201-7/+6
| | | | | | | | | | | The coverage buffer frequently overflows. We cannot increase it radically b/c they consume lots of memory (num procs x num kcovs x buffer size) and lead to OOM kills (at least with 8 procs and 2GB KASAN VM). So increase it 2x and slightly reduce number of threads/kcov descriptors. However, in snapshot mode we can be more aggressive (only 1 proc). This reduces number of overflows by ~~2-4x depending on syscall.
* pkg/manager: show number of times coverage for each call has overflowedDmitry Vyukov2024-11-201-0/+5
| | | | | If the overflows happen often, it's bad. Add visibility into this.
* executor: better handling for hanged test processesDmitry Vyukov2024-10-241-3/+4
| | | | | | | | | | | | | | | Currently we kill hanged processes and consider the corresponding test finished. We don't kill/wait for the actual test subprocess (we don't know its pid to kill, and waiting will presumably hang). This has 2 problems: 1. If the hanged process causes "task hung" report, we can't reproduce it, since the test finished too long ago (manager thinks its finished and discards the request). 2. The test process still consumed per-pid resources. Explicitly detect and handle such cases: Manager keeps these hanged tests forever, and we assign a new proc id for future processes (don't reuse the hanged one).
* executor: fix corner case of misinterpreting comparison dataDmitry Vyukov2024-08-281-0/+12
| | | | | Reset coverage right before scheduling next syscall for execution. See the added comment for details.
* executor: protect kcov/output regions with pkeysDmitry Vyukov2024-08-161-10/+41
| | | | | | | Protect KCOV regions with pkeys if they are available. Protect output region with pkeys in snapshot mode. Snapshot mode is especially sensitive to output buffer corruption since its location is not randomized.
* executor: set process name before taking snapshotDmitry Vyukov2024-08-131-7/+6
| | | | | It's not necessary to set process name in snapshot mode since we execute only 1 program each time.
* executor: fix coverage collection in snapshot modeDmitry Vyukov2024-08-061-3/+5
| | | | Fixes #5143
* pkg/fuzzer: try to triage on different VMsDmitry Vyukov2024-08-021-1/+1
| | | | Distribute triage requests to different VMs.
* all: add qemu snapshotting modeDmitry Vyukov2024-07-251-34/+62
|
* executor: fix writing of remote coverageDmitry Vyukov2024-07-221-1/+4
| | | | | | | | We never reset remote coverage, so if there is one block, we will write it after every call and multiple times at the end. It can lead to "too many calls in output" and just writes quadratic amount of coverage/signal. Reset remote coverage after writing.
* executor: refactor argument parsingDmitry Vyukov2024-07-221-4/+8
| | | | Check that we have at least command argument in the beginning.
* prog: restricts hints to at most 10 attempts per single kernel PCDmitry Vyukov2024-07-221-16/+10
| | | | | | | | | We are getting too many generated candidates, the fuzzer may not keep up with them at all (hints jobs keep growing infinitely). If a hint indeed came from the input w/o transformation, then we should guess it on the first attempt (or at least after few attempts). If it did not come from the input, or came with a non-trivial transformation, then any number of attempts won't help. So limit the total number of attempts (until the next restart).
* executor: deduplicate signal per-callAleksandr Nogikh2024-07-181-13/+15
| | | | | | | | | | | This kind of deduplication is confusing for the fuzzer, which expects to control the process itself (by MaxSignal and by specifying the calls for which full signal must be returned). There's also a chance that it may contribute to the difficulties during program triage and minimization. Let's err on the safe side and deduplicate signal only per-call.
* executor: factor output finishing into separate functionDmitry Vyukov2024-07-111-0/+43
| | | | | This will allow to reuse finish_output function for snapshot mode as well. NFC
* executor: handle EINTR when reading from control pipeDmitry Vyukov2024-07-081-3/+3
| | | | | | | Handle EINTR errors. Sometimes I see them happenning when running in debug mode. Before the previous commit, each such error was printed to output and detected as a bug. Without debug these should be retried by restarting the process, but still better to handle w/o restarting the process (may be expensive).
* pkg/mgrconfig: allow to disable remote coverage and coverage edgesDmitry Vyukov2024-07-021-7/+9
|
* pkg/rpcserver: move kernel test/data range checks from executorDmitry Vyukov2024-07-011-49/+34
| | | | | | | | | | | | | | | | | We see some errors of the form: SYZFAIL: coverage filter is full pc=0x80007000c0008 regions=[0xffffffffbfffffff 0x243fffffff 0x143fffffff 0xc3fffffff] alloc=156 Executor shouldn't send non kernel addresses in signal, but somehow it does. It can happen if the VM memory is corrupted, or if the test program does something very nasty (e.g. discovers the output region and writes to it). It's not possible to reliably filter signal in the tested VM. Move all of the filtering logic to the host. Fixes #4942
* executor: don't trace PCs as comparisonsDmitry Vyukov2024-06-281-0/+2
| | | | | | Currnetly we always write PCs into the buffer even if tracing comparisons. Such bogus data will fail comparison consistentcy checks (type/pc) and executor will crash. Don't trace PCs as comparisons.
* executor: prohibit malloc/calloc via linterDmitry Vyukov2024-06-251-13/+1
| | | | | | We include a number of C++ headers in the runnner. On FreeBSD some of them mention malloc, and our defines break the build. Use the style test to check only our files for these things.
* executor: add runner modeDmitry Vyukov2024-06-241-358/+377
| | | | | | | Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer. Also restore syz-runtest functionality in the manager. Update #4917 (sets most signal handlers to SIG_IGN)
* executor: refactor coverage filterDmitry Vyukov2024-06-241-19/+32
|
* executor: fix compiler warnings in 32-bit modeAlexander Egorenkov2024-06-131-2/+2
| | | | | | | | | | | | executor/executor.cc: In function ‘uint64 read_input(uint8**, bool)’: executor/executor.cc:1487:59: error: format ‘%zu’ expects argument of type ‘size_t’, but argument 3 has type ‘int’ [-Werror=format=] executor/executor.cc:1495:67: error: format ‘%zu’ expects argument of type ‘size_t’, but argument 3 has type ‘int’ [-Werror=format=] Signed-off-by: Alexander Egorenkov <eaibmz@gmail.com>
* executor: fix extraction of number of KCOV comparisons from coverage dataAlexander Egorenkov2024-06-121-3/+3
| | | | | | | | | | | KCOV stores the number of KCOV comparisons in a coverage buffer always as a 64-bit integer at offset 0 of the coverage buffer. Don't use the field size of the coverage object which is initialized in cover_collect() and size of which depends on kernel bitness because this field is intended only for KCOV PC coverage and not for KCOV comparisons. Signed-off-by: Alexander Egorenkov <eaibmz@gmail.com>
* executor: ignore kernel text addresses in comparisonsDmitry Vyukov2024-06-111-2/+2
| | | | | | We ignore comparisons of kernel data/physical addresses b/c these are not coming from user space. Ignore kernel text addresses for the same reason.
* executor: factor out is_kernel_pc helperDmitry Vyukov2024-06-111-17/+17
| | | | Factor out is_kernel_pc helper and add kernel pc range for test OS for testing.
* executor: add end-to-end coverage/signal/comparisons testDmitry Vyukov2024-06-111-13/+7
|
* executor: map input buffer as sharedCameron Finucane2024-06-111-1/+1
| | | | | | | To receive data, executor relies on changes propagating to its copy of the shared memory buffer. This is only guaranteed with MAP_SHARED, whereas behavior is "unspecified" for MAP_PRIVATE (but happened to work on most implementations).
* executor: allow to run a single testDmitry Vyukov2024-06-051-2/+2
|
* executor: remove noshmem modeDmitry Vyukov2024-06-041-66/+2
| | | | | | | | | All OSes we have now support shmem. Support for Fuchia/Starnix/Windows wasn't implemented, but generally they support shared memory. Remove all of the complexity and code associated with noshmem mode. If/when we revive these OSes, it's easier to properly implement shmem mode for them.
* executor: repair asan buildDmitry Vyukov2024-06-041-12/+29
| | | | | | | | | Asan build with sharem memory mode is broken for a long time since the address for output region is incompatible with asan (asan doesn't have shadow for these addresses). We did not notice it b/c we only tested no shared memory mode in short test mode used on CI. Don't use fixed mmap address under asan.
* executor: fix gvisor signalDmitry Vyukov2024-06-031-4/+4
| | | | | | | | | Fix 2 bugs: 1. We remove low 12 bits of every PC on amd64 b/c use_cover_edges return true. This results in extremly low signal (gvisor PC are dense integers). 2. We hash prev/next PC on arm64 which does not make sense since gvisor coverage is not a trace. This results in falsely large signal.
* executor: rework feature setupDmitry Vyukov2024-06-031-6/+8
| | | | | | | | | | | | | Return failure reason from setup functions rather than crash. This will provide better error messages, but also allow setup w/o creating subprocesses which will be needed when we combine fuzzer and executor. Also close all resources created during setup. This is also useful for in-process setup, but also should improve chances of reproducing a bug with C reproducer. Currently leaked file descriptors may disturb repro execution (e.g. it may act on a wrong fd).
* prog: introduce a remote_cover call attributeAleksandr Nogikh2024-05-271-4/+2
| | | | | | Update the descriptions to mark calls that cause remote coverage collection. Remote some hacky code from the executor.
* executor: always send 64bit pc and sigJoey Jiao2024-05-271-6/+5
| | | | | | | | | | | | On 64 bit machine, when CONFIG_RANDOMIZE_BASE enabled, even [32:64] bits changed across reboot. And, core kernel and modules can have diff [31:64] bits. We need to add 64bit pc support and this is to always send 64bit pc and sig to syz-fuzzer. Send 64bit pc and sig is compatable with 32bit OS.
* pkg/vminfo: move feature checking to hostDmitry Vyukov2024-05-151-18/+26
| | | | | | | | | | | | | | | | | Feature checking procedure is split into 2 phases: 1. syz-fuzzer invokes "syz-executor setup feature" for each feature one-by-one, and checks if executor does not fail. Executor can also return a special "this feature does not need custom setup", this allows to not call setup of these features in each new VM. 2. pkg/vminfo runs a simple program with ipc.ExecOpts specific for a concrete feature, e.g. for wifi injection it will try to run a program with wifi feature enabled, if setup of the feature fails, executor should also exit with an error. For coverage features we also additionally check that we actually got coverage. Then pkg/vminfo combines results of these 2 checks into final result. syz-execprog now also uses vminfo package and mimics the same checking procedure. Update #1541
* executor: make flatrpc build for C++Dmitry Vyukov2024-05-031-0/+3
|
* pkg/vminfo: check enabled syscalls on the hostDmitry Vyukov2024-05-021-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the syscall checking logic to the host. Diffing sets of disabled syscalls before/after this change in different configurations (none/setuid sandboxes, amd64/386 arches, large/small kernel configs) shows only some improvements/bug fixes. 1. socket$inet[6]_icmp are now enabled. Previously they were disabled due to net.ipv4.ping_group_range sysctl in the init namespace which prevented creation of ping sockets. In the new net namespace the sysctl gets default value which allows creation. 2. get_thread_area and set_thread_area are now disabled on amd64. They are available only in 32-bit mode, but they are present in /proc/kallsyms, so we enabled them always. 3. socket$bt_{bnep, cmtp, hidp, rfcomm} are now disabled. They cannot be created in non init net namespace. bt_sock_create() checks init_net and returns EAFNOSUPPORT immediately. This is a bug in descriptions we need to fix. Now we see it due to more precise checks. 4. fstat64/fstatat64/lstat64/stat64 are now enabled in 32-bit mode. They are not present in /proc/kallsyms as syscalls, so we have not enabled them. But they are available in 32-bit mode. 5. 78 openat variants + 10 socket variants + mount are now disabled with setuid sandbox. They are not permitted w/o root permissions, but we ignored that. This additionally leads to 700 transitively disabled syscalls. In all cases checking in the actual executor context/sandbox looks very positive, esp. for more restrictive sandboxes. Android sandbox should benefit as well. The additional benefit is full testability of the new code. The change includes only a basic test that covers all checks, and ensures the code does not crash/hang, all generated programs parse successfully, etc. But it's possible to unit-test every condition now. The new version also parallelizes checking across VMs, checking on a slow emulated qemu drops from 210 seconds to 140 seconds.
* pkg/ipc: consistently set ENOSYS for non-executed syscallsDmitry Vyukov2024-05-021-1/+1
| | | | | | | Currently we set errno=999 in executor for non-finished syscalls, but syscalls that were not even started still have errno=0. They also don't have Executed flag, but it's still handy to have a non-0 errno when the call is not successful.
* prog: include number of calls into exec encodingDmitry Vyukov2024-04-161-0/+1
| | | | | | Prepend total number of calls to the exec encoding. This will allow pkg/ipc to better parse executor response without full parsing of the encoded program.
* prog: more compact exec encoding for addressesDmitry Vyukov2024-04-151-5/+21
| | | | | | | | | | 1. Don't write size/flags for addresses. 2. Write address w/o data offset (fewer bytes in leb128 encoding). Median exec size shrinks by 25%: - exec sizes: 10%:584 50%:1423 90%:7076 + exec sizes: 10%:448 50%:1065 90%:6319
* prog: don't pad data in exec encodingDmitry Vyukov2024-04-151-3/+2
| | | | | | | | With leb128 ints it does not make any sense. Reduces exec sizes a bit more: - exec sizes: 10%:597 50%:1438 90%:7145 + exec sizes: 10%:584 50%:1423 90%:7076
* prog: use leb128 for exec encodingDmitry Vyukov2024-04-151-22/+42
| | | | | | | | | | | | Switch from uint64 to leb128 encoding for integers. This almost more than halves serialized size: - exec sizes: 10%:2160 50%:4792 90%:14288 + exec sizes: 10%:597 50%:1438 90%:7145 and makes it smaller than the text serialization: text sizes: 10%:837 50%:1591 90%:10156
* all: remove akaros supportDmitry Vyukov2024-04-151-17/+0
| | | | | | | Akaros support is unused, it was shutdown on syzbot for a while, the akaros development seems to be frozen for years as well. We have a bunch of hacks for Akaros since it supported only super old gcc and haven't supported Go. Remove it.
* executor: prevent netlink_send_ext with dofail=trueAleksandr Nogikh2024-01-051-0/+5
| | | | | This should never be happening during fuzzing. Otherwise we let syz-executor silently crash and restart insane number of times.
* executor: move setup_ext() below other featuresAleksandr Nogikh2023-06-151-4/+4
| | | | | It makes these extentions much more flexible as they can now also customize what other features set up.
* executor: use exitf instead of fail outside of setup sequence (#3959)Andrei Vagin2023-06-151-4/+4
| | | | | | | | | | | | | | | We have a long history of executor managing to corrupt itself in various interesting ways (e.g. using read with a pointer pointing to some global/stack variable and then kernel overwrites it). Or rt_sigreturn can corrupt other registers which won't cause immediate SIGSEGV, but rather some random behavior later. This is the race we can't win. We can't rely on memory consistency when the test already started, so we should use exitf instead of fail outside of setup sequence (and relying more on unit testing to ensure that executor works as expected for sane programs). Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrei Vagin <avagin@google.com>