| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| | |
|
| |
|
|
|
|
| |
Using actual VM indices for VM identification allows to match these indices to VMs in the pool,
allows to use dense arrays to store information about runners (e.g. in queue.Distributor),
and just removes string names as unnecessary additional entities.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we force restart in rpcserver, but this has 2 problems:
1. It does not know the proc where the requets will land.
2. It does not take into account if the proc has already restarted
recently for other reasons.
Restart procs in executor only if they haven't restarted recenlty.
Also make it deterministic. Given all other randomess we have,
there does not seem to be a reason to use randomized restarts
and restart after fewer/more runs.
Also restart only after corpus triage.
Corpus triage is slow already and there does not seem to be enough
benefit to restart during corpus triage.
Also restart at most 1 proc at a time,
since there are lots of serial work in the kernel.
|
| |
|
|
| |
Distribute triage requests to different VMs.
|
| |
|
|
| |
ConnectWait is directly from OpenBSD man page.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow guest payload to call syzos API functions. The available calls
are enumerated by SYZOS_API_* constants, and have a form of:
struct api_call {
uint64 call;
uint64 struct_size;
/* arbitrary call-related data here */
};
Complex instruction sequences are too easy to break, so most of the time
fuzzer won't be able to efficiently mutate them.
We replace kvm_text_arm64 with a sequence of `struct api_call`, making it
possible to intermix assembly instructions (SYZOS_API_CODE) with
higher-level constructs.
Right now the supported calls are:
- SYZOS_API_UEXIT - abort from KVM_RUN (1 argument: exit code, uint64)
- SYZOS_API_CODE - execute an ARM64 assembly blob
(1 argument: inline array of int32's)
|
| |
|
|
| |
Do not report errors when a function name contains '[_]exit' as a substring.
|
| |
|
|
|
|
|
|
|
|
| |
For KVM fuzzing we are going to need some library code that will be
running inside KVM to perform common tasks (e.g. register accesses,
device setup etc.)
This code will reside in a special ".guest" section that the executor
will map at address 0xeeee8000. For now it contains just the main function,
but will be extended in further patches.
|
| |
|
|
|
|
|
| |
Refactor phys page allocation in syz_kvm_setup_cpu$arm64 to prepare for
more address ranges.
Load user-supplied code at ARM64_ADDR_USER_CODE and allocate EL1 stack
at ARM64_ADDR_EL1_STACK_BOTTOM.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running the vusb_ath9k runtest (with [1] and [2] applied) produces ~100k of
extra coverage, which is somewhat close to the current 256k limit. A more
complicated program might produce more extra coverage and overflow the
coverage buffer.
Increase kExtraCoverSize to 1024k.
As the extra coverage buffer is maintained per-executor and not per-thread,
the total increase of the coverage mapping is ~9%, which is not too bad.
[1] https://lore.kernel.org/all/eaf54b8634970b73552dcd38bf9be6ef55238c10.1718092070.git.dvyukov@google.com/
[2] https://lore.kernel.org/all/20240722223726.194658-1-andrey.konovalov@linux.dev/T/#u
|
| |
|
|
|
|
|
|
| |
We never reset remote coverage, so if there is one block,
we will write it after every call and multiple times at the end.
It can lead to "too many calls in output" and just writes quadratic
amount of coverage/signal.
Reset remote coverage after writing.
|
| |
|
|
| |
Check that we have at least command argument in the beginning.
|
| |
|
|
|
|
|
|
|
| |
We are getting too many generated candidates, the fuzzer may not keep up
with them at all (hints jobs keep growing infinitely). If a hint indeed came
from the input w/o transformation, then we should guess it on the first
attempt (or at least after few attempts). If it did not come from the input,
or came with a non-trivial transformation, then any number of attempts won't
help. So limit the total number of attempts (until the next restart).
|
| |
|
|
|
|
|
|
|
|
|
| |
This kind of deduplication is confusing for the fuzzer, which expects to
control the process itself (by MaxSignal and by specifying the calls for
which full signal must be returned).
There's also a chance that it may contribute to the difficulties during
program triage and minimization.
Let's err on the safe side and deduplicate signal only per-call.
|
| |
|
|
|
| |
In case only ipv6 is supported, we should try ipv4-localhost first and see if it
fails, and then go on to trying ipv6.
|
| |
|
|
|
|
|
| |
It should fix errors like this one:
SYZFAIL: failed to resolve manager addr
addr=localhost h_errno=2
(errno 11: Resource temporarily unavailable
|
| |
|
|
|
| |
See commit bc144f9a58782daa2399d417b56aad80e82a219e. The justification
applies to other BSDs as well, so apply the same workaround.
|
| |
|
|
|
| |
There are also synchnous fatal signals that can happen due to bugs
in executor code. So handle them as SIGSEGV.
|
| |
|
|
|
|
|
| |
cad_pid must not point to a persistent runner process,
b/c it will be killed on ctrl+alt+del.
Fixes #5027
|
| |
|
|
|
| |
This will allow to reuse finish_output function for snapshot mode as well.
NFC
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike linux the BSDs used to check the result of setsid.
This suddenly became a problem a couple of weeks ago. It's hard
to figure out why because there was a number of problems in the
area preventing the test from working:
gmake executor execprog && \
./bin/openbsd_amd64/syz-execprog -stress -executor ./bin/openbsd_amd64/syz-executor
At least with this change the test above successfully executes some
coverage and exits cleanly.
|
| | |
|
| |
|
|
|
|
|
| |
Handle EINTR errors. Sometimes I see them happenning when running in debug mode.
Before the previous commit, each such error was printed to output and detected
as a bug. Without debug these should be retried by restarting the process,
but still better to handle w/o restarting the process (may be expensive).
|
| |
|
|
|
| |
Don't print SYZFAIL messages during machine check.
Otherwise each of them is detected as a bug.
|
| |
|
|
|
|
|
|
|
|
| |
mount() in gVisor returns EFAULT if source is NULL. It is a gVisor issue
and we will fix it. Let's explicitly sets a string source for the proc
mount to unblock gVisor jobs. The source string will additionally be
useful for troubleshooting mount-related problems in the future, because
it is shown in /prod/pid/mountinfo.
Signed-off-by: Andrei Vagin <avagin@google.com>
|
| |
|
|
|
|
|
| |
Android sets fs.mount-max to 100, making it impossible to create new chroots.
Relax the limit, setting it to a value used on desktops.
Tracking bug: https://github.com/google/syzkaller/issues/4972
|
| |
|
|
|
|
|
| |
Signal rotation is intended to make the fuzzer re-discover flaky coverage
in non flaky way. However, taking into accout that we get effectively
the same effect after each manager restart, and that the fuzzer is overloaded
with triage/smash jobs, it does not look to be worth it.
|
| |
|
|
|
|
|
|
|
|
|
| |
To prevent the executor from accidentally making the whole root file system
immutable (which breaks fuzzing), modify sandbox=none to create a tmpfs mount
and chroot into it before executing programs in a process.
According to `syz-manager -mode=smoke-test`, the number of enabled syscalls on
x86 doesn't change with this patch.
Fixes #4939, #2933, #971.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We see some errors of the form:
SYZFAIL: coverage filter is full
pc=0x80007000c0008 regions=[0xffffffffbfffffff 0x243fffffff 0x143fffffff 0xc3fffffff] alloc=156
Executor shouldn't send non kernel addresses in signal,
but somehow it does. It can happen if the VM memory is corrupted,
or if the test program does something very nasty (e.g. discovers
the output region and writes to it).
It's not possible to reliably filter signal in the tested VM.
Move all of the filtering logic to the host.
Fixes #4942
|
| |
|
|
|
|
| |
SIGBUS means OOM on Linux.
Most of the crashes that happen during fuzzing are SIGBUS,
so separate them from SIGSEGV and suppress.
|
| |
|
|
|
|
|
|
|
|
| |
Flatbuffers represents each scalar in little-endian format
(https://flatbuffers.dev/flatbuffers_internals.html). Therefore,
the size of the received root table must be converted to the host endianness
format before its first usage.
Signed-off-by: Alexander Egorenkov <eaibmz@gmail.com>
Fixes: e16e2c9a4cb6 ("executor: add runner mode")
|
| |
|
|
|
|
| |
It's a more general name that says what happened
rather than a detail of what excutor should do.
We can use this notification for other things as well.
|
| |
|
|
| |
This allows to enable test executor with coverage.
|
| |
|
|
|
|
| |
Currnetly we always write PCs into the buffer even if tracing comparisons.
Such bogus data will fail comparison consistentcy checks (type/pc)
and executor will crash. Don't trace PCs as comparisons.
|
| |
|
|
|
|
|
|
|
|
|
| |
There is a quirk related to posix_spawn_file_actions_adddup2:
it just executes the specified dup's in order in the child process.
In our case we do dups as follows:
20 -> 4 (output region)
4 -> 5 (max signal)
So we dup the output region onto 4 first, and then dup the same output region
(fd 4 becomes the output region) onto 5 (max signal).
So we have output region as both output region and max signal.
|
| |
|
|
|
| |
Fail some features in various ways for test OS,
and check that features are detected properly.
|
| |
|
|
|
|
|
| |
Coverage setup fails with exitf if not supported.
Currently we consider it as transient error that needs to be retried.
As the result we reach 20 attempts and crash the VM.
Return an error in such case instead.
|
| |
|
|
|
| |
Somehow it's very slow in syzbot arm64 image.
This speeds up pkg/runtest tests a hundred of times.
|
| |
|
|
|
|
|
| |
Otherwise we may leave orphaned executor process children, which prevent
the cleanup of the executor directory.
Closes #4920.
|
| |
|
|
|
|
|
| |
FreeBSD says:
executor/conn.h:100:3: error: unknown type name 'sockaddr_in'; did you mean 'sockaddr'?
sockaddr_in saddr4 = {};
|
| |
|
|
|
|
|
|
| |
OpenBSD says:
executor/executor_runner.h:750:51: error: no member named 'uc_mcontext' in 'sigcontext'
auto& mctx = static_cast<ucontext_t*>(ucontext)->uc_mcontext;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
|
| |
|
|
| |
Otherwise, it fails with "SYZFAIL: failed to parse manager port".
|
| |
|
|
| |
OpenBSD has neither fallocate nor posix_fallocate.
|
| |
|
|
|
|
| |
We include a number of C++ headers in the runnner.
On FreeBSD some of them mention malloc, and our defines break the build.
Use the style test to check only our files for these things.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 215eef4ad85fb6124af70d1e5c9729b69554a32b.
The gvisor "stdin" address still crashes in executor
Connection::Connect on atoi(ports) with ports == NULL.
The gvisor "stdin" address is not tested, so it's better to make it less
special rather than add more special cases in manager, executor,
and now also in Connection to handle it.
It still may crash in future after some changes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
My gcc-10 in testing vm compainls during reproducer [0] build with
following error:
rep.c: In function ‘remove_dir’:
rep.c:662:3: error: a label can only be part of a statement and a declaration is not a statement
662 | const int umount_flags = MNT_FORCE | UMOUNT_NOFOLLOW;
| ^~~~~
Label followed by declaration is C23 extension, so only new compilers
support it.
Fix it by moving declaration above `retry` label and put unused attribute
to suppress possible warning.
[0] https://syzkaller.appspot.com/bug?extid=dcc068159182a4c31ca3
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
|