| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
Right now closing a kcov fd on Linux won't disable coverage, so further
attempts to open an fd and enable coverage on the same thread will
not work.
Add cover_close() which will disable the coverage if necessary, and
close the file descriptor.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
On different platforms and in different coverage collection modes
the pointer to the beginning of kcov buffer may or may not differ
from the pointer to the region that mmap() returned.
Decouple these two pointers, so that the memory is always allocated
and deallocated with cov->mmap_alloc_ptr and cov->mmap_alloc_size, and the
buffer is accessed via cov->data and cov->data_size.
I tried my best to not break Darwin and BSD, but I did not test them.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new vminfo feature, FeatureKcovResetIoctl, that is true if the
kernel supports ioctl(KCOV_RESET_TRACE) making it possible to reset the
coverage buffer on the kernel side. This, in turn, allows us to map the
coverage buffer read-only, which will prevent all sorts of
userspace-generated corruptions at a cost of an extra syscall per program
execution.
The corresponding exec env flag, ExecEnv::ReadOnlyCoverage, turns on
read-only coverage in the executor. It is enabled by default
if FeatureKcovResetIoctl is on.
|
| |
|
|
|
|
|
|
|
| |
clang-tidy-20 generates many more failures, many of which are in the
flartrpc library. Let's disable clang-analyzer-optin.core.EnumCastOutOfRange
for now.
It also complained about PROT_EXEC in the executor, but that is
necessary to support syz_execute_func().
|
| |
|
|
|
|
|
|
|
|
|
|
| |
MAP_FIXED_NOREPLACE allows to fail early if we happened to overlap with
an existing memory mapping. It should help detects bugs #5674 at an
earlier stage, before it led to memory corruptions.
MAP_FIXED_NOREPLACE is supported from Linux 4.17, which is okay for all
syzkaller use cases on syzbot.
There's no such option for some of the supported OSes, so set it
depending on the configuration we're building for.
|
| |
|
|
|
|
|
|
|
|
|
| |
The coverage buffer frequently overflows.
We cannot increase it radically b/c they consume lots of memory
(num procs x num kcovs x buffer size) and lead to OOM kills
(at least with 8 procs and 2GB KASAN VM).
So increase it 2x and slightly reduce number of threads/kcov descriptors.
However, in snapshot mode we can be more aggressive (only 1 proc).
This reduces number of overflows by ~~2-4x depending on syscall.
|
| |
|
|
|
| |
If the overflows happen often, it's bad.
Add visibility into this.
|
| |
|
|
|
|
|
| |
Protect KCOV regions with pkeys if they are available.
Protect output region with pkeys in snapshot mode.
Snapshot mode is especially sensitive to output buffer corruption
since its location is not randomized.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We see some errors of the form:
SYZFAIL: coverage filter is full
pc=0x80007000c0008 regions=[0xffffffffbfffffff 0x243fffffff 0x143fffffff 0xc3fffffff] alloc=156
Executor shouldn't send non kernel addresses in signal,
but somehow it does. It can happen if the VM memory is corrupted,
or if the test program does something very nasty (e.g. discovers
the output region and writes to it).
It's not possible to reliably filter signal in the tested VM.
Move all of the filtering logic to the host.
Fixes #4942
|
| |
|
|
|
|
|
| |
Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer.
Also restore syz-runtest functionality in the manager.
Update #4917 (sets most signal handlers to SIG_IGN)
|
| |
|
|
| |
The address ranges in is_kernel_data/pc are only true for normal Linux.
|
| |
|
|
| |
Factor out is_kernel_pc helper and add kernel pc range for test OS for testing.
|
| | |
|
| |
|
|
|
|
|
| |
Currently we sleep only for 1 ms, which may produce some excessive CPU load
(we usually have 6/8 such processes waiting).
Make it sleep for 10 ms, but also make the sleep return immediately on child exit.
This shuold both improve latency and reduce CPU load.
|
| |
|
|
|
|
|
|
|
| |
All OSes we have now support shmem.
Support for Fuchia/Starnix/Windows wasn't implemented,
but generally they support shared memory.
Remove all of the complexity and code associated with noshmem mode.
If/when we revive these OSes, it's easier to properly
implement shmem mode for them.
|
| |
|
|
|
|
|
|
|
| |
Fix 2 bugs:
1. We remove low 12 bits of every PC on amd64 b/c use_cover_edges return true.
This results in extremly low signal (gvisor PC are dense integers).
2. We hash prev/next PC on arm64 which does not make sense
since gvisor coverage is not a trace. This results in falsely large signal.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Return failure reason from setup functions rather than crash.
This will provide better error messages, but also allow setup
w/o creating subprocesses which will be needed when we combine
fuzzer and executor.
Also close all resources created during setup.
This is also useful for in-process setup, but also should improve
chances of reproducing a bug with C reproducer. Currently leaked
file descriptors may disturb repro execution (e.g. it may act
on a wrong fd).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Feature checking procedure is split into 2 phases:
1. syz-fuzzer invokes "syz-executor setup feature" for each feature one-by-one,
and checks if executor does not fail.
Executor can also return a special "this feature does not need custom setup",
this allows to not call setup of these features in each new VM.
2. pkg/vminfo runs a simple program with ipc.ExecOpts specific for a concrete feature,
e.g. for wifi injection it will try to run a program with wifi feature enabled,
if setup of the feature fails, executor should also exit with an error.
For coverage features we also additionally check that we actually got coverage.
Then pkg/vminfo combines results of these 2 checks into final result.
syz-execprog now also uses vminfo package and mimics the same checking procedure.
Update #1541
|
| |
|
|
|
|
|
|
|
|
|
| |
Because the executor may place other mappings next to the buffer used by
kcov, occasional out-of-bound writes to them may corrupt the coverage,
creating garbage PCs (see https://github.com/google/syzkaller/issues/4531).
To prevent those, map two extra pages for the kcov buffer, and protect them,
so that OOB writes cause a segfault.
Fixes https://github.com/google/syzkaller/issues/4532
|
| |
|
|
|
| |
If the feature is supported on the device, allocate a 128MB swap file
after VM boot and activate it.
|
| |
|
|
|
|
|
| |
This commit adds a new VM for fuzzing starnix.
The VM will boot a fuchsia image using the `ffx` tool and will connect to an adb server inside it. Fuzzing will be done using HostFuzzer mode due to some features not being implemented yet in starnix. Once this is possible, fuzzing will be performed without HostFuzzer mode.
Co-authored-by: Juampi Miceli <jpmiceli@google.com>
|
| |
|
|
|
|
| |
A fixed-address mmap can fail completely or return a different address.
Log what it was. Based on:
https://groups.google.com/g/syzkaller/c/lto00RwlDIQ
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As was found out in #2921, fork bombs are still possible in Linux-based
instances. One of the possible reasons is described below.
An invalid stack can be passed to the clone() call, thus causing it to stumble
on an invalid memory access right during returning from the clone() call. This
is in turn catched by the NONFAILING() macro and the control actually jumps
over it and eventually both the child and the parent continue executing the
same code.
Prevent it by handling SIGSEGV and SIGBUS differently during the clone process.
Co-authored-by: Andrei Vagin <avagin@google.com>
|
| |
|
|
|
|
|
|
|
| |
The previous strategy (delay kcov instance creation) seems not to work
very well in carefully sandboxed environments. Let's see if the new
approach is more versatile.
Open a kcov handle for each thread at syz-executor's initialization, but
don't mmap it right away.
|
| |
|
|
|
|
|
|
|
| |
As now kcov instances may get set up during fuzzing, performing dup2 in
cover_open is no longer safe as it may close some important resource.
Prevent that by reserving most of fds that belong to the kcov fds range.
Unfortunately we must duplicate the code because of the way kcov
implementations are organized.
|
| |
|
|
|
|
|
| |
pkg/host.Setup never asks to setup "sysctl" feature explicitly,
sysctl's are assumed to be setup whenever "syz-executor setup" is executed.
Thus "sysctl" does not need to be present in the list of available
things to setup.
|
| |
|
|
|
|
|
|
|
| |
Currently the data_offset field of cover_t is only initialized for
per-syscall coverage collection. As a result, remote coverage is read
from an invalid location, fails to pass sanity checks and is not
returned to syzkaller.
Fix the initialization of cover_t fields.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently all executor fail errors go into "lost connection" bucket.
This is not very useful. First, there are different executor failures.
Second, it's not possible to understand what failures happen how frequently.
Third, there are not authentic lost connection.
Create separate SYZFAIL: bugs for them.
Update #573
Update #502
Update #318
|
| | |
|
| |
|
|
|
| |
kcov_remote_arg was changed to a portable format
so we don't need to handle differences between 64/32-bits anymore.
|
| |
|
|
| |
gvisor coverage is not a trace, so producing edges won't work.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Apply ignore_return to semctl$GETVAL which produces random errno
values on linux and freebsd.
2. Apply ignore_return to prctl and remove the custom code in executor.
3. Remove the custom errno ignoring code in fuchsia executor.
The calls are already marked as ignore_return, so this is just a leftover.
4. Only reset errno for ignore_return.
The syscall can still return a resource (maybe).
We only need to reset errno for fallback coverage.
|
| |
|
|
|
|
| |
Sysctl's are not captured as part of reproducers.
This can result in failure to reproduce a bug on developer machine.
Include sysctl setup as part of C reproducers.
|
| |
|
|
|
|
|
|
|
|
|
| |
Currently we assume that sysctl's are setup as part of machine boot.
This introduces a non-trivial dependency on image creation
and sysctl's are not captured by as part of C reproducers
and are not captured by syzbot dashboard. This can make some
reproducers fail on developer machines or on syzbot later
when sysctl's change.
Setup sysctl's in executor as part of machine setup.
It makes it much more controllable and hermetic.
|
| |
|
|
|
| |
Since ENOENT problem was solved by commit 318430cbb3b2ceef ("executor/linux:
change mount propagation type to private"), remove the debug code for this problem.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fail()'s are often used during the validation of kernel reactions to
queries that were issued by pseudo syscalls implementations. As fault
injection may cause the kernel not to succeed in handling these
queries (e.g. socket writes or reads may fail), this could ultimately
lead to unwanted "lost connection to test machine" crashes.
In order to avoid this and, on the other hand, to still have the
ability to signal a disastrous situation, the exit code of this
function now depends on the current context.
All fail() invocations during system call execution with enabled fault
injection lead to termination with zero exit code. In all other cases,
the exit code is kFailStatus.
This is achieved by introduction of a special thread-specific variable
`current_thread` that allows to access information about the thread in
which the current code is executing.
Also, this commit eliminates current_cover as it is no longer needed.
|
| |
|
|
|
|
| |
gvisor coverage is not in the range of linux kernel coverage.
So the coverage filter does not work. Detect if running under gvisor
and skip the coverage filter.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
With commit 50e21c6be6188f42 ("executor/linux: dump mount information when
failed to open kcov file"), we got an unexpected result.
/sys/kernel/ does not exist despite /sys/ exists.
/proc/mounts cannot be opened despite /proc/ exists.
If sysfs is not mounted on /sys/ and proc is not mounted on /proc/ ,
maybe other filesystems (e.g. devtmpfs, cgroup) are not mounted as well.
Let's dump "/", "/proc/" and "/sys/", and then mount /proc/ and dump /proc/mounts .
|
| |
|
|
|
|
|
|
|
| |
There are many "lost connection to test machine (5)" reports where the
testing terminated due to ENOENT upon open("/sys/kernel/debug/kcov").
Since some testcase might be unintendedly modifying mount information,
let's start from checking whether/how mount is broken.
This commit might be reverted after the cause is identified and fixed.
|
| |
|
|
|
|
|
|
| |
Use native byte-order for IPC and program serialization.
This way we will be able to support both little- and big-endian
architectures.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
Surround the main data mapping with PROT_NONE pages to make virtual address layout more consistent
across different configurations (static/non-static build) and C repros.
One observed case before: executor had a mapping above the data mapping (output region),
while C repros did not have that mapping above, as the result in one case VMA had next link,
while in the other it didn't and it caused a bug to not reproduce with the C repro.
The bug that reproduces only with the mapping above:
https://lkml.org/lkml/2020/4/17/819
|
| |
|
|
|
|
|
|
|
| |
The feature gets enabled when /dev/raw-gadget is present and accessible.
With this feature enabled, executor will do chmod 0666 /dev/raw-gadget on
startup, which makes it possible to do USB fuzzing in setuid and namespace
sandboxes. There should be no backwards compatibility issues with syz
reproducers that don't explicitly enable this feature, as they currently only
work in none sandbox.
|
| |
|
|
|
|
|
|
| |
nmi_check_duration() prints "INFO: NMI handler took too long" on slow debug kernels.
It happens a lot in qemu, and the messages are frequently corrupted
(intermixed with other kernel output as they are printed from NMI)
and are not matched against the suppression in pkg/report.
This write prevents these messages from being printed.
|
| |
|
|
|
|
| |
Some prctl commands don't respect the normal convention for return values
(e.g. PR_GET_TIMERSLACK, but there are more) and may produce all possible
errno values. This conflicts with fallback coverage.
|
| |
|
|
|
|
|
| |
1. It always crashed in cover_reset when coverage is disabled.
2. Use NONFAILING when accessing image segments.
3. Give it additional 100 ms as it may be slow.
4. Add a test for syz_mount_image.
|
| |
|
|
|
| |
Not all gcc's everywhere support C++11 by default.
We have some old on Travis.
|
| |
|
|
|
|
|
|
|
| |
Layout of kcov_remote_arg is ABI-dependent,
as the result when 32-bit userspace talks to 64-bit kernel
it does not work out of the box. We need both statically
different structs for kernels of different bitnesses,
but also dynamic dispatch because a 32-bit userspace
can talk to both 64-bit and 32-bit kernels.
|
| |
|
|
| |
The kcov extension is being upstreamed and the interfaces has been changed.
|