| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
This change adds VirtualBox support to syzkaller. It implements the VM
interface for VirtualBox and provides:
- full VM lifecycle operations (create, boot, stop, snapshot restore)
- serial console hookup and integration with the output merger
- proper boot wait logic similar to qemu, using SSH readiness
- boot-time crash capture using collected console output
|
| |
|
|
|
|
|
|
|
|
| |
Enable external abortion of the instance creation process. This is
especially useful for the qemu case where we retry the creation/boot up
to 1000 times, which can take significant time (e.g. it timeouts
syz-cluster pods on unstable kernels).
The context can be further propagated to WaitForSSH, but that requires
another quite significant vm/ refactoring.
|
| | |
|
| |
|
|
|
| |
1. func Run optionally accepts the opts.
2. Some refactoring, more comments.
|
| |
|
|
| |
It allows to use context as a single termination signal source.
|
| |
|
|
| |
They are shorter, more readable, and don't require temp vars.
|
| | |
|
| |
|
|
|
|
|
| |
Move the VM count restriction logic info vm package.
This avoids lots of duplication, makes it supported
for VM types that failed to do this, and allows
to unify more VM count logic in future.
|
| |
|
|
|
|
|
|
|
| |
Pools and ReproLoop and always created on start,
so there is no need to support lazy set for them.
It only complicates code and makes it harder to reason about.
Also introduce vm.Dispatcher as an alias to dispatcher.Pool,
as it's the only specialization we use in the project.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
| |
Go package names should generally be singular form:
https://go.dev/blog/package-names
https://rakyll.org/style-packages
https://groups.google.com/g/golang-nuts/c/buBwLar1gNw
|
| |
|
|
|
|
| |
New is more idiomatic name and is shorter
(lines where stats.Create is used are usually long,
so making them a bit shorter is good).
|
| |
|
|
| |
Fixes #5028
|
| |
|
|
|
| |
Rely on instance.Pool to perform fuzzing and do bug reproductions.
Extract the reproduction queue logic to separate testable class.
|
| |
|
|
|
|
|
| |
The pool operates on a low level and assumes that there's one default
activity (=fuzzing) that is performed by the VMs and that there are
also occasional non-default activities that must be performed by some
VMs (=bug reproduction).
|
| |
|
|
| |
It's better to follow standard interfaces.
|
| | |
|
| |
|
|
|
| |
It usually means a kernel crash, in which case we want to give the
kernel some more time to print the whole coverage report to the console.
|
| |
|
|
|
|
|
| |
Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer.
Also restore syz-runtest functionality in the manager.
Update #4917 (sets most signal handlers to SIG_IGN)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ipc gate slows down overall execution a lot.
Without ipc gate I am getting ~20% more executions with debug kernel
and ~100% more executions with a fast non-debug kernel.
Replace ipc gate with explicit tracking of last executing programs
per proc in syz-manager.
Ipc gate was also used for leak checking, but leak checking seems
to be still broken. At least in my local runs I am not getting
any leaks even with the previous fix.
So remove the gate completly for now. Taking into account that
we are likely to rewrite this code in C++ soon, it makes
little sense to create a special gate for leak checking only in Go.
Update #4728
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always call the finish callback to make control flow consistent
if VM crash/does not crash. Then users can rely on the callback
being always called.
Fix a bug highlighted by the extended test:
currently we call extractError/callback twice when the fuzzer is preempted.
If the fuzzer is preempted, extractError returns nil,
which makes appendOutput return nil as well,
which makes the main loop continue as if no crash/preemption happened.
It will exit, but only after 5 min "no output" timeout.
Most likley the output will still contain the preemption message,
so no "no output" will be reported, but the additional 5 min wait
is unnecessary.
|
| |
|
|
|
|
| |
Remove things that are only needed for target VM communication:
conditional compression, timeout scaling, traffic stats.
To minimize diffs when we switch target VM communication to flatrpc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the crash did not result in a panic, we continue normal fuzzing
operation for several more seconds until a delay has passed and the VM
was stopped.
It results in 20-30 more programs that are:
1) Run on a potentially corrupted machine.
2) Litter the logs -- these programs are definitely not related to the
crash.
3) Make it trickier to automatically track crash risk for individual
syscalls/programs (if we get to this).
Add an option to vm.Run that allows to determine a crash immediatey
after it was detected. Use it to stop program exchange and program
injection in rpc.go.
|
| |
|
|
|
|
| |
VM output we receive on the host is effectively equivalent to RPC recv metric.
If we stop printing programs in the fuzzer, traffic will move from output to RPC.
It will be useful to see this change via metrics.
|
| |
|
|
| |
This will allow manager to inject executing programs into output.
|
| |
|
|
|
|
|
|
| |
Delete support for odroid board.
It's build broken for >3 years (at least on 8ba8079b119f).
We keep it in history and if it's resurrected, it needs
to be merged with vm/isolated and most code needs to be
at least build-tested (mock out only the C interface).
|
| |
|
|
|
|
|
| |
Delete support for kvmtool.
We don't use it, I have not heard anybody using it in the past 5 years.
The last commit in https://github.com/kvmtool/kvmtool is done 2 years ago.
And it was never fully working (see taskset hack to avoid races).
|
| |
|
|
|
|
| |
All callers of Run always call MonitorExecution right after it.
Combine these 2 methods. This allows to hide some implementation
details and simplify users of vm package.
|
| |
|
|
|
|
|
|
| |
RPC compression take up to 10% of CPU time in profiles,
but it's unlikely to be beneficial for local VM runs
(we are mostly copying memory in this case).
Enable RPC compression based on the VM type
(local VM don't use it, remove machines use it).
|
| |
|
|
|
|
|
|
| |
In this mode, all syz-fuzzers will be on the same network and will start
competing with each other for binding to the same port.
For now, we don't have the need to use pprof in the host fuzzer mode, so
let's just disable it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases (e.g. gVisor instances using host's network namespace)
attempts to bind() all syz-fuzzer processes to the same port result in
conflicts and fuzzing breakages.
Refactor the code to enable custom pprof configuration depending on the
vm type.
For now, just disable pprof endpoints for gVisor VMs. Once we actually
need the feature there, we can generate custom ports for every gVisor
VM.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's not correct to mix them since they point to fundamentally different
issues:
1) Boot time errors are caused by a problematic kernel image and can
only be resolved by using another kernel version or config.
2) Infrastructure errors are temporary, so we can just try again some
time later.
Reserve the existing BootError for (1) errors and let all other VM
handling errors refer to (2).
To make it possible to attach more output to the infra error, introduce
the VerboseInfraError type.
|
| |
|
|
|
|
|
|
|
| |
Booting physical Android devices requires building a few artifacts, as described
at https://source.android.com/docs/setup/build/building-kernels.
When a ProxyVM type is used, we need to differentiate whether or not to
use the Android build logic, so we add an additional mapping which uses
a different name but the same VM logic.
|
| |
|
|
|
|
|
| |
This commit adds a new VM for fuzzing starnix.
The VM will boot a fuchsia image using the `ffx` tool and will connect to an adb server inside it. Fuzzing will be done using HostFuzzer mode due to some features not being implemented yet in starnix. Once this is possible, fuzzing will be performed without HostFuzzer mode.
Co-authored-by: Juampi Miceli <jpmiceli@google.com>
|
| |
|
|
|
|
|
| |
* vm: add pool.Close() support
* vm: add proxyapp client implementation
* vm/proxyapp: autogenerate mocks
* vm/proxyapp: add proxyapp tests
* pkg/mgrconfig: add proxyapp type tests
|
| |
|
|
|
| |
Also update syz-crush to save RawOutput instead of output from the
Report.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* vm/cuttlefish: add vm type for cuttlefish on gce
This new VM type embeds the existing 'gce' type to start an instance and
then run a Cuttlefish Android VM on it using the 'launch_cvd' binary
installed on it.
This requires us to make a few fields on the 'gce' type visible so that
'cuttlefish' can set them when starting the instance.
The remaining functionality (SSH forwarding, file copying, and running
commands on the nested Android VM will be in following changes.
For more information on Cuttlefish, see:
https://source.android.com/setup/create/cuttlefish
https://android.googlesource.com/device/google/cuttlefish/
* vm/cuttlefish: add vm type for cuttlefish on gce
This new VM type embeds the existing 'gce' type to start an instance and
then run a Cuttlefish Android VM on it using the 'launch_cvd' binary
installed on it.
This requires us to make a few fields on the 'gce' type visible so that
'cuttlefish' can set them when starting the instance.
The remaining functionality (SSH forwarding, file copying, and running
commands on the nested Android VM will be in following changes.
For more information on Cuttlefish, see:
https://source.android.com/setup/create/cuttlefish
https://android.googlesource.com/device/google/cuttlefish/
* vm/cuttlefish: add vm type for cuttlefish on gce
This new VM type embeds the existing 'gce' type to start an instance and
then run a Cuttlefish Android VM on it using the 'launch_cvd' binary
installed on it.
This requires us to make a few fields on the 'gce' type visible so that
'cuttlefish' can set them when starting the instance.
The remaining functionality (SSH forwarding, file copying, and running
commands on the nested Android VM will be in following changes.
For more information on Cuttlefish, see:
https://source.android.com/setup/create/cuttlefish
https://android.googlesource.com/device/google/cuttlefish/
* vm/cuttlefish: fix missed log.Logf(0 call to log.Logf(1
* vm/cuttlefish: remove unneeded log.Logf() calls
These logging for Count() isn't terribly useful since it's a single-line
call with very simple logic.
For the unimplemented methods the log lines have limited utility since
they're already returning error messages which will get logged.
|
| |
|
|
| |
Current state: every 5 minutes VM reboots.
Fix: signal "executing program" to monitor to prevent this reboot.
|
| |
|
|
|
|
|
|
|
|
| |
Currently syzkaller only applies its suppressions regexps to the oops message
itself and a small number of its preceding bytes. A case has been reported
(#2685), where it was important to analyse a bigger portion of output data.
Pass the whole log and a starting position to the `Report.Parse` method
separately instead of passing an already cut log there. Adjust use cases of
the `Report.Parse` method to handle its new behavior.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Currently a number of report post-processing activities are implemented as a
decorator over the interface that defines OS-specific implementations.
Following exactly the same interface is too restrictive in this case as adding
extra parameters to the post-processing forces the developer to adjust all
implementations thay may not need these parameters at all.
Untie the wrapper from the Reporter interface. Use a package-private
reporterImpl interface for the OS-specific implementations, while having an
exported Reporter structure. Make sure that Reporter is stored and
passed as a pointer.
|
| | |
|
| |
|
|
|
|
| |
Significant portion of oopses with qemu emulation gets truncated.
Hard to say if we don't wait long enough or there is something else,
but scaling "wait for output" timeout seems reasonable regardless.
|
| |
|
|
| |
Increase ssh wait timeout according to the target slowdown.
|
| |
|
|
|
|
| |
Add sys/targets.Timeouts struct that parametrizes timeouts throughout the system.
The struct allows to control syscall/program/no output timeouts for OS/arch/VM/etc.
See comment on the struct for more details.
|
| |
|
|
| |
We don't need indirection via strings to declare executingProgram var.
|
| |
|
|
|
|
|
|
|
| |
The way to diagnose generally depends on the issue.
E.g. do we need register dump to debug this issue?
Do we need host dmesg dump? Some diagnosis may be
directly specific to a particular problem (e.g. dumping
a particular debugfs/procfs file).
Pass Report to Diagnose to make this possible.
|
| |
|
|
|
|
| |
The "no output" handling mostly duplicates extractError logic
(with open-coded report.VMDiagnosisStart).
Deduplicate this logic.
|
| |
|
|
|
|
|
| |
Use the "vmrun" utility to manage Workstation VMs. The syzkaller manager
creates temporary VMs (linked clones) from a base image, gets their IP
address and uses ssh to deploy and run programs (similar to the isolated
mode).
|