| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
| |
This should be calculated in dispatcher.Pool that actually does boot
VMs.
|
| |
|
|
|
|
|
|
| |
Go package names should generally be singular form:
https://go.dev/blog/package-names
https://rakyll.org/style-packages
https://groups.google.com/g/golang-nuts/c/buBwLar1gNw
|
| |
|
|
|
|
| |
New is more idiomatic name and is shorter
(lines where stats.Create is used are usually long,
so making them a bit shorter is good).
|
| |
|
|
|
| |
Rely on instance.Pool to perform fuzzing and do bug reproductions.
Extract the reproduction queue logic to separate testable class.
|
| | |
|
| |
|
|
|
|
|
| |
Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer.
Also restore syz-runtest functionality in the manager.
Update #4917 (sets most signal handlers to SIG_IGN)
|
| |
|
|
|
|
|
| |
Make num fuzzing VMs stat more precise:
increment when the VM is actually ready to execute test programs,
decrement as soon as we see oops in console output.
Also show it on graphs.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switch to flatrpc connection between manager and fuzzer.
With flatrpc we have a goroutine per connection instead of async RPC,
which makes things a bit simpler. Now don't reordered messages
(in particular start executing and finish executing for programs),
race on the program during printing is no longer possible
since we finish handlign start executing request before we even
receive finish executing.
We also don't need to lookup Runner for every RPC since it's
now local to the handling goroutine.
We also don't need to protect requests map since only single
goroutine accesses it.
We also send new programs to the fuzzer as soon as we receive
start executing message, which provides better buffering.
We also don't batch new requests and finish executing requests
in a single RPC, which makes things a bit simpler.
In my local run this reduces syz-manager heap size from 1.3GB to 1.1GB.
Update #1541
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the syscall checking logic to the host.
Diffing sets of disabled syscalls before/after this change
in different configurations (none/setuid sandboxes, amd64/386 arches,
large/small kernel configs) shows only some improvements/bug fixes.
1. socket$inet[6]_icmp are now enabled.
Previously they were disabled due to net.ipv4.ping_group_range sysctl
in the init namespace which prevented creation of ping sockets.
In the new net namespace the sysctl gets default value which allows creation.
2. get_thread_area and set_thread_area are now disabled on amd64.
They are available only in 32-bit mode, but they are present in /proc/kallsyms,
so we enabled them always.
3. socket$bt_{bnep, cmtp, hidp, rfcomm} are now disabled.
They cannot be created in non init net namespace.
bt_sock_create() checks init_net and returns EAFNOSUPPORT immediately.
This is a bug in descriptions we need to fix.
Now we see it due to more precise checks.
4. fstat64/fstatat64/lstat64/stat64 are now enabled in 32-bit mode.
They are not present in /proc/kallsyms as syscalls, so we have not enabled them.
But they are available in 32-bit mode.
5. 78 openat variants + 10 socket variants + mount are now disabled
with setuid sandbox. They are not permitted w/o root permissions,
but we ignored that. This additionally leads to 700 transitively
disabled syscalls.
In all cases checking in the actual executor context/sandbox
looks very positive, esp. for more restrictive sandboxes.
Android sandbox should benefit as well.
The additional benefit is full testability of the new code.
The change includes only a basic test that covers all checks,
and ensures the code does not crash/hang, all generated programs
parse successfully, etc. But it's possible to unit-test
every condition now.
The new version also parallelizes checking across VMs,
checking on a slow emulated qemu drops from 210 seconds
to 140 seconds.
|
| |
|
|
| |
Simplify code. It does not belong to rpc handler.
|
| |
|
|
|
|
|
|
|
|
| |
Instead of counting exeutor restarts add executor freshness
(number of tests executed in the same process before this one)
into execution result.
This removes all program-related metrics from syz-fuzzer,
and concentrates all of them in the manager.
The freshness of the concrete test may also be useful
for some analysis later.
|
| |
|
|
|
|
| |
Don't send text program to the fuzzer,
instead send exec encoding directly.
It's more compact now and does not need complex deserialization.
|
| |
|
|
|
|
|
|
| |
Instead of printing full program from the fuzzer,
send a short notification with program ID to the manager
and let manager emit the program into the log.
This significnatly reduces amount of communication
and makes it possible to not send text programs to the fuzzer at all.
|
| |
|
|
|
|
|
|
|
|
|
| |
There is non-0 rate of transient executor errors.
Currently we do full GC, free OS memory and sleep for a second after then.
This was more meaningful when the fuzzer was in the VM as the fuzzer process
consumed lots of memory. Now it consumes only ~20MB, any OOMs are likely
not due to the fuzzer process.
So instead sleep briefly and only after several retries
(I would assume most errors are fixed after 1 retry).
|
| |
|
|
|
|
| |
Initially I hide them to not save some spage on the graphs page.
But if we combine them with total crashes, they don't consume space
and I think they look very reasonable there.
|
| |
|
|
|
| |
We will also use it to determine when we are ready to schedule programs
that are very likely to crash instances.
|
| |
|
|
|
|
|
| |
Add ability for each package to create and export own stats.
Each stat is self-contained, describes how it should be presented,
and there is not need to copy them from one package to another.
Stats also keep historical data and allow building graphs over time.
|
| |
|
|
|
| |
Collect total number of calls, average number of requested progs,
average server latency, and average end-to-end latency.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Instead of doing fuzzing in parallel in running VM, make all decisions
in the host syz-manager process.
Instantiate and keep a fuzzer.Fuzzer object in syz-manager and update
the RPC between syz-manager and syz-fuzzer to exchange exact programs to
execute and their resulting signal and coverage.
To optimize the networking traffic, exchange mostly only the difference
between the known max signal and the detected signal.
|
| |
|
|
|
| |
Once we move pkg/fuzzer to the host, we won't be able to reuse this
implementation.
|
| |
|
|
|
| |
Measure and display the total RPC communication traffic.
It will help better evaluate #4579.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* syz-manager: add prometheus metrics
Add prometheus metrics client to syz-manager.
Expose metrics on a new port defined in mgrconfig.
Allows for prometheus to scrape metrics from syz-manager.
* syz-manager: expose metrics endpoint in http server
.gitignore : remove local .img path
* mgrconfig: remove unnecessary config option
* syz-manager: update stats to use gaugefunc
added docs for prometheus exported metrics
added more gaugefunc metrics
Signed-off-by: Palash Oswal <oswalpalash@gmail.com>
* syz-manager: minor changes for CI tests
added periods to comments and renamed go variables
Signed-off-by: Palash Oswal <oswalpalash@gmail.com>
* syz-manager: re-position prometheus counter declaration
docs updated with PR comments
Signed-off-by: Palash Oswal <oswalpalash@gmail.com>
|
| |
|
|
|
|
| |
Show actual coverage intersected with coverage filter
and total size of coverage filter. This may give a more
useful progress metric.
|
| |
|
|
|
|
|
| |
Originally, syz-manager confusingly logs corpusSignal as "cover".
Change syz-manager's logging to output corpusSignal, corpusCover
and maxSignal.
Add a field in Stats to store maxSignal.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
Use a random subset of syscalls/corpus/coverage for each individual VM run.
Hypothesis is that this should allow fuzzer to get more coverage
find more bugs in saturated state (stuck in local optimum).
See the issue and comments for details.
Update #1348
|
| |
|
|
| |
Update #605
|
|
|
Move work with hub into a separate file and fully separate
its state from the rest of the manager state.
First step towards splitting manager into managable parts.
This also required to rework stats as they are used throughout the code.
Update #538
Update #605
|