aboutsummaryrefslogtreecommitdiffstats
path: root/syz-manager/hub.go
Commit message (Collapse)AuthorAgeFilesLines
* all: show manager url in syz-hubJoey Jiao2025-03-171-0/+2
|
* pkg/runtest: rely on pkg/manager seed loading logicAleksandr Nogikh2024-10-141-1/+1
| | | | It will help us catch broken seeds right in TestParse().
* pkg/manager: factor out the HTTP server codeAleksandr Nogikh2024-10-111-2/+2
| | | | | Decouple it from syz-manager. Remove a lot of no longer necessary mutex calls.
* syz-manager: remove syz-hub prog add statisticsAleksandr Nogikh2024-09-141-9/+2
| | | | | These no longer make any sense since we only send programs after the corpus triage.
* syz-manager: send new inputs to the hub only onceDmitry Vyukov2024-09-121-29/+26
| | | | | | | | | | | We used to send corpus updates (added/removed elements) to the hub in each sync. But that produced too much churn since hub algorithm is O(N^2) (distributing everything to everybody), and lots of new inputs are later removed (either we can't reproduce coverage after restart, or inputs removed during corpus minimization). So now we don't send new inputs in each sync, instead we aim at sending corpus once after initial triage. This solves the problem with non-reproducible/removed inputs. Typical instance life-time on syzbot is <24h, for such instances we send the corpus once. If an instance somehow lives for longer, then we re-connect and re-send once in a while (e.g. a local long-running instance).
* syz-manager: don't send fake coverage corpus to hubDmitry Vyukov2024-09-121-0/+6
| | | | | | If a manager uses fake coverage, don't send its corpus to the hub. It should be lower quality than coverage-guided corpus. However still send repros and accept new inputs.
* syz-manager: move seed functionality to pkg/managerAleksandr Nogikh2024-09-021-2/+2
|
* syz-manager: move repro loop to pkg/managerAleksandr Nogikh2024-09-021-4/+5
| | | | This is a potentially reusable piece of functionality.
* syz-manager: define a reminimization thresholdAleksandr Nogikh2024-08-161-1/+1
| | | | | | | Let it be equal to 15 calls for now. Don't reminimize corpus programs that have fewer calls. Always reminimize hub programs that no less calls.
* syz-manager: move prog helpers to the prog packageAleksandr Nogikh2024-08-061-1/+1
| | | | Reduce the size of syz-manager.
* pkg/stat: rename package name to singular formDmitry Vyukov2024-07-241-15/+15
| | | | | | | | Go package names should generally be singular form: https://go.dev/blog/package-names https://rakyll.org/style-packages https://groups.google.com/g/golang-nuts/c/buBwLar1gNw
* pkg/stats: rename Create to NewDmitry Vyukov2024-07-241-7/+7
| | | | | | New is more idiomatic name and is shorter (lines where stats.Create is used are usually long, so making them a bit shorter is good).
* syz-manager: tidy ReproResultAleksandr Nogikh2024-07-171-3/+0
| | | | | | | | | There have been some mess and duplication around Crash/ReproResult data structures. As a result, we've been attempting to upload repro failure logs to the dashboard for bugs, which did not originate from the dashboard. It litters the syz-manager logs. Refactor the code.
* syz-manager: query hub only after finishing the previous batchAleksandr Nogikh2024-07-151-1/+2
| | | | | | | | | | | It does happen that we see a long tail of "candidate triage jobs" during a big influx of syz-hub programs. This is bad because in 10 minutes we'll query another batch, which will further stretch the triage process. By delaying the process more and more we're offloading the start of bug reproduction, so let's control the hub sync process more carefully - only perform the next query after the previous batch has completed.
* syz-manager: deprioritize hub reproducersAleksandr Nogikh2024-07-111-1/+2
| | | | Only request them if there's nothing else to reproduce.
* all: transition to instance.PoolAleksandr Nogikh2024-07-111-6/+6
| | | | | Rely on instance.Pool to perform fuzzing and do bug reproductions. Extract the reproduction queue logic to separate testable class.
* pkg/corpus: don't keep serialized programs in memoryDmitry Vyukov2024-07-101-15/+14
| | | | | | | | We only need serialized representation on some rare operations (some web UI pages, and first hub connect). Don't keep them in memory. In my instance this saves 503MB (15.5%) of heap, which reduces RSS by 1GB (2x due to GC).
* executor: add runner modeDmitry Vyukov2024-06-241-8/+18
| | | | | | | Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer. Also restore syz-runtest functionality in the manager. Update #4917 (sets most signal handlers to SIG_IGN)
* pkg/fuzzer: refactor progTypesDmitry Vyukov2024-06-031-3/+5
| | | | | | | | | The next commit will add another Candidate flag. Candidate flags duplicate progTypes enum, so to avoid conversions of one to another use progTypes in Candidate struct directly. Rename progTypes to progFlags since multiple can be set, so this is effectively flags rather than a single type.
* pkg/repro, pkg/ipc: use flatrpc.FeatureDmitry Vyukov2024-05-061-2/+2
| | | | | | | Start switching from host.Features to flatrpc.Features. This change is supposed to be a no-op, just to reduce future diffs that will change how we obtain features.
* pkg/rpctype: prepare for not using for target communicationDmitry Vyukov2024-05-031-2/+2
| | | | | | Remove things that are only needed for target VM communication: conditional compression, timeout scaling, traffic stats. To minimize diffs when we switch target VM communication to flatrpc.
* pkg/rpctype: allow to disable timeoutsDmitry Vyukov2024-04-111-2/+2
| | | | | | | | | | | | Fuzzer don't need timeouts for the RPC connection much, if it does not receive new programs, we will kill it due to "no output" anyway. But they are problematic when we do parallel calls (Exchange), e.g. one call can cancel timeout of an existing call. They also will be more problematic if we also send notifications about programs fuzzer started executing in parallel. And they also marginally slow down things. Disable timeouts in the fuzzer.
* syz-manager: don't store whole CheckResultDmitry Vyukov2024-04-091-1/+1
| | | | | | | | | | | | | | | Currently we store whole CheckResult in the manager and send it back to fuzzers. It is somewhat large for both storing in memory and sending each time to fuzzers. We already clear DisabledCalls in CheckResult before storing it to save space. But we also don't need to store/send EnabledSyscalls. Currently we use CheckResult.GlobFiles in the fuzzer to update prog package, but we don't need to do it. That's a leftover from "fuzzer in the VM" times. We don't generate programs in the fuzzer anymore. The only bit we really need it CheckResult.Features, so store/send just them.
* all: refactor statsDmitry Vyukov2024-04-091-9/+24
| | | | | | | Add ability for each package to create and export own stats. Each stat is self-contained, describes how it should be presented, and there is not need to copy them from one package to another. Stats also keep historical data and allow building graphs over time.
* pkg/rpctype: make RPC compression optionalDmitry Vyukov2024-04-031-2/+8
| | | | | | | | RPC compression take up to 10% of CPU time in profiles, but it's unlikely to be beneficial for local VM runs (we are mostly copying memory in this case). Enable RPC compression based on the VM type (local VM don't use it, remove machines use it).
* all: move fuzzer to the hostAleksandr Nogikh2024-03-251-6/+6
| | | | | | | | | | | | Instead of doing fuzzing in parallel in running VM, make all decisions in the host syz-manager process. Instantiate and keep a fuzzer.Fuzzer object in syz-manager and update the RPC between syz-manager and syz-fuzzer to exchange exact programs to execute and their resulting signal and coverage. To optimize the networking traffic, exchange mostly only the difference between the known max signal and the detected signal.
* pkg/corpus: a separate package for the corpus functionalityAleksandr Nogikh2024-03-181-2/+2
| | | | | | | | pkg/fuzzer and syz-manager have a common corpus functionality that can be well be unified. Create a separate pkg/corpus package that would be used by both of them. It will simplify further work of moving pkg/fuzzer to the host.
* syz-manager: prefer non-ANY progs in corpus minimizationAleksandr Nogikh2024-02-081-2/+2
| | | | | In case of non-squashed programs we can leverage our descriptions in a much better way than just blind mutations of binary blobs.
* syz-manager: treat progs from hub and dashboard differentlyAleksandr Nogikh2024-02-021-2/+2
| | | | | | | From dashboard we receive logs, from syz-hub - ready reproducers. If we failed to find a repro from the log, report a failure back to the dashboard. If we succeeded, prepend the options.
* all: restart unfinished bug reproductionsAleksandr Nogikh2024-01-301-3/+3
| | | | | | | | | | | There are cases when syz-manager is killed before it could finish bug reproduction. If the bug is frequent, it's not a problem - we might have more luck next time. However, if the bug happened only once, we risk never finding a repro. Let syz-managers periodically query dashboard for crash logs to reproducer. Later we can reuse the same API to move repro sharing functionality out from syz-hub.
* syz-manager: improve prog validation errors loggingAleksandr Nogikh2023-11-221-2/+2
| | | | | | | If we received an invalid program from the fuzzer, log it as an error. It should never be happening under normal conditions. Include the exact error text in log messages.
* pkg/report: move report.Type to pkg/report/crashAleksandr Nogikh2023-07-051-2/+3
| | | | | This will help avoid a circular dependency pkg/vcs -> pkg/report -> pkg/vcs.
* pkg/report: extract more report types for LinuxAleksandr Nogikh2023-07-051-1/+1
| | | | Amend oops and oopsFormat to contain report type.
* syz-manager: jump to phaseTriagedHub after a timeoutAleksandr Nogikh2023-05-021-7/+18
| | | | | | | | | At times, syz-hub gets broken and no syz-manager instance can connect to it for quite a while. This basically prevents corpus rotations and reproducer generation from happening. If syz-hub is still unreachable after 3 connection attempts, give up and jump to phaseTriagedHub unconditionally.
* syzkaller: remove RPC prefix from rpctypes (#2929)Taras Madan2021-12-161-3/+3
| | | There is no need to use RPC prefix. It is already a part of the element path.
* syz-manager: support oauth when calling syz-hubGreg Steuck2021-07-301-3/+31
| | | | Permit empty hub_key to indicate oauth.
* all: make timeouts configurableDmitry Vyukov2020-12-281-2/+2
| | | | | | Add sys/targets.Timeouts struct that parametrizes timeouts throughout the system. The struct allows to control syscall/program/no output timeouts for OS/arch/VM/etc. See comment on the struct for more details.
* syz-manager: fix hub stats calculationDmitry Vyukov2020-12-051-3/+3
| | | | | r.Progs is not filled anymore (for legacy managers). Use r.Inputs instead of r.Progs everywhere.
* syz-manager: send domain to hubDmitry Vyukov2020-12-031-0/+1
| | | | | | Actually send domain to the hub... Update #2095
* syz-hub: support input domainsDmitry Vyukov2020-12-031-11/+51
| | | | | | | | | | | | | | | | | | | | | | | | | Hub input domain identifier (optional). The domain is used to avoid duplicate work (input minimization, smashing) across multiple managers testing similar kernels and connected to the same hub. If two managers are in the same domain, they will not do input minimization after each other. If additionally they are in the same smashing sub-domain, they will also not do smashing after each other. By default (empty domain) all managers testing the same OS are placed into the same domain, this is a reasonable setting if managers test roughly the same kernel. In this case they will not do minimization nor smashing after each other. The setting can be either a single identifier (e.g. "foo") which will affect both minimization and smashing; or two identifiers separated with '/' (e.g. "foo/bar"), in this case the first identifier affects minimization and both affect smashing. For example, if managers test different Linux kernel versions with different tools, a reasonable use of domains on these managers can be: - "upstream/kasan" - "upstream/kmsan" - "upstream/kcsan" - "5.4/kasan" - "5.4/kcsan" - "4.19/kasan" Fixes #2095
* syz-manager: more consistently check disabled syscallsDmitry Vyukov2020-05-121-7/+12
| | | | | | | | | We have program "validity" check duplicated 4 times (initially it was just "does it deserialize?"). Then we added program length and disabled syscall. But some of the sites have only a subset of checks. Factor out program checking procedure into a separate function and use it at all sites.
* prog: control program lengthDmitry Vyukov2020-03-131-1/+2
| | | | | | | | | | | | | | | | | | | We have _some_ limits on program length, but they are really soft. When we ask to generate a program with 10 calls, sometimes we get 100-150 calls. There are also no checks when we accept external programs from corpus/hub. Issue #1630 contains an example where this crashes VM (executor limit on number of 1000 resources is violated). Larger programs also harm the process overall (slower, consume more memory, lead to monster reproducers, etc). Add a set of measure for hard control over program length. Ensure that generated/mutated programs are not too long; drop too long programs coming from corpus/hub in manager; drop too long programs in hub. As a bonus ensure that mutation don't produce programs with 0 calls (which is currently possible and happens). Fixes #1630
* syz-manager: don't send more than 100K inputs to hubDmitry Vyukov2020-01-151-0/+7
| | | | | | Never send more than 100K, this is never healthy but happens episodically due to various reasons: problems with fallback coverage, bugs in kcov, fuzzer exploiting our infrastructure, etc.
* syz-manager: corpus rotationDmitry Vyukov2019-12-301-2/+1
| | | | | | | | | Use a random subset of syscalls/corpus/coverage for each individual VM run. Hypothesis is that this should allow fuzzer to get more coverage find more bugs in saturated state (stuck in local optimum). See the issue and comments for details. Update #1348
* pkg/host: rename some featuresDmitry Vyukov2019-11-161-1/+1
| | | | | Rename some features in preparation for subsequent changes which will align names across the code base.
* syz-manager: reproduce leaks from hubDmitry Vyukov2019-05-211-0/+11
| | | | | | pkg/repro only enables leak checking when report type is MemoryLeak. Since repros from hub always have Unknown type, repro won't reproduce leaks. Always set report type to MemoryLeak on leak instances.
* prog: introduce strict parsing modeDmitry Vyukov2018-12-101-2/+2
| | | | | | | | | | | Over time we relaxed parsing to handle all kinds of invalid programs (excessive/missing args, wrong types, etc). This is useful when reading old programs from corpus. But this is harmful for e.g. reading test inputs as they can become arbitrary outdated. For runtests which creates additional problem of executing not what is actually written in the test (or at least what author meant). Add strict parsing mode that does not tolerate any errors. For now it just checks excessive syscall arguments.
* tools/syz-runtest: add tool for program unit testingDmitry Vyukov2018-08-031-1/+1
| | | | | | | | | | | The tool is run as: $ syz-runtest -config manager.config This runs all programs from sys/*/test/* in different modes on actual VMs and checks results. Fixes #603
* syz-manager: refactor work with hubDmitry Vyukov2018-08-021-0/+192
Move work with hub into a separate file and fully separate its state from the rest of the manager state. First step towards splitting manager into managable parts. This also required to rework stats as they are used throughout the code. Update #538 Update #605