aboutsummaryrefslogtreecommitdiffstats
path: root/pkg/rpcserver/runner.go
Commit message (Collapse)AuthorAgeFilesLines
* pkg/rpcserver: fix fallback coverageAleksandr Nogikh2025-04-281-4/+6
| | | | | If we set it too early, it will be filtered out as it is not within the addresses of the .text segment.
* all: clarify the error in case of ExecFailureAleksandr Nogikh2025-01-301-1/+4
| | | | | Whenever the status is set, also include the reason. It should help easier debug execution and machine check time problems.
* pkg/rpcserver: refactor to remove Fatalf callsAleksandr Nogikh2025-01-291-6/+8
| | | | Apply necessary changes to pkg/flatrpc and pkg/manager as well.
* pkg/rpcserver: prevent a nil pointer dereferenceAleksandr Nogikh2025-01-221-1/+4
| | | | | | | If we get a Hanged != "" response from a non-RequestTypeProgram request, we used to end up trying to serialize an nil *prog.Prog value. Add a missing if condition.
* executor: query globs in the test program contextDmitry Vyukov2024-12-111-8/+23
| | | | | | | | | | | | | | | | | We query globs for 2 reasons: 1. Expand glob types in syscall descriptions. 2. Dynamic file probing for automatic descriptions generation. In both of these contexts are are interested in files that will be present during test program execution (rather than normal unsandboxed execution). For example, some files may not be accessible to test programs after pivot root. On the other hand, we create and link some additional files for the test program that don't normally exist. Add a new request type for querying of globs that are executed in the test program context.
* executor: use any executor if the avoid mask included all of themAndrei Vagin2024-11-181-3/+0
| | | | | | | | | | | | | | | | After 9fc8fe026baa ("executor: better handling for hanged test processes"), yz-executor's responses may reference procids outside of the [0;procs] range. If procids are no longer dense on the syz-executor side, we cannot rely on this check in pkg/rpcserver: ``` if avoid == (uint64(1)<<runner.procs)-1 { avoid = 0 } ``` Signed-off-by: Andrei Vagin <avagin@google.com>
* executor: better handling for hanged test processesDmitry Vyukov2024-10-241-1/+17
| | | | | | | | | | | | | | | Currently we kill hanged processes and consider the corresponding test finished. We don't kill/wait for the actual test subprocess (we don't know its pid to kill, and waiting will presumably hang). This has 2 problems: 1. If the hanged process causes "task hung" report, we can't reproduce it, since the test finished too long ago (manager thinks its finished and discards the request). 2. The test process still consumed per-pid resources. Explicitly detect and handle such cases: Manager keeps these hanged tests forever, and we assign a new proc id for future processes (don't reuse the hanged one).
* pkg/rpcserver, syz-manager: always include the program from CommAleksandr Nogikh2024-09-101-2/+15
| | | | | | | | | It does sometimes happen that the kernel is crashed so fast that syz-manager is not notified that the syz-executor has started running the faulty input. In cases when the exact program is known from Comm, let's make sure it's always present in the log of the last executed programs.
* executor: restart procs more deterministicallyDmitry Vyukov2024-08-021-8/+0
| | | | | | | | | | | | | | | | | | | | Currently we force restart in rpcserver, but this has 2 problems: 1. It does not know the proc where the requets will land. 2. It does not take into account if the proc has already restarted recently for other reasons. Restart procs in executor only if they haven't restarted recenlty. Also make it deterministic. Given all other randomess we have, there does not seem to be a reason to use randomized restarts and restart after fewer/more runs. Also restart only after corpus triage. Corpus triage is slow already and there does not seem to be enough benefit to restart during corpus triage. Also restart at most 1 proc at a time, since there are lots of serial work in the kernel.
* pkg/fuzzer: try to triage on different VMsDmitry Vyukov2024-08-021-2/+17
| | | | Distribute triage requests to different VMs.
* pkg/stat: rename package name to singular formDmitry Vyukov2024-07-241-7/+7
| | | | | | | | Go package names should generally be singular form: https://go.dev/blog/package-names https://rakyll.org/style-packages https://groups.google.com/g/golang-nuts/c/buBwLar1gNw
* prog: restricts hints to at most 10 attempts per single kernel PCDmitry Vyukov2024-07-221-0/+9
| | | | | | | | | We are getting too many generated candidates, the fuzzer may not keep up with them at all (hints jobs keep growing infinitely). If a hint indeed came from the input w/o transformation, then we should guess it on the first attempt (or at least after few attempts). If it did not come from the input, or came with a non-trivial transformation, then any number of attempts won't help. So limit the total number of attempts (until the next restart).
* pkg/rpcserver: exit on connection loop abortionAleksandr Nogikh2024-07-151-0/+1
| | | | | | | | | | | | For local rpcserver runs, we do not reboot the executor in case of errors. Moreover, if the error did not lead to the executor process exit, we may never detect that something went wrong. Return an error channel from CreateInstance() to be able to act on connection loop errors. Explicitly register the instance during local executions and exit from RunLocal() in case of connection problems.
* pkg/rpcserver: debug executor stallsAleksandr Nogikh2024-07-111-15/+59
| | | | | | | | In some cases, the executor seems to be mysteriously silent when we were awaiting a reply. During pkg/runtest tests, give it 1 minute to prepare a reply, then try to request the current state and abort the connection.
* all: transition to instance.PoolAleksandr Nogikh2024-07-111-0/+21
| | | | | Rely on instance.Pool to perform fuzzing and do bug reproductions. Extract the reproduction queue logic to separate testable class.
* pkg/rpcserver: stop the loop on shutdownAleksandr Nogikh2024-07-081-1/+4
| | | | | | | | | There's no sense in continuing the operation once the Runner has been stopped. If no new requests are coming, the loop goroutine may last a long time since it never actually interacts with the (possibly already closed) socket.
* pkg/rpcserver: capitalize external interface methodsAleksandr Nogikh2024-07-041-9/+9
| | | | | Make it explicit which methods of Runner refer to its implementation and which are supposed to be invoked by its users.
* pkg/rpcserver: remove direct accesses to Runner fieldsAleksandr Nogikh2024-07-041-8/+53
|
* pkg/rpcserver: move handshake functionality to RunnerAleksandr Nogikh2024-07-041-0/+60
| | | | This allows for a more clean interface between RPCServer and Runner.
* pkg/fuzzer: remove signal rotationDmitry Vyukov2024-07-021-3/+2
| | | | | | | Signal rotation is intended to make the fuzzer re-discover flaky coverage in non flaky way. However, taking into accout that we get effectively the same effect after each manager restart, and that the fuzzer is overloaded with triage/smash jobs, it does not look to be worth it.
* pkg/rpcserver: move kernel test/data range checks from executorDmitry Vyukov2024-07-011-6/+49
| | | | | | | | | | | | | | | | | We see some errors of the form: SYZFAIL: coverage filter is full pc=0x80007000c0008 regions=[0xffffffffbfffffff 0x243fffffff 0x143fffffff 0xc3fffffff] alloc=156 Executor shouldn't send non kernel addresses in signal, but somehow it does. It can happen if the VM memory is corrupted, or if the test program does something very nasty (e.g. discovers the output region and writes to it). It's not possible to reliably filter signal in the tested VM. Move all of the filtering logic to the host. Fixes #4942
* pkg/flatrpc: rename StartLeakChecks to CorpusTriagedDmitry Vyukov2024-07-011-3/+3
| | | | | | It's a more general name that says what happened rather than a detail of what excutor should do. We can use this notification for other things as well.
* pkg/rpcserver: split rpcserver.goAleksandr Nogikh2024-06-281-0/+347
Split out most of the Runner functionality into a separate file. This should make it easier to reason about what rpcserver.go does and it also makes further pkg/rpcserver refactoring simpler.