aboutsummaryrefslogtreecommitdiffstats
path: root/vm/qemu
Commit message (Collapse)AuthorAgeFilesLines
* all: use any instead of interface{}Dmitry Vyukov2025-12-221-5/+5
| | | | Any is the preferred over interface{} now in Go.
* vm/qemu: additional check for crashes only in DiagnoseBabak Huseynov2025-11-211-2/+33
|
* vm: use error wrapping to detect ssh connection errorsAleksandr Nogikh2025-10-011-4/+3
| | | | This is a much cleaner logic than string matching.
* vm/qemu: don't auto retry ssh connection timeout errorsAleksandr Nogikh2025-10-011-0/+7
| | | | | | In almost all cases these mean some boot time crash. It also doesn't make much sense to continue string matching since the boot output may contain the matched strings in benign contexts.
* vm: add context to Pool.Create()Aleksandr Nogikh2025-10-011-1/+4
| | | | | | | | | | Enable external abortion of the instance creation process. This is especially useful for the qemu case where we retry the creation/boot up to 1000 times, which can take significant time (e.g. it timeouts syz-cluster pods on unstable kernels). The context can be further propagated to WaitForSSH, but that requires another quite significant vm/ refactoring.
* all: apply linter auto fixesTaras Madan2025-07-171-1/+1
| | | | ./tools/syz-env bin/golangci-lint run ./... --fix
* vm/qemu: use virtio-net-ccw as virtual netdev on s390x archAlexander Egorenkov2025-07-011-2/+2
| | | | | | | | | | | | | | | | virtio-net-ccw is a preferred way to set up a virtual network interface on s390x at the moment because it is faster than virtio-net-pci (eventfd and irqfd is missing). This also allows disabling of zPCI in QEMU which was required only because virtio-net-pci was used as a network interface. PCI is special on s390x and, for instance, does not use MMIO or expose topology [1,2,3]. Furthermore, any features like PXE are not supported with virtio-net-pci on s390x. [1] https://people.redhat.com/~cohuck/2018/02/19/notes-on-pci-on-s390x.html [2] https://wiki.qemu.org/Documentation/Platforms/S390X#A_note_on_PCI_support [3] https://www.qemu.org/docs/master/system/s390x/pcidevices.html Signed-off-by: Alexander Egorenkov <eaibmz@gmail.com>
* vm: func Run accepts contextTaras Madan2025-05-191-3/+3
| | | | It allows to use context as a single termination signal source.
* Revert "vm/qemu: use -machine virt and -cpu max for arm32"Aleksandr Nogikh2025-05-061-2/+2
| | | | This reverts commit 85a5a23f228f2de970f578bf3b452a23a222c09d.
* vm/qemu: use -machine virt and -cpu max for arm32Aleksandr Nogikh2025-04-291-2/+2
| | | | | | | The previously used combination does not boot our buildroot image: [ 6.334727][ T1] Run /sbin/init as init process [ 6.668200][ T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
* vm/qemu: fix wrong arg usageTaras Madan2025-03-281-1/+1
| | | | Closes #5870.
* vm: use SSHOptions instead of 4 paramsTaras Madan2025-03-271-25/+26
| | | | It reduces WaitForSSH parameter count from 9 to 6.
* vm/qemu: run riscv64 kernel using 4-level page tableAlexandre Ghiti2025-02-131-1/+1
| | | | | Riscv is far from having a hw with a 5-level support, so let's focus on the 4-level.
* vm/qemu: retry on Address already in use errorsAleksandr Nogikh2025-02-131-0/+3
| | | | | | The chance of port collision is very low, but still not 0. There's no reason to report an error on the first ocurrence of the problem, let it first retry 100 times.
* vm: use -cpu cortex-a15 for qemu/arm32Aleksandr Nogikh2024-12-031-2/+3
| | | | | | | | | The new qemu versions began to fail with the settings we previously used. It's probably not worth extensive debugging, so let's just do what qemu suggests. qemu-system-arm: Invalid CPU model: max The only valid type is: cortex-a15
* vm: dedup VM count restriction in debug modeDmitry Vyukov2024-11-251-4/+0
| | | | | | | Move the VM count restriction logic info vm package. This avoids lots of duplication, makes it supported for VM types that failed to do this, and allows to unify more VM count logic in future.
* vm/qemu: do not pass `-accel tcg,thread=multi` on arm64Alexander Potapenko2024-10-221-1/+1
| | | | | | Even though we are yet to see arm64 hosts on which `-accel kvm` works properly, require the users to explicitly set request TCG in their manager configs.
* vm/qemu: increase max number of VMsDmitry Vyukov2024-09-271-2/+2
| | | | I want to create more than 128.
* vm/qemu: enable sve128 on ARM instancesAlexander Potapenko2024-09-251-1/+1
| | | | This seems to be an acceptable compromise between speed and coverage
* vm/qemu: extend error messagesDmitry Vyukov2024-08-161-3/+3
| | | | | Include VM output into snapshot error messages. Otherwise it's hard to understand what happened.
* vm/qemu: use the maximum available CPU on ARM64Alexander Potapenko2024-07-291-4/+7
| | | | | | This is needed to have access to newer features like nested virtualization. Because those features slow down CPU emulation in QEMU, disable SVE and pointer authentication, which are of less importance for us now.
* vm/qemu: use the maximum available VGIC on arm64Alexander Potapenko2024-07-291-1/+1
| | | | | Newer virtual IRQ controllers provide more features, so this should hopefully increase the coverage.
* all: add qemu snapshotting modeDmitry Vyukov2024-07-253-0/+299
|
* vm/qemu: refactor boot functionDmitry Vyukov2024-07-251-49/+56
| | | | | More qemu arguments building into separate function to prevent linter error about max function length in next commits.
* vmimpl: refactor VM type registrationDmitry Vyukov2024-07-231-1/+4
| | | | | | | | | Pass Type struct directly during registration. This allows to add additional optional parameters to VM types without changing all VM implementations. We we will need to add SupportsSnapshots flag and one flag to resolve #5028. With this change it will be possible to add "SupportsSnapshots: true" to just one VM type implemenetation.
* vm: make Instance implement io.CloserAleksandr Nogikh2024-07-111-1/+2
| | | | It's better to follow standard interfaces.
* vm/qemu: don't log qmp on level 1Dmitry Vyukov2024-07-111-1/+1
| | | | | If qmp is used all the time for snapshotting, it produces tons of uniniteresting logs at level 1 (manager web UI).
* vm/qemu: better handle qmp errorsDmitry Vyukov2024-07-081-2/+14
| | | | | | Sometimes qemu just returns an "Error: ..." string in reply instead of returning an error. Handle these cases. Also log all qmp commands in debug mode.
* vm: refactor vm.Multiplex argumentsAleksandr Nogikh2024-07-011-1/+5
| | | | | Introduce a MultiplexConfig structure that contains optional parameters. Include a Scale parameter to control the intended slowdown.
* vm/qemu: use the default vmimpl.Multiplex() functionAleksandr Nogikh2024-07-011-28/+1
|
* vm/qemu: remove an unused diagnose fieldAleksandr Nogikh2024-07-011-6/+0
| | | | We never write to the channel.
* executor: add runner modeDmitry Vyukov2024-06-241-8/+3
| | | | | | | Move all syz-fuzzer logic into syz-executor and remove syz-fuzzer. Also restore syz-runtest functionality in the manager. Update #4917 (sets most signal handlers to SIG_IGN)
* pkg/rpctype: prepare for not using for target communicationDmitry Vyukov2024-05-031-1/+1
| | | | | | Remove things that are only needed for target VM communication: conditional compression, timeout scaling, traffic stats. To minimize diffs when we switch target VM communication to flatrpc.
* vm/qemu: don't use UseNewQemuImageOptions for NetBSDAleksandr Nogikh2024-04-161-5/+4
| | | | It seems to bring more problems than it solves.
* all: remove akaros supportDmitry Vyukov2024-04-151-7/+1
| | | | | | | Akaros support is unused, it was shutdown on syzbot for a while, the akaros development seems to be frozen for years as well. We have a bunch of hacks for Akaros since it supported only super old gcc and haven't supported Go. Remove it.
* vm/qemu: use the new options format for NetBSDAleksandr Nogikh2024-04-111-4/+5
| | | | | We're seeing a lot of `Image format was not specified for '%PATH' and probing guessed raw.` errors.
* pkg/rpctype: make RPC compression optionalDmitry Vyukov2024-04-031-1/+1
| | | | | | | | RPC compression take up to 10% of CPU time in profiles, but it's unlikely to be beneficial for local VM runs (we are mostly copying memory in this case). Enable RPC compression based on the VM type (local VM don't use it, remove machines use it).
* vm/isolated: allow the use of system-wide SSH configFlorent Revest2024-03-191-4/+4
| | | | | | | | | | | | Most of the VM types tightly manage the target they SSH into and can safely assume that system wide SSH configuration would mess with the SSH flags provided by syzkaller. However, in the "isolate" VM type, one can connect to a host that is not at all managed by syzkaller. In this case, it can be useful to leverage system wide SSH config, maybe provided by a corporate environment. This adds an option to the isolated config to skip some of the SSH and SCP flags that would drop system wide config.
* vm/qemu.go: fix nil-ptr-deref in ctorSungwoo Kim2024-02-211-1/+1
| | | | | | | | | | | | os.Stat() may return (nil, err) if it fails to open a file. So, the code below wrongly validates st as it will be always nil if err != nil, causing nil pointer dereference in st.Size(). ``` if st, err := os.Stat(inst.image); err != nil && st.Size() == 0 { ``` To fix this, this patch allows st.Size() only if err == nil.
* vm/qemu: forward pprof portAleksandr Nogikh2024-01-101-2/+9
| | | | Forward the default pprof port to enable direct connections from the host.
* all: use special placeholder for errorsTaras Madan2023-07-241-3/+3
|
* vm: speed up arm/arm64 emulationAleksandr Nogikh2023-06-281-2/+2
| | | | | The `-accel tcg,thread=multi` option speeds up boot by ~25%. Execution speed shoud also increase.
* vm/qemu: initial freebsd/riscv64 supportP1umer2023-01-031-0/+7
|
* vm/qemu: prevent network device renaming on s390x archAlexander Egorenov2022-11-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a temporary work-around for s390x until it supports CONFIG_CMDLINE. Failing to do so might cause a failure to establish a SSH connection when syz-ci tests a built image. syz-ci output: -------------- building kernel... testing image... VM boot failed with: can't ssh into the instance failure log: ------------ failed to run ["ssh" "-p" "34490" ... "root@localhost" "pwd"]: exit status 255 Connection timed out during banner exchange Connection to 127.0.0.1 port 34490 timed out qemu dmesg ---------- [ 6.646475] virtio_net virtio0 eno1: renamed from eth0 Signed-off-by: Alexander Egorenov <eaibmz@gmail.com>
* vm/qemu: return Fuchsia heartbeat period to default (#3389)eepeep2022-09-211-1/+0
| | | | | | PR #3387 inadvertently set the heartbeat period to the same value as the heartbeat age threshold, which is incorrect. This removes that configuration line, allowing the period to revert to its default of 1sec.
* vm/qemu: support VFs > 8George Kennedy2022-09-211-0/+23
| | | | | | | | | | | Syzkaller currently only supports 8 (0-7) pass-through VFs. Add support for VFs > 8 by incrementing the Device # and resetting the VF # to zero when INDEX modulo 8 = zero. Introduce "{{FN%8}}" to trigger this support. Ex: vfio-pci,host=31:0a.{{FN%8}} Signed-off-by: George Kennedy <george.kennedy@oracle.com>
* vm/qemu: move timeout before retry to avoid resource busyGeorge Kennedy2022-09-211-0/+3
| | | | | | | | Add "Device or resource busy" check to delay loop in function Create to avoid resource busy caused by qemu "lazy release" of VFs when VMs are restarted. Signed-off-by: George Kennedy <george.kennedy@oracle.com>
* vm/qemu: relax Fuchsia lockup detector thresholdsCameron Finucane2022-09-201-0/+8
| | | | | Running with nested virtualization, this was causing many false positives, so we relax it to a similar level as used for Linux targets.
* vm/qemu/qemu.go: changed deprecated nowait option to preferred wait=offAdam Goska2021-12-201-1/+1
| | | | | | Invoking qemu with the nowait option produces a warning that the short-form boolean options are deprecated and that wait=off is preferred.
* vm/qemu: handle QMP eventsAlexey Kardashevskiy2021-10-221-4/+17
| | | | | | | | | | | | QEMU occasionally sends events in the same stream used for QMP commands so from time time the received packet is not a QMP reponse but a QMP event which breaks the parser. For example, events are send when a machine state changed. This adds basic support for event. For now we skip them and wait until the expected QMP command response arrives. Signed-off-by: Alexey Kardashevskiy <aik@linux.ibm.com>