aboutsummaryrefslogtreecommitdiffstats
path: root/prog/encoding.go
Commit message (Collapse)AuthorAgeFilesLines
* all: use any instead of interface{}Dmitry Vyukov2025-12-221-4/+4
| | | | Any is the preferred over interface{} now in Go.
* prog: take multiple serialization flagsAleksandr Nogikh2025-11-031-17/+35
| | | | | | | | Refactor Prog.Serialize() to accept a variadic list of flags. For now, two are supported: 1) Verbose (equal to SerializeVerbose()). 2) SkipImages (don't serialize fs images).
* all: delete dead codeTaras Madan2025-02-101-4/+0
| | | | | go install golang.org/x/tools/cmd/deadcode@latest deadcode -test ./...
* all: use min/max functionsDmitry Vyukov2025-01-171-6/+2
| | | | They are shorter, more readable, and don't require temp vars.
* all: follow new linter recommendationsTaras Madan2024-09-101-4/+8
|
* prog: fix validation of DataMmapProgDmitry Vyukov2024-05-061-33/+16
| | | | | | | Allow to serialize/deserialize DataMmapProg and fix validation in debug mode. Fixes #4750
* prog: add raw deserialization modeDmitry Vyukov2024-04-291-5/+18
| | | | | | | Raw deserialization mode does not do any program sanitization and allows to use global file names, prohibited ioctl's, etc. This will be useful for moving syscall/feature checking code to the host, we will need to probe opening global files, etc.
* prog: fix selection of args eligible for squashingDmitry Vyukov2024-04-151-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes 3 issues: 1. We intended to squash only 'in' pointer elems, but we looked at the pointer direction rather than elem direction. Since pointers themselves are always 'in' we squashed a number of types we didn't want to squash. 2. We can squash filenames, which can lead to generation of escaping filenames, e.g. fuzzer managed to create "/" filename for blockdev_filename as: mount(&(0x7f0000000000)=ANY=[@ANYBLOB='/'], ...) Don't squash filenames. 3. We analyzed a concrete arg to see if it contains something we don't want to squash (e.g. pointers). But the whole type can still contain unsupported things in inactive union options, or in 0-sized arrays. E.g. this happened in the mount case above. Analyze the whole type to check for unsupported things. This also moves most of the analysis to the compiler, so mutation will be a bit faster. This removes the following linux types from squashing. 1. These are not 'in': btrfs_ioctl_search_args_v2 btrfs_ioctl_space_args ethtool_cmd_u fscrypt_add_key_arg fscrypt_get_policy_ex_arg fsverity_digest hiddev_ioctl_string_arg hidraw_report_descriptor ifreq_dev_t[devnames, ptr[inout, ethtool_cmd_u]] ifreq_dev_t[ipv4_tunnel_names, ptr[inout, ip_tunnel_parm]] ifreq_dev_t["sit0", ptr[inout, ip_tunnel_prl]] io_uring_probe ip_tunnel_parm ip_tunnel_prl poll_cq_resp query_port_cmd query_qp_resp resize_cq_resp scsi_ioctl_probe_host_out_buffer sctp_assoc_ids sctp_authchunks sctp_getaddrs sctp_getaddrs_old 2. These contain pointers: binder_objects iovec[in, netlink_msg_route_sched] iovec[in, netlink_msg_route_sched_retired] msghdr_netlink[netlink_msg_route_sched] msghdr_netlink[netlink_msg_route_sched_retired] nvme_of_msg 3. These contain filenames: binfmt_script blockdev_filename netlink_msg_route_sched netlink_msg_route_sched_retired selinux_create_req
* prog: auto-set proper conditional fields in Deserialize()Aleksandr Nogikh2024-03-131-1/+12
| | | | | | | | | Treat all default union arguments as transient and reevaluate them after the call was fully parsed. Before conditional field patching, we do need to have performed arg validation, which also reevaluates conditions. To break the cycle, make validation configurable.
* Revert "prog: auto-set proper conditional fields in Deserialize()"Aleksandr Nogikh2024-03-081-12/+1
| | | | This reverts commit 8e75c913b6f9b09cab2ad31fd7d66ea0d1703de8.
* prog: auto-set proper conditional fields in Deserialize()Aleksandr Nogikh2024-03-081-1/+12
| | | | | | | | | Treat all default union arguments as transient and reevaluate them after the call was fully parsed. Before conditional field patching, we do need to have performed arg validation, which also reevaluates conditions. To break the cycle, make validation configurable.
* prog: make invalid union field error more explicitAleksandr Nogikh2024-02-191-1/+4
| | | | Include the name of the union and list the correct options.
* prog: highlight error locationAleksandr Nogikh2024-02-191-2/+7
| | | | | Simplify seed program debugging by highlighting the actual error location in the source text.
* prog: support AUTO for structsPaul Chaignon2023-11-131-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | The AUTO token can currently be used in place of ConstType, LenType, and CsumType. This commit extends this support to StructType on the condition that they are made of the above types. It will allow us to simplify all the {AUTO, AUTO, AUTO, AUTO} in test programs. The following struct can therefore be written as AUTO. auto_struct1 { f0 len[parent, int32] f1 const[0x43, int32] } In addition, with this commit, AUTO can also stand for nested structs as long as all the leaf fields are ConstType, LenType, or CsumType. For example, the following struct can be written as AUTO. auto_struct2 { f0 len[parent, int32] f1 auto_struct1 } Suggested-by: Aleksandr Nogikh <nogikh@google.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* prog: add helper function parser.HasNextPaul Chaignon2023-11-131-0/+15
| | | | | | | This helper function will be used in a subsequent commit to take a look ahead at several characters. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* prog: more informative errors on parsing failuresPaul Chaignon2023-10-121-2/+2
| | | | | | | | This simply prints more information when failing to parse unions and structs in syzkaller programs. This is particularly useful when a syscall has multiple structs or unions (ex. for bpf$PROG_LOAD). Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* all: use special placeholder for errorsTaras Madan2023-07-241-2/+2
|
* pkg/image: make Decompress easier to useDmitry Vyukov2022-12-221-2/+1
| | | | | | | | | | Change DecompressWriter to DecompressCheck: checking validity of the image is the only useful use of DecompressWriter. Change Decompress to MustDecompress which does not return an error. We check validity during program deserialization, so all other uses already panic on errors. Also add dtor return value in preparation for subsequent changes.
* pkg/image: factor out from progDmitry Vyukov2022-12-221-5/+6
| | | | | | | Move image compression-related function to a separate package. In preperation for subsequent changes that make decompression more complex. Prog package is already large and complex. Also makes running compression tests/benchmarks much faster.
* prog: don't materialize uncompressed image in DeserializeDmitry Vyukov2022-11-251-3/+3
| | | | | | | | | | Currently we uncompress all images in Deserialize to check that the data is valid. As the result deserializing all seeds we have takes ~40 seconds of real time and ~125 seconds of CPU time. And we do this during every syz-manager start. Don't materialize the uncompressed image. This reduces real time to ~15 seconds and CPU time to 18 seconds (no garbage collections). In syz-manager the benefit is even larger since garbage collections take longer (larger heap).
* prog: handle broken base64 stringsAleksandr Nogikh2022-11-221-2/+2
| | | | | | Currently it can fail if there's never a closing quote. Add a test to verify this behavior.
* prog: introduce new Base64 syntax for dataHrutvik Kanabar2022-11-211-21/+49
| | | | | | | | | | | | | The new "$..." syntax is read as a Base64 encoding binary data. Note that users cannot specify the size of the Base64 syntax using the `"..."/<size>` notation. When serialising programs to human-readable form, only compressed types (determined by `IsCompressed()`) are represented using the new Base64 notation. Also add a couple of serialisation tests, checking behaviour for compressed and non-compressed types.
* prog, pkg/compiler: add `BufferCompressed` buffer type & `compressed_image` ↵Hrutvik Kanabar2022-11-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | builtin Create the `BufferCompressed` kind of `BufferType`, which will be used to represent compressed data. Create the corresponding `compressed_image` syzlang builtin, which is backed by `BufferCompressed`. For now, no syscalls use this feature - this will be introduced in future commits. We have to be careful to decompress the data before mutating, and re-compress before storing. We make sure that any deserialised `BufferCompressed` data is valid too. `BufferCompressed` arguments are mutated using a generic heatmap. In future, we could add variants of `BufferCompressed` or populate the `BufferType` sub-kind, using it to choose different kinds of heatmap for different uncompressed data formats. Various operations on compressed data must be forbidden, so we check for `BufferCompressed` in key places. We also have to ensure `compressed_image` can only be used in syscalls that are marked `no_{generate,minimize}`. Therefore, we add a generic compiler check which allows type descriptions to require attributes on the syscalls which use them.
* prog: error if program variable refers to non-resourceDmitry Vyukov2022-01-111-0/+2
| | | | | Error if program variable refers to non-resource in strict parsing mode. Such errors are hard to diagnose otherwise since the variable is silently discarded.
* all: replace collide mode by `async` call propertyAleksandr Nogikh2021-12-101-0/+3
| | | | | | | | | | | | | Replace the currently existing straightforward approach to race triggering (that was almost entirely implemented inside syz-executor) with a more flexible one. The `async` call property instructs syz-executor not to block until the call has completed execution and proceed immediately to the next call. The decision on what calls to mark with `async` is made by syz-fuzzer. Ultimately this should let us implement more intelligent race provoking strategies as well as make more fine-grained reproducers.
* prog: don't use reflect.Value.IsZeroDmitry Vyukov2021-09-301-1/+2
| | | | | reflect.Value.IsZero is added in go1.13, not available in Appengine SDK. Replace it with DeepEqual+Zero.
* all: refactor fault injection into call propsAleksandr Nogikh2021-09-221-4/+2
| | | | | | | | | | | | Now that call properties mechanism is implemented, we can refactor fault injection. Unfortunately, it is impossible to remove all traces of the previous apprach. In reprolist and while performing syz-ci jobs, syzkaller still needs to parse the old format. Remove the old prog options-based approach whenever possible and replace it with the use of call properties.
* all: introduce call propertiesAleksandr Nogikh2021-09-221-2/+75
| | | | | | | | | Call properties let us specify how each individual call within a program must be executed. So far the only way to enforce extra rules was to pass extra program-level properties (e.g. that is how fault injection was done). However, it entangles the logic and not flexible enough. Implement an ability to pass properties along with each individual call.
* all: introduce a prog.Call constructorAleksandr Nogikh2021-09-221-5/+2
| | | | | Create a constructor for the prog.Call type. It allows to reduce the duplication of code now and during further changes.
* syz-manager, syz-fuzzer: filter stale glob values in the corpusDmitry Vyukov2021-06-261-1/+2
| | | | | | | | | | | | Corpus may accumulate glob values that are already filtered out by descriptions (e.g. some harmful files), for an example see: https://groups.google.com/g/syzkaller-bugs/c/W_R0O4XWpfY/m/sdwwg2_hAwAJ Pass glob files to the manager and filter out values that are not present in the glob already. Also use the same caching scheme we use for features and enabled syscalls so that fuzzers don't need to scan globs every time.
* pkg/compiler: add glob typeJoey Jiaojg2021-05-261-1/+1
| | | | | | | | | | | | | | | | | | | | * all: add new typename dirname The current way to check files under sysfs or proc is: - define a string to represent each file - open the file - pass the fd to write / read / close The issues above are: - Need to know what file present on target device - Need to write openat for each file With dirname added, which will open one file in the directory randomly and then pass the fd to write/read/close. * all: use typename glob to match filename Fixes #481
* prog: allow arbitrary long lines in serialized programsDmitry Vyukov2020-09-201-21/+23
| | | | | | | We use bufio.Scanner and it has mandatory limit on line length. The file system tests (sys/linux/test/syz_mount_image_*) has very long lines (megabytes). Remove the restriction on line length.
* prog: extend error message on deserialization errorDmitry Vyukov2020-09-201-1/+1
|
* pkg, prog: add per-field direction attributeNecip Fazil Yildiran2020-08-131-7/+10
|
* prog/alloc: align address allocation for aligned[addr]Albert van der Linde2020-07-141-1/+1
| | | | | | | | | Calls to alloc didn't respect the alignment attribute. Now Type.Alignment() is used to ensure each type is correctly aligned. Existing descriptions with [align[X]] don't have an issue as they align to small blocks and default align is to 64 bytes. This commits adds support for [align[X]] for an X larger than 64.
* .golangci.yml: enable whitespace checkerDmitry Vyukov2020-06-051-2/+0
| | | | Points to bad empty lines very precisely.
* prog: introduce Field typeDmitry Vyukov2020-05-021-17/+19
| | | | | | | | | | | | | Remvoe FieldName from Type and add a separate Field type that holds field name. Use Field for struct fields, union options and syscalls arguments, only these really have names. Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%). Allows to not create new type for squashed any pointer. But main advantages will follow, e.g. removing StructDesc, using TypeRef in Arg, etc. Update #1580
* prog: rename {PtrType,ArrayType}.Type to ElemDmitry Vyukov2020-05-011-10/+9
| | | | | | | Name "Type" is confusing when referring to pointer/array element type. Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative. It _is_ a type, but what's usually more relevant is that it's an _element_ type. Let's leave type checking to compiler and give it a more meaningful name.
* prog: remove Dir from TypeDmitry Vyukov2020-05-011-58/+60
| | | | | | | | | | | | | | | | | | Having Dir is Type is handy, but forces us to duplicate lots of types. E.g. if a struct is referenced as both in and out, then we need to have 2 copies and 2 copies of structs/types it includes. If also prevents us from having the struct type as struct identity (because we can have up to 3 of them). Revert to the old way we used to do it: propagate Dir as we walk syscall arguments. This moves lots of dir passing from pkg/compiler to prog package. Now Arg contains the dir, so once we build the tree, we can use dirs as before. Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%). Update #1580
* prog: make program parsing more permissiveDmitry Vyukov2020-04-281-1/+2
| | | | | Don't error on wrong vma with value in non strict mode. Add more tests and fix use of cmp package (prog.Syscall is not comparable anymore).
* prog: rename target.SanitizeCall to NeutralizeDmitry Vyukov2020-03-171-2/+2
| | | | | | | | | | | | | We will need a wrapper for target.SanitizeCall that will do more than just calling the target-provided function. To avoid confusion and potential mistakes, give the target function and prog function different names. Prog package will continue to call this "sanitize", which will include target's "neutralize" + more. Also refactor API a bit: we need a helper function that sanitizes the whole program because that's needed most of the time. Fixes #477 Fixes #502
* prog: control program lengthDmitry Vyukov2020-03-131-6/+8
| | | | | | | | | | | | | | | | | | | We have _some_ limits on program length, but they are really soft. When we ask to generate a program with 10 calls, sometimes we get 100-150 calls. There are also no checks when we accept external programs from corpus/hub. Issue #1630 contains an example where this crashes VM (executor limit on number of 1000 resources is violated). Larger programs also harm the process overall (slower, consume more memory, lead to monster reproducers, etc). Add a set of measure for hard control over program length. Ensure that generated/mutated programs are not too long; drop too long programs coming from corpus/hub in manager; drop too long programs in hub. As a bonus ensure that mutation don't produce programs with 0 calls (which is currently possible and happens). Fixes #1630
* prog: dump orig prog if Deserialize panicsDmitry Vyukov2020-02-211-0/+6
| | | | | | | | | | We are seeing some one-off panics during Deserialization and it's unclear if it's machine memory corrpution or an actual bug in prog. I leam towards machine memory corruption but it's impossible to prove without seeing the orig program. Move git revision to prog and it's more base package (sys can import prog, prog can't import sys).
* pkg/compiler: fix another bitfield layout bugDmitry Vyukov2020-01-071-1/+1
| | | | See the added test for details.
* prog: fix tests for string enforcementDmitry Vyukov2020-01-051-1/+3
| | | | | | | | String value enforcement broke a number of tests where we use different values. Be more string as to what string values we use in tests. Required to add tmpfs descriptions to test syz_mount_image. Also special-casing AF_ALG algorithms as these are auto-generated.
* prog: don't mutate strings with enumerated valuesDmitry Vyukov2020-01-051-3/+17
| | | | | | | | | | Strings with enumerated values are frequently file names or have complete enumeration of relevant values. Mutating complete enumeration if not very profitable. Mutating file names leads to escaping paths and fuzzer messing with things it is not supposed to mess with as in: r0 = openat$apparmor_task_exec(0xffffffffffffff9c, &(0x7f0000000440)='/proc/self//exe\x00', 0x3, 0x0)
* prog: fix a typo in a commentDmitry Vyukov2019-12-301-1/+1
|
* prog: don't fail decoding on non-default out argsDmitry Vyukov2019-12-211-1/+8
| | | | | | | We get them in cross-compilation test where an out const arg has different values in different archs. No reason to fail deserialization in that case, replace with default arg instead.
* tools: add syz-expandAndrey Konovalov2019-09-231-10/+20
| | | | | | | The syz-expand tools allows to parse a program and print it including all the default values. This is mainly useful for debugging, like doing manual program modifications while trying to come up with a reproducer for some particular kernel behavior.
* prog: add implementation for resource centricVeronica Radu2019-09-031-1/+1
|