aboutsummaryrefslogtreecommitdiffstats
path: root/prog/encoding_test.go
Commit message (Collapse)AuthorAgeFilesLines
* prog: take multiple serialization flagsAleksandr Nogikh2025-11-031-0/+13
| | | | | | | | Refactor Prog.Serialize() to accept a variadic list of flags. For now, two are supported: 1) Verbose (equal to SerializeVerbose()). 2) SkipImages (don't serialize fs images).
* prog: replace MinimizeParams with MinimizeModeDmitry Vyukov2024-08-071-1/+1
| | | | | | | | | | | | | | All callers shouldn't control lots of internal details of minimization (if we have more params, that's just more variations to test, and we don't have more, params is just a more convoluted way to say if we minimize for corpus or a crash). 2 bools also allow to express 4 options, but only 3 make sense. Also when I see MinimizeParams{} in the code, it's unclear what it means. Replace params with mode. And potentially "crash" minimization is not "light", it's just different. E.g. we can simplify int arguments for reproducers (esp in snapshot mode), but we don't need that for corpus.
* prog: make minimization parameters explicitAleksandr Nogikh2024-05-271-1/+1
| | | | Add an explicit parameter to only run call removal.
* prog: fix validation of DataMmapProgDmitry Vyukov2024-05-061-2/+16
| | | | | | | Allow to serialize/deserialize DataMmapProg and fix validation in debug mode. Fixes #4750
* tools/syz-linter: check t.Logf/Errorf/Fatalf messagesDmitry Vyukov2024-04-171-2/+2
| | | | | Fix checking of Logf, it has string in 0-th arg. Add checking of t.Errorf/Fatalf.
* prog: don't require preallocated buffer for exec encodingDmitry Vyukov2024-04-161-15/+12
| | | | | | If we send exec encoding to the fuzzer, it's not necessary to serialize exec encoding into existing buffer (currnetly we serialize directly into shmem). So simplify code by serializing into a new slice.
* prog: profile what consumes space in exec encodingDmitry Vyukov2024-04-151-2/+2
| | | | | | | | Allow to profile how many bytes are consumed for what in the exec encoding. The profile shows there are not many opportunities left. 53% are consumed by data blobs. 13% for const args. 18% for non-arg things (syscall number, copyout index, props, etc).
* prog: fix selection of args eligible for squashingDmitry Vyukov2024-04-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes 3 issues: 1. We intended to squash only 'in' pointer elems, but we looked at the pointer direction rather than elem direction. Since pointers themselves are always 'in' we squashed a number of types we didn't want to squash. 2. We can squash filenames, which can lead to generation of escaping filenames, e.g. fuzzer managed to create "/" filename for blockdev_filename as: mount(&(0x7f0000000000)=ANY=[@ANYBLOB='/'], ...) Don't squash filenames. 3. We analyzed a concrete arg to see if it contains something we don't want to squash (e.g. pointers). But the whole type can still contain unsupported things in inactive union options, or in 0-sized arrays. E.g. this happened in the mount case above. Analyze the whole type to check for unsupported things. This also moves most of the analysis to the compiler, so mutation will be a bit faster. This removes the following linux types from squashing. 1. These are not 'in': btrfs_ioctl_search_args_v2 btrfs_ioctl_space_args ethtool_cmd_u fscrypt_add_key_arg fscrypt_get_policy_ex_arg fsverity_digest hiddev_ioctl_string_arg hidraw_report_descriptor ifreq_dev_t[devnames, ptr[inout, ethtool_cmd_u]] ifreq_dev_t[ipv4_tunnel_names, ptr[inout, ip_tunnel_parm]] ifreq_dev_t["sit0", ptr[inout, ip_tunnel_prl]] io_uring_probe ip_tunnel_parm ip_tunnel_prl poll_cq_resp query_port_cmd query_qp_resp resize_cq_resp scsi_ioctl_probe_host_out_buffer sctp_assoc_ids sctp_authchunks sctp_getaddrs sctp_getaddrs_old 2. These contain pointers: binder_objects iovec[in, netlink_msg_route_sched] iovec[in, netlink_msg_route_sched_retired] msghdr_netlink[netlink_msg_route_sched] msghdr_netlink[netlink_msg_route_sched_retired] nvme_of_msg 3. These contain filenames: binfmt_script blockdev_filename netlink_msg_route_sched netlink_msg_route_sched_retired selinux_create_req
* prog: make invalid union field error more explicitAleksandr Nogikh2024-02-191-2/+2
| | | | Include the name of the union and list the correct options.
* compiler: support const as int first argumentPaul Chaignon2023-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds support for the following syntax: int8[constant] as an equivalent to: const[constant, int8] The goal is to have a unified const/flags definition that we can use in templates. For example: type template[CLASS, ...] { class int8:3[CLASS] // ... } type singleClassType template[SINGLE_CONST] type subClassType template[abc_class_flags] In this example, the CLASS template field can be either a constant or a flag. This is especially useful when defining both a generic instance of the template as well as specialized instances (ex. bpf_alu_ops and bpf_add_op). Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* compiler: support flags as int first argumentPaul Chaignon2023-11-281-0/+4
| | | | | | | | | | | | | | | | | | | This commit adds support for the following syntax: int_flags = 1, 5, 8, 9 int32[int_flags] which is equivalent to: int_flags = 1, 5, 8, 9 flags[int_flags, int32] The second int type argument, align, is not allowed if the first argument is a flag. The compiler will also error if the first argument appears to be a flag (is ident and has no colon), but can't be found in the map of flags. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* prog, sys: test cases for struct AUTOPaul Chaignon2023-11-131-0/+12
| | | | | | | | | | This commit adds a few test cases for the support of AUTO for structs. It covers: - A simple struct with only const and len types. - A nested struct case. - An error case when a struct has an int type field. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* prog: add helper function parser.HasNextPaul Chaignon2023-11-131-0/+23
| | | | | | | This helper function will be used in a subsequent commit to take a look ahead at several characters. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
* prog: reject escaping filenames during deserializationDmitry Vyukov2023-02-161-0/+7
| | | | | | | | We already try as hard as possible to not generate escaping (global) filenames. However, it's possible we read them from the corpus if it happens to contain some. Also check for escaping filenames during deserialization. Fixes #3678
* prog: handle broken base64 stringsAleksandr Nogikh2022-11-221-0/+4
| | | | | | Currently it can fail if there's never a closing quote. Add a test to verify this behavior.
* prog: introduce new Base64 syntax for dataHrutvik Kanabar2022-11-211-1/+8
| | | | | | | | | | | | | The new "$..." syntax is read as a Base64 encoding binary data. Note that users cannot specify the size of the Base64 syntax using the `"..."/<size>` notation. When serialising programs to human-readable form, only compressed types (determined by `IsCompressed()`) are represented using the new Base64 notation. Also add a couple of serialisation tests, checking behaviour for compressed and non-compressed types.
* all: add the `rerun` call propertyAleksandr Nogikh2021-12-101-2/+6
| | | | | | | | | | | | | | To be able to collide specific syscalls more precisely, we need to repeat the process many times. Introduce the `rerun` call property, which instructs `syz-executor` to repeat the call the specified number of times. The intended use is: call1() (rerun: 100, async) call2() (rerun: 100) For now, assign rerun values randomly to consecutive pairs of calls, where the first one is async.
* all: replace collide mode by `async` call propertyAleksandr Nogikh2021-12-101-1/+5
| | | | | | | | | | | | | Replace the currently existing straightforward approach to race triggering (that was almost entirely implemented inside syz-executor) with a more flexible one. The `async` call property instructs syz-executor not to block until the call has completed execution and proceed immediately to the next call. The decision on what calls to mark with `async` is made by syz-fuzzer. Ultimately this should let us implement more intelligent race provoking strategies as well as make more fine-grained reproducers.
* all: refactor fault injection into call propsAleksandr Nogikh2021-09-221-2/+6
| | | | | | | | | | | | Now that call properties mechanism is implemented, we can refactor fault injection. Unfortunately, it is impossible to remove all traces of the previous apprach. In reprolist and while performing syz-ci jobs, syzkaller still needs to parse the old format. Remove the old prog options-based approach whenever possible and replace it with the use of call properties.
* all: introduce call propertiesAleksandr Nogikh2021-09-221-0/+61
| | | | | | | | | Call properties let us specify how each individual call within a program must be executed. So far the only way to enforce extra rules was to pass extra program-level properties (e.g. that is how fault injection was done). However, it entangles the logic and not flexible enough. Implement an ability to pass properties along with each individual call.
* pkg/compiler: optimize array[const] representationDmitry Vyukov2021-04-211-2/+3
| | | | | | | | | | | | | | Represent array[const[X, int8], N] as string["XX...X"]. This replaces potentially huge number of: NONFAILING(*(uint8_t*)0x2000126c = 0); NONFAILING(*(uint8_t*)0x2000126d = 0); NONFAILING(*(uint8_t*)0x2000126e = 0); with a single memcpy. In one reproducer we had 3991 such lines. Also replace memcpy's with memset's when possible. Update #1070
* prog: support disabled attributeDmitry Vyukov2020-05-041-2/+4
| | | | | Update #477 Update #502
* prog: make program parsing more permissiveDmitry Vyukov2020-04-281-4/+22
| | | | | Don't error on wrong vma with value in non strict mode. Add more tests and fix use of cmp package (prog.Syscall is not comparable anymore).
* prog: improve TestDeserializeHelperDmitry Vyukov2020-03-241-17/+16
| | | | | 1. Allow to not provide Out if it's the same as In. 2. Always check Out.
* prog: export deserialization test helper for sys/{linux,openbsd}Dmitry Vyukov2020-03-171-143/+89
| | | | | sys/{linux,openbsd} duplicate deserialization test logic as well. Export and reuse the existing helper function.
* prog: factor out common code in testsDmitry Vyukov2020-03-171-62/+55
| | | | Factor out a common test helper for tests that deserialize and check programs.
* pkg/compiler: ensure consistency of syscall argument typesDmitry Vyukov2020-03-171-18/+18
| | | | | | | | | | | | | | | | | | Ensure that we don't have conflicting sizes for the same argument of the same syscall, e.g.: foo$1(a int16) foo$2(a int32) This is useful for several reasons: - we will be able avoid morphing syscalls into other syscalls - we will be able to figure out more precise sizes for args (lots of them are implicitly intptr, which is the largest type on most important arches) - found few bugs in linux descriptions Update #477 Update #502
* prog: control program lengthDmitry Vyukov2020-03-131-6/+19
| | | | | | | | | | | | | | | | | | | We have _some_ limits on program length, but they are really soft. When we ask to generate a program with 10 calls, sometimes we get 100-150 calls. There are also no checks when we accept external programs from corpus/hub. Issue #1630 contains an example where this crashes VM (executor limit on number of 1000 resources is violated). Larger programs also harm the process overall (slower, consume more memory, lead to monster reproducers, etc). Add a set of measure for hard control over program length. Ensure that generated/mutated programs are not too long; drop too long programs coming from corpus/hub in manager; drop too long programs in hub. As a bonus ensure that mutation don't produce programs with 0 calls (which is currently possible and happens). Fixes #1630
* prog: fix tests for string enforcementDmitry Vyukov2020-01-051-1/+1
| | | | | | | | String value enforcement broke a number of tests where we use different values. Be more string as to what string values we use in tests. Required to add tmpfs descriptions to test syz_mount_image. Also special-casing AF_ALG algorithms as these are auto-generated.
* prog: don't mutate strings with enumerated valuesDmitry Vyukov2020-01-051-26/+43
| | | | | | | | | | Strings with enumerated values are frequently file names or have complete enumeration of relevant values. Mutating complete enumeration if not very profitable. Mutating file names leads to escaping paths and fuzzer messing with things it is not supposed to mess with as in: r0 = openat$apparmor_task_exec(0xffffffffffffff9c, &(0x7f0000000440)='/proc/self//exe\x00', 0x3, 0x0)
* prog: don't fail decoding on non-default out argsDmitry Vyukov2019-12-211-0/+5
| | | | | | | We get them in cross-compilation test where an out const arg has different values in different archs. No reason to fail deserialization in that case, replace with default arg instead.
* prog, pkg/csource: more readable serialization for stringsDmitry Vyukov2018-12-151-17/+31
| | | | | | | Always serialize strings in readable format (non-hex). Serialize binary data in readable format in more cases. Fixes #792
* prog: support AUTO args in programsDmitry Vyukov2018-12-101-0/+8
| | | | | | | | | | | | | | | AUTO arguments can be used for: - consts - lens - pointers For const's and len's AUTO is replaced with the natural value, addresses for AUTO pointers are allocated linearly. This greatly simplifies writing test programs by hand as most of the time we want these natural values. Update tests to use AUTO.
* prog: implement strict parsing modeDmitry Vyukov2018-12-101-27/+36
| | | | | | | Add bulk of checks for strict parsing mode. Probably not complete, but we can extend then in future as needed. Turns out we can't easily use it for serialized programs as they omit default args and during deserialization it looks like missing args.
* prog: introduce strict parsing modeDmitry Vyukov2018-12-101-37/+60
| | | | | | | | | | | Over time we relaxed parsing to handle all kinds of invalid programs (excessive/missing args, wrong types, etc). This is useful when reading old programs from corpus. But this is harmful for e.g. reading test inputs as they can become arbitrary outdated. For runtests which creates additional problem of executing not what is actually written in the test (or at least what author meant). Add strict parsing mode that does not tolerate any errors. For now it just checks excessive syscall arguments.
* prog: refactor deserialization codeDmitry Vyukov2018-12-101-2/+2
| | | | | | | Move target and vars into parser and make all parsing functions methods of the parser. This reduces number of args that we need to pass around and eases adding more state that needs to be passed around.
* prog: add concept of "special pointers"Dmitry Vyukov2018-08-301-2/+34
| | | | | | | | | | | | | | | | | Currently we only generate either valid user-space pointers or NULL. Extend NULL to a set of special pointers that we will use in programs. All targets now contain 3 special values: - NULL - 0xfffffffffffffff (invalid kernel pointer) - 0x999999999999999 (non-canonical address) Each target can add additional special pointers on top of this. Also generate NULL/special pointers for non-opt ptr's. This restriction was always too restrictive. We may want to generate them with very low probability, but we do want to generate them. Also change pointers to NULL/special during mutation (but still not in the opposite direction).
* prog: collect all prog commentsDmitry Vyukov2018-08-081-0/+9
| | | | | | Parse and collect and prog comments. Will be needed for runtest annotations (e.g. "requires threaded mode", etc).
* prog: parse comments in serialized programsDmitry Vyukov2018-07-271-0/+32
| | | | | | Remember per-call comments, will be useful for annotating tests. Also support this form: call() # comment
* executor: overhaulDmitry Vyukov2018-07-241-34/+34
| | | | | | | | | | | | | | | | | Make as much code as possible shared between all OSes. In particular main is now common across all OSes. Make more code shared between executor and csource (in particular, loop function and threaded execution logic). Also make loop and threaded logic shared across all OSes. Make more posix/unix code shared across OSes (e.g. signal handling, pthread creation, etc). Plus other changes along similar lines. Also support test OS in executor (based on portable posix) and add 4 arches that cover all execution modes (fork server/no fork server, shmem/no shmem). This change paves way for testing of executor code and allows to preserve consistency across OSes and executor/csource.
* prog: parallelize testsDmitry Vyukov2018-05-041-0/+1
| | | | | Parallelize more tests and reduce number of iterations in random tests under race detector.
* prog: fix isDefaultArgDmitry Vyukov2018-03-081-2/+2
| | | | | Test that isDefaultArg returns true for result of DefaultArg. Fix few bugs uncovered by this test.
* prog: harden program parsing against description changes moreDmitry Vyukov2018-03-051-12/+55
| | | | | | | Handle most of type changes, e.g. const is changed to struct, or struct to pointers. In all these cases we create default args. They may not give the coverage anymore, but still better than losing them right away.
* prog: recover after type changes during program deserializationDmitry Vyukov2018-03-051-24/+32
| | | | | Make program deserialization handle and recover after type changes in descriptions.
* prog: handle excessive args and fields during program parsingDmitry Vyukov2018-03-051-0/+20
| | | | | Tolerate excessive args and fields during program parsing. This is useful after description changes to not lose corpus.
* prog: reorder Minimize argumentsDmitry Vyukov2018-02-191-2/+2
| | | | | Make the predicate the last argument. It's more common and convenient (arguments are not separated by multiple lines).
* prog: don't serialize default argumentsDmitry Vyukov2018-02-011-0/+43
| | | | | | | This reduces size of a corpus in half. We store corpus on manager and on hub, so this will reduce their memory consumption. But also makes large programs more readable.
* pkg/csource: fix handling of proc typesDmitry Vyukov2017-12-221-1/+1
| | | | | | | | | | Generated program always uses pid=0 even when there are multiple processes. Make each process use own pid. Unfortunately required to do quite significant changes to prog, because the current format only supported fixed pid. Fixes #490
* prog: don't serialize output data argsDmitry Vyukov2017-12-171-2/+6
| | | | | | | | Fixes #188 We now will write just ""/1000 to denote a 1000-byte output buffer. Also we now don't store 1000-byte buffer in memory just to denote size. Old format is still parsed.
* prog: introduce more readable format for data argsDmitry Vyukov2017-12-171-0/+47
| | | | | | | | | | | | | | | | | | | | | Fixes #460 File names, crypto algorithm names, etc in programs are completely unreadable: bind$alg(r0, &(0x7f0000408000)={0x26, "6861736800000000000000000000", 0x0, 0x0, "6d6435000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000 00000000000"}, 0x58) Introduce another format for printable strings. New args are denoted by '' ("" for old args). New format is enabled for printable chars, \x00 and \t, \r, \n. Example: `serialize(&(0x7f0000408000)={"6861736800000000000000000000", "4849000000"})`, vs: `serialize(&(0x7f0000408000)={'hash\x00', 'HI\x00'})`,