| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
Refactor Prog.Serialize() to accept a variadic list of flags.
For now, two are supported:
1) Verbose (equal to SerializeVerbose()).
2) SkipImages (don't serialize fs images).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
All callers shouldn't control lots of internal details of minimization
(if we have more params, that's just more variations to test,
and we don't have more, params is just a more convoluted way to say
if we minimize for corpus or a crash).
2 bools also allow to express 4 options, but only 3 make sense.
Also when I see MinimizeParams{} in the code, it's unclear what it means.
Replace params with mode.
And potentially "crash" minimization is not "light", it's just different.
E.g. we can simplify int arguments for reproducers (esp in snapshot mode),
but we don't need that for corpus.
|
| |
|
|
| |
Add an explicit parameter to only run call removal.
|
| |
|
|
|
|
|
| |
Allow to serialize/deserialize DataMmapProg
and fix validation in debug mode.
Fixes #4750
|
| |
|
|
|
| |
Fix checking of Logf, it has string in 0-th arg.
Add checking of t.Errorf/Fatalf.
|
| |
|
|
|
|
| |
If we send exec encoding to the fuzzer, it's not necessary to serialize
exec encoding into existing buffer (currnetly we serialize directly into shmem).
So simplify code by serializing into a new slice.
|
| |
|
|
|
|
|
|
| |
Allow to profile how many bytes are consumed for what in the exec encoding.
The profile shows there are not many opportunities left.
53% are consumed by data blobs.
13% for const args.
18% for non-arg things (syscall number, copyout index, props, etc).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes 3 issues:
1. We intended to squash only 'in' pointer elems,
but we looked at the pointer direction rather than elem direction.
Since pointers themselves are always 'in' we squashed a number of
types we didn't want to squash.
2. We can squash filenames, which can lead to generation of escaping filenames,
e.g. fuzzer managed to create "/" filename for blockdev_filename as:
mount(&(0x7f0000000000)=ANY=[@ANYBLOB='/'], ...)
Don't squash filenames.
3. We analyzed a concrete arg to see if it contains something
we don't want to squash (e.g. pointers). But the whole type
can still contain unsupported things in inactive union options,
or in 0-sized arrays. E.g. this happened in the mount case above.
Analyze the whole type to check for unsupported things.
This also moves most of the analysis to the compiler,
so mutation will be a bit faster.
This removes the following linux types from squashing.
1. These are not 'in':
btrfs_ioctl_search_args_v2
btrfs_ioctl_space_args
ethtool_cmd_u
fscrypt_add_key_arg
fscrypt_get_policy_ex_arg
fsverity_digest
hiddev_ioctl_string_arg
hidraw_report_descriptor
ifreq_dev_t[devnames, ptr[inout, ethtool_cmd_u]]
ifreq_dev_t[ipv4_tunnel_names, ptr[inout, ip_tunnel_parm]]
ifreq_dev_t["sit0", ptr[inout, ip_tunnel_prl]]
io_uring_probe
ip_tunnel_parm
ip_tunnel_prl
poll_cq_resp
query_port_cmd
query_qp_resp
resize_cq_resp
scsi_ioctl_probe_host_out_buffer
sctp_assoc_ids
sctp_authchunks
sctp_getaddrs
sctp_getaddrs_old
2. These contain pointers:
binder_objects
iovec[in, netlink_msg_route_sched]
iovec[in, netlink_msg_route_sched_retired]
msghdr_netlink[netlink_msg_route_sched]
msghdr_netlink[netlink_msg_route_sched_retired]
nvme_of_msg
3. These contain filenames:
binfmt_script
blockdev_filename
netlink_msg_route_sched
netlink_msg_route_sched_retired
selinux_create_req
|
| |
|
|
| |
Include the name of the union and list the correct options.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for the following syntax:
int8[constant]
as an equivalent to:
const[constant, int8]
The goal is to have a unified const/flags definition that we can use in
templates. For example:
type template[CLASS, ...] {
class int8:3[CLASS]
// ...
}
type singleClassType template[SINGLE_CONST]
type subClassType template[abc_class_flags]
In this example, the CLASS template field can be either a constant or a
flag. This is especially useful when defining both a generic instance of
the template as well as specialized instances (ex. bpf_alu_ops and
bpf_add_op).
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for the following syntax:
int_flags = 1, 5, 8, 9
int32[int_flags]
which is equivalent to:
int_flags = 1, 5, 8, 9
flags[int_flags, int32]
The second int type argument, align, is not allowed if the first
argument is a flag. The compiler will also error if the first argument
appears to be a flag (is ident and has no colon), but can't be found in
the map of flags.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
| |
This commit adds a few test cases for the support of AUTO for structs.
It covers:
- A simple struct with only const and len types.
- A nested struct case.
- An error case when a struct has an int type field.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
| |
This helper function will be used in a subsequent commit to take a look
ahead at several characters.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
| |
We already try as hard as possible to not generate escaping (global) filenames.
However, it's possible we read them from the corpus if it happens to contain some.
Also check for escaping filenames during deserialization.
Fixes #3678
|
| |
|
|
|
|
| |
Currently it can fail if there's never a closing quote.
Add a test to verify this behavior.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The new "$..." syntax is read as a Base64 encoding binary data.
Note that users cannot specify the size of the Base64 syntax using the
`"..."/<size>` notation.
When serialising programs to human-readable form, only compressed types
(determined by `IsCompressed()`) are represented using the new Base64
notation.
Also add a couple of serialisation tests, checking behaviour for
compressed and non-compressed types.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
To be able to collide specific syscalls more precisely, we need to
repeat the process many times.
Introduce the `rerun` call property, which instructs `syz-executor` to
repeat the call the specified number of times. The intended use is:
call1() (rerun: 100, async)
call2() (rerun: 100)
For now, assign rerun values randomly to consecutive pairs of calls,
where the first one is async.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the currently existing straightforward approach to race triggering
(that was almost entirely implemented inside syz-executor) with a more
flexible one.
The `async` call property instructs syz-executor not to block until the
call has completed execution and proceed immediately to the next call.
The decision on what calls to mark with `async` is made by syz-fuzzer.
Ultimately this should let us implement more intelligent race provoking
strategies as well as make more fine-grained reproducers.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now that call properties mechanism is implemented, we can refactor
fault injection.
Unfortunately, it is impossible to remove all traces of the previous apprach.
In reprolist and while performing syz-ci jobs, syzkaller still needs to
parse the old format.
Remove the old prog options-based approach whenever possible and replace
it with the use of call properties.
|
| |
|
|
|
|
|
|
|
| |
Call properties let us specify how each individual call within a program
must be executed. So far the only way to enforce extra rules was to pass
extra program-level properties (e.g. that is how fault injection was done).
However, it entangles the logic and not flexible enough.
Implement an ability to pass properties along with each individual call.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Represent array[const[X, int8], N] as string["XX...X"].
This replaces potentially huge number of:
NONFAILING(*(uint8_t*)0x2000126c = 0);
NONFAILING(*(uint8_t*)0x2000126d = 0);
NONFAILING(*(uint8_t*)0x2000126e = 0);
with a single memcpy. In one reproducer we had 3991 such lines.
Also replace memcpy's with memset's when possible.
Update #1070
|
| |
|
|
|
| |
Update #477
Update #502
|
| |
|
|
|
| |
Don't error on wrong vma with value in non strict mode.
Add more tests and fix use of cmp package (prog.Syscall is not comparable anymore).
|
| |
|
|
|
| |
1. Allow to not provide Out if it's the same as In.
2. Always check Out.
|
| |
|
|
|
| |
sys/{linux,openbsd} duplicate deserialization test logic as well.
Export and reuse the existing helper function.
|
| |
|
|
| |
Factor out a common test helper for tests that deserialize and check programs.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that we don't have conflicting sizes for the same argument
of the same syscall, e.g.:
foo$1(a int16)
foo$2(a int32)
This is useful for several reasons:
- we will be able avoid morphing syscalls into other syscalls
- we will be able to figure out more precise sizes for args
(lots of them are implicitly intptr, which is the largest
type on most important arches)
- found few bugs in linux descriptions
Update #477
Update #502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have _some_ limits on program length, but they are really soft.
When we ask to generate a program with 10 calls, sometimes we get
100-150 calls. There are also no checks when we accept external
programs from corpus/hub. Issue #1630 contains an example where
this crashes VM (executor limit on number of 1000 resources is
violated). Larger programs also harm the process overall (slower,
consume more memory, lead to monster reproducers, etc).
Add a set of measure for hard control over program length.
Ensure that generated/mutated programs are not too long;
drop too long programs coming from corpus/hub in manager;
drop too long programs in hub.
As a bonus ensure that mutation don't produce programs with
0 calls (which is currently possible and happens).
Fixes #1630
|
| |
|
|
|
|
|
|
| |
String value enforcement broke a number of tests
where we use different values.
Be more string as to what string values we use in tests.
Required to add tmpfs descriptions to test syz_mount_image.
Also special-casing AF_ALG algorithms as these are auto-generated.
|
| |
|
|
|
|
|
|
|
|
| |
Strings with enumerated values are frequently file names
or have complete enumeration of relevant values.
Mutating complete enumeration if not very profitable.
Mutating file names leads to escaping paths and
fuzzer messing with things it is not supposed to mess with as in:
r0 = openat$apparmor_task_exec(0xffffffffffffff9c, &(0x7f0000000440)='/proc/self//exe\x00', 0x3, 0x0)
|
| |
|
|
|
|
|
| |
We get them in cross-compilation test where an out const
arg has different values in different archs.
No reason to fail deserialization in that case, replace with default
arg instead.
|
| |
|
|
|
|
|
| |
Always serialize strings in readable format (non-hex).
Serialize binary data in readable format in more cases.
Fixes #792
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AUTO arguments can be used for:
- consts
- lens
- pointers
For const's and len's AUTO is replaced with the natural value,
addresses for AUTO pointers are allocated linearly.
This greatly simplifies writing test programs by hand
as most of the time we want these natural values.
Update tests to use AUTO.
|
| |
|
|
|
|
|
| |
Add bulk of checks for strict parsing mode.
Probably not complete, but we can extend then in future as needed.
Turns out we can't easily use it for serialized programs
as they omit default args and during deserialization it looks like missing args.
|
| |
|
|
|
|
|
|
|
|
|
| |
Over time we relaxed parsing to handle all kinds of invalid programs
(excessive/missing args, wrong types, etc).
This is useful when reading old programs from corpus.
But this is harmful for e.g. reading test inputs as they can become arbitrary outdated.
For runtests which creates additional problem of executing not
what is actually written in the test (or at least what author meant).
Add strict parsing mode that does not tolerate any errors.
For now it just checks excessive syscall arguments.
|
| |
|
|
|
|
|
| |
Move target and vars into parser and make all
parsing functions methods of the parser.
This reduces number of args that we need to pass around
and eases adding more state that needs to be passed around.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we only generate either valid user-space pointers or NULL.
Extend NULL to a set of special pointers that we will use in programs.
All targets now contain 3 special values:
- NULL
- 0xfffffffffffffff (invalid kernel pointer)
- 0x999999999999999 (non-canonical address)
Each target can add additional special pointers on top of this.
Also generate NULL/special pointers for non-opt ptr's.
This restriction was always too restrictive. We may want to generate
them with very low probability, but we do want to generate them.
Also change pointers to NULL/special during mutation
(but still not in the opposite direction).
|
| |
|
|
|
|
| |
Parse and collect and prog comments.
Will be needed for runtest annotations
(e.g. "requires threaded mode", etc).
|
| |
|
|
|
|
| |
Remember per-call comments, will be useful for annotating tests.
Also support this form:
call() # comment
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make as much code as possible shared between all OSes.
In particular main is now common across all OSes.
Make more code shared between executor and csource
(in particular, loop function and threaded execution logic).
Also make loop and threaded logic shared across all OSes.
Make more posix/unix code shared across OSes
(e.g. signal handling, pthread creation, etc).
Plus other changes along similar lines.
Also support test OS in executor (based on portable posix)
and add 4 arches that cover all execution modes
(fork server/no fork server, shmem/no shmem).
This change paves way for testing of executor code
and allows to preserve consistency across OSes and executor/csource.
|
| |
|
|
|
| |
Parallelize more tests and reduce number of iterations
in random tests under race detector.
|
| |
|
|
|
| |
Test that isDefaultArg returns true for result of DefaultArg.
Fix few bugs uncovered by this test.
|
| |
|
|
|
|
|
| |
Handle most of type changes, e.g. const is changed to struct,
or struct to pointers. In all these cases we create default args.
They may not give the coverage anymore, but still better than
losing them right away.
|
| |
|
|
|
| |
Make program deserialization handle and recover after type changes
in descriptions.
|
| |
|
|
|
| |
Tolerate excessive args and fields during program parsing.
This is useful after description changes to not lose corpus.
|
| |
|
|
|
| |
Make the predicate the last argument.
It's more common and convenient (arguments are not separated by multiple lines).
|
| |
|
|
|
|
|
| |
This reduces size of a corpus in half.
We store corpus on manager and on hub,
so this will reduce their memory consumption.
But also makes large programs more readable.
|
| |
|
|
|
|
|
|
|
|
| |
Generated program always uses pid=0 even when there are multiple processes.
Make each process use own pid.
Unfortunately required to do quite significant changes to prog,
because the current format only supported fixed pid.
Fixes #490
|
| |
|
|
|
|
|
|
| |
Fixes #188
We now will write just ""/1000 to denote a 1000-byte output buffer.
Also we now don't store 1000-byte buffer in memory just to denote size.
Old format is still parsed.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #460
File names, crypto algorithm names, etc in programs are completely unreadable:
bind$alg(r0, &(0x7f0000408000)={0x26, "6861736800000000000000000000",
0x0, 0x0, "6d6435000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000
00000000000"}, 0x58)
Introduce another format for printable strings.
New args are denoted by '' ("" for old args).
New format is enabled for printable chars, \x00
and \t, \r, \n.
Example:
`serialize(&(0x7f0000408000)={"6861736800000000000000000000", "4849000000"})`,
vs:
`serialize(&(0x7f0000408000)={'hash\x00', 'HI\x00'})`,
|