| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the currently existing straightforward approach to race triggering
(that was almost entirely implemented inside syz-executor) with a more
flexible one.
The `async` call property instructs syz-executor not to block until the
call has completed execution and proceed immediately to the next call.
The decision on what calls to mark with `async` is made by syz-fuzzer.
Ultimately this should let us implement more intelligent race provoking
strategies as well as make more fine-grained reproducers.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now that call properties mechanism is implemented, we can refactor
fault injection.
Unfortunately, it is impossible to remove all traces of the previous apprach.
In reprolist and while performing syz-ci jobs, syzkaller still needs to
parse the old format.
Remove the old prog options-based approach whenever possible and replace
it with the use of call properties.
|
| |
|
|
|
|
|
|
|
| |
Call properties let us specify how each individual call within a program
must be executed. So far the only way to enforce extra rules was to pass
extra program-level properties (e.g. that is how fault injection was done).
However, it entangles the logic and not flexible enough.
Implement an ability to pass properties along with each individual call.
|
| |
|
|
|
|
| |
Detect the case when a program requires more copyout than executor can handle.
Curretnly these result in: "SYZFAIL: command refers to bad result" failures.
Now syz-fuzzer should ignore them.
|
| |
|
|
|
|
|
| |
We started to see lots of "provided buffer is too small" with seeded
syz_mount_image programs. Currently it fails whole VM, which is not good.
Ignoring them is not perfect, but there does not seem to be any better
simple solution.
|
| | |
|
| |
|
|
|
|
|
|
| |
Use native byte-order for IPC and program serialization.
This way we will be able to support both little- and big-endian
architectures.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
|
| |
|
|
|
|
|
|
| |
We must pad data arguments with known values when serializing
them into the given destination buffer because it could
be reused and contain random bytes from previous use.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
|
| |
|
|
|
|
|
| |
I bumped input buffer size on Go side in:
a2af37f0 prog: increase encodingexec buffer size
But I forgot to increase the size on the executor side.
Do this and add comments re keeping them in sync.
|
| |
|
|
|
| |
Some of the programs involving netfilter syscalls
produce errors about insufficient buffer size. Bump it more.
|
| |
|
|
|
|
| |
Fixes #1542
Found thanks to syz-check. Update #590
|
| |
|
|
|
|
|
|
| |
All callers of BitfieldMiddle just want static size (0 for middle).
Make it so: Size for middle bitfields just returns 0. Removes lots of if's.
Introduce Type.UnitSize, which now holds the underlying type for bitfields.
This will be needed to fix #1542 b/c even if UnitSize=4 for last bitfield
Size can be anywhere from 0 to 4 (not necessary equal to UnitSize due to overlapping).
|
| |
|
|
|
|
|
| |
Always serialize strings in readable format (non-hex).
Serialize binary data in readable format in more cases.
Fixes #792
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we only generate either valid user-space pointers or NULL.
Extend NULL to a set of special pointers that we will use in programs.
All targets now contain 3 special values:
- NULL
- 0xfffffffffffffff (invalid kernel pointer)
- 0x999999999999999 (non-canonical address)
Each target can add additional special pointers on top of this.
Also generate NULL/special pointers for non-opt ptr's.
This restriction was always too restrictive. We may want to generate
them with very low probability, but we do want to generate them.
Also change pointers to NULL/special during mutation
(but still not in the opposite direction).
|
| |
|
|
|
|
| |
Move debug validation into a separate function.
Update #538
|
| |
|
|
|
|
|
| |
Factor copyin, copyout and checksums into separate functions.
Also slightly tidy csum analysis.
Update #538
|
| |
|
|
|
|
| |
Reduce cyclomatic complexity.
Update #538
|
| |
|
|
|
| |
fmt type allows to convert intergers and resources
to string representation.
|
| |
|
|
|
| |
No reason to allocate return value if there is no return type.
c.Ret == nil is the reasonable indication that this is a "void" call.
|
| |
|
|
|
|
| |
Now that we don't have ReturnArg and only ResultArg's refer
to other ResultArg's we can remove ArgUser/ArgUsed and
devirtualize lots of code.
|
| |
|
|
| |
It's not all that needed.
|
| |
|
|
|
|
|
|
|
| |
We currently use -1 as default value for resources
when the actual value is not available.
-1 is good for fd's, but is not the right default
value for pointers/keys/etc.
Pass from prog and use in executor proper default
value for resources.
|
| |
|
|
|
| |
If all union options can be syscall arguments,
allow the union itself as syscall argument.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1. mmap all memory always, without explicit mmap calls in the program.
This makes lots of things much easier and removes lots of code.
Makes mmap not a special syscall and allows to fuzz without mmap enabled.
2. Change address assignment algorithm.
Current algorithm allocates unmapped addresses too frequently
and allows collisions between arguments of a single syscall.
The new algorithm analyzes actual allocations in the program
and places new arguments at unused locations.
|
| |
|
|
|
|
| |
Turns out we never produced NULL pointers because
what's meant to be NULL pointer was actually encoded
as pointer to beginning of the data region.
|
| | |
|
| |
|
|
|
|
|
|
| |
Make Foreach* callback accept the arg and a context struct
that can contain lots of aux info.
This (1) removes lots of unuser base/parent args,
(2) provides foundation for stopping recursion,
(3) allows to merge foreachSubargOffset.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
Generated program always uses pid=0 even when there are multiple processes.
Make each process use own pid.
Unfortunately required to do quite significant changes to prog,
because the current format only supported fixed pid.
Fixes #490
|
| |
|
|
| |
Fixes #174
|
| |
|
|
|
|
|
|
|
| |
Factor out program parsing from pkg/csource.
csource code that parses program and at the same time
formats output is very messy and complex.
New aproach also allows to understand e.g.
when a call has copyout instructions which is
useful for better C source output.
|
| | |
|
| |
|
|
|
|
| |
Introduce isUsed(arg) helper, use it in several places.
Move method definitions closer to their types.
Simplify presence check for ArgUsed.Used() in several places.
|
| |
|
|
|
|
|
|
| |
Fixes #188
We now will write just ""/1000 to denote a 1000-byte output buffer.
Also we now don't store 1000-byte buffer in memory just to denote size.
Old format is still parsed.
|
| |
|
|
|
|
| |
Currently we always send 2MB of data to executor in ipc_simple.go.
Send only what's consumed by the program, and don't send the trailing zeros.
Serialized programs usually take only few KBs.
|
| |
|
|
|
|
| |
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
|
| |
|
|
|
|
|
|
|
|
|
| |
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.
Update #191
|
| |
|
|
| |
In preparation for moving sys types to prog to reduce later diffs.
|
| |
|
|
| |
It is used only by a single test. Remove it from non-test code.
|
| |
|
|
|
|
|
|
|
|
| |
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.
Update #324
|
| |
|
|
| |
Result of running gofmt -s.
|
| |
|
|
|
|
| |
ResultArg might have const value.
Also add a test.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now Arg is a huge struct (160 bytes), which has many different fields
used for different arg kinds. Since most of the args we see in a typical
corpus are ArgConst, this results in a significant memory overuse.
This change:
- makes Arg an interface instead of a struct
- adds a SomethingArg struct for each arg kind we have
- converts all *Arg pointers into just Arg, since interface variable by
itself contains a pointer to the actual data
- removes ArgPageSize, now ConstArg is used instead
- consolidates correspondence between arg kinds and types, see comments
before each SomethingArg struct definition
- now LenType args that denote the length of VmaType args are serialized as
"0x1000" instead of "(0x1000)"; to preserve backwards compatibility
syzkaller is able to parse the old format for now
- multiple small changes all over to make the above work
After this change syzkaller uses twice less memory after deserializing a
typical corpus.
|
| |
|
|
|
| |
This commit moves checksum computation to executor. This will allow to embed
dynamically generated values (like TCP sequence numbers) into packets.
|
| |
|
|
|
|
|
| |
This change adds a `csum[kind, type]` type.
The only available kind right now is `ipv4`.
Using `csum[ipv4, int16be]` in `ipv4_header` makes syzkaller calculate
and embed correct checksums into ipv4 packets.
|
| |
|
|
|
|
|
| |
The optimization change removed validation too aggressively.
We do need program validation during deserialization,
because we can get bad programs from corpus or hub.
Restore program validation after deserialization.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
A bunch of spot optmizations after cpu/memory profiling:
1. Optimize hot-path coverage comparison in fuzzer.
2. Don't allocate and copy serialized program, serialize directly into shmem.
3. Reduce allocations during parsing of output shmem (encoding/binary sucks).
4. Don't allocate and copy coverage arrays, refer directly to the shmem region
(we are not going to mutate them).
5. Don't validate programs outside of tests, validation allocates tons of memory.
6. Replace the choose primitive with simpler switches.
Choose allocates fullload of memory (for int, func, and everything the func refers).
7. Other minor optimizations.
|
| | |
|