| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Auto-generated syscall descriptions currently do not properly mark
arch-specific syscalls like socketcall (which is only available on 32
bit systems), which leads to TestGenerate breakages.
Until the syz-declextract tool is fixed and descriptions are
re-generated, don't use such calls in TestGenerate tests. It has
recently caused numerous syzkaller update erorrs on syzbot.
Cc #5410.
Closes #6468.
|
| |
|
|
|
|
|
|
|
|
|
| |
All non-base variants of syz_kfuzztest_run (i.e., those that are
discovered dynamically) are encoded so that they map onto the base
variant which is defined in kfuzztest.txt, and known by the executor.
We add a function for fetching this, that is wrapped in a sync.once
block to avoid repeated iteration over the target's array of syscalls.
Signed-off-by: Ethan Graham <ethangraham@google.com>
|
| |
|
|
|
|
|
|
| |
As KFuzzTest targets are discovered at boot, we need a mechanism for
adding these to the array of enabled system calls. This is implemented
by the new Extend method, which performs this setup.
Signed-off-by: Ethan Graham <ethangraham@google.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of generating Go files with descriptions
serialize them as gob and compress with flate.
This significantly reduces build time, go vet time,
and solves scalability problems with some static analysis tools.
Reference times (all after rm -rf ~/.cache/go-build) before:
TIME="%e %P %M" time go install ./syz-manager
48.29 577% 4824820
TIME="%e %P %M" time go test -c ./prog
56.28 380% 6973292
After:
TIME="%e %P %M" time go install ./syz-manager
22.81 865% 859788
TIME="%e %P %M" time go test -c ./prog
12.74 565% 267760
syz-manager size before/after: 194712597 -> 83418407
-57% even provided we now embed all descriptions
instead of just a single arch.
Deflate/decoding time for a single Linux arch is ~330ms.
Fixes #5542
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several optimizations to reduce amount of hint replacements:
1. Don't mutate int's that are <= 8 bits.
2. Don't mutate data that is <= 3 bytes.
3. Restrict mutation of len only value >10 and < 1<<20.
Values <= 10 we can produce during normal mutation.
Values > 1<<20 are presumably not length of something
and we have logic to produce various large bogus lengths.
4. Include all small ints <= 16 into specialInts and remove 31, 32, 63
(don't remember where they come from).
5. Don't produce other known flags (and combinations) for flags.
And a larger part computes groups of related arguments
so that we don't try to produce known ioctl's from other known ioctl's,
and similarly for socket/socketpair/setsockopt/etc.
See comments in Target.initRelatedFields for details.
Update #477
|
| |
|
|
|
|
| |
Litte-endian is kind of default (except for s390).
So instead of saying that each arch is litte-endian,
mark only s390 as big-endian.
|
| |
|
|
|
|
|
|
|
| |
All OSes we have now support shmem.
Support for Fuchia/Starnix/Windows wasn't implemented,
but generally they support shared memory.
Remove all of the complexity and code associated with noshmem mode.
If/when we revive these OSes, it's easier to properly
implement shmem mode for them.
|
| |
|
|
|
|
|
| |
Move more complex glob processing to the host (into prog package).
Make fuzzer just read and return globs if requested.
This moves us closer to #1541
|
| |
|
|
|
|
| |
If we send exec encoding to the fuzzer, it's not necessary to serialize
exec encoding into existing buffer (currnetly we serialize directly into shmem).
So simplify code by serializing into a new slice.
|
| |
|
|
|
| |
This extends Target to contain a map of possible values for each int
flag.
|
| |
|
|
|
| |
It can be useful to retrieve constants from their names to factorize
constant into flag masks in syz-prog2c for example.
|
| |
|
|
|
|
|
|
|
|
| |
During resource argument generation, we used to randomly select one of
the matching resources. With so many descendants of fd, this becomes
quite inefficient and most of the time syzkaller fails to build correct
programs.
Give precise resource contructions priority. Experiment with other
resource types only in 1/3 of cases.
|
| |
|
|
|
|
| |
Now that images are not linux-specific,
we can move all image-related logic directly into prog package
and significantly simplify the logic.
|
| |
|
|
|
|
|
|
|
|
|
| |
Ideally, we should properly support the already existing fix flag to
distinguish between fixing and checking, but for now at least let it
control whether structural changes are to be made.
Otherwise we get into trouble while hint-mutating syz_mount_image calls,
because we iterate over all call arguments and (possibly) remove them at
the same time. It leads to `bad group arg size %v, should be <= %v for
%#v type %#v` errors.
|
| |
|
|
|
| |
To simplify the extraction code, let's make segments non-overlapping
even before execution.
|
| |
|
|
|
| |
Generate very long file names once in a while to provoke bugs like:
https://github.com/google/gvisor/commit/f857f268eceb1cdee0b2bdfa218c969c84033fcd
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currnetly we loop up to 1000 times in randGen.createResource,
this is necessary because we can't guarantee that the generated syscall
will indeed contain the necessary resources. This is ugly.
Now that we have stricter constructors (no unions) with few additional tweaks
we can guarantee that we generate the resource every time.
Generate at least 1 array element when in createResource.
Don't generate special empty pointers when in createResource.
Record only resource constructors in Syscall.outputResource,
this makes rotation logic to include at least 1 of them.
|
| |
|
|
|
| |
This will allow callbacks to stop iteration early by
setting ctx.Stop flag (as it works for ForeachArg).
|
| |
|
|
|
|
|
|
|
|
| |
Currently fallback coverage imposes an implicit 8K limit
on the max number of syscalls. 8K is quite close to the
current number of syscalls we have on Linux.
1. Bump this limit to 2M.
2. Detect limit violation during startup rather than later,
with an obscure error message and only if fallback coverage is used.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* all: add new typename dirname
The current way to check files under sysfs or proc is:
- define a string to represent each file
- open the file
- pass the fd to write / read / close
The issues above are:
- Need to know what file present on target device
- Need to write openat for each file
With dirname added, which will open one file
in the directory randomly and then pass the fd to
write/read/close.
* all: use typename glob to match filename
Fixes #481
|
| |
|
|
| |
Otherwise coverage collection just doesn't work.
|
| |
|
|
|
|
|
|
|
| |
Calls to alloc didn't respect the alignment attribute. Now
Type.Alignment() is used to ensure each type is correctly
aligned. Existing descriptions with [align[X]] don't have an
issue as they align to small blocks and default align is to
64 bytes. This commits adds support for [align[X]] for an X
larger than 64.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
* Introduce the new target flag 'LittleEndian' which specifies
of which endianness the target is.
* Introduce the new requires flag 'littleendian' for tests to
selectively enable/disable tests on either little-endian architectures
or big-endian ones.
* Disable KD unit test on s390x architecture because the test
works only on little-endian architecture.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
|
| |
|
|
|
|
|
|
|
| |
The linux string dictionary comes from extremely old times
when we did not have proper descriptions for almost anything,
and the dictionary was a quick hack to guess at least some
special strings.
Now we have way better descriptions and the dictionary
become both unnecessary and probably even harmful.
|
| |
|
|
| |
Don't allocate 3 parallel slices.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use Ref in Arg instead of full Type interface.
This reduces size of all args. In partiuclar the most common
ConstArg is reduces from 32 bytes to 16 and now does not
contain any pointers (better for GC).
Running syz-db bench on a beefy corpus: before:
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.704699958s
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.873792394s
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.820479906s
after:
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1023 MB, live allocs 379 MB (8 M), time 8.938939937s
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1087 MB, live allocs 379 MB (8 M), time 9.410243167s
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1023 MB, live allocs 379 MB (8 M), time 9.38225806s
Max heap and live heap are reduced by 20%.
Update #1580
|
| |
|
|
|
|
|
|
|
|
| |
Currently ANY implementation fabricates new types dynamically.
This is something we don't do anywhere else, generally types
come from compiler and all are static.
Dynamic types will conflict with use of Ref in Arg optimization.
Move ANY types creation into compiler.
Update #1580
|
| |
|
|
|
| |
Update #477
Update #502
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
|
| |
|
|
|
|
|
| |
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
|
| |
|
|
|
|
|
|
|
|
|
| |
Make MakeMmap return more than 1 call.
This is a preparation for future changes.
Also remove addr/size as they are effectively
always the same and can be inferred from the target
(will also conflict with the future changes).
Also rename to MakeDataMmap to better represent
the new purpose: it's just some arbitrary mmap,
but rather mapping of the data segment.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We will need a wrapper for target.SanitizeCall that will do more
than just calling the target-provided function. To avoid confusion
and potential mistakes, give the target function and prog function
different names. Prog package will continue to call this "sanitize",
which will include target's "neutralize" + more.
Also refactor API a bit: we need a helper function that sanitizes
the whole program because that's needed most of the time.
Fixes #477
Fixes #502
|
| |
|
|
|
|
|
|
|
| |
Use a random subset of syscalls/corpus/coverage for each individual VM run.
Hypothesis is that this should allow fuzzer to get more coverage
find more bugs in saturated state (stuck in local optimum).
See the issue and comments for details.
Update #1348
|
| |
|
|
|
| |
Allows to use compiled descriptions.
Will be useful for static checking utility.
|
| |
|
|
|
|
|
| |
When we build a list of resource constructors we over and over iterate through
all types in a syscall to find resource types. Speed it up by iterating only
once to build a list of constructors for each resource and then reuse it.
This significantly speeds up syz-exeprog startup time on Raspberry Pi Zero.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Providing additional info, especially regarding syscall arguments, in reproducers
can be helpful. An example is device numbers passed to mknod(2).
This commit introduces an optional annotate function on a per target basis.
Example for the OpenBSD target:
$ cat prog.in
mknod(0x0, 0x0, 0x4503)
getpid()
$ syz-prog2c -prog prog.in
int main(void)
{
syscall(SYS_mmap, 0x20000000, 0x1000000, 3, 0x1012, -1, 0, 0);
syscall(SYS_mknod, 0, 0, 0x4503); /* major = 69, minor = 3 */
syscall(SYS_getpid);
return 0;
}
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AUTO arguments can be used for:
- consts
- lens
- pointers
For const's and len's AUTO is replaced with the natural value,
addresses for AUTO pointers are allocated linearly.
This greatly simplifies writing test programs by hand
as most of the time we want these natural values.
Update tests to use AUTO.
|
| |
|
|
|
|
| |
golint suggests that "prog.Prog" is a bad naming
because everything in prog package is ProgSomething.
Rename to Builder, "prog.Builder" sounds right.
|
| |
|
|
|
|
|
| |
There are 2 bugs:
1. We always allocate 1 page, even if use more.
2. VMA addresses are not aligned, so most mmap-like functions fail with EINVAL.
The added test currently panics with "unaligned vma address".
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when we get target consts with target.ConstMap["name"]
during target initialization, we just get 0 for missing consts.
This is error-prone as we can mis-type a const, or a const may
be undefined only on some archs (as we have common unix code
shared between several OSes).
Check that all the consts are actually defined.
The check detects several violations, to fix them:
1. move mremap to linux as it's only defined on linux
2. move S_IFMT to openbsd, as it's only defined and used on openbsd
3. define missing MAP_ANONYMOUS for freebsd and netbsd
4. fix extract for netbsd
|
| |
|
|
|
|
|
|
| |
This avoids the issue of "android" not having any registered configurations
or syscalls / ioctls / etc, when built with GOOS=android.
This occurs when building in Google3, since --config=android_arm64 selects
the Android toolchain.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we only generate either valid user-space pointers or NULL.
Extend NULL to a set of special pointers that we will use in programs.
All targets now contain 3 special values:
- NULL
- 0xfffffffffffffff (invalid kernel pointer)
- 0x999999999999999 (non-canonical address)
Each target can add additional special pointers on top of this.
Also generate NULL/special pointers for non-opt ptr's.
This restriction was always too restrictive. We may want to generate
them with very low probability, but we do want to generate them.
Also change pointers to NULL/special during mutation
(but still not in the opposite direction).
|
| |
|
|
| |
Fix typos, non-canonical code, remove dead code, etc.
|
| |
|
|
|
|
| |
Squash complex structs into flat byte array and mutate this array
with generic blob mutations. This allows to mutate what we currently
consider as paddings and add/remove paddings from structs, etc.
|
| |
|
|
|
|
| |
IDs change whenever a call is added or removed,
this leads to large diffs unnecessarly.
Assign IDs dynamically.
|