| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
If conditions of several union fields are satisfied, select one
randomly. This would be a more logical semantics.
When conditional struct fields are translated to unions, negate the
condition for the union alternative.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
pkg/compiler restructures conditional fields in structures into unions,
so we only have to implement the support for unions.
Semantics is as follows:
If a union has conditions, syzkaller picks the first field whose
condition matches. Since we require the last union field to have no
conditions, we can always construct an object.
Changes from this commit aim at ensuring that the selected union fields
always follow the rule above.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The expression may either include integers/consts or reference other
fields in the structure via value[field1:field2:field3].
The fields on this path must all belong to structures and must not have
any if conditions themselves.
For unions, mandate that the last field has no conditions (it will be
the default one).
For structs, convert conditional fields into fields of a union type of
the following form:
anonymous_union [
value T (if[expression])
void void
]
|
| |
|
|
| |
Include the name of the union and list the correct options.
|
| |
|
|
|
| |
Simplify seed program debugging by highlighting the actual error
location in the source text.
|
| |
|
|
| |
Display the call that violated the rules.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Earlier only len[parent, T] was supported and meant the size of the
whole structure.
Logically, len[parent:b, T] should be equivalent to just len[b, T].
Let len[parent:parent:a, T] refer to the structure that encloses the
current one.
Support len fields inside unions.
|
| |
|
|
|
| |
In case of non-squashed programs we can leverage our descriptions in a
much better way than just blind mutations of binary blobs.
|
| | |
|
| |
|
|
|
| |
This extends Target to contain a map of possible values for each int
flag.
|
| |
|
|
|
| |
It can be useful to retrieve constants from their names to factorize
constant into flag masks in syz-prog2c for example.
|
| |
|
|
|
|
| |
In many cases we can remove all calls that follow the call of interest.
Try this before deleting them one-by-one.
|
| |
|
|
| |
Only the values of the returned array are of interest.
|
| |
|
|
|
|
|
|
|
| |
If no matching resource was already present in the program, we used to
substitute a random value in ~50% of cases. That's not efficient.
Restructure the resource generation process so that, if there are no
other options, we generate a new resource in 80% cases and in the
remaining 20% we substitute an integer.
|
| |
|
|
|
|
|
|
|
|
| |
During resource argument generation, we used to randomly select one of
the matching resources. With so many descendants of fd, this becomes
quite inefficient and most of the time syzkaller fails to build correct
programs.
Give precise resource contructions priority. Experiment with other
resource types only in 1/3 of cases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per profiling, (*GroupArg).Size() calls from this function were one of
the hottest paths during the BenchmarkMutate() benchmark.
Some of those calls are made only to issue a runtime panic, which we
arguably don't need unless we're testing the code.
After the changes:
│ /tmp/original │ /tmp/new │
│ sec/op │ sec/op vs base │
Mutate-36 221.8µ ± 4% 179.0µ ± 3% -19.31% (p=0.000 n=15)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for the following syntax:
int8[constant]
as an equivalent to:
const[constant, int8]
The goal is to have a unified const/flags definition that we can use in
templates. For example:
type template[CLASS, ...] {
class int8:3[CLASS]
// ...
}
type singleClassType template[SINGLE_CONST]
type subClassType template[abc_class_flags]
In this example, the CLASS template field can be either a constant or a
flag. This is especially useful when defining both a generic instance of
the template as well as specialized instances (ex. bpf_alu_ops and
bpf_add_op).
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for the following syntax:
int_flags = 1, 5, 8, 9
int32[int_flags]
which is equivalent to:
int_flags = 1, 5, 8, 9
flags[int_flags, int32]
The second int type argument, align, is not allowed if the first
argument is a flag. The compiler will also error if the first argument
appears to be a flag (is ident and has no colon), but can't be found in
the map of flags.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
| |
Syz-executor fails if rerun > 0 && fail_nth > 0, but we don't do this
check during prog validation.
It works fine when syzkaller runs as a standalone app (because it never
generates such programs), but it can be a problem when receiving progs
from other instances via syz-hub.
|
| |
|
|
|
|
|
|
|
|
| |
This commit adds a few test cases for the support of AUTO for structs.
It covers:
- A simple struct with only const and len types.
- A nested struct case.
- An error case when a struct has an int type field.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AUTO token can currently be used in place of ConstType, LenType, and
CsumType. This commit extends this support to StructType on the
condition that they are made of the above types. It will allow us to
simplify all the {AUTO, AUTO, AUTO, AUTO} in test programs.
The following struct can therefore be written as AUTO.
auto_struct1 {
f0 len[parent, int32]
f1 const[0x43, int32]
}
In addition, with this commit, AUTO can also stand for nested structs as
long as all the leaf fields are ConstType, LenType, or CsumType. For
example, the following struct can be written as AUTO.
auto_struct2 {
f0 len[parent, int32]
f1 auto_struct1
}
Suggested-by: Aleksandr Nogikh <nogikh@google.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
| |
This helper function will be used in a subsequent commit to take a look
ahead at several characters.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
| |
This simply prints more information when failing to parse unions and
structs in syzkaller programs. This is particularly useful when a
syscall has multiple structs or unions (ex. for bpf$PROG_LOAD).
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds more complex unit tests to cover the bug in
getInputResources fixed by the previous commit.
required_res1 and test_args3 covers the case where a struct is included
both as optional and required. required_res2 and test_args4 cover the
case where a struct is included both as DirOut and DirIn. In both cases
the resource should be recognized as being a required input resource for
the syscall.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While reviewing the changes introduced by previous commits, Aleksandr
noticed a bug in the logic to compute all required input resources of a
given syscall (function getInputResources). This commit fixes the bug.
That function has an optimization to avoid walking structs and unions
twice. That optimization however fails to consider the direction and
optional bit. For example, if a struct is first encountered with DirOut
(== output resource), it will never be analyzed again and will not be
included in getInputResources' returned values. That same struct could
however be included elsewhere with DirIn.
The exact same issue affects the optional bit since optional resources
are now also skipped by getInputResources.
One approach to fix this is to only consider a struct or union as 'seen'
if it was seen with the same direction and optional bit. This commit
implements that approach.
Fixes: 852e3d2eae98 ("sys: support recursive structs")
Reported-by: Aleksandr Nogikh <nogikh@google.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
| |
This commit adds a unit test for getInputResources, to verify in
particular that it doesn't return input resources that are optional.
Note we can't test the built-in "optional[]" because that relies on
unions and those aren't supported yet.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If trying to fuzz only bpf$PROG_LOAD, the executors fail with:
SYZFATAL: Manager.Check call failed: machine check failed: all
system calls are disabled
That is happening because it detects a dependency on fd_bpf_map via two
paths:
1. bpf_prog_t.fd_array is an optional pointer to an array of fd_bpf_map.
2. The bpf_insn union contains descriptions for two instructions,
bpf_insn_map_fd and bpf_insn_map_value, that reference fd_bpf_map.
Both of those cases point to optional uses of fd_bpf_map, but syzkaller
isn't able to recognize that today.
This commit addresses the first case, when a resource or one of the
types using it are explicitly marked as optional. Before this commit,
syzkaller was only able to recognize the case where the resource itself
is marked as optional. However, in the case of e.g. bpf_prog_t.fd_array,
it's the pointer to the array of fd_bpf_map that is marked optional.
To fix this, we propagate the optional bit when walking down the AST. We
then pass this propagated bit to the callback function via the context.
This change was tested on the above bpf$PROG_LOAD case 1, by removing
bpf_insn_map_fd and bpf_insn_map_value from the bpf(2) description to
avoid hitting case 2. Addressing case 2 will require more changes to the
same logic.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We had a problem -- using inout ANYUNION leads to syzkaller generating
copyout instructions for fmt[X, resource] types.
Add a validation rule to detect this during tests.
Fix this by supporting (in) for union fields. Previously, all union
field direction attributes were banned as they were making things more
complicated.
The (in) attribute is definitely safe and allows for more flexibility.
|
| |
|
|
|
|
|
|
|
| |
Prohibit arg direction from being DirIn if other calls use the resource
as input.
Fix one case where we used to violate it - during argument squashing.
Reported-by: John Miller <jm3228520@gmail.com>
|
| |
|
|
|
| |
Unless overridden, arg directions are inherited from parent types.
Let's follow the same logic in validation.go.
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
There seem to be a lot of unclear dependencies between pseudo syscall
code and global methods. By testing them only together we have little
chance to detect these problems because implementations can indiretly
help one another.
In addition to existing tests, also compile all pseudo syscalls
independently.
|
| | |
|
| |
|
|
|
| |
Adds comments about how choice table runs values are populated
based on information from dynamic and static priorities.
|
| |
|
|
|
|
|
|
| |
We already try as hard as possible to not generate escaping (global) filenames.
However, it's possible we read them from the corpus if it happens to contain some.
Also check for escaping filenames during deserialization.
Fixes #3678
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It duplicates random calls in a program and makes the duplicated copies
async.
E.g. it could transform
r0 = test()
test2(r0)
to
r0 = test()
test2(r0) (async)
test2(r0)
or
test() (async)
r0 = test()
test2(r0)
|
| |
|
|
|
|
|
|
| |
When we decompress images for mutation or hints,
we always specially check for empty compressed data
(I assume it can apper after minimization).
Treat it as correct compressed and return empty decompressed data.
This removes the need in special handling in users.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Images are very large so the generic algorithm for data arguments
can produce too many mutants. For images we consider only
4/8-byte aligned ints. This is enough to handle all magic
numbers and checksums. We also ignore 0 and ^uint64(0) source bytes,
because there are too many of these in lots of images.
With this change the fuzzer was able to get past magic checks
in all of the following functions with our fake images:
- in fs/befs/super.c befs_check_sb()
- in fs/freevxfs/vxfs_super.c vxfs_fill_super()
- in fs/hpfs/super.c hpfs_fill_super()
- in fs/omfs/inode.c omfs_fill_super()
- in fs/qnx6/inode.c qnx6_check_first_superblock()
- in fs/ufs/super.c ufs_fill_super()
And even successfully mounted sysv filesystem and triggered
"sleeping function called from invalid context in __getblk_gfp"
when opening a file in the mounted filesystem.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Benchmark results:
name old time/op new time/op delta
Decompress-8 24.7ms ± 1% 13.4ms ± 4% -45.81% (p=0.000 n=16+19)
name old alloc/op new alloc/op delta
Decompress-8 67.2MB ± 0% 0.0MB ± 1% -99.98% (p=0.000 n=18+20)
name old allocs/op new allocs/op delta
Decompress-8 188 ± 0% 167 ± 0% -11.17% (p=0.000 n=20+20)
Test process memory consumption drops from 220MB to 80MB.
|
| |
|
|
|
|
|
|
|
|
| |
Change DecompressWriter to DecompressCheck: checking validity
of the image is the only useful use of DecompressWriter.
Change Decompress to MustDecompress which does not return an error.
We check validity during program deserialization, so all other
uses already panic on errors.
Also add dtor return value in preparation for subsequent changes.
|
| |
|
|
| |
If we do, then our code will fail/crash on decompression.
|
| |
|
|
|
|
| |
Now that images are not linux-specific,
we can move all image-related logic directly into prog package
and significantly simplify the logic.
|
| |
|
|
|
|
|
| |
Move image compression-related function to a separate package.
In preperation for subsequent changes that make decompression
more complex. Prog package is already large and complex.
Also makes running compression tests/benchmarks much faster.
|
| | |
|
| |
|
| |
Use "unknown" revision as a default.
|
| |
|
|
|
|
|
|
| |
Also adjust number of mutations per image.
We do one per 128K of interesting data. But we have no single seed
that has so much interesting data. Here is the amount of interesting data in seeds:
https://gist.github.com/dvyukov/566b20364610d80e5d0534524546c3f0
Reduce this cuhnk size to 4K.
|
| |
|
|
|
|
| |
Provide NumMutations method instead of Size.
It allows HeatMap to choose number of mutations better
(e.g. for completely empty/flat images w/o interesting data).
|
| |
|
|
|
|
|
|
|
|
| |
Currently we uncompress all images in Deserialize to check that the data is valid.
As the result deserializing all seeds we have takes ~40 seconds of real time
and ~125 seconds of CPU time. And we do this during every syz-manager start.
Don't materialize the uncompressed image.
This reduces real time to ~15 seconds and CPU time to 18 seconds (no garbage collections).
In syz-manager the benefit is even larger since garbage collections take longer (larger heap).
|
| |
|
|
|
|
|
| |
The current Uint64 produces uniformly distributed integers.
Guessing integers in the full 64-bit range is very inefficient.
Use the randInt function we use in normal generation/mutation,
it produces various more interesting values with higher probability.
|