| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
Any is the preferred over interface{} now in Go.
|
| |
|
|
|
|
|
|
| |
Refactor Prog.Serialize() to accept a variadic list of flags.
For now, two are supported:
1) Verbose (equal to SerializeVerbose()).
2) SkipImages (don't serialize fs images).
|
| |
|
|
|
| |
go install golang.org/x/tools/cmd/deadcode@latest
deadcode -test ./...
|
| |
|
|
| |
They are shorter, more readable, and don't require temp vars.
|
| | |
|
| |
|
|
|
|
|
| |
Allow to serialize/deserialize DataMmapProg
and fix validation in debug mode.
Fixes #4750
|
| |
|
|
|
|
|
| |
Raw deserialization mode does not do any program sanitization
and allows to use global file names, prohibited ioctl's, etc.
This will be useful for moving syscall/feature checking code
to the host, we will need to probe opening global files, etc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes 3 issues:
1. We intended to squash only 'in' pointer elems,
but we looked at the pointer direction rather than elem direction.
Since pointers themselves are always 'in' we squashed a number of
types we didn't want to squash.
2. We can squash filenames, which can lead to generation of escaping filenames,
e.g. fuzzer managed to create "/" filename for blockdev_filename as:
mount(&(0x7f0000000000)=ANY=[@ANYBLOB='/'], ...)
Don't squash filenames.
3. We analyzed a concrete arg to see if it contains something
we don't want to squash (e.g. pointers). But the whole type
can still contain unsupported things in inactive union options,
or in 0-sized arrays. E.g. this happened in the mount case above.
Analyze the whole type to check for unsupported things.
This also moves most of the analysis to the compiler,
so mutation will be a bit faster.
This removes the following linux types from squashing.
1. These are not 'in':
btrfs_ioctl_search_args_v2
btrfs_ioctl_space_args
ethtool_cmd_u
fscrypt_add_key_arg
fscrypt_get_policy_ex_arg
fsverity_digest
hiddev_ioctl_string_arg
hidraw_report_descriptor
ifreq_dev_t[devnames, ptr[inout, ethtool_cmd_u]]
ifreq_dev_t[ipv4_tunnel_names, ptr[inout, ip_tunnel_parm]]
ifreq_dev_t["sit0", ptr[inout, ip_tunnel_prl]]
io_uring_probe
ip_tunnel_parm
ip_tunnel_prl
poll_cq_resp
query_port_cmd
query_qp_resp
resize_cq_resp
scsi_ioctl_probe_host_out_buffer
sctp_assoc_ids
sctp_authchunks
sctp_getaddrs
sctp_getaddrs_old
2. These contain pointers:
binder_objects
iovec[in, netlink_msg_route_sched]
iovec[in, netlink_msg_route_sched_retired]
msghdr_netlink[netlink_msg_route_sched]
msghdr_netlink[netlink_msg_route_sched_retired]
nvme_of_msg
3. These contain filenames:
binfmt_script
blockdev_filename
netlink_msg_route_sched
netlink_msg_route_sched_retired
selinux_create_req
|
| |
|
|
|
|
|
|
|
| |
Treat all default union arguments as transient and reevaluate them after
the call was fully parsed.
Before conditional field patching, we do need to have performed arg
validation, which also reevaluates conditions. To break the cycle, make
validation configurable.
|
| |
|
|
| |
This reverts commit 8e75c913b6f9b09cab2ad31fd7d66ea0d1703de8.
|
| |
|
|
|
|
|
|
|
| |
Treat all default union arguments as transient and reevaluate them after
the call was fully parsed.
Before conditional field patching, we do need to have performed arg
validation, which also reevaluates conditions. To break the cycle, make
validation configurable.
|
| |
|
|
| |
Include the name of the union and list the correct options.
|
| |
|
|
|
| |
Simplify seed program debugging by highlighting the actual error
location in the source text.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AUTO token can currently be used in place of ConstType, LenType, and
CsumType. This commit extends this support to StructType on the
condition that they are made of the above types. It will allow us to
simplify all the {AUTO, AUTO, AUTO, AUTO} in test programs.
The following struct can therefore be written as AUTO.
auto_struct1 {
f0 len[parent, int32]
f1 const[0x43, int32]
}
In addition, with this commit, AUTO can also stand for nested structs as
long as all the leaf fields are ConstType, LenType, or CsumType. For
example, the following struct can be written as AUTO.
auto_struct2 {
f0 len[parent, int32]
f1 auto_struct1
}
Suggested-by: Aleksandr Nogikh <nogikh@google.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
| |
This helper function will be used in a subsequent commit to take a look
ahead at several characters.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| |
|
|
|
|
|
|
| |
This simply prints more information when failing to parse unions and
structs in syzkaller programs. This is particularly useful when a
syscall has multiple structs or unions (ex. for bpf$PROG_LOAD).
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
Change DecompressWriter to DecompressCheck: checking validity
of the image is the only useful use of DecompressWriter.
Change Decompress to MustDecompress which does not return an error.
We check validity during program deserialization, so all other
uses already panic on errors.
Also add dtor return value in preparation for subsequent changes.
|
| |
|
|
|
|
|
| |
Move image compression-related function to a separate package.
In preperation for subsequent changes that make decompression
more complex. Prog package is already large and complex.
Also makes running compression tests/benchmarks much faster.
|
| |
|
|
|
|
|
|
|
|
| |
Currently we uncompress all images in Deserialize to check that the data is valid.
As the result deserializing all seeds we have takes ~40 seconds of real time
and ~125 seconds of CPU time. And we do this during every syz-manager start.
Don't materialize the uncompressed image.
This reduces real time to ~15 seconds and CPU time to 18 seconds (no garbage collections).
In syz-manager the benefit is even larger since garbage collections take longer (larger heap).
|
| |
|
|
|
|
| |
Currently it can fail if there's never a closing quote.
Add a test to verify this behavior.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The new "$..." syntax is read as a Base64 encoding binary data.
Note that users cannot specify the size of the Base64 syntax using the
`"..."/<size>` notation.
When serialising programs to human-readable form, only compressed types
(determined by `IsCompressed()`) are represented using the new Base64
notation.
Also add a couple of serialisation tests, checking behaviour for
compressed and non-compressed types.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
builtin
Create the `BufferCompressed` kind of `BufferType`, which will be used
to represent compressed data. Create the corresponding `compressed_image`
syzlang builtin, which is backed by `BufferCompressed`. For now, no
syscalls use this feature - this will be introduced in future commits.
We have to be careful to decompress the data before mutating, and
re-compress before storing. We make sure that any deserialised
`BufferCompressed` data is valid too.
`BufferCompressed` arguments are mutated using a generic heatmap. In
future, we could add variants of `BufferCompressed` or populate the
`BufferType` sub-kind, using it to choose different kinds of heatmap for
different uncompressed data formats.
Various operations on compressed data must be forbidden, so we check for
`BufferCompressed` in key places. We also have to ensure `compressed_image`
can only be used in syscalls that are marked `no_{generate,minimize}`.
Therefore, we add a generic compiler check which allows type
descriptions to require attributes on the syscalls which use them.
|
| |
|
|
|
| |
Error if program variable refers to non-resource in strict parsing mode.
Such errors are hard to diagnose otherwise since the variable is silently discarded.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the currently existing straightforward approach to race triggering
(that was almost entirely implemented inside syz-executor) with a more
flexible one.
The `async` call property instructs syz-executor not to block until the
call has completed execution and proceed immediately to the next call.
The decision on what calls to mark with `async` is made by syz-fuzzer.
Ultimately this should let us implement more intelligent race provoking
strategies as well as make more fine-grained reproducers.
|
| |
|
|
|
| |
reflect.Value.IsZero is added in go1.13, not available in Appengine SDK.
Replace it with DeepEqual+Zero.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now that call properties mechanism is implemented, we can refactor
fault injection.
Unfortunately, it is impossible to remove all traces of the previous apprach.
In reprolist and while performing syz-ci jobs, syzkaller still needs to
parse the old format.
Remove the old prog options-based approach whenever possible and replace
it with the use of call properties.
|
| |
|
|
|
|
|
|
|
| |
Call properties let us specify how each individual call within a program
must be executed. So far the only way to enforce extra rules was to pass
extra program-level properties (e.g. that is how fault injection was done).
However, it entangles the logic and not flexible enough.
Implement an ability to pass properties along with each individual call.
|
| |
|
|
|
| |
Create a constructor for the prog.Call type. It allows to reduce
the duplication of code now and during further changes.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Corpus may accumulate glob values that are already filtered out
by descriptions (e.g. some harmful files), for an example see:
https://groups.google.com/g/syzkaller-bugs/c/W_R0O4XWpfY/m/sdwwg2_hAwAJ
Pass glob files to the manager and filter out values that
are not present in the glob already.
Also use the same caching scheme we use for features and
enabled syscalls so that fuzzers don't need to scan globs every time.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* all: add new typename dirname
The current way to check files under sysfs or proc is:
- define a string to represent each file
- open the file
- pass the fd to write / read / close
The issues above are:
- Need to know what file present on target device
- Need to write openat for each file
With dirname added, which will open one file
in the directory randomly and then pass the fd to
write/read/close.
* all: use typename glob to match filename
Fixes #481
|
| |
|
|
|
|
|
| |
We use bufio.Scanner and it has mandatory limit on line length.
The file system tests (sys/linux/test/syz_mount_image_*) has
very long lines (megabytes).
Remove the restriction on line length.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
| |
Calls to alloc didn't respect the alignment attribute. Now
Type.Alignment() is used to ensure each type is correctly
aligned. Existing descriptions with [align[X]] don't have an
issue as they align to small blocks and default align is to
64 bytes. This commits adds support for [align[X]] for an X
larger than 64.
|
| |
|
|
| |
Points to bad empty lines very precisely.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
|
| |
|
|
|
|
|
| |
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
|
| |
|
|
|
| |
Don't error on wrong vma with value in non strict mode.
Add more tests and fix use of cmp package (prog.Syscall is not comparable anymore).
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We will need a wrapper for target.SanitizeCall that will do more
than just calling the target-provided function. To avoid confusion
and potential mistakes, give the target function and prog function
different names. Prog package will continue to call this "sanitize",
which will include target's "neutralize" + more.
Also refactor API a bit: we need a helper function that sanitizes
the whole program because that's needed most of the time.
Fixes #477
Fixes #502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have _some_ limits on program length, but they are really soft.
When we ask to generate a program with 10 calls, sometimes we get
100-150 calls. There are also no checks when we accept external
programs from corpus/hub. Issue #1630 contains an example where
this crashes VM (executor limit on number of 1000 resources is
violated). Larger programs also harm the process overall (slower,
consume more memory, lead to monster reproducers, etc).
Add a set of measure for hard control over program length.
Ensure that generated/mutated programs are not too long;
drop too long programs coming from corpus/hub in manager;
drop too long programs in hub.
As a bonus ensure that mutation don't produce programs with
0 calls (which is currently possible and happens).
Fixes #1630
|
| |
|
|
|
|
|
|
|
|
| |
We are seeing some one-off panics during Deserialization
and it's unclear if it's machine memory corrpution or
an actual bug in prog. I leam towards machine memory corruption
but it's impossible to prove without seeing the orig program.
Move git revision to prog and it's more base package
(sys can import prog, prog can't import sys).
|
| |
|
|
| |
See the added test for details.
|
| |
|
|
|
|
|
|
| |
String value enforcement broke a number of tests
where we use different values.
Be more string as to what string values we use in tests.
Required to add tmpfs descriptions to test syz_mount_image.
Also special-casing AF_ALG algorithms as these are auto-generated.
|
| |
|
|
|
|
|
|
|
|
| |
Strings with enumerated values are frequently file names
or have complete enumeration of relevant values.
Mutating complete enumeration if not very profitable.
Mutating file names leads to escaping paths and
fuzzer messing with things it is not supposed to mess with as in:
r0 = openat$apparmor_task_exec(0xffffffffffffff9c, &(0x7f0000000440)='/proc/self//exe\x00', 0x3, 0x0)
|
| | |
|
| |
|
|
|
|
|
| |
We get them in cross-compilation test where an out const
arg has different values in different archs.
No reason to fail deserialization in that case, replace with default
arg instead.
|
| |
|
|
|
|
|
| |
The syz-expand tools allows to parse a program and print it including all
the default values. This is mainly useful for debugging, like doing manual
program modifications while trying to come up with a reproducer for some
particular kernel behavior.
|
| | |
|