diff options
| author | Dmitry Vyukov <dvyukov@google.com> | 2019-03-14 10:27:11 +0100 |
|---|---|---|
| committer | Dmitry Vyukov <dvyukov@google.com> | 2019-03-14 10:27:11 +0100 |
| commit | d34313cd5d5c25ea3a914140f25168738dc96aef (patch) | |
| tree | c2f642487ab8d775e8c1d40f19e9b60a8faaf6e4 /docs/syscall_descriptions.md | |
| parent | 375815261dbedeaf0e02581d50be9980c9eef8b7 (diff) | |
docs: extend descriptions/programs docs
Extend doc on descriptions, const generation process,
add more links to internals, explain programs, etc.
Clarify that all generated files are checked in.
Diffstat (limited to 'docs/syscall_descriptions.md')
| -rw-r--r-- | docs/syscall_descriptions.md | 118 |
1 files changed, 89 insertions, 29 deletions
diff --git a/docs/syscall_descriptions.md b/docs/syscall_descriptions.md index 2e4f6bb9c..115e8046f 100644 --- a/docs/syscall_descriptions.md +++ b/docs/syscall_descriptions.md @@ -1,7 +1,8 @@ # Syscall descriptions -`syzkaller` uses declarative description of syscalls to generate, mutate, minimize, serialize and deserialize programs (sequences of syscalls). -Below you can see (hopefully self-explanatory) excerpt from the description: +`syzkaller` uses declarative description of syscall interfaces to manipulate +programs (sequences of syscalls). Below you can see (hopefully self-explanatory) +excerpt from the description: ``` open(file filename, flags flags[open_flags], mode flags[open_mode]) fd @@ -10,32 +11,70 @@ close(fd fd) open_mode = S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH ``` -The description is contained in `sys/linux/*.txt` files. -For example see the [sys/linux/sys.txt](/sys/linux/sys.txt) file. +The description is contained in `sys/OS/*.txt` files. +For example see the [sys/linux/dev_snd_midi.txt](/sys/linux/dev_snd_midi.txt) file +for descriptions of the Linux MIDI interfaces. -## Syntax +A more formal description of the description syntax can be found [here](syscall_descriptions_syntax.md). -The description of the syntax can be found [here](syscall_descriptions_syntax.md). +## Description compilation -## Code generation +These textual syscall descriptions are then compiled into machine-usable form used by `syzkaller` +to actually generate programs. This process consists of 2 steps. -Textual syscall descriptions are translated into code used by `syzkaller`. -This process consists of 2 steps. -The first step is extraction of values of symbolic constants from Linux sources using `syz-extract` utility. -`syz-extract` generates a small C program that includes kernel headers referenced by `include` directives, -defines macros as specified by `define` directives and prints values of symbolic constants. +The first step is extraction of values of symbolic constants from kernel sources using +[syz-extract](/sys/syz-extract) utility. `syz-extract` generates a small C program that +includes kernel headers referenced by `include` directives, defines macros as specified +by `define` directives and prints values of symbolic constants. Results are stored in `.const` files, one per arch. -For example, [sys/linux/dev_ptmx.txt](/sys/linux/dev_ptmx.txt) is translated into [sys/linux/dev_ptmx_amd64.const](/sys/linux/dev_ptmx_amd64.const). +For example, [sys/linux/dev_ptmx.txt](/sys/linux/dev_ptmx.txt) is translated into +[sys/linux/dev_ptmx_amd64.const](/sys/linux/dev_ptmx_amd64.const). + +The second step is translation of descriptions into Go code using +[syz-sysgen](/sys/syz-sysgen) utility (the actual compiler code lives in +[pkg/ast](/pkg/ast/) and [pkg/compiler](/pkg/compiler/)). +This step uses syscall descriptions and the const files generated during the first step +and produces instantiations of `Syscall` and `Type` types defined in [prog/types.go](/prog/types.go). +Here is an [example](/sys/akaros/gen/amd64.go) of the compiler output for Akaros. +This step also generates some minimal syscall metadata for C++ code in +[executor/syscalls.h](/executor/syscalls.h). + +## Programs -The second step is generation of Go code for syzkaller. -This step uses syscall descriptions and the const files generated during the first step. -You can see a result in [sys/linux/gen/amd64.go](/sys/linux/gen/amd64.go) and in [executor/syscalls.h](/executor/syscalls.h). +The translated descriptions are then used to generate, mutate, execute, minimize, serialize +and deserialize programs. A program is a sequences of syscalls with concrete values for arguments. +Here is an example (of a textual representation) of a program: + +``` +mmap(&(0x7f0000000000), (0x1000), 0x3, 0x32, -1, 0) +r0 = open(&(0x7f0000000000)="./file0", 0x3, 0x9) +read(r0, &(0x7f0000000000), 42) +close(r0) +``` + +For actual manipulations `syzkaller` uses in-memory AST-like representation consisting of +`Call` and `Arg` values defined in [prog/prog.go](/prog/prog.go). That representation is used to +[analyze](/prog/analysis.go), [generate](/prog/rand.go), [mutate](/prog/mutation.go), +[minimize](/prog/minimization.go), [validate](/prog/validation.go), etc programs. + +The in-memory representation can be [transformed](/prog/encoding.go) to/from +textual form to store in on-disk corpus, show to humans, etc. + +There is also another [binary representation](https://github.com/google/syzkaller/blob/master/prog/decodeexec.go) +of the programs (called `exec`), that is much simpler, does not contains rich type information (irreversible) +and is used for actual execution (interpretation) of programs by [executor](/executor/executor.cc). ## Describing new system calls This section describes how to extend syzkaller to allow fuzz testing of a new system call; this is particularly useful for kernel developers who are proposing new system calls. +Syscall interfaces are manually-written. There is an +[open issue](https://github.com/google/syzkaller/issues/590) to provide some aid +for this process and some ongoing work, but we are yet there. +There is also [headerparser](headerparser_usage.md) utility that can auto-generate +some parts of descriptions from header files. + First, add a declarative description of the new system call to the appropriate file: - Various `sys/linux/<subsystem>.txt` files hold system calls for particular kernel subsystems, for example `bpf` or `socket`. @@ -44,22 +83,43 @@ First, add a declarative description of the new system call to the appropriate f The description of the syntax can be found [here](syscall_descriptions_syntax.md). -If the subsystem is present in the mainline kernel, run `make extract TARGETOS=linux SOURCEDIR=$KSRC` -with `$KSRC` set to the location of a kernel source tree. This will generate const files. -Note, that this will overwrite `.config` file you have in `$KSRC`. +After adding/changing descriptions run: +``` +make extract TARGETOS=linux SOURCEDIR=$KSRC +make generate +make +``` + +Here `make extract` generates/updates the `*.const` files. +`$KSRC` should point to the _latest_ kernel checkout.\ +Note: `make extract` overwrites `.config` in `$KSRC` and `mrproper`'s it. + +Then `make generate` updates generated code and `make` rebuilds binaries.\ +Note: `make generate` does not require any kernel sources, native compilers, etc +and is pure text processing. + +Note: _all_ generated files (`*.const`, `*.go`, `*.h`) are checked-in with the +`*.txt` changes in the same commit. + +If you want to fuzz the new subsystem that you described locally, you may find +the `enable_syscalls` configuration parameter useful to specifically target +the new system calls. + +## Non-mainline subsystems + +`make extract` extracts constants for all `*.txt` files and for all supported architectures. +This may not work for subsystems that are not present in mainline kernel or if you have +problems with native kernel compilers, etc. In such cases the `syz-extract` utility +used by `make extract` can be run manually for single file/arch as: -If the subsystem is not present in the mainline kernel, then you need to manually run `syz-extract` binary: ``` make bin/syz-extract -bin/syz-extract -os linux -arch $ARCH -sourcedir "$LINUX" -builddir "$LINUXBLD" <new>.txt +bin/syz-extract -os linux -arch $ARCH -sourcedir $KSRC -builddir $LINUXBLD <new>.txt ``` + `$ARCH` is one of `amd64`, `386` `arm64`, `arm`, `ppc64le`. If the subsystem is supported on several architectures, then run `syz-extract` for each arch. -`$LINUX` should point to kernel source checkout, which is configured for the corresponding arch (i.e. you need to run `make someconfig && make` there first). -If the kernel was built into a separate directory (with `make O=...`) then also set `$LINUXBLD` to the location of the build directory. - -Then, run `make generate` and `make` which will update generated code and rebuild binaries. - -Optionally, adjust the `enable_syscalls` configuration value for syzkaller to specifically target the new system calls. - -In order to partially auto-generate system call descriptions you can use [headerparser](headerparser_usage.md). +`$LINUX` should point to kernel source checkout, which is configured for the +corresponding arch (i.e. you need to run `make someconfig && make` there first). +If the kernel was built into a separate directory (with `make O=...`) then also +set `$LINUXBLD` to the location of the build directory. |
