aboutsummaryrefslogtreecommitdiffstats
path: root/docs/syscall_descriptions.md
diff options
context:
space:
mode:
authorDmitry Vyukov <dvyukov@google.com>2019-03-14 10:27:11 +0100
committerDmitry Vyukov <dvyukov@google.com>2019-03-14 10:27:11 +0100
commitd34313cd5d5c25ea3a914140f25168738dc96aef (patch)
treec2f642487ab8d775e8c1d40f19e9b60a8faaf6e4 /docs/syscall_descriptions.md
parent375815261dbedeaf0e02581d50be9980c9eef8b7 (diff)
docs: extend descriptions/programs docs
Extend doc on descriptions, const generation process, add more links to internals, explain programs, etc. Clarify that all generated files are checked in.
Diffstat (limited to 'docs/syscall_descriptions.md')
-rw-r--r--docs/syscall_descriptions.md118
1 files changed, 89 insertions, 29 deletions
diff --git a/docs/syscall_descriptions.md b/docs/syscall_descriptions.md
index 2e4f6bb9c..115e8046f 100644
--- a/docs/syscall_descriptions.md
+++ b/docs/syscall_descriptions.md
@@ -1,7 +1,8 @@
# Syscall descriptions
-`syzkaller` uses declarative description of syscalls to generate, mutate, minimize, serialize and deserialize programs (sequences of syscalls).
-Below you can see (hopefully self-explanatory) excerpt from the description:
+`syzkaller` uses declarative description of syscall interfaces to manipulate
+programs (sequences of syscalls). Below you can see (hopefully self-explanatory)
+excerpt from the description:
```
open(file filename, flags flags[open_flags], mode flags[open_mode]) fd
@@ -10,32 +11,70 @@ close(fd fd)
open_mode = S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH
```
-The description is contained in `sys/linux/*.txt` files.
-For example see the [sys/linux/sys.txt](/sys/linux/sys.txt) file.
+The description is contained in `sys/OS/*.txt` files.
+For example see the [sys/linux/dev_snd_midi.txt](/sys/linux/dev_snd_midi.txt) file
+for descriptions of the Linux MIDI interfaces.
-## Syntax
+A more formal description of the description syntax can be found [here](syscall_descriptions_syntax.md).
-The description of the syntax can be found [here](syscall_descriptions_syntax.md).
+## Description compilation
-## Code generation
+These textual syscall descriptions are then compiled into machine-usable form used by `syzkaller`
+to actually generate programs. This process consists of 2 steps.
-Textual syscall descriptions are translated into code used by `syzkaller`.
-This process consists of 2 steps.
-The first step is extraction of values of symbolic constants from Linux sources using `syz-extract` utility.
-`syz-extract` generates a small C program that includes kernel headers referenced by `include` directives,
-defines macros as specified by `define` directives and prints values of symbolic constants.
+The first step is extraction of values of symbolic constants from kernel sources using
+[syz-extract](/sys/syz-extract) utility. `syz-extract` generates a small C program that
+includes kernel headers referenced by `include` directives, defines macros as specified
+by `define` directives and prints values of symbolic constants.
Results are stored in `.const` files, one per arch.
-For example, [sys/linux/dev_ptmx.txt](/sys/linux/dev_ptmx.txt) is translated into [sys/linux/dev_ptmx_amd64.const](/sys/linux/dev_ptmx_amd64.const).
+For example, [sys/linux/dev_ptmx.txt](/sys/linux/dev_ptmx.txt) is translated into
+[sys/linux/dev_ptmx_amd64.const](/sys/linux/dev_ptmx_amd64.const).
+
+The second step is translation of descriptions into Go code using
+[syz-sysgen](/sys/syz-sysgen) utility (the actual compiler code lives in
+[pkg/ast](/pkg/ast/) and [pkg/compiler](/pkg/compiler/)).
+This step uses syscall descriptions and the const files generated during the first step
+and produces instantiations of `Syscall` and `Type` types defined in [prog/types.go](/prog/types.go).
+Here is an [example](/sys/akaros/gen/amd64.go) of the compiler output for Akaros.
+This step also generates some minimal syscall metadata for C++ code in
+[executor/syscalls.h](/executor/syscalls.h).
+
+## Programs
-The second step is generation of Go code for syzkaller.
-This step uses syscall descriptions and the const files generated during the first step.
-You can see a result in [sys/linux/gen/amd64.go](/sys/linux/gen/amd64.go) and in [executor/syscalls.h](/executor/syscalls.h).
+The translated descriptions are then used to generate, mutate, execute, minimize, serialize
+and deserialize programs. A program is a sequences of syscalls with concrete values for arguments.
+Here is an example (of a textual representation) of a program:
+
+```
+mmap(&(0x7f0000000000), (0x1000), 0x3, 0x32, -1, 0)
+r0 = open(&(0x7f0000000000)="./file0", 0x3, 0x9)
+read(r0, &(0x7f0000000000), 42)
+close(r0)
+```
+
+For actual manipulations `syzkaller` uses in-memory AST-like representation consisting of
+`Call` and `Arg` values defined in [prog/prog.go](/prog/prog.go). That representation is used to
+[analyze](/prog/analysis.go), [generate](/prog/rand.go), [mutate](/prog/mutation.go),
+[minimize](/prog/minimization.go), [validate](/prog/validation.go), etc programs.
+
+The in-memory representation can be [transformed](/prog/encoding.go) to/from
+textual form to store in on-disk corpus, show to humans, etc.
+
+There is also another [binary representation](https://github.com/google/syzkaller/blob/master/prog/decodeexec.go)
+of the programs (called `exec`), that is much simpler, does not contains rich type information (irreversible)
+and is used for actual execution (interpretation) of programs by [executor](/executor/executor.cc).
## Describing new system calls
This section describes how to extend syzkaller to allow fuzz testing of a new system call;
this is particularly useful for kernel developers who are proposing new system calls.
+Syscall interfaces are manually-written. There is an
+[open issue](https://github.com/google/syzkaller/issues/590) to provide some aid
+for this process and some ongoing work, but we are yet there.
+There is also [headerparser](headerparser_usage.md) utility that can auto-generate
+some parts of descriptions from header files.
+
First, add a declarative description of the new system call to the appropriate file:
- Various `sys/linux/<subsystem>.txt` files hold system calls for particular kernel
subsystems, for example `bpf` or `socket`.
@@ -44,22 +83,43 @@ First, add a declarative description of the new system call to the appropriate f
The description of the syntax can be found [here](syscall_descriptions_syntax.md).
-If the subsystem is present in the mainline kernel, run `make extract TARGETOS=linux SOURCEDIR=$KSRC`
-with `$KSRC` set to the location of a kernel source tree. This will generate const files.
-Note, that this will overwrite `.config` file you have in `$KSRC`.
+After adding/changing descriptions run:
+```
+make extract TARGETOS=linux SOURCEDIR=$KSRC
+make generate
+make
+```
+
+Here `make extract` generates/updates the `*.const` files.
+`$KSRC` should point to the _latest_ kernel checkout.\
+Note: `make extract` overwrites `.config` in `$KSRC` and `mrproper`'s it.
+
+Then `make generate` updates generated code and `make` rebuilds binaries.\
+Note: `make generate` does not require any kernel sources, native compilers, etc
+and is pure text processing.
+
+Note: _all_ generated files (`*.const`, `*.go`, `*.h`) are checked-in with the
+`*.txt` changes in the same commit.
+
+If you want to fuzz the new subsystem that you described locally, you may find
+the `enable_syscalls` configuration parameter useful to specifically target
+the new system calls.
+
+## Non-mainline subsystems
+
+`make extract` extracts constants for all `*.txt` files and for all supported architectures.
+This may not work for subsystems that are not present in mainline kernel or if you have
+problems with native kernel compilers, etc. In such cases the `syz-extract` utility
+used by `make extract` can be run manually for single file/arch as:
-If the subsystem is not present in the mainline kernel, then you need to manually run `syz-extract` binary:
```
make bin/syz-extract
-bin/syz-extract -os linux -arch $ARCH -sourcedir "$LINUX" -builddir "$LINUXBLD" <new>.txt
+bin/syz-extract -os linux -arch $ARCH -sourcedir $KSRC -builddir $LINUXBLD <new>.txt
```
+
`$ARCH` is one of `amd64`, `386` `arm64`, `arm`, `ppc64le`.
If the subsystem is supported on several architectures, then run `syz-extract` for each arch.
-`$LINUX` should point to kernel source checkout, which is configured for the corresponding arch (i.e. you need to run `make someconfig && make` there first).
-If the kernel was built into a separate directory (with `make O=...`) then also set `$LINUXBLD` to the location of the build directory.
-
-Then, run `make generate` and `make` which will update generated code and rebuild binaries.
-
-Optionally, adjust the `enable_syscalls` configuration value for syzkaller to specifically target the new system calls.
-
-In order to partially auto-generate system call descriptions you can use [headerparser](headerparser_usage.md).
+`$LINUX` should point to kernel source checkout, which is configured for the
+corresponding arch (i.e. you need to run `make someconfig && make` there first).
+If the kernel was built into a separate directory (with `make O=...`) then also
+set `$LINUXBLD` to the location of the build directory.