aboutsummaryrefslogtreecommitdiffstats
path: root/tools/syz-declextract
Commit message (Collapse)AuthorAgeFilesLines
* all: reformat C/C++ filesDmitry Vyukov2026-01-1932-415/+510
|
* tools/clang/declextract: move from tools/syz-declextract/clangtoolDmitry Vyukov2025-11-175-1568/+1
| | | | | Some of the common helpers may be reused across different Clang tools (currently json.h and .clang-format). Move the files to allow such reuse.
* pkg/clangtool/tooltest: add packageDmitry Vyukov2025-11-171-89/+6
| | | | Factor out common clang tool testing helpers from the declextract tool test.
* pkg/clangtool: make more genericDmitry Vyukov2025-11-172-2/+3
| | | | Make it possible to use pkg/clangtool with other types than declextract.Output.
* tools/syz-declextract: update clangtool to the latest clangDmitry Vyukov2025-11-172-7/+6
| | | | Fix some minor API changes.
* tools/syz-declextract: ignore files with non US-ASCII charsDmitry Vyukov2025-04-151-0/+4
|
* tools/syz-declextract: update test golden filesDmitry Vyukov2025-04-152-2/+1
| | | | | Regenerate golden files with up-to-date clang tool. Missed part of commit c7e92da6cb06679b04062786481f50e42c585bfc.
* pkg/declextract: add open fileops callback to interface listDmitry Vyukov2025-04-151-1/+0
| | | | | | Add open callback if there are no other unique callbacks. This happens for e.g. seq files which only have unique open, while read is a common seq_read callback.
* pkg/declextract: more precise fileops callback resolutionDmitry Vyukov2025-04-151-5/+5
| | | | | | Use resolved Function references instead of string names for fileops callback resolution. Function names are not unique, a number of callbacks have the same names.
* tools/syz-declextract: extract function references more preciselyDmitry Vyukov2025-04-153-42/+70
| | | | | | Currently we misparse some function references, e.g. for: .write = (foo) ? bar : baz, we extract "foo". Extract first function reference from such expressions.
* tools/syz-declextract: extract enums declared with a typedefDmitry Vyukov2025-04-156-12/+120
|
* tools/syz-declextract: extract ioctls declared with enumsDmitry Vyukov2025-04-156-62/+109
| | | | | Some ioctls are declared inconsistently using enums rather than macros. Extract these as well.
* tools/syz-declextract: export info about file ops interfacesDmitry Vyukov2025-04-112-1/+19
|
* tools/syz-declextract: add interface coverage infoDmitry Vyukov2025-04-1018-73/+372
| | | | | | Add coverage percent for kernel interfaces. The current data is generated with Mar coverage report on kernel commit 1e7857b28020ba57ca7fdafae7ac855ba326c697.
* pkg/declextract: export syscall variants as separate interfacesDmitry Vyukov2025-04-1011-83/+177
| | | | | | Export each syscall variant (e.g. fcnt$*) as a separate interface. Effectively these are separate syscalls. We will want this for ioctl as well (it's not 1 interface).
* tools/syz-declextract: don't say that clang is optionalDmitry Vyukov2025-04-101-1/+1
| | | | pkg/clangtool checks that source files were compiled with clang.
* tools/syz-declextract: handle ints more carefullyDmitry Vyukov2025-04-105-2/+37
| | | | | | It seems that new clang is more picky about asserts for large ints. It not assert-fails when converting large ints to int64. Be more careful when converting these to ints.
* tools/syz-declextract: fix warnings about unused variablesDmitry Vyukov2025-04-101-2/+2
|
* tools/syz-declextract: refine arg types for syscall variantsDmitry Vyukov2025-04-091-11/+11
| | | | | | Use scope-based dataflow analysis for syscall variants (including ioctls). As the result we only consider code that relates to a partiuclar command/ioctl, and can infer arguments/return types for each command/ioctl independently.
* tools/syz-declextract: update README.mdDmitry Vyukov2025-04-091-3/+6
| | | | | Update the latest tested llvm revision. Add additional compiler flags to suppress unuseful warnings.
* tools/syz-declextract: extend test dataDmitry Vyukov2025-04-098-28/+250
| | | | | | Add few interesting cases for scope analysis. Move functions related to resource to the header file, they must be visible in every file to work.
* tools/syz-declextract: remove support for old clangDmitry Vyukov2025-04-091-7/+1
|
* tools/syz-declextract/clangtool: fix getBitWidthValue for LLVM>=21Burak Emir2025-04-091-1/+7
|
* tools/syz-declextract: support attributes on typesDmitry Vyukov2025-04-035-11/+74
| | | | | | Remove __attribute__ on types. Some kernels now use it on some syscall args as shown in the test. The __attribute__ may contain quotes and break json.
* tools/syz-declextract: allow to run on subset of archesDmitry Vyukov2025-04-032-10/+16
| | | | | | | This may be useful for downstream kernels that only build and are supposed to be used with a subset of arches. Some esoteric arches may be broken on such kernels. Allow to ignore them.
* tools/syz-declextract: fix README run instructionFlorent Revest2025-03-181-1/+1
| | | | | | When using go run, I had to specify the path of syz-declextract or I'd get the following error: package tools/syz-declextract is not in std (/usr/lib/google-golang/src/tools/syz-declextract)
* tools/syz-declextract: fix README build instructionFlorent Revest2025-03-181-1/+1
| | | | | The cmake command used to generate syz-declextract uses the -GNinja flag so it should be built with ninja rather than make.
* all: remove loop variables scopingTaras Madan2025-02-171-1/+0
|
* all: replace Walk with WalkDir to reduce os.Lstat callsGofastasf2025-01-301-2/+2
| | | | | | | | filepath.Walk calls os.Lstat for every file or directory to retrieve os.FileInfo. filepath.WalkDir avoids unnecessary system calls since it provides a fs.DirEntry, which includes file type information without requiring a stat call. This improves performance by reducing redundant system calls.
* pkg/declextract: infer syscall commandsDmitry Vyukov2025-01-221-0/+11
| | | | | | | | Use function scope information extracted in the previous commit to infer multiplexed syscalls (fcntl, prctl, ...) and infer their arguments. Descriptions generated on Linux commit c4b9570cfb63501.
* tools/syz-declextract: support function scopesDmitry Vyukov2025-01-2218-507/+1283
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract info about function scopes formed by switch'es on function arguments. For example if we have: void foo(..., int cmd, ...) { ... switch (cmd) { case FOO: ... block 1 ... case BAR: ... block 2 ... } ... } We record that any data flow within block 1 is only relevant when foo's arg cmd has value FOO, similarly for block 2 and BAR. This allows to do 3 things: 1. Locate ioctl commands that are switched on within transitively called functions. 2. Infer return value for each ioctl command. 3. Infer argument type when it's not specified in _IO macro. This will also allow to infer other multiplexed syscalls. Descriptions generated on Linux commit c4b9570cfb63501.
* pkg/compiler: fix struct layout bugDmitry Vyukov2025-01-203-4/+17
| | | | | | | | | | | | | | | Currently we have a bug in struct layout that affects some corner cases that involve recursive structs. The result of this bug is that we use wrong alignment 1 (not yet calculated) for some structs when calculating layout of other structs. The root cause of this bug is that we calculate struct alignment too early in typeStruct.Gen when structs are not yet laid out. For this reason we moved struct size calculation to the later phase (after compiler.layoutStruct). Move alignment calculation from typeStruct.Gen to compiler.layoutStruct to fix this.
* tools/syz-declextract: fix empty structs and arraysDmitry Vyukov2025-01-2011-51/+399
| | | | | | | | | | | | | | | | This fixes 2 bugs: 1. We completly remove empty structs, but they can have effect on parent struct layout if they have >1 alignment. Replace empty structs with a special auto_aligner type that preserves alignment. 2. Arrays of 0 size are currently emitted as dynamically-sized (we assume 0 size means "this is not a const-size array"). Add separate IsConstSize flag for arrays that marks const-size arrays. Additionally cross-check that generated structs have exactly the same size/alignment as the corresponding C structs. This allows to catch the above bugs.
* pkg/declextract: remove unused includes and definesDmitry Vyukov2025-01-1710-26/+77
| | | | | | | | | | This is nice on its own, but this will also help to prevent lots of problems when we export more info from the clang tool in future. The clang tool does not know what will end up in the final descriptions, so it exports info about all consts that it encounters. As the result we pull in lots of includes/defines, and lots of kernel includes/defines are broken or create problems. So the fewer we have, the better.
* pkg/declextract: move const handling logic from the clang toolDmitry Vyukov2025-01-176-52/+126
| | | | | | | | Export raw info about consts from the clang tool, and let the Go part handle it. The less logic is in the clang tool, the better. Also this will allow to remove unused includes when we know which consts we ended up using. The more includes we include, the higher the chances we include something that's broken.
* tools/syz-declextract: infer argument/field typesDmitry Vyukov2024-12-1711-39/+678
| | | | | | Use data flow analysis to infer syscall argument, return value, and struct field types. See the comment in pkg/declextract/typing.go for more details.
* pkg/declextract: change auto_todo type to int8Dmitry Vyukov2024-12-138-9/+8
| | | | | | We use auto_todo type as an element of array for void*. array[int8] is lowered to the buffer type, which is much better handled by the fuzzer engine + closer resembles real blobs.
* tools/syz-declextract: extract info about all functionsDmitry Vyukov2024-12-1318-12/+302
| | | | | | Extract info about all functions, and compute total LOC for each interface. For now only static calls are considered, this doesn't handle indirect calls yet. This is just a groundwork for more complex callgraph/dataflow analysis.
* tools/syz-declextract: parallelizeDmitry Vyukov2024-12-123-27/+56
| | | | | | Do kernel probing, source code analysis and loading of syscall rename map in parallel. Also change probe caching to the scheme we now use for the clang tool cache so the same reasons.
* pkg/ifaceprobe: optimize cacheDmitry Vyukov2024-12-121-6/+6
| | | | | | | Instead of storing real PC values store indexes into the PCs table. This significantly reduces size of the cache (in my case from 1823 MB to 473 MB) and actually makes use of the cache simpler (don't need separate map).
* pkg/clangtool: cache combined outputDmitry Vyukov2024-12-122-14/+10
| | | | | | | | | | | | | | | | Instead of caching output for each file separately, cache total combined output in a single file. Caching output for each file is not useful in practice, I either use everything cached, or regenerate whole cache. Caching combined output is much more efficient. With function info there are lots of duplication across individual output files. E.g. I am getting 6GB cache for individual files, and only 60MB for the combined cache. Also change how caching works. Remove the flag and always use the cache if it exists. It's much more convinient and safer to use (accidentially not using the cache). The cache file can be removed to force regeneration.
* tools/syz-declextract: generate file_operations descriptionsDmitry Vyukov2024-12-116-4/+96
| | | | | | | | Emit descriptions for special files in /dev, /sys, /proc, and ./. pkg/declextract combines file_operations info produced by the clang tool with the dynamic probing info produced by pkg/ifaceprobe in order to produce complete descriptions for special files.
* tools/syz-declextract: extract file_operations descriptionsDmitry Vyukov2024-12-119-2/+396
| | | | | | | | Extend the clang tool to locate file_operations variables and arrays and dump open/read/write/mmap/ioctl callbacks for each. It also tries to extract set of ioctl commands and argument types for them in a simple best-effort way (for now). It just locates switch in the ioctl callback and extracts each case as a command.
* pkg/declextract: emit more netlink familiesDmitry Vyukov2024-12-114-0/+70
| | | | Emit families w/o policy, emit duplicate commands.
* pkg/declextract: refactor netlink generationDmitry Vyukov2024-12-111-10/+13
| | | | | | | Emit all information related to a single netlink family close to each other. Previously we emitted them scattered and grouped by info type. That was both inconvinient to emit and inconvinient to read. NFC.
* pkg/declextract: rename generated names for consistencyDmitry Vyukov2024-12-114-55/+52
| | | | | | Currently we append "$auto", or "$auto_record", or prepend "auto_", or insert "auto" somewhere in the middle. Use more consistent naming: always append "$auto".
* tools/syz-declextract: rewriteDmitry Vyukov2024-12-1139-1568/+2773
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syz-declextract accumulated a bunch of code health problems so that now it's hard to change/extend it, lots of new features can only be added in in hacky ways and cause lots of code duplication. It's also completly untested. Rewrite the tool to: - move as much code as possible to Go (working with the clang tool is painful for a number of reasons) - allow testing and add unit tests (first layer of tests test what information is produced by the clang tool, second layer of tests test how that information is transformed to descriptions) - allow extending the clang tool output to export arbitrary info in non-hacky way (now it produces arbitrary JSON instead of a mix of incomplete descriptions and interfaces) - remove code duplication in the clang tool and provide common infrastructure to add new analysis w/o causing more duplication - provide more convinient primitives in the clang tool - improve code style consistency and stick to the LLVM code style (in particular, variable names must start with a capital letter, single-statement blocks are not surrounded with {}) - remove intermixing of code that works on different levels (currently we have AST analysis + busness logic + printfs all intermixed with each other) - provide several helper Go packages for better code structuring (e.g. pkg/clangtool just runs the tool on source files in parallel and returns results, this already separates a bunch of low-level logic from the rest of the code under a simple abstraction) I've tried to make the output match the current output as much as possible so that the diff is managable (in some cases at the cost of code quality, this should be fixed in future commits). There are still some differences, but hopefully they are managable for review (more includes/defines, reordered some netlink attributes). Fixed minor bugs are fixed along the way, but mostly NFC: 1. Some unions were incorrectly emitted as [varlen] (C unions are never varlen). 2. Only a of [packed], [align[N]] attributes was emitted for struct (both couldn't be emitted).
* tools/syz-declextract: a bunch of refactoringsDmitry Vyukov2024-11-271-69/+116
| | | | | | | | | | | | | | Add caching mode where results of running the Clang tool are cached for each file and can be reused. It saves lots of time when only the Go tool changes. Also allows to look at the output for each file for debugging. Group all assorted variables in a context struct. There are lots of assorted vars and will be more. Support defines in the tool output. Fix up some includes to more generic ones.
* tools/syz-declextract: accept manager configDmitry Vyukov2024-11-261-15/+11
| | | | | Make the tool accept a manager config. This will be required for dynamic extraction of info from the kernel.
* tools/syz-declextract: prefix flags with auto_Dmitry Vyukov2024-11-261-1/+1
| | | | They can clash with our manual flags names.