aboutsummaryrefslogtreecommitdiffstats
path: root/pkg/declextract
Commit message (Collapse)AuthorAgeFilesLines
* pkg/clangtool: allow final verification of outputDmitry Vyukov2025-11-201-1/+1
| | | | | | | Let tools verify that all source file names, line numbers, etc are valid/present. If there are any bogus entries, it's better to detect them early, than to crash/error much later when the info is used.
* pkg/clangtool: make more genericDmitry Vyukov2025-11-175-39/+24
| | | | Make it possible to use pkg/clangtool with other types than declextract.Output.
* all: apply linter auto fixesTaras Madan2025-07-171-2/+2
| | | | ./tools/syz-env bin/golangci-lint run ./... --fix
* pkg/declextract: add a TODODmitry Vyukov2025-04-151-0/+4
|
* tools/syz-declextract: ignore files with non US-ASCII charsDmitry Vyukov2025-04-151-0/+7
|
* pkg/declextract: add open fileops callback to interface listDmitry Vyukov2025-04-152-1/+10
| | | | | | Add open callback if there are no other unique callbacks. This happens for e.g. seq files which only have unique open, while read is a common seq_read callback.
* pkg/declextract: more precise fileops callback resolutionDmitry Vyukov2025-04-153-47/+67
| | | | | | Use resolved Function references instead of string names for fileops callback resolution. Function names are not unique, a number of callbacks have the same names.
* tools/syz-declextract: extract function references more preciselyDmitry Vyukov2025-04-152-12/+7
| | | | | | Currently we misparse some function references, e.g. for: .write = (foo) ? bar : baz, we extract "foo". Extract first function reference from such expressions.
* pkg/declextract: add a hack for kernel/sched files compiled togetherDmitry Vyukov2025-04-151-0/+8
| | | | | | | To optimize build time kernel/sched compiles a number of source files together by including them into another source file. As the result static functions declared in one source file effectively referenced from another source file. In order to be able to resolve them, we pretend such functions are not static.
* tools/syz-declextract: export info about file ops interfacesDmitry Vyukov2025-04-114-18/+130
|
* tools/syz-declextract: add interface coverage infoDmitry Vyukov2025-04-103-14/+47
| | | | | | Add coverage percent for kernel interfaces. The current data is generated with Mar coverage report on kernel commit 1e7857b28020ba57ca7fdafae7ac855ba326c697.
* pkg/declextract: export syscall variants as separate interfacesDmitry Vyukov2025-04-104-52/+73
| | | | | | Export each syscall variant (e.g. fcnt$*) as a separate interface. Effectively these are separate syscalls. We will want this for ioctl as well (it's not 1 interface).
* tools/syz-declextract: refine arg types for syscall variantsDmitry Vyukov2025-04-094-44/+128
| | | | | | Use scope-based dataflow analysis for syscall variants (including ioctls). As the result we only consider code that relates to a partiuclar command/ioctl, and can infer arguments/return types for each command/ioctl independently.
* pkg/declextract: infer syscall commandsDmitry Vyukov2025-01-221-13/+33
| | | | | | | | Use function scope information extracted in the previous commit to infer multiplexed syscalls (fcntl, prctl, ...) and infer their arguments. Descriptions generated on Linux commit c4b9570cfb63501.
* tools/syz-declextract: support function scopesDmitry Vyukov2025-01-225-45/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract info about function scopes formed by switch'es on function arguments. For example if we have: void foo(..., int cmd, ...) { ... switch (cmd) { case FOO: ... block 1 ... case BAR: ... block 2 ... } ... } We record that any data flow within block 1 is only relevant when foo's arg cmd has value FOO, similarly for block 2 and BAR. This allows to do 3 things: 1. Locate ioctl commands that are switched on within transitively called functions. 2. Infer return value for each ioctl command. 3. Infer argument type when it's not specified in _IO macro. This will also allow to infer other multiplexed syscalls. Descriptions generated on Linux commit c4b9570cfb63501.
* tools/syz-declextract: fix empty structs and arraysDmitry Vyukov2025-01-203-24/+73
| | | | | | | | | | | | | | | | This fixes 2 bugs: 1. We completly remove empty structs, but they can have effect on parent struct layout if they have >1 alignment. Replace empty structs with a special auto_aligner type that preserves alignment. 2. Arrays of 0 size are currently emitted as dynamically-sized (we assume 0 size means "this is not a const-size array"). Add separate IsConstSize flag for arrays that marks const-size arrays. Additionally cross-check that generated structs have exactly the same size/alignment as the corresponding C structs. This allows to catch the above bugs.
* pkg/declextract: remove unused includes and definesDmitry Vyukov2025-01-171-4/+11
| | | | | | | | | | This is nice on its own, but this will also help to prevent lots of problems when we export more info from the clang tool in future. The clang tool does not know what will end up in the final descriptions, so it exports info about all consts that it encounters. As the result we pull in lots of includes/defines, and lots of kernel includes/defines are broken or create problems. So the fewer we have, the better.
* pkg/declextract: move const handling logic from the clang toolDmitry Vyukov2025-01-173-25/+52
| | | | | | | | Export raw info about consts from the clang tool, and let the Go part handle it. The less logic is in the clang tool, the better. Also this will allow to remove unused includes when we know which consts we ended up using. The more includes we include, the higher the chances we include something that's broken.
* tools/syz-declextract: infer argument/field typesDmitry Vyukov2024-12-174-7/+342
| | | | | | Use data flow analysis to infer syscall argument, return value, and struct field types. See the comment in pkg/declextract/typing.go for more details.
* pkg/declextract: fix static function handlingDmitry Vyukov2024-12-161-2/+2
| | | | | The check did not actually match any header files. Fix the check.
* pkg/declextract: change auto_todo type to int8Dmitry Vyukov2024-12-131-1/+1
| | | | | | We use auto_todo type as an element of array for void*. array[int8] is lowered to the buffer type, which is much better handled by the fuzzer engine + closer resembles real blobs.
* tools/syz-declextract: extract info about all functionsDmitry Vyukov2024-12-133-0/+96
| | | | | | Extract info about all functions, and compute total LOC for each interface. For now only static calls are considered, this doesn't handle indirect calls yet. This is just a groundwork for more complex callgraph/dataflow analysis.
* pkg/declextract: move file update from pkg/clangtoolDmitry Vyukov2024-12-131-1/+4
| | | | | | | | | Currently when entities are added/changed, one may need to update both pkg/declextract and pkg/clangtool b/c clangtool updates paths. Move all of that updates to pkg/declextract, so that pkg/clangtool does not need to be touched when entities change. The idea behind pkg/clangtool is to provide lower-level infrastructure function of running the clang tool only.
* pkg/declextract: speed up sortAndDedupSliceDmitry Vyukov2024-12-131-9/+15
| | | | Marshal to JSON only once and dedup based on hash before sorting.
* pkg/ifaceprobe: optimize cacheDmitry Vyukov2024-12-121-5/+1
| | | | | | | Instead of storing real PC values store indexes into the PCs table. This significantly reduces size of the cache (in my case from 1823 MB to 473 MB) and actually makes use of the cache simpler (don't need separate map).
* pkg/declextract: reduce cyclomatic complexityDmitry Vyukov2024-12-112-115/+134
| | | | | Linter points to very large cyclomatic complexity/length of some functions. Fix that.
* pkg/declextract: generated single openat for all related filesDmitry Vyukov2024-12-111-26/+9
|
* pkg/declextract: restore use of ipv6_addrDmitry Vyukov2024-12-111-1/+1
|
* tools/syz-declextract: generate file_operations descriptionsDmitry Vyukov2024-12-114-13/+287
| | | | | | | | Emit descriptions for special files in /dev, /sys, /proc, and ./. pkg/declextract combines file_operations info produced by the clang tool with the dynamic probing info produced by pkg/ifaceprobe in order to produce complete descriptions for special files.
* tools/syz-declextract: extract file_operations descriptionsDmitry Vyukov2024-12-111-0/+28
| | | | | | | | Extend the clang tool to locate file_operations variables and arrays and dump open/read/write/mmap/ioctl callbacks for each. It also tries to extract set of ioctl commands and argument types for them in a simple best-effort way (for now). It just locates switch in the ioctl callback and extracts each case as a command.
* pkg/declextract: emit more netlink familiesDmitry Vyukov2024-12-112-24/+17
| | | | Emit families w/o policy, emit duplicate commands.
* pkg/declextract: refine more networking typesDmitry Vyukov2024-12-112-11/+29
|
* pkg/declextract: refactor netlink generationDmitry Vyukov2024-12-114-78/+42
| | | | | | | Emit all information related to a single netlink family close to each other. Previously we emitted them scattered and grouped by info type. That was both inconvinient to emit and inconvinient to read. NFC.
* pkg/declextract: rename generated names for consistencyDmitry Vyukov2024-12-112-27/+18
| | | | | | Currently we append "$auto", or "$auto_record", or prepend "auto_", or insert "auto" somewhere in the middle. Use more consistent naming: always append "$auto".
* tools/syz-declextract: rewriteDmitry Vyukov2024-12-115-0/+964
syz-declextract accumulated a bunch of code health problems so that now it's hard to change/extend it, lots of new features can only be added in in hacky ways and cause lots of code duplication. It's also completly untested. Rewrite the tool to: - move as much code as possible to Go (working with the clang tool is painful for a number of reasons) - allow testing and add unit tests (first layer of tests test what information is produced by the clang tool, second layer of tests test how that information is transformed to descriptions) - allow extending the clang tool output to export arbitrary info in non-hacky way (now it produces arbitrary JSON instead of a mix of incomplete descriptions and interfaces) - remove code duplication in the clang tool and provide common infrastructure to add new analysis w/o causing more duplication - provide more convinient primitives in the clang tool - improve code style consistency and stick to the LLVM code style (in particular, variable names must start with a capital letter, single-statement blocks are not surrounded with {}) - remove intermixing of code that works on different levels (currently we have AST analysis + busness logic + printfs all intermixed with each other) - provide several helper Go packages for better code structuring (e.g. pkg/clangtool just runs the tool on source files in parallel and returns results, this already separates a bunch of low-level logic from the rest of the code under a simple abstraction) I've tried to make the output match the current output as much as possible so that the diff is managable (in some cases at the cost of code quality, this should be fixed in future commits). There are still some differences, but hopefully they are managable for review (more includes/defines, reordered some netlink attributes). Fixed minor bugs are fixed along the way, but mostly NFC: 1. Some unions were incorrectly emitted as [varlen] (C unions are never varlen). 2. Only a of [packed], [align[N]] attributes was emitted for struct (both couldn't be emitted).