| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
| |
Some of the common helpers may be reused across different Clang tools
(currently json.h and .clang-format). Move the files to allow such reuse.
|
| |
|
|
| |
Factor out common clang tool testing helpers from the declextract tool test.
|
| |
|
|
| |
Make it possible to use pkg/clangtool with other types than declextract.Output.
|
| |
|
|
| |
Fix some minor API changes.
|
| | |
|
| |
|
|
|
| |
Regenerate golden files with up-to-date clang tool.
Missed part of commit c7e92da6cb06679b04062786481f50e42c585bfc.
|
| |
|
|
|
|
| |
Add open callback if there are no other unique callbacks.
This happens for e.g. seq files which only have unique open,
while read is a common seq_read callback.
|
| |
|
|
|
|
| |
Use resolved Function references instead of string names for fileops
callback resolution. Function names are not unique, a number of callbacks
have the same names.
|
| |
|
|
|
|
| |
Currently we misparse some function references, e.g. for:
.write = (foo) ? bar : baz,
we extract "foo". Extract first function reference from such expressions.
|
| | |
|
| |
|
|
|
| |
Some ioctls are declared inconsistently using enums rather than macros.
Extract these as well.
|
| | |
|
| |
|
|
|
|
| |
Add coverage percent for kernel interfaces.
The current data is generated with Mar coverage report
on kernel commit 1e7857b28020ba57ca7fdafae7ac855ba326c697.
|
| |
|
|
|
|
| |
Export each syscall variant (e.g. fcnt$*) as a separate interface.
Effectively these are separate syscalls. We will want this for
ioctl as well (it's not 1 interface).
|
| |
|
|
| |
pkg/clangtool checks that source files were compiled with clang.
|
| |
|
|
|
|
| |
It seems that new clang is more picky about asserts for large ints.
It not assert-fails when converting large ints to int64.
Be more careful when converting these to ints.
|
| | |
|
| |
|
|
|
|
| |
Use scope-based dataflow analysis for syscall variants (including ioctls).
As the result we only consider code that relates to a partiuclar command/ioctl,
and can infer arguments/return types for each command/ioctl independently.
|
| |
|
|
|
| |
Update the latest tested llvm revision.
Add additional compiler flags to suppress unuseful warnings.
|
| |
|
|
|
|
| |
Add few interesting cases for scope analysis.
Move functions related to resource to the header file,
they must be visible in every file to work.
|
| | |
|
| | |
|
| |
|
|
|
|
| |
Remove __attribute__ on types.
Some kernels now use it on some syscall args as shown in the test.
The __attribute__ may contain quotes and break json.
|
| |
|
|
|
|
|
| |
This may be useful for downstream kernels that only build
and are supposed to be used with a subset of arches.
Some esoteric arches may be broken on such kernels.
Allow to ignore them.
|
| |
|
|
|
|
| |
When using go run, I had to specify the path of syz-declextract or I'd
get the following error: package tools/syz-declextract is not in std
(/usr/lib/google-golang/src/tools/syz-declextract)
|
| |
|
|
|
| |
The cmake command used to generate syz-declextract uses the -GNinja flag
so it should be built with ninja rather than make.
|
| | |
|
| |
|
|
|
|
|
|
| |
filepath.Walk calls os.Lstat for every file or directory to retrieve os.FileInfo.
filepath.WalkDir avoids unnecessary system calls since it provides a fs.DirEntry,
which includes file type information without requiring a stat call.
This improves performance by reducing redundant system calls.
|
| |
|
|
|
|
|
|
| |
Use function scope information extracted in the previous commit
to infer multiplexed syscalls (fcntl, prctl, ...) and infer
their arguments.
Descriptions generated on Linux commit c4b9570cfb63501.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extract info about function scopes formed by switch'es on function arguments.
For example if we have:
void foo(..., int cmd, ...)
{
...
switch (cmd) {
case FOO:
... block 1 ...
case BAR:
... block 2 ...
}
...
}
We record that any data flow within block 1 is only relevant
when foo's arg cmd has value FOO, similarly for block 2 and BAR.
This allows to do 3 things:
1. Locate ioctl commands that are switched on within transitively
called functions.
2. Infer return value for each ioctl command.
3. Infer argument type when it's not specified in _IO macro.
This will also allow to infer other multiplexed syscalls.
Descriptions generated on Linux commit c4b9570cfb63501.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have a bug in struct layout that affects
some corner cases that involve recursive structs.
The result of this bug is that we use wrong alignment 1
(not yet calculated) for some structs when calculating
layout of other structs.
The root cause of this bug is that we calculate struct
alignment too early in typeStruct.Gen when structs
are not yet laid out.
For this reason we moved struct size calculation to the
later phase (after compiler.layoutStruct).
Move alignment calculation from typeStruct.Gen to
compiler.layoutStruct to fix this.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes 2 bugs:
1. We completly remove empty structs, but they can have
effect on parent struct layout if they have >1 alignment.
Replace empty structs with a special auto_aligner type
that preserves alignment.
2. Arrays of 0 size are currently emitted as dynamically-sized
(we assume 0 size means "this is not a const-size array").
Add separate IsConstSize flag for arrays that marks const-size arrays.
Additionally cross-check that generated structs have exactly
the same size/alignment as the corresponding C structs.
This allows to catch the above bugs.
|
| |
|
|
|
|
|
|
|
|
| |
This is nice on its own, but this will also help to prevent
lots of problems when we export more info from the clang tool in future.
The clang tool does not know what will end up in the final descriptions,
so it exports info about all consts that it encounters.
As the result we pull in lots of includes/defines, and lots of kernel
includes/defines are broken or create problems.
So the fewer we have, the better.
|
| |
|
|
|
|
|
|
| |
Export raw info about consts from the clang tool, and let the Go part handle it.
The less logic is in the clang tool, the better. Also this will allow to remove
unused includes when we know which consts we ended up using.
The more includes we include, the higher the chances we include something
that's broken.
|
| |
|
|
|
|
| |
Use data flow analysis to infer syscall argument, return value,
and struct field types.
See the comment in pkg/declextract/typing.go for more details.
|
| |
|
|
|
|
| |
We use auto_todo type as an element of array for void*.
array[int8] is lowered to the buffer type, which is much
better handled by the fuzzer engine + closer resembles real blobs.
|
| |
|
|
|
|
| |
Extract info about all functions, and compute total LOC for each interface.
For now only static calls are considered, this doesn't handle indirect calls yet.
This is just a groundwork for more complex callgraph/dataflow analysis.
|
| |
|
|
|
|
| |
Do kernel probing, source code analysis and loading of syscall
rename map in parallel. Also change probe caching to the scheme
we now use for the clang tool cache so the same reasons.
|
| |
|
|
|
|
|
| |
Instead of storing real PC values store indexes into the PCs table.
This significantly reduces size of the cache (in my case from
1823 MB to 473 MB) and actually makes use of the cache simpler
(don't need separate map).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of caching output for each file separately,
cache total combined output in a single file.
Caching output for each file is not useful in practice,
I either use everything cached, or regenerate whole cache.
Caching combined output is much more efficient.
With function info there are lots of duplication across
individual output files. E.g. I am getting 6GB cache
for individual files, and only 60MB for the combined cache.
Also change how caching works. Remove the flag and always
use the cache if it exists. It's much more convinient and
safer to use (accidentially not using the cache).
The cache file can be removed to force regeneration.
|
| |
|
|
|
|
|
|
| |
Emit descriptions for special files in /dev, /sys, /proc, and ./.
pkg/declextract combines file_operations info produced by the clang tool
with the dynamic probing info produced by pkg/ifaceprobe in order
to produce complete descriptions for special files.
|
| |
|
|
|
|
|
|
| |
Extend the clang tool to locate file_operations variables and arrays
and dump open/read/write/mmap/ioctl callbacks for each.
It also tries to extract set of ioctl commands and argument types
for them in a simple best-effort way (for now). It just locates switch
in the ioctl callback and extracts each case as a command.
|
| |
|
|
| |
Emit families w/o policy, emit duplicate commands.
|
| |
|
|
|
|
|
| |
Emit all information related to a single netlink family close to each other.
Previously we emitted them scattered and grouped by info type.
That was both inconvinient to emit and inconvinient to read.
NFC.
|
| |
|
|
|
|
| |
Currently we append "$auto", or "$auto_record", or prepend "auto_",
or insert "auto" somewhere in the middle.
Use more consistent naming: always append "$auto".
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
syz-declextract accumulated a bunch of code health problems
so that now it's hard to change/extend it, lots of new features
can only be added in in hacky ways and cause lots of code duplication.
It's also completly untested. Rewrite the tool to:
- move as much code as possible to Go (working with the clang tool
is painful for a number of reasons)
- allow testing and add unit tests (first layer of tests test
what information is produced by the clang tool, second layer
of tests test how that information is transformed to descriptions)
- allow extending the clang tool output to export arbitrary info
in non-hacky way (now it produces arbitrary JSON instead of a mix
of incomplete descriptions and interfaces)
- remove code duplication in the clang tool and provide common
infrastructure to add new analysis w/o causing more duplication
- provide more convinient primitives in the clang tool
- improve code style consistency and stick to the LLVM code style
(in particular, variable names must start with a capital letter,
single-statement blocks are not surrounded with {})
- remove intermixing of code that works on different levels
(currently we have AST analysis + busness logic + printfs
all intermixed with each other)
- provide several helper Go packages for better code structuring
(e.g. pkg/clangtool just runs the tool on source files in parallel
and returns results, this already separates a bunch of low-level
logic from the rest of the code under a simple abstraction)
I've tried to make the output match the current output as much as possible
so that the diff is managable (in some cases at the cost of code quality,
this should be fixed in future commits). There are still some differences,
but hopefully they are managable for review (more includes/defines,
reordered some netlink attributes).
Fixed minor bugs are fixed along the way, but mostly NFC:
1. Some unions were incorrectly emitted as [varlen]
(C unions are never varlen).
2. Only a of [packed], [align[N]] attributes was emitted
for struct (both couldn't be emitted).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add caching mode where results of running the Clang tool
are cached for each file and can be reused.
It saves lots of time when only the Go tool changes.
Also allows to look at the output for each file for debugging.
Group all assorted variables in a context struct.
There are lots of assorted vars and will be more.
Support defines in the tool output.
Fix up some includes to more generic ones.
|
| |
|
|
|
| |
Make the tool accept a manager config.
This will be required for dynamic extraction of info from the kernel.
|
| |
|
|
| |
They can clash with our manual flags names.
|