diff options
| author | Dmitry Vyukov <dvyukov@google.com> | 2026-01-12 12:36:49 +0100 |
|---|---|---|
| committer | Dmitry Vyukov <dvyukov@google.com> | 2026-01-12 12:28:59 +0000 |
| commit | b492d50995971c199eb04edea5d3926010ac92f4 (patch) | |
| tree | 82649fe95d9e0593529b7f4de5819ce57fd7468f | |
| parent | bc54aa9fe40d6d1ffa6f80a1e04a18689ddbc54c (diff) | |
pkg/aflow/flow/assessment: refine KCSAN prompt
Rephrase the prompt to be only about KCSAN,
currently it has some leftovers from more generic assessment prompt
that covered KASAN bugs as well (actionability).
Also add Confident bool output.
We may want to act on both benign/non-benign,
so we need to know when LLM wasn't actually sure either way.
This should also be useful for manual verification/statistics.
If LLM is not confident and can can admit that, it's much better
than giving a wrong answer. But we will likely want to track
percent of non-confident answers.
| -rw-r--r-- | pkg/aflow/flow/assessment/assessment.go | 13 | ||||
| -rw-r--r-- | pkg/aflow/flow/assessment/kcsan.go | 60 |
2 files changed, 34 insertions, 39 deletions
diff --git a/pkg/aflow/flow/assessment/assessment.go b/pkg/aflow/flow/assessment/assessment.go deleted file mode 100644 index f0e3dadb7..000000000 --- a/pkg/aflow/flow/assessment/assessment.go +++ /dev/null @@ -1,13 +0,0 @@ -// Copyright 2025 syzkaller project authors. All rights reserved. -// Use of this source code is governed by Apache 2 LICENSE that can be found in the LICENSE file. - -package assessmenet - -// Common inputs for bug assessment when we don't have a reproducer. -type Inputs struct { - CrashReport string - KernelRepo string - KernelCommit string - KernelConfig string - CodesearchToolBin string -} diff --git a/pkg/aflow/flow/assessment/kcsan.go b/pkg/aflow/flow/assessment/kcsan.go index 755113a47..e29ebd5fb 100644 --- a/pkg/aflow/flow/assessment/kcsan.go +++ b/pkg/aflow/flow/assessment/kcsan.go @@ -10,13 +10,22 @@ import ( "github.com/google/syzkaller/pkg/aflow/tool/codesearcher" ) -type KCSANOutputs struct { +type kcsanInputs struct { + CrashReport string + KernelRepo string + KernelCommit string + KernelConfig string + CodesearchToolBin string +} + +type kcsanOutputs struct { + Confident bool Benign bool Explanation string } func init() { - aflow.Register[Inputs, KCSANOutputs]( + aflow.Register[kcsanInputs, kcsanOutputs]( ai.WorkflowAssessmentKCSAN, "assess if a KCSAN report is about a benign race that only needs annotations or not", &aflow.Flow{ @@ -29,11 +38,12 @@ func init() { Name: "expert", Reply: "Explanation", Outputs: aflow.LLMOutputs[struct { - Benign bool `jsonschema:"If the data race is benign or not."` + Confident bool `jsonschema:"If you are confident in the verdict of the analysis or not."` + Benign bool `jsonschema:"If the data race is benign or not."` }](), Temperature: 1, - Instruction: instruction, - Prompt: prompt, + Instruction: kcsanInstruction, + Prompt: kcsanPrompt, Tools: codesearcher.Tools, }, }, @@ -42,35 +52,33 @@ func init() { ) } -const instruction = ` -You are an experienced Linux kernel developer tasked with determining if the given kernel bug -report is actionable or not. Actionable means that it contains enough info to root cause -the underlying bug, and that the report is self-consistent and makes sense, rather than -a one-off nonsensical crash induced by a previous memory corruption. - -Use the provided tools to confirm any assumptions, what variables/fields being accessed, etc. -In particular, don't make assumptions about the kernel source code, -use codesearch tools to read the actual source code. - -The bug report is a data race report from KCSAN tool. +const kcsanInstruction = ` +You are an experienced Linux kernel developer tasked with determining if the given kernel +data race is benign or not. The data race report is from KCSAN tool. It contains 2 stack traces of the memory accesses that constitute a data race. -The report would be inconsistent, if the stacks point to different subsystems, -or if they access different fields. -The report would be non-actionable, if the underlysing data race is "benign". -That is, the race is on a simple int/bool or similar field, and the accesses -are not supposed to be protected by any mutual exclusion primitives. + +A "benign" data races are on a simple int/bool variable or similar field, +and the accesses are not supposed to be protected by any mutual exclusion primitives. Common examples of such "benign" data races are accesses to various flags fields, -statistics counters, and similar. -An actionable race is "harmful", that is can lead to corruption/crash even with +statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash +with a conservative compiler that compiles memory accesses to primitive types +effectively as atomic. + +A non-benign (or "harmful" data race) can lead to corruption/crash even with a conservative compiler that compiles memory accesses to primitive types effectively as atomic. A common example of a "harmful" data races is race on a complex container (list/hashmap/etc), where accesses are supposed to be protected by a mutual exclusion primitive. -In the final reply explain why you think the report is consistent and the data race is harmful. + +In the final reply explain why you think the given data race is benign or is harmful. + +Use the provided tools to confirm any assumptions, what variables/fields being accessed, etc. +In particular, don't make assumptions about the kernel source code, +use codesearch tools to read the actual source code. ` -const prompt = ` -The bug report is: +const kcsanPrompt = ` +The data race report is: {{.CrashReport}} ` |
