pkg/aflow/flow/assessment: refine KCSAN prompt

Rephrase the prompt to be only about KCSAN, currently it has some leftovers from more generic assessment prompt that covered KASAN bugs as well (actionability). Also add Confident bool output. We may want to act on both benign/non-benign, so we need to know when LLM wasn't actually sure either way. This should also be useful for manual verification/statistics. If LLM is not confident and can can admit that, it's much better than giving a wrong answer. But we will likely want to track percent of non-confident answers.
author: Dmitry Vyukov <dvyukov@google.com> 2026-01-12 12:36:49 +0100
committer: Dmitry Vyukov <dvyukov@google.com> 2026-01-12 12:28:59 +0000
commit: b492d50995971c199eb04edea5d3926010ac92f4 (patch)
tree: 82649fe95d9e0593529b7f4de5819ce57fd7468f
parent: bc54aa9fe40d6d1ffa6f80a1e04a18689ddbc54c (diff)
2 files changed, 34 insertions, 39 deletions
diff --git a/pkg/aflow/flow/assessment/assessment.go b/pkg/aflow/flow/assessment/assessment.go
deleted file mode 100644
index f0e3dadb7..000000000
--- a/pkg/aflow/flow/assessment/assessment.go
+++ /dev/null
@@ -1,13 +0,0 @@
-// Copyright 2025 syzkaller project authors. All rights reserved.
-// Use of this source code is governed by Apache 2 LICENSE that can be found in the LICENSE file.
-
-package assessmenet
-
-// Common inputs for bug assessment when we don't have a reproducer.
-type Inputs struct {
-	CrashReport       string
-	KernelRepo        string
-	KernelCommit      string
-	KernelConfig      string
-	CodesearchToolBin string
-}
diff --git a/pkg/aflow/flow/assessment/kcsan.go b/pkg/aflow/flow/assessment/kcsan.go
index 755113a47..e29ebd5fb 100644
--- a/pkg/aflow/flow/assessment/kcsan.go
+++ b/pkg/aflow/flow/assessment/kcsan.go
@@ -10,13 +10,22 @@ import (
 	"github.com/google/syzkaller/pkg/aflow/tool/codesearcher"
 )
 
-type KCSANOutputs struct {
+type kcsanInputs struct {
+	CrashReport       string
+	KernelRepo        string
+	KernelCommit      string
+	KernelConfig      string
+	CodesearchToolBin string
+}
+
+type kcsanOutputs struct {
+	Confident   bool
 	Benign      bool
 	Explanation string
 }
 
 func init() {
-	aflow.Register[Inputs, KCSANOutputs](
+	aflow.Register[kcsanInputs, kcsanOutputs](
 		ai.WorkflowAssessmentKCSAN,
 		"assess if a KCSAN report is about a benign race that only needs annotations or not",
 		&aflow.Flow{
@@ -29,11 +38,12 @@ func init() {
 						Name:  "expert",
 						Reply: "Explanation",
 						Outputs: aflow.LLMOutputs[struct {
-							Benign bool `jsonschema:"If the data race is benign or not."`
+							Confident bool `jsonschema:"If you are confident in the verdict of the analysis or not."`
+							Benign    bool `jsonschema:"If the data race is benign or not."`
 						}](),
 						Temperature: 1,
-						Instruction: instruction,
-						Prompt:      prompt,
+						Instruction: kcsanInstruction,
+						Prompt:      kcsanPrompt,
 						Tools:       codesearcher.Tools,
 					},
 				},
@@ -42,35 +52,33 @@ func init() {
 	)
 }
 
-const instruction = `
-You are an experienced Linux kernel developer tasked with determining if the given kernel bug
-report is actionable or not. Actionable means that it contains enough info to root cause
-the underlying bug, and that the report is self-consistent and makes sense, rather than
-a one-off nonsensical crash induced by a previous memory corruption.
-
-Use the provided tools to confirm any assumptions, what variables/fields being accessed, etc.
-In particular, don't make assumptions about the kernel source code,
-use codesearch tools to read the actual source code.
-
-The bug report is a data race report from KCSAN tool.
+const kcsanInstruction = `
+You are an experienced Linux kernel developer tasked with determining if the given kernel
+data race is benign or not. The data race report is from KCSAN tool.
 It contains 2 stack traces of the memory accesses that constitute a data race.
-The report would be inconsistent, if the stacks point to different subsystems,
-or if they access different fields.
-The report would be non-actionable, if the underlysing data race is "benign".
-That is, the race is on a simple int/bool or similar field, and the accesses
-are not supposed to be protected by any mutual exclusion primitives.
+
+A "benign" data races are on a simple int/bool variable or similar field,
+and the accesses are not supposed to be protected by any mutual exclusion primitives.
 Common examples of such "benign" data races are accesses to various flags fields,
-statistics counters, and similar.
-An actionable race is "harmful", that is can lead to corruption/crash even with
+statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash
+with a conservative compiler that compiles memory accesses to primitive types
+effectively as atomic.
+
+A non-benign (or "harmful" data race) can lead to corruption/crash even with
 a conservative compiler that compiles memory accesses to primitive types
 effectively as atomic. A common example of a "harmful" data races is race on
 a complex container (list/hashmap/etc), where accesses are supposed to be protected
 by a mutual exclusion primitive.
-In the final reply explain why you think the report is consistent and the data race is harmful.
+
+In the final reply explain why you think the given data race is benign or is harmful.
+
+Use the provided tools to confirm any assumptions, what variables/fields being accessed, etc.
+In particular, don't make assumptions about the kernel source code,
+use codesearch tools to read the actual source code.
 `
 
-const prompt = `
-The bug report is:
+const kcsanPrompt = `
+The data race report is:
 
 {{.CrashReport}}
 `
author	Dmitry Vyukov <dvyukov@google.com>	2026-01-12 12:36:49 +0100
committer	Dmitry Vyukov <dvyukov@google.com>	2026-01-12 12:28:59 +0000
commit	b492d50995971c199eb04edea5d3926010ac92f4 (patch)
tree	82649fe95d9e0593529b7f4de5819ce57fd7468f
parent	bc54aa9fe40d6d1ffa6f80a1e04a18689ddbc54c (diff)