aboutsummaryrefslogtreecommitdiffstats
path: root/pkg/aflow/flow_test.go
Commit message (Collapse)AuthorAgeFilesLines
* pkg/aflow: ensure we don't register MCP tools with duplicate namesDmitry Vyukov2026-03-101-0/+5
| | | | | | | If we have duplicate names, then only one of the duplicates will be used at random. Add a check that we don't have duplicate names. Currently it's only "crash-reproducer" (both action and a tool). Also ignore "set-results" tool, and all tools created in tests.
* pkg/aflow: add Flow.Consts instead of ProvideDmitry Vyukov2026-03-091-0/+31
| | | | | | | | | There is no point in using Provide more than once, and anywhere besides the first action of a flow. So it's not really an action, but more of a flow property. Add Flow.Consts field to handle this case better. Also provide slightly less verbose syntax by using a map instead of a struct, and add tests.
* pkg/aflow: abstract away LLM temperatureDmitry Vyukov2026-02-021-7/+7
| | | | | | | | | | Introduce abstract "task type" for LLM agents instead of specifying temperature explicitly for each agent. This has 2 advantages: - we don't hardcode it everywhere, and can change centrally as our understanding of the right temperature evolves - we can control other LLM parameters (topn/topk) using task type as well Update #6576
* pkg/aflow: keep LLM reply on tool callsDmitry Vyukov2026-01-261-0/+3
|
* pkg/aflow: fix Temperature handlingDmitry Vyukov2026-01-261-1/+1
| | | | | | If LLMAgent.Temperature is assigned an untyped float const (0.5) it will be typed as float64 rather than float32. So recast them. Cap Temperature at model's supported MaxTemperature.
* pkg/aflow: refactor in preparation for DoWhileDmitry Vyukov2026-01-241-1/+1
| | | | | | | | A bunch of NFC refactorings: - split action verification into 2 phases (inputs/outputs) - change how LLMTool is verified - remove some unused fields/parameters - improve error messages a bit
* pkg/aflow: unexport Pipeline typeDmitry Vyukov2026-01-231-2/+2
| | | | | | | I've added NewPipeline constructor for a bit nicer syntax, but failed to use it in actual workflows. Unexport Pipeline and rename NewPipeline to Pipeline. This slightly improves workflows definition syntax.
* pkg/aflow: handle empty LLM repliesDmitry Vyukov2026-01-231-0/+2
|
* pkg/aflow: refactor testsDmitry Vyukov2026-01-231-1254/+245
| | | | | | Add helper function that executes test workflows, compares results (trajectory, LLM requests) against golden files, and if requested updates these golden files.
* pkg/aflow: cache LLM requestsDmitry Vyukov2026-01-211-33/+27
| | | | | | Using cached replies is faster, cheaper, and more reliable. Espcially handy during development when the same workflows are retried lots of times with some changes.
* pkg/aflow: handle ints in tool argumentsDmitry Vyukov2026-01-211-3/+6
|
* pkg/aflow: inject tool errors into trajectoryDmitry Vyukov2026-01-211-11/+83
| | | | | | | | | | Currently we handle several errors in LLMAgent (wrong tool name, wrong tool arguments), and return the error to LLM, but nothing is injected into the trajectory wrt what happened. This makes trajectory incomplete and confusing, one just sees repeated LLM calls w/o understanding what caused them. Inject these tool failures into the trace, so that it's clear what happened.
* pkg/aflow: extend TestToolMisbehavior with trajectory checkingDmitry Vyukov2026-01-211-3/+193
|
* pkg/aflow: ask LLM to call several tools at the same timeDmitry Vyukov2026-01-201-3/+3
| | | | This seems to help a bit with number of round-trips.
* pkg/aflow: handle common LLM mis-behaviors wrt tool callingDmitry Vyukov2026-01-201-1/+237
| | | | | | | | | | | | | Gracefully handle (reply to LLM with error): - incorrect tool name - incorrect tool arg type - missing tool arg Silently handle: - more than one call to set-results - excessive tool args Fixes #6604
* pkg/aflow: handle model quota errorsDmitry Vyukov2026-01-201-0/+22
| | | | | | | | Detect model quota violations (assumed to be RPD). Make syz-agent not request jobs that use the model until the next quota reset time. Fixes #6573
* pkg/aflow: make LLM model per-agent rather than per-flowDmitry Vyukov2026-01-201-7/+35
| | | | | | Having LLM model per-agent is even more flexible than per-flow. We can have some more complex tasks during patch generation with the most elaborate model, but also some simpler ones with less elaborate models.
* pkg/aflow: add ability to generate several candidate replies for LLM agentsDmitry Vyukov2026-01-191-57/+393
| | | | | | | | | Add LLMAgent.Candidates parameter. If set to a value N>1, then the agent is invoked N times, and all outputs become slices. The results can be later aggregated by another agent, as shown in the test.
* pkg/aflow: allow to specify model per-flowDmitry Vyukov2026-01-141-1/+3
| | | | | We may want to use a weaker model for some workflows. Allow to use different models for different workflows.
* pkg/aflow: add package for agentic workflowsDmitry Vyukov2026-01-091-0/+554