aboutsummaryrefslogtreecommitdiffstats
path: root/syz-cluster/controller
Commit message (Collapse)AuthorAgeFilesLines
* syz-cluster: refactor DockerfilesAleksandr Nogikh2025-12-311-6/+5
| | | | | | | Copy everything into the build context. Add a .dockerignore file to avoid copying the definitely unnecessary files and folders. Check copyrights presence in Dockerfiles.
* syz-cluster: add pkg/osutil to Docker containersAleksandr Nogikh2025-12-301-0/+1
| | | | It's become necessary after #6533.
* syz-cluster: log finished session statusAleksandr Nogikh2025-08-261-1/+1
| | | | Print a bit more info to the logs to make them more understandable.
* syz-cluster: update Go version in DockerfilesAleksandr Nogikh2025-07-141-1/+1
| | | | | For some reason, it does not download the newer toolchain versions automatically.
* syz-cluster: generate web dashboard URLs for reportsAleksandr Nogikh2025-07-142-8/+4
| | | | | Take web dashboard URL from the config and use it to generate links for logs, reproducers, etc.
* syz-cluster: avoid UUIDs in blob storeAleksandr Nogikh2025-06-171-13/+6
| | | | | | | | | | | Make blob store URIs dependent on the IDs explicitly passed into the Write() function. In many cases this removes the need to distinguish between the case when the object has already been saved and we must overwrite it and when it's saved the first time. Keep on first storing the object to the blob storage and only then submitting the entities to Spanner. This will lead to some wasted space, but we'll add garbage collection at some point.
* syz-cluster: set proper Service typesAleksandr Nogikh2025-05-211-0/+1
| | | | | | | | As the cluster is private, use the ClusterIP type to only request a cluster-internal IP. Since web dashboard will need to be exposed via Load Balancer, set the necessary metadata annotation.
* syz-cluster: separate global env from global configAleksandr Nogikh2025-04-304-34/+35
| | | | | | | | | Environment variables are convenient for storing values like DB or GCS bucket names, but structured formats are more convenient for the actual service configuration. Separate global-config from global-config-env and add the functionality that queries and parses the config options.
* syz-cluster: clean up running steps of finished workflowsAleksandr Nogikh2025-04-172-1/+80
| | | | | | | | | If the workflow step crashed or timed out, we used to have Running status for such steps even though the session itself may be long finished. In order to prevent this inconsistency, on finishing each session go through all remaining running steps and update their status to Error.
* syz-cluster: add paginationAleksandr Nogikh2025-04-081-1/+1
| | | | | Add simple Previous/Next navigation for the list of series. For now, just rely on SQL's LIMIT/OFFSET functionality.
* syz-cluster: better handle SeriesProcessor restartsAleksandr Nogikh2025-04-022-3/+10
| | | | | | | | | | | | | If the Loop() was restarted in between the moment we marked the session as started in the DB and the moment we actually started the workflow, there was no way back to the normal operation. That was the reason of the sporadic TestProcessor failures we've seen in the presubmit tests. Handle this case in the code by just continuing the non-finished calls. Closes #5776.
* syz-cluster: display and filter by Cc listAleksandr Nogikh2025-03-111-1/+1
| | | | | For each series, display the Cc'd email list and let users filter the patch series list by those addresses.
* syz-cluter: define a service account for servicesAleksandr Nogikh2025-02-261-0/+1
| | | | | For minikube, it changes nothing, but it will make it easier to plug it into GKE.
* syz-cluster: make image prefix and tag configurableAleksandr Nogikh2025-02-261-1/+1
| | | | | | | | | Accept IMAGE_PREFIX and IMAGE_TAG parameters that allow to reuse the Makefile and a lot of k8s configurations both for local and prod environments. Refactor Makefile: define build-* and push-* rules, use templates to avoid repetition.
* syz-cluster/controller: add more loggingAleksandr Nogikh2025-02-191-3/+5
| | | | That should hopefully shed more light on #5776.
* syz-cluster/controller: move the API server to pkg/controllerAleksandr Nogikh2025-02-144-347/+7
| | | | This will facilitate its reuse in tests.
* syz-cluster/controller: move services to pkg/serviceAleksandr Nogikh2025-02-142-312/+16
| | | | | This will facilitate the reuse of the code. Split off SessionService from SeriesService.
* syz-cluster: report series/sessions via APIAleksandr Nogikh2025-02-146-133/+166
| | | | | | | | | | | | | | | | | | | | | | In the previous version of the code, series-tracker was directly pushing patch series into the DB and the controller auto-created fuzzing sessions. Mediate these via the controller API instead. Instead of creating Session objects on the fly, pre-create them and let processor take them one by one. The approach has multiple benefits: 1) The same API might be used for the patch series sources other than LKML. 2) If the existence of Session objects is not a sign that we have started working on it, it allows for a more precise status display (not created/waiting/running/finished). 3) We could manually push older patch series and manually trigger fuzzing sessions to experimentally measure the bug detection rates. 4) The controller tests could be organized only by relying on the API offered by the component.
* syz-cluster: set resource limitsAleksandr Nogikh2025-02-041-0/+7
| | | | | It will be important once we deploy to GKE. For now, let's set just some limits, we'll adjust them over time.
* syz-cluster: use GCS as blob storageAleksandr Nogikh2025-02-042-7/+1
| | | | | | | We already use a GCS emulator for the dev environment, use a separate bucket for blobs. Keep using the local storage driver for unit tests.
* syz-cluster: store session test logsAleksandr Nogikh2025-02-043-8/+29
| | | | Record the logs from the build and fuzzing steps.
* syz-cluster/controller: handle a channel closureAleksandr Nogikh2025-01-281-0/+3
| | | | | If we don't consider the possibility, we risk processing a nil value and causing a nil pointer dereference.
* syz-cluster: use Request.Context()Aleksandr Nogikh2025-01-271-11/+7
|
* syz-cluster: order session tests by dateAleksandr Nogikh2025-01-271-0/+2
| | | | This gives a more natural order than just the names.
* syz-cluster: explicitly set the skip reasonAleksandr Nogikh2025-01-272-0/+30
| | | | | | | | It lets immediately distinguish the series that were actually processed from the series that were skipped early on. By storing a string, we also make it apparent why exactly the series was skipped.
* syz-cluster/controller: make parallelization configurableAleksandr Nogikh2025-01-222-19/+28
| | | | | Configure the number of patch series processed in parallel via an env variable.
* syz-cluster: add support for findingsAleksandr Nogikh2025-01-223-24/+151
| | | | | Findings are crashes and build/boot/test errors that happened during the patch series processing.
* syz-cluster: initial codeAleksandr Nogikh2025-01-2211-0/+876
The basic code of a K8S-based cluster that: * Aggregates new LKML patch series. * Determines the kernel trees to apply them to. * Builds the basic and the patched kernel. * Displays the results on a web dashboard. This is a very rudimentary version with a lot of TODOs that provides a skeleton for further work. The project makes use of Argo workflows and Spanner DB. Bootstrap is used for the web interface. Overall structure: * syz-cluster/dashboard: a web dashboard listing patch series and their test results. * syz-cluster/series-tracker: polls Lore archives and submits the new patch series to the DB. * syz-cluster/controller: schedules workflows and provides API for them. * syz-cluster/kernel-disk: a cron job that keeps a kernel checkout up to date. * syz-cluster/workflow/*: workflow steps. For the DB structure see syz-cluster/pkg/db/migrations/*.