aboutsummaryrefslogtreecommitdiffstats
path: root/dashboard/app/api.go
Commit message (Collapse)AuthorAgeFilesLines
* dashboard/app: add support for AI workflowsDmitry Vyukov2026-01-051-0/+3
| | | | | | | | Support for: - polling for AI jobs - handling completion of AI jobs - submitting job trajectory logs - basic visualization for AI jobs
* all: use any instead of interface{}Dmitry Vyukov2025-12-221-29/+29
| | | | Any is the preferred over interface{} now in Go.
* dashboard/app: add typed handler middlewareDmitry Vyukov2025-12-221-121/+53
| | | | Remove duplicated code related to request deserialization using middleware.
* pkg/coveragedb: update file to subsystem info periodicallyTaras Madan2025-08-071-6/+1
| | | | | | #6070 explains the problem of data propagation. 1. Add weekly /cron/update_coverdb_subsystems. 2. Stop updating subsystems from coverage receiver API.
* all: simplify subsystem revision updatesAleksandr Nogikh2025-07-231-2/+5
| | | | | | Don't specify the subsystem revision in the dashboard config and instead let it be nested in the registered subsystems. This reduces the amount of the manual work needed to switch syzbot to a newer subsystem list.
* dashboard: fix In-Reply-To for send_emailAleksandr Nogikh2025-07-231-1/+6
| | | | | | | For some reason, the default ReplyTo field is converted to a Reply-To header that's not well understood by the mailing lists. Set an In-Reply-To header explicitly.
* dashboard: add send_email methodAleksandr Nogikh2025-07-031-0/+17
| | | | | The API method can be used to send a raw email on behalf of the GAE instance.
* dashboard/app: allow to set spanner context only from testsTaras Madan2025-05-051-2/+1
| | | | getSpannerClient returns prod client as a default.
* pkg/gcs: simplify interface, remove proxy typeTaras Madan2025-04-021-6/+2
|
* dashboard/app: pre-gzip all responsesTaras Madan2025-02-051-8/+8
|
* pkg/coveragedb: store information about covered file functions in dbTaras Madan2025-01-291-1/+1
|
* dashboard/app: test coverage /file linkTaras Madan2025-01-271-7/+2
| | | | | | | | 1. Init coveragedb client once and propagate it through context to enable mocking. 2. Always init coverage handlers. It simplifies testing. 3. Read webGit and coveragedb client from ctx to make it mockable. 4. Use int for file line number and int64 for merged coverage. 5. Add tests.
* all: use min/max functionsDmitry Vyukov2025-01-171-24/+8
| | | | They are shorter, more readable, and don't require temp vars.
* dashboard/app: gcsPayloadHandler ungzip the gcs contentTaras Madan2024-12-201-1/+11
| | | | | All API handlers expect input stream to be ungzipped. This handler shouldn't be the exception.
* dashboard/app: fix parsing bug and accept "gs://" prefixesTaras Madan2024-12-201-5/+5
| | | | Bug: payload is a serialized json, not string.
* dashboard/app: fix create_upload_url bugTaras Madan2024-12-191-1/+1
| | | | create_upload_url: failed to unmarshal response: json: cannot unmarshal number into Go value of type string
* pkg/coveragedb: test SaveMergeResultTaras Madan2024-12-191-3/+10
| | | | | | | | 1. Make interface testable. 2. Add Spanner interfaces. 3. Generate mocks for proxy interfaces. 4. Test SaveMergeResult. 5. Test MergeCSVWriteJSONL and coveragedb.SaveMergeResult integration.
* tools/syz-covermerger: upload coverage as jsonlTaras Madan2024-12-191-23/+13
| | | | | | | | | | | | | | | | | Previous implementation store only the summary of processed records. The summary was <1GB and single processing node was able to manipulate the data. Current implementation stores all the details about records read to make post-processing more flexible. This change was needed to get access to the source manager name and will help to analyze other details. This new implementation requires 20GB mem to process single day records. CSV log interning experiment allowed to merge using 10G. Quarter data aggregation will cost ~100 times more. The alternative is to use stream processing. We can process data kernel-file-by-file. It allows to /15000 memory consumption. This approach is implemented here. We're batching coverage signals by file and store per-file results in GCS JSONL file. See https://jsonlines.org/ to learn about jsonl.
* dashboard/app: upload coverage using GCS bucketTaras Madan2024-12-171-66/+115
|
* dashboard/app: dedup error messagesTaras Madan2024-12-131-9/+7
|
* dashboard/app: deserialize data directly from gz readerTaras Madan2024-12-131-53/+49
|
* tools/syz-covermerger: more logsTaras Madan2024-12-121-2/+2
| | | | | | | I don't see any visible problems but the records in DB are not created. Let's report the amount of records created at the end of the batch step. +log the names of the managers
* prog: annotate image assets with fsck logsFlorent Revest2024-12-091-7/+16
| | | | | | | | | | | | | | | | | | Syscall attributes are extended with a fsck command field which lets file system mount definitions specify a fsck-like command to run. This is required because all file systems have a custom fsck command invokation style. When uploading a compressed image asset to the dashboard, syz-manager also runs the fsck command and logs its output over the dashapi. The dashboard logs these fsck logs into the database. This has been requested by fs maintainer Ted Tso who would like to quickly understand whether a filesystem is corrupted or not before looking at a reproducer in more details. Ultimately, this could be used as an early triage sign to determine whether a bug is obviously critical.
* dashboard: const attempts in db.RunInTransactionSabyrzhan Tasbolatov2024-10-281-7/+12
| | | | | | | | | | | | | | | Use a common number of attempts (10) everywhere in dashboard/app for db.RunInTransaction() with the custom wrapper runInTransaction(). This should fix the issues when the ErrConcurrentTransaction occurs after a default number of attempts (3). If the max. limit (10) is not enough, then it should hint on problem at the tx transaction function itself for the further debugging. For very valuable transactions in createBugForCrash(), reportCrash(), let's leave 30 attempts as it is currently. Fixes: https://github.com/google/syzkaller/issues/5441
* dashboard: optimize reportCrash()Aleksandr Nogikh2024-10-281-10/+33
| | | | | | | | | | | Instead of iterating over all possible seq values, first query the maximum existing value outside of the transaction and then only consider the higher numbers inside the transaction. That should avoid "operating on too many entity groups in a single transaction" errors that we observe on syzbot. Closes #5440.
* dashboard: refactor relevantBackportJobs()Aleksandr Nogikh2024-09-121-4/+4
| | | | Make it return a single slice.
* dashboard: cut out tails instead of heads in putText()Aleksandr Nogikh2024-09-111-13/+21
| | | | | | | | | | | | For some kinds of logs (e.g. crash logs), it's preferable to leave the tail part because it has more important information. Since for other kinds of logs we don't expect any cuts to happen, let's just always cut like this. Refactor the putText() function and its usages in the dashboard. Closes #5250.
* dashboard/app: priority of revoked & no reproSabyrzhan Tasbolatov2024-09-091-8/+4
| | | | | | | | | | | | | If there are non-revoked reproducers, we will always prefer them to the revoked ones. And we do update the priority once we have revoked a reproducer. But if the only reproducer was revoked, we still give that crash a higher priority than other crashes, which never had a reproducer attached. If "repro revoked" should have the same priority as "no repro", then we just need to update Crash.UpdateReportingPriority. Fixes: https://github.com/google/syzkaller/issues/4992.
* pkg/spanner/coveragedb: move package to pkg/coveragedbTaras Madan2024-08-291-1/+1
|
* dashboard/app/api.go: constant time compare secretsTaras Madan2024-07-241-2/+4
| | | | Let's resist the timing attacks.
* dashboard/app/api.go: handleAPI can return less 5xx errorsTaras Madan2024-07-241-12/+7
|
* dashboard/app/api.go: more loggingTaras Madan2024-07-241-5/+5
| | | | We're currently receiving only "unauthorized".
* all: move spanner writes to dashboard/appTaras Madan2024-07-231-0/+37
| | | | dashboard/app knows about subsystems more
* dashboard: specify the types of LogToRepro repliesAleksandr Nogikh2024-07-221-0/+2
|
* dashboard: better filter logs for reproductionAleksandr Nogikh2024-07-191-0/+3
| | | | Filter out the obviously unnecessary bugs.
* dashboard: support manual reproduction requestsAleksandr Nogikh2024-06-251-0/+53
| | | | | | | As it is problematic to set up automatic bidirectional sharing of reproducer files between namespaces, let's support the ability to manually request a reproduction attempt on a specific syz-manager instance. That should help in the time being.
* dashboard: collect and display repro logsAleksandr Nogikh2024-05-271-0/+3
| | | | | This will help us understand how exactly how we have arrived at the reproducer.
* all: fix up context import after go fixDmitry Vyukov2024-04-261-1/+1
|
* all: go fix everythingDmitry Vyukov2024-04-261-1/+1
|
* dashboard: make errors more detailedAleksandr Nogikh2024-04-111-7/+7
| | | | | We're periodically seeing "failed to report build error" errors on the ci, but the error message is missing necessary details.
* dashboard: retest missing backportsAleksandr Nogikh2024-02-051-0/+37
| | | | | | | | Poll missing backport commits and, once a missing backport is found to be present in the fuzzed trees, don't display it on the web dashboard and send an email with the suggestion to close the bug. Closes #4425.
* all: record diverted bug reproductionsAleksandr Nogikh2024-01-301-6/+24
| | | | | | | | | | | | In some cases, we derail during bug reproduction and end up finding and reporting a reproducer for another issue. It causes no problems since it's bucketed using the new title, but it's difficult to trace such situations - on the original bug page, there are no failed reproduction logs and on the new bug page it looks as if we intended to find a reproducer for the bug with the new title. Let's record bug reproduction logs in this case, so that we'd also see a failed bug reproduction attempt on the original bug page.
* all: restart unfinished bug reproductionsAleksandr Nogikh2024-01-301-0/+71
| | | | | | | | | | | There are cases when syz-manager is killed before it could finish bug reproduction. If the bug is frequent, it's not a problem - we might have more luck next time. However, if the bug happened only once, we risk never finding a repro. Let syz-managers periodically query dashboard for crash logs to reproducer. Later we can reuse the same API to move repro sharing functionality out from syz-hub.
* dashboard: capture cover and PCs after corpus triageAleksandr Nogikh2024-01-231-0/+6
| | | | | This statistics allows us to better estimate the amount of coverage that is lost after every syzbot instance is restarted.
* dashboard: introduce an emergency stop modeAleksandr Nogikh2024-01-091-0/+29
| | | | | | | | Add an emergency stop button that can be used by any admin. After it's clicked two times, syzbot stops all reporting and recoding of new bugs. It's assumed that the stop mode is revoked by manually deleting an entry from the database.
* dashboard: cache per-ns manager listsAleksandr Nogikh2023-10-251-0/+1
| | | | | | | | | | The query may take up to 100ms in some cases, while the result changes on quite rare occasions. Let's use a cached version of the data when rendering UI pages. We don't need extra tests because it's already excercised in existing tests that trigger web endpoints.
* dashboard: remove too granular config helpersAleksandr Nogikh2023-10-121-1/+1
| | | | | Now that we mock the config as a whole and not parts of it, these functions have boiled down to 1-liners. We don't need them anymore.
* dashboard: introduce a getNsConfig() helperAleksandr Nogikh2023-10-121-7/+7
| | | | | In many cases we want to just access the namespaces's config. Introduce a special helper function to keep code shorter and more conscise.
* dashboard: access config through contextAleksandr Nogikh2023-10-121-13/+13
| | | | | | | | | | | | | | | | | | | | | We used to have a single global `config` variable and access it throughout the whole dashboard application. However, this approach has been more and more complicated test writing -- sometimes we want the config to be only slightly different, so that it's not worth it adding new namespaces, sometimes we have to test how dashboard handles config changes over time. This has already led to a number of hacky contextWithXXX methods that mocked various parts of the global variable. The rest of the code had to sometimes still use `config` directly and sometimes invoke getXXX(c) methods. This is very inconsistent and prone to errors. With more and more situations where we need to patch the config appearing (see #4118), let's refactor the application to always access config via the getConfig(c) method. This allows us to uniformly patch the config and be sure that the non-patched copy is not accessible from anywhere else.
* dashboard: improve crash priority calculationAleksandr Nogikh2023-08-301-1/+3
| | | | | | | | If there's no config record for the manager, activeManager would return a nil pointer to ConfigManager. Assume the priority to be 0 in this case. Update the admin.go function that recalculates priorities.