feat: add github/openssf scrapers by moshloop · Pull Request #1875 · flanksource/config-db

moshloop · 2026-01-23T08:13:01Z

Summary by CodeRabbit

New Features
- Added GitHub Security scraper for Dependabot, code scanning, and secret scanning alerts.
- Added OpenSSF Scorecard scraper for repository security assessments.
- Enhanced debugging and result merging capabilities for scraped data.
Bug Fixes
- Improved error handling and validation in JSON processing and AWS Route53 configuration.
Chores
- Added test fixtures and supporting infrastructure for new scrapers.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-23T08:13:37Z

Walkthrough

This PR introduces two new security-focused scrapers (GitHub Security and OpenSSF Scorecard), extends the API with enhanced result formatting methods, adds support for change redirection across configurations, and implements new CLI flags for result export and database persistence.

Changes

Cohort / File(s)	Summary
GitHub Security Scraper `api/v1/github_security.go`, `scrapers/github/security.go`, `scrapers/github/security_client.go`	New scraper type for monitoring GitHub repository security alerts (Dependabot, code scanning, secret scanning) with incremental scraping support and rate-limit handling.
OpenSSF Scorecard Scraper `api/v1/openssf.go`, `scrapers/openssf/client.go`, `scrapers/openssf/scorecard.go`, `scrapers/openssf/compliance.go`	New scraper type for assessing repository security posture via OpenSSF Scorecard API with caching, retry logic, and compliance framework mappings.
Core API Extensions `api/v1/types.go`, `api/v1/scrapeconfig_types.go`, `api/context.go`	Added GitHubSecurity and OpenSSFScorecard fields to ScraperSpec; introduced IsDebug() and IsTrace() accessor methods on ScrapeConfig and ScrapeContext.
Result Formatting & Aggregation `api/v1/interface.go`, `api/v1/common.go`	New methods for rendering ScrapeResult and related types (Debug, Pretty, PrettyRow, PrettyShort); new FullScrapeResults type and MergeScrapeResults function for aggregating multiple scrape outputs.
Generated Deep Copy Methods `api/v1/zz_generated.deepcopy.go`	Added DeepCopyInto and DeepCopy methods for new types (GitHubSecurity, OpenSSFScorecard, related filters and repository types) and updated ScraperSpec deep copy logic.
Change Redirection Support `scrapers/changes/rules.go`, `scrapers/changes/rules_test.go`	Extended ChangeMapping with ConfigID, ConfigType, and ScraperID fields to support redirecting changes across configurations; added test cases for change redirection scenarios.
Scraper Registry Integration `scrapers/common.go`	Registered new GitHub Security and OpenSSF Scorecard scrapers in the public All scrapers slice.
CLI and Result Output `cmd/run.go`	Added \--export and \--save flags; new result merging logic that aggregates all scrape results and outputs via pretty-printing or database persistence.
Processing and Utilities `scrapers/processors/json.go`, `scrapers/processors/script.go`, `scrapers/sql/sql.go`, `scrapers/aws/aws.go`	Enhanced error handling and logging in JSON/Script processors; added debug logging for SQL queries; minor AWS Route53 alias handling improvements.
Test Fixtures & Data `fixtures/github-security.yaml`, `fixtures/openssf.yaml`, `scrapers/github/testdata/security/`, `scrapers/openssf/testdata/`	New YAML scraper configurations and JSON test datasets for GitHub security alerts (Dependabot, code scanning, secret scanning) and OpenSSF scorecard responses.

Sequence Diagram(s)

sequenceDiagram
    participant Runner as Run Command
    participant Scraper as GitHub Security<br/>Scraper
    participant Client as GitHub<br/>Security Client
    participant API as GitHub API
    participant Formatter as Result<br/>Formatter
    participant Output as Output Handler<br/>(File/DB/Print)

    Runner->>Scraper: Scrape(ctx)
    Scraper->>Client: NewGitHubSecurityClient(config)
    Scraper->>Client: Iterate repositories
    Client->>Client: CheckRateLimit()
    alt Rate Limited
        Client-->>Scraper: Wait duration
        Scraper->>Scraper: Sleep
    end
    Client->>API: GetDependabotAlerts()
    API-->>Client: Alert data
    Client->>API: GetCodeScanningAlerts()
    API-->>Client: Alert data
    Client->>API: GetSecretScanningAlerts()
    API-->>Client: Alert data
    Client-->>Scraper: Aggregated alerts
    Scraper->>Scraper: Calculate health<br/>status
    Scraper-->>Runner: ScrapeResult
    Runner->>Formatter: Render result
    Formatter-->>Output: Formatted output

sequenceDiagram
    participant Runner as Run Command
    participant Scraper as OpenSSF<br/>Scorecard Scraper
    participant Cache as ScrapeTime<br/>Cache
    participant Client as Scorecard<br/>API Client
    participant API as OpenSSF<br/>Scorecard API
    participant Processor as Result<br/>Processor

    Runner->>Scraper: Scrape(ctx)
    Scraper->>Cache: Check LastScorecardScrapeTime
    alt Cached & Valid
        Scraper-->>Runner: Skip (TTL active)
    else Fetch Required
        Scraper->>Client: GetRepositoryScorecard()
        Client->>API: GET /api/projects/{owner}/{repo}
        alt Retryable Error
            Client->>Client: Retry (up to 3x)
        end
        API-->>Client: ScorecardResponse
        Client-->>Scraper: Scorecard data
        Scraper->>Processor: Process checks
        Processor->>Processor: Filter by MinScore
        Processor->>Processor: Create analyses per check
        Processor-->>Scraper: Structured results
        Scraper->>Cache: Update LastScorecardScrapeTime
        Scraper-->>Runner: ScrapeResult[]
    end

sequenceDiagram
    participant Cmd as Run Command
    participant Scrapers as All Scrapers
    participant Merger as Result Merger
    participant Exporter as File Exporter
    participant Printer as Pretty Printer
    participant DB as Duty Database

    Cmd->>Scrapers: Execute all scrapers
    Scrapers-->>Cmd: ScrapeResults[]
    Cmd->>Merger: MergeScrapeResults(results)
    Merger->>Merger: Aggregate Configs
    Merger->>Merger: Aggregate Changes
    Merger->>Merger: Aggregate Analysis
    Merger-->>Cmd: FullScrapeResults
    alt --export flag set
        Cmd->>Exporter: Export resources to files
        Exporter-->>Cmd: Write complete
    end
    alt No --export flag
        Cmd->>Printer: Print merged results
        Printer-->>Cmd: Formatted output
    end
    alt --save flag set & DB configured
        Cmd->>DB: Persist results
        DB-->>Cmd: Save complete
        Cmd->>Printer: Print summary
    end

Possibly related PRs

Pr/exec #1787: Modifies ScrapeConfig-related type discrimination logic in api/v1/scrapeconfig_types.go, directly related to the changes introduced for GitHubSecurity and OpenSSFScorecard type recognition.

Suggested reviewers

adityathebe

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat: add github/openssf scrapers' clearly and concisely summarizes the main change—adding two new scraper implementations (GitHub Security and OpenSSF Scorecard) as evidenced by the multiple new files and type definitions across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch security-scrapers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

api/v1/types.go (1)

130-136: ApplyPlugin skips GitHubSecurity/OpenSSFScorecard.

New scraper specs won’t receive BaseScraper plugin settings, so plugin labels/transforms won’t apply.

🛠️ Proposed fix

 	for i := range spec.GithubActions {
 		spec.GithubActions[i].BaseScraper = spec.GithubActions[i].BaseScraper.ApplyPlugins(plugins...)
 	}
+
+	for i := range spec.GitHubSecurity {
+		spec.GitHubSecurity[i].BaseScraper = spec.GitHubSecurity[i].BaseScraper.ApplyPlugins(plugins...)
+	}
+
+	for i := range spec.OpenSSFScorecard {
+		spec.OpenSSFScorecard[i].BaseScraper = spec.OpenSSFScorecard[i].BaseScraper.ApplyPlugins(plugins...)
+	}
 
 	for i := range spec.Azure {
 		spec.Azure[i].BaseScraper = spec.Azure[i].BaseScraper.ApplyPlugins(plugins...)
 	}

🤖 Fix all issues with AI agents

In `@api/context.go`:
- Around line 161-167: IsDebug() and IsTrace() may dereference a nil
scrapeConfig; update both methods to guard against nil (and nil Spec if Spec is
a pointer) similarly to ScraperID(): check if ctx.scrapeConfig == nil (and
ctx.scrapeConfig.Spec == nil when applicable) and return false if nil, otherwise
call and return ctx.scrapeConfig.Spec.IsDebug() / IsTrace(); this prevents
panics when WithScrapeConfig() hasn't been called.

In `@api/v1/github_security.go`:
- Around line 43-52: ParseMaxAge currently calls time.ParseDuration which
doesn't accept the "d" day suffix; update GitHubSecurityFilters.ParseMaxAge to
detect a trailing "d" (case-insensitive), parse the numeric portion as an
integer or float representing days, convert to a time.Duration via days * 24 *
time.Hour, and return that; for all other inputs keep using time.ParseDuration
and propagate its error. Ensure the implementation handles empty MaxAge (already
returns 0,nil), returns a clear error on invalid numeric parsing, and references
the MaxAge field and ParseMaxAge method when making the change.

In `@api/v1/openssf.go`:
- Around line 10-11: The MinScore field currently uses *float64 which breaks
controller-gen; change MinScore to a string pointer (e.g., *string) and add
validation/tagging as needed in the struct, then update the scraper logic to
parse the string into a float64 (handle nil/parse errors) wherever MinScore is
consumed; alternatively, if you must keep a float, add the kubebuilder CRD
marker // +kubebuilder:validation:Type=number and ensure controller-gen is
invoked with crd:allowDangerousTypes=true—locate and update the MinScore field
in the api/v1/openssf.go struct for these changes.

In `@scrapers/github/security_client.go`:
- Around line 121-133: GetSecurityAdvisories currently omits pagination so
results are limited to the API default; update the request options in
GetSecurityAdvisories to copy AlertListOptions.Page and AlertListOptions.PerPage
into the github.ListRepositorySecurityAdvisoriesOptions (alongside State) before
calling c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories so callers
can control Page and PerPage like the other methods (GetDependabotAlerts,
GetCodeScanningAlerts, GetSecretScanningAlerts).

In `@scrapers/github/security.go`:
- Around line 94-106: ScrapeResult.CreatedAt is typed as *time.Time but the code
passes a time.Time from repo.CreatedAt.GetTime(); fix by creating a local
variable to hold the returned time (e.g., tm := repo.CreatedAt.GetTime()) and
assign CreatedAt to its address (CreatedAt: &tm) when constructing the
v1.ScrapeResult; update the construction in the block that builds the
ScrapeResult (v1.ScrapeResult / CreatedAt) so it uses the pointer to the local
time variable.

In `@scrapers/openssf/compliance.go`:
- Around line 10-11: The file-level comment above ComplianceMappings incorrectly
cites "plugins/vite-plugin-security.js" as the data source; update the comment
to reference the correct source (e.g., the OpenSSF Scorecard documentation or
the actual mapping source used in this repo) so it accurately documents where
the compliance mapping originates; locate the top comment near the
ComplianceMappings declaration in scrapers/openssf/compliance.go and replace the
incorrect reference with the proper source name/URL (or remove the irrelevant
Vite/Vue reference) while keeping the explanatory text about mapping OpenSSF
Scorecard checks to compliance framework controls.

In `@scrapers/openssf/scorecard.go`:
- Around line 274-276: The function scorecardCheckNameToKebab is a no-op (it
replaces "-" with "-" and is unused); either remove the function completely or
implement proper kebab-case conversion and use it where needed—e.g., transform
spaces/underscores to dashes and lowercase (apply to scorecardCheckNameToKebab),
or delete the scorecardCheckNameToKebab declaration and any references to it if
no conversion is required.

🧹 Nitpick comments (4)

scrapers/github/security.go (1)

291-296: createConfigInsights is still a FIXME.

Please implement or track via a follow-up issue so the alert details aren’t silently dropped.
scrapers/github/security_client.go (1)
59-79: Empty state string passed to API unconditionally.

When opts.State is empty, you're passing &opts.State (pointer to empty string) to the API. This may cause unexpected filtering behavior depending on how the GitHub API handles empty state values.
Suggested fix to conditionally set State
 func (c *GitHubSecurityClient) GetDependabotAlerts(ctx context.Context, opts AlertListOptions) ([]*github.DependabotAlert, *github.Response, error) {
 	reqOpts := &github.ListAlertsOptions{
-		State: &opts.State,
 		ListOptions: github.ListOptions{
 			Page:    opts.Page,
 			PerPage: opts.PerPage,
 		},
 	}
 
+	if opts.State != "" {
+		reqOpts.State = &opts.State
+	}
+
 	if opts.Severity != "" {
 		reqOpts.Severity = &opts.Severity
 	}
scrapers/openssf/scorecard.go (2)
74-84: Type assertion without safety check could panic.

Line 76 uses a direct type assertion lastScrape.(time.Time) without the comma-ok pattern. If a non-time.Time value is ever stored in the map (due to a bug or future change), this will panic.
Suggested fix for safer type assertion
 func shouldSkipScorecardFetch(ctx api.ScrapeContext, repoFullName string) (bool, string) {
 	if lastScrape, ok := LastScorecardScrapeTime.Load(repoFullName); ok {
-		lastTime := lastScrape.(time.Time)
+		lastTime, ok := lastScrape.(time.Time)
+		if !ok {
+			ctx.Logger.V(3).Infof("invalid cache entry for %s, will refetch", repoFullName)
+			return false, ""
+		}
 		timeSinceLastScrape := time.Since(lastTime)
139-143: Consider if score < 10 is too strict for "failing" status.

A score of 9/10 is marked as "failing", which may be overly strict. Most scoring systems consider 7+ or 8+ as passing. This creates inconsistency with createRepositoryProperties where score >= 7 counts as passing.
Consider aligning with the passing threshold used elsewhere
-		if check.Score < 10 {
+		if check.Score < 7 {
 			analysis.Status = "failing"
 		} else {
 			analysis.Status = "passing"
 		}

coderabbitai · 2026-01-23T08:37:39Z

api/context.go

 func (ctx ScrapeContext) IsTrace() bool {
 	return ctx.scrapeConfig.Spec.IsTrace()
 }
+
+func (ctx ScrapeContext) IsDebug() bool {
+	return ctx.scrapeConfig.Spec.IsDebug()
+}


⚠️ Potential issue | 🟠 Major

Potential nil pointer dereference in IsDebug() and IsTrace().

Both methods access ctx.scrapeConfig.Spec without checking if scrapeConfig is nil. Other methods like ScraperID() (lines 144-149) perform this nil check. If IsDebug() or IsTrace() is called before WithScrapeConfig(), this will panic.

Suggested fix for IsDebug (same pattern should be applied to IsTrace)

func (ctx ScrapeContext) IsDebug() bool { + if ctx.scrapeConfig == nil { + return false + } return ctx.scrapeConfig.Spec.IsDebug() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (ctx ScrapeContext) IsTrace() bool {

return ctx.scrapeConfig.Spec.IsTrace()

}

func (ctx ScrapeContext) IsDebug() bool {

return ctx.scrapeConfig.Spec.IsDebug()

}

func (ctx ScrapeContext) IsTrace() bool {

return ctx.scrapeConfig.Spec.IsTrace()

}

func (ctx ScrapeContext) IsDebug() bool {

if ctx.scrapeConfig == nil {

return false

}

return ctx.scrapeConfig.Spec.IsDebug()

}

🤖 Prompt for AI Agents

In `@api/context.go` around lines 161 - 167, IsDebug() and IsTrace() may dereference a nil scrapeConfig; update both methods to guard against nil (and nil Spec if Spec is a pointer) similarly to ScraperID(): check if ctx.scrapeConfig == nil (and ctx.scrapeConfig.Spec == nil when applicable) and return false if nil, otherwise call and return ctx.scrapeConfig.Spec.IsDebug() / IsTrace(); this prevents panics when WithScrapeConfig() hasn't been called.

coderabbitai · 2026-01-23T08:37:39Z

api/v1/github_security.go

+	// MaxAge filters alerts by age (e.g., "90d", "30d")
+	MaxAge string `yaml:"maxAge,omitempty" json:"maxAge,omitempty"`
+}
+
+// ParseMaxAge converts the MaxAge string to a time.Duration
+func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
+	if f.MaxAge == "" {
+		return 0, nil
+	}
+	return time.ParseDuration(f.MaxAge)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Go time.ParseDuration supported time units documentation

💡 Result:

In Go’s standard library, time.ParseDuration accepts these unit suffixes (as documented):

ns

us (or µs)

ms

s

m

h (hours)

Docs: time.ParseDuration on pkg.go.dev/time (see “Valid time units…”). [1]

Examples: 300ms, -1.5h, 2h45m, 1µs / 1us. [1]

Sources

[1] https://pkg.go.dev/time#ParseDuration

🏁 Script executed:

# Search for MaxAge documentation examples in the codebase rg -A 5 -B 5 "MaxAge" --type go | head -100

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

# Specifically look for "90d" or "30d" examples in comments or strings rg "(90d|30d)"

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

fd github_security.go --exec cat {}

Repository: flanksource/config-db

Length of output: 186

ParseMaxAge cannot handle day units—time.ParseDuration only accepts ns, us, ms, s, m, h.

If the documentation or code examples suggest using "90d" or "30d", they will fail. Either update examples to hours (e.g., 2160h for 90 days) or implement custom day parsing.

🛠️ Proposed fix (support "d" suffix)

import ( + "strconv" + "strings" "time" "github.com/flanksource/duty/types" ) @@ func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) { if f.MaxAge == "" { return 0, nil } + if strings.HasSuffix(f.MaxAge, "d") { + daysStr := strings.TrimSuffix(f.MaxAge, "d") + days, err := strconv.Atoi(daysStr) + if err != nil { + return 0, err + } + return time.Hour * 24 * time.Duration(days), nil + } return time.ParseDuration(f.MaxAge) }

🤖 Prompt for AI Agents

In `@api/v1/github_security.go` around lines 43 - 52, ParseMaxAge currently calls time.ParseDuration which doesn't accept the "d" day suffix; update GitHubSecurityFilters.ParseMaxAge to detect a trailing "d" (case-insensitive), parse the numeric portion as an integer or float representing days, convert to a time.Duration via days * 24 * time.Hour, and return that; for all other inputs keep using time.ParseDuration and propagate its error. Ensure the implementation handles empty MaxAge (already returns 0,nil), returns a clear error on invalid numeric parsing, and references the MaxAge field and ParseMaxAge method when making the change.

coderabbitai · 2026-01-23T08:37:39Z

api/v1/openssf.go

+	// MinScore optionally filters repositories by minimum score (0-10)
+	MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"`


⚠️ Potential issue | 🔴 Critical

Fix float type to unblock CRD generation.

The pipeline is failing because controller-gen disallows float types by default due to cross-language serialization inconsistencies. Consider using a string type with validation, or add the // +kubebuilder:validation:Type=number marker with crd:allowDangerousTypes=true in the generator invocation.

🔧 Recommended fix: Use string type with custom unmarshaling

// OpenSSFScorecard scraper fetches security scorecard data from the OpenSSF Scorecard API type OpenSSFScorecard struct { BaseScraper `json:",inline" yaml:",inline"` // Repositories is the list of repositories to assess Repositories []OpenSSFRepository `yaml:"repositories" json:"repositories"` - // MinScore optionally filters repositories by minimum score (0-10) - MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"` + // MinScore optionally filters repositories by minimum score (0-10) + // +kubebuilder:validation:Pattern=`^([0-9]|10)(\.[0-9]+)?$` + MinScore string `yaml:"minScore,omitempty" json:"minScore,omitempty"` }

Then parse the string to float64 in the scraper logic. Alternatively, if floats are truly needed, update the controller-gen invocation to include crd:allowDangerousTypes=true.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// MinScore optionally filters repositories by minimum score (0-10)

MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"`

// MinScore optionally filters repositories by minimum score (0-10)

// +kubebuilder:validation:Pattern=`^([0-9]|10)(\.[0-9]+)?$`

MinScore string `yaml:"minScore,omitempty" json:"minScore,omitempty"`

🧰 Tools

🪛 GitHub Actions: Test

[error] 11-11: controller-gen failed: CRD generation error for api/v1/openssf.go: found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true. (Step: controller-gen crd paths="./api/..." output:crd:artifacts:config=chart/crds)

🪛 GitHub Check: lint

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🪛 GitHub Check: test

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🪛 GitHub Check: test-prod

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🤖 Prompt for AI Agents

In `@api/v1/openssf.go` around lines 10 - 11, The MinScore field currently uses *float64 which breaks controller-gen; change MinScore to a string pointer (e.g., *string) and add validation/tagging as needed in the struct, then update the scraper logic to parse the string into a float64 (handle nil/parse errors) wherever MinScore is consumed; alternatively, if you must keep a float, add the kubebuilder CRD marker // +kubebuilder:validation:Type=number and ensure controller-gen is invoked with crd:allowDangerousTypes=true—locate and update the MinScore field in the api/v1/openssf.go struct for these changes.

coderabbitai · 2026-01-23T08:37:39Z

scrapers/github/security_client.go

+// GetSecurityAdvisories fetches repository security advisories
+func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {
+	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
+		State: opts.State,
+	}
+
+	advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)
+	if err != nil {
+		return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)
+	}
+
+	return advisories, resp, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

go-github ListRepositorySecurityAdvisoriesOptions pagination support

💡 Result:

go-github’s ListRepositorySecurityAdvisoriesOptions supports offset-based pagination the same way most “List…” calls do: via embedded github.ListOptions (page / per_page). ListOptions is the standard pagination struct in go-github and defines Page and PerPage. [1]

So you paginate like:

set opt.ListOptions.PerPage (and opt.ListOptions.Page starting at 1)

call ListRepositorySecurityAdvisories(...)

read resp.NextPage

loop until resp.NextPage == 0

This is the documented go-github pagination pattern in general. [2]
And ListRepositorySecurityAdvisoriesOptions is present as an options type alongside the advisory-listing methods. [3]

Sources
[1] (pkg.go.dev)
[2] (sources.debian.org)
[3] (pkg.go.dev)

Citations:

1: https://pkg.go.dev/github.com/google/go-github/github?utm_source=openai

2: https://sources.debian.org/src/golang-github-google-go-github/48.1.0-2/README.md?utm_source=openai

3: https://pkg.go.dev/github.com/google/go-github/v74/github?utm_source=openai

🏁 Script executed:

# First, let's examine the file to see the full context and compare with other alert methods find . -name "security_client.go" -path "*/github/*" | head -5

Repository: flanksource/config-db

Length of output: 101

🏁 Script executed:

# Let's also check if there's a go.mod to see which version of go-github is being used find . -name "go.mod" | head -1 | xargs cat | grep github

Repository: flanksource/config-db

Length of output: 22311

🏁 Script executed:

cat -n ./scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 6316

🏁 Script executed:

# Also search for other alert-related methods to compare pagination patterns grep -n "func.*Alert\|func.*Advisory" ./scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 681

Add pagination support to GetSecurityAdvisories.

GetSecurityAdvisories is the only alert method missing ListOptions pagination. All other methods (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts) include Page and PerPage from AlertListOptions. Without pagination, results will be limited to the API's default page size.

Suggested fix

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) { reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{ State: opts.State, + ListOptions: github.ListOptions{ + Page: opts.Page, + PerPage: opts.PerPage, + }, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

ListOptions: github.ListOptions{

Page: opts.Page,

PerPage: opts.PerPage,

},

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 121 - 133, GetSecurityAdvisories currently omits pagination so results are limited to the API default; update the request options in GetSecurityAdvisories to copy AlertListOptions.Page and AlertListOptions.PerPage into the github.ListRepositorySecurityAdvisoriesOptions (alongside State) before calling c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories so callers can control Page and PerPage like the other methods (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts).

coderabbitai · 2026-01-23T08:37:39Z

scrapers/github/security.go

+	result := v1.ScrapeResult{
+		Type:        ConfigTypeGitHubSecurityRepo,
+		ID:          fmt.Sprintf("github-security/%s", repoFullName),
+		Name:        repoFullName,
+		ConfigClass: "Security",
+		Config:      repo,
+		Tags: v1.Tags{
+			{Name: "owner", Value: repoConfig.Owner},
+			{Name: "repo", Value: repoConfig.Repo},
+			{Name: "security", Value: "true"},
+		},
+		CreatedAt:  repo.CreatedAt.GetTime(),
+		Properties: properties,


⚠️ Potential issue | 🔴 Critical

Fix CreatedAt type mismatch.

ScrapeResult.CreatedAt expects *time.Time, but repo.CreatedAt.GetTime() returns time.Time.

🐛 Proposed fix

- result := v1.ScrapeResult{ + createdAt := repo.GetCreatedAt() + result := v1.ScrapeResult{ Type: ConfigTypeGitHubSecurityRepo, ID: fmt.Sprintf("github-security/%s", repoFullName), Name: repoFullName, ConfigClass: "Security", Config: repo, Tags: v1.Tags{ {Name: "owner", Value: repoConfig.Owner}, {Name: "repo", Value: repoConfig.Repo}, {Name: "security", Value: "true"}, }, - CreatedAt: repo.CreatedAt.GetTime(), + CreatedAt: &createdAt, Properties: properties, }

🤖 Prompt for AI Agents

In `@scrapers/github/security.go` around lines 94 - 106, ScrapeResult.CreatedAt is typed as *time.Time but the code passes a time.Time from repo.CreatedAt.GetTime(); fix by creating a local variable to hold the returned time (e.g., tm := repo.CreatedAt.GetTime()) and assign CreatedAt to its address (CreatedAt: &tm) when constructing the v1.ScrapeResult; update the construction in the block that builds the ScrapeResult (v1.ScrapeResult / CreatedAt) so it uses the pointer to the local time variable.

coderabbitai · 2026-01-23T08:37:39Z

scrapers/openssf/compliance.go

+// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls
+// Based on DataSheet reference: plugins/vite-plugin-security.js


⚠️ Potential issue | 🟡 Minor

Incorrect source reference in documentation comment.

The comment references "plugins/vite-plugin-security.js" which appears to be from a different project (JavaScript/Vite). This should reference the actual data source for the compliance mappings.

Suggested fix

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls -// Based on DataSheet reference: plugins/vite-plugin-security.js +// Based on OpenSSF Scorecard checks documentation var ComplianceMappings = map[string]ComplianceMapping{

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls

// Based on DataSheet reference: plugins/vite-plugin-security.js

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls

// Based on OpenSSF Scorecard checks documentation

var ComplianceMappings = map[string]ComplianceMapping{

🤖 Prompt for AI Agents

In `@scrapers/openssf/compliance.go` around lines 10 - 11, The file-level comment above ComplianceMappings incorrectly cites "plugins/vite-plugin-security.js" as the data source; update the comment to reference the correct source (e.g., the OpenSSF Scorecard documentation or the actual mapping source used in this repo) so it accurately documents where the compliance mapping originates; locate the top comment near the ComplianceMappings declaration in scrapers/openssf/compliance.go and replace the incorrect reference with the proper source name/URL (or remove the irrelevant Vite/Vue reference) while keeping the explanatory text about mapping OpenSSF Scorecard checks to compliance framework controls.

coderabbitai · 2026-01-23T08:37:39Z

scrapers/openssf/scorecard.go

+func scorecardCheckNameToKebab(name string) string {
+	return strings.ToLower(strings.ReplaceAll(name, "-", "-"))
+}


⚠️ Potential issue | 🟡 Minor

Dead code: scorecardCheckNameToKebab is a no-op.

This function replaces "-" with "-", which has no effect. It's also unused in the file. Remove it or fix the implementation if kebab-case conversion is needed.

Remove unused no-op function

-func scorecardCheckNameToKebab(name string) string { - return strings.ToLower(strings.ReplaceAll(name, "-", "-")) -}

Or if actual conversion is needed (e.g., spaces/underscores to dashes):

func scorecardCheckNameToKebab(name string) string { return strings.ToLower(strings.ReplaceAll(name, " ", "-")) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func scorecardCheckNameToKebab(name string) string {

return strings.ToLower(strings.ReplaceAll(name, "-", "-"))

}

🤖 Prompt for AI Agents

In `@scrapers/openssf/scorecard.go` around lines 274 - 276, The function scorecardCheckNameToKebab is a no-op (it replaces "-" with "-" and is unused); either remove the function completely or implement proper kebab-case conversion and use it where needed—e.g., transform spaces/underscores to dashes and lowercase (apply to scorecardCheckNameToKebab), or delete the scorecardCheckNameToKebab declaration and any references to it if no conversion is required.

coderabbitai

Actionable comments posted: 11

🤖 Fix all issues with AI agents

In `@api/context.go`:
- Around line 165-167: The IsDebug method on ScrapeContext can panic if
scrapeConfig is nil; update ScrapeContext.IsDebug() to guard against a nil
ctx.scrapeConfig and return false when unset, otherwise return
ctx.scrapeConfig.Spec.IsDebug(); reference the ScrapeContext type and IsDebug()
method and ensure compatibility with NewScrapeContext which may produce a
context without scrapeConfig.

In `@api/v1/github_security.go`:
- Around line 47-53: The ParseMaxAge method in GitHubSecurityFilters currently
calls time.ParseDuration which rejects "90d"/"30d" as documented; change
ParseMaxAge to accept a trailing 'd' by detecting a suffix of 'd'
(case-insensitive), parsing the numeric portion as an integer/float, converting
days to hours (days * 24h) and returning time.Duration (or fall back to
time.ParseDuration for other valid suffixes), and ensure errors are returned for
invalid inputs; update logic inside ParseMaxAge to handle both "Nd" inputs and
standard time.ParseDuration-compatible strings and reference the
GitHubSecurityFilters.ParseMaxAge function when locating the change.

In `@api/v1/interface.go`:
- Around line 815-831: The PrettyRow method (ScrapeResult.PrettyRow) currently
performs unconditional printing via clicky.MustPrint and accepts an unused
options parameter; remove the two clicky.MustPrint calls so the method only
constructs and returns the row map, and drop the unused options parameter from
the function signature (or if you prefer to keep it, ensure it's used) so
callers control printing/formatting. Update any call sites of
ScrapeResult.PrettyRow to handle printing/formatting themselves (e.g., call
clicky.MustPrint with desired FormatOptions where needed).

In `@api/v1/openssf.go`:
- Around line 10-11: The MinScore field currently uses *float64 which breaks CRD
generation; change MinScore to a safer type (e.g., *string or *int) and update
any callers/serialization accordingly: replace the declaration of MinScore
(*float64) with either *string (to store "7.5") or *int representing a scaled
value (e.g., 75 for 0.0–10.0 scaled to 0–100), keep the existing yaml/json tags,
and adjust validation/usage logic that reads MinScore (parsing string to float
or dividing the int by the scale) so consumers like validation, defaults, and
API docs remain correct; if you must keep a numeric type, add the kubebuilder
marker // +kubebuilder:validation:Type=number and rerun controller-gen with
crd:allowDangerousTypes=true (not recommended).

In `@api/v1/types.go`:
- Around line 71-74: The ScraperSpec additions (fields GitHubSecurity and
OpenSSFScorecard) are not wired into AllScraperConfigs and ApplyPlugin; update
the AllScraperConfigs map to include entries "githubsecurity": GitHubSecurity{}
and "openssfscorecard": OpenSSFScorecard{}, and modify the ApplyPlugin method to
iterate over s.GitHubSecurity and s.OpenSSFScorecard and call ApplyPlugin on
each item (mirroring the existing loops for GithubActions and Azure) so those
plugin configs are applied; reference the AllScraperConfigs map and the
ApplyPlugin method and the types GitHubSecurity and OpenSSFScorecard when
locating code to change.

In `@cmd/run.go`:
- Around line 108-125: The current code always exports/prints regardless of the
--export flag; update the branch to honor the boolean export flag (e.g.,
exportFlag or export) so the export path only runs when exportFlag is true and
the print path runs when exportFlag is false; specifically, wrap the export loop
that calls exportResource(results, outputDir) and the
logger.Infof("Exported...") inside if exportFlag { ... } and move the
clicky.MustPrint(all.*) calls into the corresponding else branch, and also
validate that when exportFlag is true an outputDir is provided (return a clear
error if empty) to avoid silent failures.

In `@scrapers/github/security_client.go`:
- Around line 23-48: In NewGitHubSecurityClient, after resolving token (the
block that sets token from ctx.HydrateConnectionByURL or
ctx.GetEnvValueFromCache), add a check that token is not empty and return a
descriptive error if it is; for example, if token == "" return
fmt.Errorf("missing GitHub token: no connection %q and no personalAccessToken
provided", config.ConnectionName or config.PersonalAccessToken) so the function
fails fast instead of creating a client that will 401 on API calls.
- Around line 121-133: GetSecurityAdvisories ignores pagination and always
fetches the first page; set the ListRepositorySecurityAdvisoriesOptions to
include the requested page and per-page values from AlertListOptions (map
opts.Page and opts.PerPage into the options used for the API call). Update the
reqOpts used in GetSecurityAdvisories to populate the ListOptions/Page and
PerPage fields (same approach as
GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) so the call
to c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories honors
pagination.

In `@scrapers/github/security.go`:
- Around line 59-62: The log call is formatting with "%d" but is concatenating
the alert-count strings, so compute an integer total first by parsing the
individual counts from result.Properties.Find("Critical Alerts").Text, "High
Alerts", "Medium Alerts", and "Low Alerts" (use strconv.Atoi, handle errors
defaulting to 0), sum them into a total int, then call
ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName,
total) instead of concatenating the Text values.

In `@scrapers/processors/json.go`:
- Around line 599-607: The validation currently checks only ExternalChangeID for
standalone inputs but not ExternalID; update the validator in the same block
that inspects input.Changes (near the lo.Filter usage) to also check for any
change where ExternalID == "" when input.ID is empty (standalone change). Use
the same pattern (e.g., lo.Filter or an additional check) referencing
input.Changes, ExternalID, ExternalChangeID and input.ID, and return the same
formatted error (including e.Config and input.Debug()) when either required
field is missing.

In `@scrapers/processors/script.go`:
- Around line 33-36: The current logger.Infof call in the json.Unmarshal error
block logs the entire script output variable s (risking secrets/PII and large
dumps); update the error handling around json.Unmarshal to avoid printing the
full s — instead log the error and either a safely truncated sample (e.g., first
N characters with "…(truncated)") and the total length, or a redacted/hashed
summary of s, and consider simple redaction for common secrets (tokens,
passwords) before including any sample; modify the logger.Infof invocation that
references err and s to use the new truncated/redacted sample and length-only
information.

🧹 Nitpick comments (4)

api/v1/scrapeconfig_types.go (1)

70-76: Potential ambiguity with ScraperSpec.IsDebug() and ScraperSpec.IsTrace() methods.

ScrapeConfig.IsDebug() checks annotations while ScraperSpec.IsDebug() (in types.go lines 178-184) checks the LogLevel field. Both methods exist on related types with the same name but different behavior.

Consider documenting or clarifying the distinction, or using different method names to avoid confusion when both are accessible through the context chain.
scrapers/openssf/compliance.go (1)
110-115: Return defensive copies from GetComplianceMappings.

The returned slices share backing arrays with the global map; external mutation can alter shared mappings. Consider returning copies to keep mappings immutable (and optionally keep the map unexported if it’s not required externally).
♻️ Suggested adjustment
 func GetComplianceMappings(checkName string) ComplianceMapping {
 	if mapping, ok := ComplianceMappings[checkName]; ok {
-		return mapping
+		return ComplianceMapping{
+			SOC2:     append([]string(nil), mapping.SOC2...),
+			NISTSSDF: append([]string(nil), mapping.NISTSSDF...),
+			CIS:      append([]string(nil), mapping.CIS...),
+		}
 	}
 	return ComplianceMapping{}
 }
scrapers/openssf/client.go (2)
72-104: Respect context cancellation during retry backoff.

Line 82–86 sleeps unconditionally; if ctx is canceled, callers still block through the backoff. Consider a select on ctx.Done() to exit promptly.
♻️ Proposed fix (ctx-aware backoff)
 	for attempt := 0; attempt < maxRetries; attempt++ {
 		if attempt > 0 {
 			backoff := time.Duration(attempt*attempt) * time.Second
 			c.ctx.Logger.V(3).Infof("retrying scorecard request for %s (attempt %d/%d) after %v",
 				repoFullName, attempt+1, maxRetries, backoff)
-			time.Sleep(backoff)
+			timer := time.NewTimer(backoff)
+			select {
+			case <-ctx.Done():
+				timer.Stop()
+				return nil, ctx.Err()
+			case <-timer.C:
+			}
 		}
144-168: Retry classification is brittle; prefer typed errors and strings.Contains.

Line 149–152 relies on substring matching, and contains/findSubstring reimplements stdlib logic. Consider using errors.Is/As with net.Error for timeouts and strings.Contains for server errors, and drop custom helpers.
♻️ Proposed fix (typed errors + stdlib Contains)
 import (
 	"context"
 	"encoding/json"
+	"errors"
 	"fmt"
 	"io"
+	"net"
 	"net/http"
+	"strings"
 	"time"

 	"github.com/flanksource/config-db/api"
 )
@@
 func (c *ScorecardClient) isRetryable(err error) bool {
 	if err == nil {
 		return false
 	}
-
-	errStr := err.Error()
-	return contains(errStr, "server error") ||
-	       contains(errStr, "timeout") ||
-	       contains(errStr, "connection")
+	if errors.Is(err, context.DeadlineExceeded) {
+		return true
+	}
+	var netErr net.Error
+	if errors.As(err, &netErr) && (netErr.Timeout() || netErr.Temporary()) {
+		return true
+	}
+	return strings.Contains(err.Error(), "server error")
 }
-
-func contains(s, substr string) bool {
-	return len(s) >= len(substr) && (s == substr || len(s) > len(substr) &&
-		(s[:len(substr)] == substr || s[len(s)-len(substr):] == substr ||
-		 findSubstring(s, substr)))
-}
-
-func findSubstring(s, substr string) bool {
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return true
-		}
-	}
-	return false
-}

coderabbitai · 2026-01-23T09:01:55Z

api/context.go

+func (ctx ScrapeContext) IsDebug() bool {
+	return ctx.scrapeConfig.Spec.IsDebug()
+}


⚠️ Potential issue | 🟡 Minor

Guard against nil ScrapeConfig.
NewScrapeContext can produce a context without scrapeConfig; calling IsDebug() would panic. Return false when unset.

🐛 Proposed fix

func (ctx ScrapeContext) IsDebug() bool { - return ctx.scrapeConfig.Spec.IsDebug() + if ctx.scrapeConfig == nil { + return false + } + return ctx.scrapeConfig.Spec.IsDebug() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (ctx ScrapeContext) IsDebug() bool {

return ctx.scrapeConfig.Spec.IsDebug()

}

func (ctx ScrapeContext) IsDebug() bool {

if ctx.scrapeConfig == nil {

return false

}

return ctx.scrapeConfig.Spec.IsDebug()

}

🤖 Prompt for AI Agents

In `@api/context.go` around lines 165 - 167, The IsDebug method on ScrapeContext can panic if scrapeConfig is nil; update ScrapeContext.IsDebug() to guard against a nil ctx.scrapeConfig and return false when unset, otherwise return ctx.scrapeConfig.Spec.IsDebug(); reference the ScrapeContext type and IsDebug() method and ensure compatibility with NewScrapeContext which may produce a context without scrapeConfig.

coderabbitai · 2026-01-23T09:01:55Z

api/v1/github_security.go

+// ParseMaxAge converts the MaxAge string to a time.Duration
+func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
+	if f.MaxAge == "" {
+		return 0, nil
+	}
+	return time.ParseDuration(f.MaxAge)
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the github_security.go file find . -name "github_security.go" -type f

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

# Check if there are any documentation files or tests related to ParseMaxAge fd -e md -e txt | head -20 git ls-files | grep -i security

Repository: flanksource/config-db

Length of output: 567

🏁 Script executed:

# Search for references to "90d" and "30d" in the codebase rg "90d|30d" -i

Repository: flanksource/config-db

Length of output: 15027

🏁 Script executed:

# Read the github_security.go file to see the actual implementation cat -n api/v1/github_security.go | head -60

Repository: flanksource/config-db

Length of output: 2261

🏁 Script executed:

# Search for tests related to ParseMaxAge fd -e go -type f | xargs grep -l "ParseMaxAge" 2>/dev/null

Repository: flanksource/config-db

Length of output: 237

🌐 Web query:

Go time.ParseDuration supported format "d" day suffix

💡 Result:

Go’s standard library time.ParseDuration does not support a d (day) suffix—the only valid unit suffixes are ns, us (or µs), ms, s, m, h. [1]

The time package also intentionally provides no Day constant (“to avoid confusion across daylight savings time zone transitions”). [1]

What to do instead

If you mean exact 24-hour blocks, use hours: 24h, 48h, 720h (for 30 days), etc. [1]

If you mean calendar days (variable-length due to DST), use date math like t.AddDate(0, 0, days) rather than time.Duration.

🏁 Script executed:

# Search for tests of ParseMaxAge or any usages of it rg "ParseMaxAge" -i -A 3 -B 3

Repository: flanksource/config-db

Length of output: 511

🏁 Script executed:

# Check if there are any existing tests for github_security fd -name "*github*security*test*" -o -name "*test*github*security*"

Repository: flanksource/config-db

Length of output: 295

🏁 Script executed:

# Also check fixtures that might use MaxAge grep -r "maxAge\|MaxAge" --include="*.yaml" --include="*.yml" --include="*.json"

Repository: flanksource/config-db

Length of output: 458

ParseMaxAge rejects the documented 90d/30d inputs.

Line 43 documents examples "90d" and "30d", but line 52 uses time.ParseDuration, which doesn't accept a d suffix. The only supported unit suffixes are ns, us, µs, ms, s, m, h. Either update the documentation to reflect supported formats (e.g., "2160h" for 90 days, "720h" for 30 days) or implement custom parsing to support the d suffix.

🤖 Prompt for AI Agents

In `@api/v1/github_security.go` around lines 47 - 53, The ParseMaxAge method in GitHubSecurityFilters currently calls time.ParseDuration which rejects "90d"/"30d" as documented; change ParseMaxAge to accept a trailing 'd' by detecting a suffix of 'd' (case-insensitive), parsing the numeric portion as an integer/float, converting days to hours (days * 24h) and returning time.Duration (or fall back to time.ParseDuration for other valid suffixes), and ensure errors are returned for invalid inputs; update logic inside ParseMaxAge to handle both "Nd" inputs and standard time.ParseDuration-compatible strings and reference the GitHubSecurityFilters.ParseMaxAge function when locating the change.

coderabbitai · 2026-01-23T09:01:55Z

api/v1/interface.go

+func (s ScrapeResult) PrettyRow(options interface{}) map[string]api.Text {
+	row := make(map[string]api.Text)
+	row["ID"] = clicky.Text(s.ID)
+	row["Name"] = clicky.Text(s.Name)
+	row["Type"] = clicky.Text(s.Type)
+	row["Health"] = clicky.Text(string(s.Health))
+	row["Status"] = clicky.Text(s.Status)
+	if s.Error != nil {
+		row["Error"] = clicky.Text(s.Error.Error())
+	} else {
+		row["Error"] = clicky.Text("")
+	}
+	clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{JSON: true})
+	clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{Pretty: true})
+
+	return row
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find and read the file to confirm the current state fd -t f "interface.go" api/v1/ 2>/dev/null | head -5

Repository: flanksource/config-db

Length of output: 84

🏁 Script executed:

# Read the specific section of the file wc -l api/v1/interface.go

Repository: flanksource/config-db

Length of output: 89

🏁 Script executed:

# Get the exact content around lines 815-831 sed -n '810,835p' api/v1/interface.go

Repository: flanksource/config-db

Length of output: 740

🏁 Script executed:

# Search for all usages of PrettyRow in the codebase rg "PrettyRow" -B 2 -A 2 --type go

Repository: flanksource/config-db

Length of output: 706

🏁 Script executed:

# Check the ScrapeResult type definition and its interface ast-grep --pattern 'type ScrapeResult struct { $$$ }'

Repository: flanksource/config-db

Length of output: 5467

🏁 Script executed:

# Look for any interface that PrettyRow might be implementing rg "PrettyRow" --type go -B 5 -A 5

Repository: flanksource/config-db

Length of output: 1339

🏁 Script executed:

# Search for PrettyRow interface definition rg "type PrettyRow" --type go -B 2 -A 5

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Search for all places where PrettyRow method is called (not defined) rg "\.PrettyRow$" --type go -B 2 -A 2

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Search for other types that implement PrettyRow method rg "func \(.+$ PrettyRow" --type go

Repository: flanksource/config-db

Length of output: 159

🏁 Script executed:

# Check if there are any tests for this method rg "PrettyRow" --type go | grep -i test

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Search for api.PrettyRow usage rg "api\.PrettyRow" --type go -B 2 -A 2

Repository: flanksource/config-db

Length of output: 460

🏁 Script executed:

# Search for PrettyRow in the api package definition fd -t f "\.go" | xargs grep -l "type PrettyRow" 2>/dev/null

Repository: flanksource/config-db

Length of output: 49

🏁 Script executed:

# Search more broadly for what might call PrettyRow rg "PrettyRow" --type go

Repository: flanksource/config-db

Length of output: 254

🏁 Script executed:

# Check the api package imports and see what PrettyRow is rg "PrettyRow" -B 5 -A 5 --type go

Repository: flanksource/config-db

Length of output: 1339

🏁 Script executed:

# Look for the clicky package to understand MustPrint behavior rg "clicky\.MustPrint" --type go -B 2 -A 2 | head -50

Repository: flanksource/config-db

Length of output: 636

🏁 Script executed:

# Look for interface PrettyRow definition in vendor or api packages rg "interface.*PrettyRow\|type.*PrettyRow" --type go -B 2 -A 5

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check if ScrapeResult is used as api.PrettyRow (interface check) rg "var.*api\.PrettyRow" --type go -B 2 -A 2

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Look for any reflection or dynamic calls to PrettyRow rg '"PrettyRow"' --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Search in the api package directly fd -type f -path "*/api*" -name "*.go" | head -20

Repository: flanksource/config-db

Length of output: 235

🏁 Script executed:

# Look for where Configs field is populated/used from FullScrapeResults rg "FullScrapeResults" --type go -B 3 -A 3

Repository: flanksource/config-db

Length of output: 934

🏁 Script executed:

# Check imports in interface.go to understand the api package better head -50 api/v1/interface.go | grep -E "^import|^package"

Repository: flanksource/config-db

Length of output: 84

🏁 Script executed:

# Find where api.PrettyRow is defined - check if it's in a different package find . -name "*.go" -type f -exec grep -l "type PrettyRow" {} \;

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check the full api/v1 directory structure ls -la api/v1/

Repository: flanksource/config-db

Length of output: 2174

🏁 Script executed:

# Look for api package definition more broadly find . -path "*/api/*" -name "*.go" -type f | xargs grep -l "PrettyRow" 2>/dev/null | head -10

Repository: flanksource/config-db

Length of output: 86

🏁 Script executed:

# Check if api refers to external package by looking at imports rg "^import" -A 30 api/v1/interface.go | head -50

Repository: flanksource/config-db

Length of output: 855

🏁 Script executed:

# Search for where ScrapeResult might be assigned to api.PrettyRow rg "api\.PrettyRow\|ScrapeResult" --type go | grep -E "(Configs|api\.PrettyRow)" | head -20

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Search for how Configs from FullScrapeResults is used rg "\.Configs" --type go -B 2 -A 2 | head -50

Repository: flanksource/config-db

Length of output: 408

🏁 Script executed:

# Check for MergeScrapeResults function to see if it converts to api.PrettyRow rg "MergeScrapeResults" --type go -B 3 -A 20

Repository: flanksource/config-db

Length of output: 2156

🏁 Script executed:

# Check if there's a conversion or assignment from ScrapeResult to api.PrettyRow rg "api\.PrettyRow" --type go -B 5 -A 5

Repository: flanksource/config-db

Length of output: 842

🏁 Script executed:

# Look for range over Configs to see how api.PrettyRow items are used rg "for.*Configs" --type go -B 2 -A 5

Repository: flanksource/config-db

Length of output: 6010

🏁 Script executed:

# Look at the full MergeScrapeResults function to understand how Configs is populated sed -n '850,950p' api/v1/interface.go

Repository: flanksource/config-db

Length of output: 2299

🏁 Script executed:

# Check how clicky package uses PrettyRow interface python3 << 'EOF' import subprocess result = subprocess.run(['rg', 'api.PrettyRow', '-B 5', '-A 5', '--type', 'go'], capture_output=True, text=True) print(result.stdout) if result.returncode != 0: print(f"Error: {result.stderr}") EOF

Repository: flanksource/config-db

Length of output: 155

🏁 Script executed:

# Search for the clicky api.PrettyRow interface definition python3 << 'EOF' # Let's look at what happens when clicky.MustPrint gets Configs # and understand the actual flow import subprocess # First find usage result = subprocess.run(['rg', 'MustPrint.*Configs', '-B 3', '-A 3', '--type', 'go'], capture_output=True, text=True) print("=== MustPrint with Configs ===") print(result.stdout) # Now search for other PrettyRow implementations result2 = subprocess.run(['rg', 'func.*PrettyRow', '--type', 'go'], capture_output=True, text=True) print("\n=== All PrettyRow implementations ===") print(result2.stdout) EOF

Repository: flanksource/config-db

Length of output: 136

🏁 Script executed:

# Let's examine the complete MergeScrapeResults function sed -n '846,900p' api/v1/interface.go

Repository: flanksource/config-db

Length of output: 1194

🏁 Script executed:

# Check for any explicit calls to PrettyRow method rg "\.PrettyRow\(" --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Look at the actual line where r is appended to full.Configs sed -n '875,895p' api/v1/interface.go

Repository: flanksource/config-db

Length of output: 572

🏁 Script executed:

# Check if api.PrettyRow is an interface python3 << 'EOF' # Search for interface definition in the repository import subprocess result = subprocess.run(['rg', 'interface.*{', '--type', 'go', '-A', '5'], capture_output=True, text=True) lines = result.stdout.split('\n') # Filter for PrettyRow related interfaces for i, line in enumerate(lines): if 'PrettyRow' in line or (i > 0 and 'PrettyRow' in '\n'.join(lines[max(0,i-5):min(len(lines),i+5)])): print(line) EOF

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Let's look more carefully at the complete flow - from MergeScrapeResults function cat api/v1/interface.go | grep -n "func MergeScrapeResults" -A 60

Repository: flanksource/config-db

Length of output: 2127

🏁 Script executed:

# Search for where api.PrettyRow interface might be defined python3 << 'EOF' import subprocess # Search for PrettyRow in all contexts result = subprocess.run(['rg', 'PrettyRow', '--type', 'go', '-C', '3'], capture_output=True, text=True) print(result.stdout[:3000]) EOF

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check if there's any explicit call to PrettyRow method with debugging grep -rn "\.PrettyRow(" . --include="*.go" 2>/dev/null || echo "No explicit calls found"

Repository: flanksource/config-db

Length of output: 88

🏁 Script executed:

# Let me check the actual behavior - look at cmd/run.go to understand the flow cat cmd/run.go | grep -B 10 -A 10 "clicky.MustPrint(all.Configs)"

Repository: flanksource/config-db

Length of output: 615

🏁 Script executed:

# Look for what happens in clicky when MustPrint is called on PrettyRow items # Check vendor or go.mod for clicky version cat go.mod | grep clicky

Repository: flanksource/config-db

Length of output: 115

🏁 Script executed:

# Let's check what api.PrettyRow actually is - look for interface definition in imported packages python3 << 'EOF' import subprocess # Search for any interface that might define PrettyRow result = subprocess.run(['rg', 'type.*PrettyRow.*interface', '--type', 'go'], capture_output=True, text=True) if result.stdout: print("Found interface definition:") print(result.stdout) else: print("No interface definition found in local code") # Check if it's imported from clicky result2 = subprocess.run(['rg', 'from.*clicky', '--type', 'go'], capture_output=True, text=True) print("\nClicky imports:") print(result2.stdout[:1000] if result2.stdout else "None") EOF

Repository: flanksource/config-db

Length of output: 130

🏁 Script executed:

# Let me search the go.mod to see if there's a way to understand the clicky interface better # First check if we can find any documentation or usage patterns rg "clicky" -B 2 -A 2 --type go | grep -A 5 -B 5 "PrettyRow\|interface"

Repository: flanksource/config-db

Length of output: 3480

🏁 Script executed:

# Let's look at what the review is actually saying - check if PrettyRow is being called implicitly # by clicky when it encounters api.PrettyRow types python3 << 'EOF' # Since ScrapeResult is assigned to api.PrettyRow slice and no explicit calls found, # PrettyRow must be a method that clicky calls automatically. # Let's verify if clicky.MustPrint triggers the PrettyRow method # Check the actual flow: when does the PrettyRow method get called? import subprocess # Look for usage of clicky.MustPrint with different types result = subprocess.run(['rg', 'clicky.MustPrint', '-B 2', '-A 2', '--type', 'go'], capture_output=True, text=True) print("All clicky.MustPrint calls:") print(result.stdout) EOF

Repository: flanksource/config-db

Length of output: 93

Remove unconditional prints from PrettyRow method.

The two clicky.MustPrint calls on lines 827–828 execute whenever this method is invoked, causing duplicate output and preventing callers from controlling formatting. The options parameter is also unused. Let the caller decide when and how to print by returning only the row data.

Suggested fix

func (s ScrapeResult) PrettyRow(options interface{}) map[string]api.Text { row := make(map[string]api.Text) row["ID"] = clicky.Text(s.ID) row["Name"] = clicky.Text(s.Name) row["Type"] = clicky.Text(s.Type) row["Health"] = clicky.Text(string(s.Health)) row["Status"] = clicky.Text(s.Status) if s.Error != nil { row["Error"] = clicky.Text(s.Error.Error()) } else { row["Error"] = clicky.Text("") } - clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{JSON: true}) - clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{Pretty: true}) - return row }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (s ScrapeResult) PrettyRow(options interface{}) map[string]api.Text {

row := make(map[string]api.Text)

row["ID"] = clicky.Text(s.ID)

row["Name"] = clicky.Text(s.Name)

row["Type"] = clicky.Text(s.Type)

row["Health"] = clicky.Text(string(s.Health))

row["Status"] = clicky.Text(s.Status)

if s.Error != nil {

row["Error"] = clicky.Text(s.Error.Error())

} else {

row["Error"] = clicky.Text("")

}

clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{JSON: true})

clicky.MustPrint([]map[string]api.Text{row}, clicky.FormatOptions{Pretty: true})

return row

}

func (s ScrapeResult) PrettyRow(options interface{}) map[string]api.Text {

row := make(map[string]api.Text)

row["ID"] = clicky.Text(s.ID)

row["Name"] = clicky.Text(s.Name)

row["Type"] = clicky.Text(s.Type)

row["Health"] = clicky.Text(string(s.Health))

row["Status"] = clicky.Text(s.Status)

if s.Error != nil {

row["Error"] = clicky.Text(s.Error.Error())

} else {

row["Error"] = clicky.Text("")

}

return row

}

🤖 Prompt for AI Agents

In `@api/v1/interface.go` around lines 815 - 831, The PrettyRow method (ScrapeResult.PrettyRow) currently performs unconditional printing via clicky.MustPrint and accepts an unused options parameter; remove the two clicky.MustPrint calls so the method only constructs and returns the row map, and drop the unused options parameter from the function signature (or if you prefer to keep it, ensure it's used) so callers control printing/formatting. Update any call sites of ScrapeResult.PrettyRow to handle printing/formatting themselves (e.g., call clicky.MustPrint with desired FormatOptions where needed).

coderabbitai · 2026-01-23T09:01:55Z

api/v1/openssf.go

+	// MinScore optionally filters repositories by minimum score (0-10)
+	MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"`


⚠️ Potential issue | 🔴 Critical

Fix float type to resolve pipeline failures.

The *float64 type causes CRD generation to fail. Kubernetes controller-gen discourages floats due to inconsistent serialization across languages.

Consider using a string type or an integer representing a scaled value (e.g., 0-100 instead of 0.0-10.0):

🔧 Proposed fix using string type

- // MinScore optionally filters repositories by minimum score (0-10) - MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"` + // MinScore optionally filters repositories by minimum score (0-10), specified as a string (e.g., "7.5") + MinScore *string `yaml:"minScore,omitempty" json:"minScore,omitempty"`

Alternatively, if you prefer to keep the float type, add // +kubebuilder:validation:Type=number marker and re-run controller-gen with crd:allowDangerousTypes=true, though this is not recommended for portability.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// MinScore optionally filters repositories by minimum score (0-10)

MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"`

// MinScore optionally filters repositories by minimum score (0-10), specified as a string (e.g., "7.5")

MinScore *string `yaml:"minScore,omitempty" json:"minScore,omitempty"`

🧰 Tools

🪛 GitHub Actions: Lint

[error] 11-11: controller-gen CRD generation failed: found float at line 11:12 in api/v1/openssf.go. Serializing floats as strings is recommended; or re-run with crd:allowDangerousTypes=true.

🪛 GitHub Actions: Test

[error] 11-11: controller-gen CRD generation failed: found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🪛 GitHub Check: lint

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🪛 GitHub Check: test

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🪛 GitHub Check: test-prod

[failure] 11-11:
found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true

🤖 Prompt for AI Agents

In `@api/v1/openssf.go` around lines 10 - 11, The MinScore field currently uses *float64 which breaks CRD generation; change MinScore to a safer type (e.g., *string or *int) and update any callers/serialization accordingly: replace the declaration of MinScore (*float64) with either *string (to store "7.5") or *int representing a scaled value (e.g., 75 for 0.0–10.0 scaled to 0–100), keep the existing yaml/json tags, and adjust validation/usage logic that reads MinScore (parsing string to float or dividing the int by the scale) so consumers like validation, defaults, and API docs remain correct; if you must keep a numeric type, add the kubebuilder marker // +kubebuilder:validation:Type=number and rerun controller-gen with crd:allowDangerousTypes=true (not recommended).

coderabbitai · 2026-01-23T09:01:55Z

api/v1/types.go

+	GithubActions   []GitHubActions   `json:"githubActions,omitempty" yaml:"githubActions,omitempty"`
+	GitHubSecurity  []GitHubSecurity  `json:"githubSecurity,omitempty" yaml:"githubSecurity,omitempty"`
+	OpenSSFScorecard []OpenSSFScorecard `json:"openssfScorecard,omitempty" yaml:"openssfScorecard,omitempty"`
+	Azure           []Azure           `json:"azure,omitempty" yaml:"azure,omitempty"`


⚠️ Potential issue | 🟠 Major

Missing entries in AllScraperConfigs map and ApplyPlugin method.

The new GitHubSecurity and OpenSSFScorecard fields are added to ScraperSpec, but the corresponding entries are missing from:

AllScraperConfigs map (lines 15-31) - should include "githubsecurity": GitHubSecurity{} and "openssfscorecard": OpenSSFScorecard{}

ApplyPlugin method (lines 97-167) - should include plugin application loops for both new types, similar to other scrapers

Proposed additions

Add to AllScraperConfigs map:

var AllScraperConfigs = map[string]any{ "aws": AWS{}, "azure": Azure{}, "azuredevops": AzureDevops{}, "exec": Exec{}, "file": File{}, "gcp": GCP{}, "githubactions": GitHubActions{}, + "githubsecurity": GitHubSecurity{}, "http": HTTP{}, "kubernetes": Kubernetes{}, "kubernetesfile": KubernetesFile{}, "logs": Logs{}, + "openssfscorecard": OpenSSFScorecard{}, "slack": Slack{}, "sql": SQL{}, "terraform": Terraform{}, "trivy": Trivy{}, }

Add to ApplyPlugin method (after line 132):

for i := range spec.GithubActions { spec.GithubActions[i].BaseScraper = spec.GithubActions[i].BaseScraper.ApplyPlugins(plugins...) } + + for i := range spec.GitHubSecurity { + spec.GitHubSecurity[i].BaseScraper = spec.GitHubSecurity[i].BaseScraper.ApplyPlugins(plugins...) + } + + for i := range spec.OpenSSFScorecard { + spec.OpenSSFScorecard[i].BaseScraper = spec.OpenSSFScorecard[i].BaseScraper.ApplyPlugins(plugins...) + }

🤖 Prompt for AI Agents

In `@api/v1/types.go` around lines 71 - 74, The ScraperSpec additions (fields GitHubSecurity and OpenSSFScorecard) are not wired into AllScraperConfigs and ApplyPlugin; update the AllScraperConfigs map to include entries "githubsecurity": GitHubSecurity{} and "openssfscorecard": OpenSSFScorecard{}, and modify the ApplyPlugin method to iterate over s.GitHubSecurity and s.OpenSSFScorecard and call ApplyPlugin on each item (mirroring the existing loops for GithubActions and Azure) so those plugin configs are applied; reference the AllScraperConfigs map and the ApplyPlugin method and the types GitHubSecurity and OpenSSFScorecard when locating code to change.

coderabbitai · 2026-01-23T09:01:55Z

scrapers/github/security_client.go

+func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {
+	var token string
+	if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {
+		return nil, fmt.Errorf("failed to hydrate connection: %w", err)
+	} else if connection != nil {
+		token = connection.Password
+	} else {
+		token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())
+		if err != nil {
+			return nil, fmt.Errorf("failed to get token: %w", err)
+		}
+	}
+
+	ts := oauth2.StaticTokenSource(
+		&oauth2.Token{AccessToken: token},
+	)
+	tc := oauth2.NewClient(ctx, ts)
+	client := github.NewClient(tc)
+
+	return &GitHubSecurityClient{
+		ScrapeContext: ctx,
+		Client:        client,
+		owner:         owner,
+		repo:          repo,
+	}, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub REST API security alerts code scanning secret scanning authentication requirements

💡 Result:

GitHub REST API: security alerts (code scanning, secret scanning, Dependabot) — auth & access requirements

1) Token types you can use

OAuth App token or PAT (classic): supported broadly across these alert APIs (see per-endpoint scope requirements below).

Fine-grained tokens / GitHub App tokens: supported for most repo/org alert endpoints, but not for some enterprise-level Dependabot alert endpoints. For example, “List Dependabot alerts for an enterprise” does not work with fine-grained PATs or GitHub App tokens. [3]

2) Scopes / permissions (classic PATs & OAuth tokens)

Code scanning alerts: requires security_events (private or public), or public_repo if only accessing public repos. [1]

Secret scanning alerts: requires repo or security_events (and public_repo can be sufficient if only accessing public repos). [2]

Dependabot alerts: commonly requires security_events (and public_repo can be sufficient if only accessing public repos). Some endpoints (e.g., enterprise-level) accept repo or security_events. [3]

3) Fine-grained PAT / GitHub App permission names (repo-scoped)

Code scanning: "Code scanning alerts" repo permission (read for GET/list, write for update). [1]

Secret scanning: "Secret scanning alerts" repo permission (read/write depending on endpoint). [2]

Dependabot: "Dependabot alerts" repo permission (read/write depending on endpoint). [3]

4) Who is allowed to call the endpoints (role / access gating)

Even with the right token scope/permissions, the caller must have sufficient access:

Repository alert visibility/access is generally available to people with write/maintain/admin on the repo (and org owners for org-owned repos). GitHub also lets repo admins/org owners grant additional alert access (with constraints). [4]

Some alert endpoints explicitly require elevated roles, e.g.:

Secret scanning (org-level listing): caller must be an org admin or security manager. [2]

Secret scanning (repo-level listing): caller must be a repo admin (or admin of the owning org). [2]

Dependabot (org-level listing): caller must be an org owner or security manager. [3]

Dependabot (enterprise-level listing): caller must be an enterprise member, and results are limited to orgs where the caller is an org owner or security manager. [3]

Sources: GitHub REST API docs for Code Scanning [1], Secret Scanning [2], Dependabot Alerts [3], and “Managing security and analysis settings… / Granting access to security alerts” [4].

🏁 Script executed:

find . -name "security_client.go" -type f | head -5

Repository: flanksource/config-db

Length of output: 101

🏁 Script executed:

# Look for the full security_client.go file to understand context fd security_client.go --type f

Repository: flanksource/config-db

Length of output: 99

🏁 Script executed:

cat -n scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 6316

🏁 Script executed:

# Check if there are other places in the codebase that use this client # to understand the broader context rg "NewGitHubSecurityClient|GitHubSecurityClient" --type go -A 2 -B 2

Repository: flanksource/config-db

Length of output: 5100

🏁 Script executed:

# Check if token validation patterns exist elsewhere in this project rg "token == \"\"" --type go

Repository: flanksource/config-db

Length of output: 47

Add token validation to fail fast on missing credentials.

Empty token values proceed silently to client creation, causing all subsequent security alert API calls (Dependabot, code scanning, secret scanning, advisories) to fail with 401 Unauthorized. Validate that a token exists and return an explicit error if neither connection nor personalAccessToken config provides one.

🐛 Proposed fix

} else { token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace()) if err != nil { return nil, fmt.Errorf("failed to get token: %w", err) } } + + if token == "" { + return nil, fmt.Errorf("github personal access token is required (connection or personalAccessToken)") + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {

var token string

if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {

return nil, fmt.Errorf("failed to hydrate connection: %w", err)

} else if connection != nil {

token = connection.Password

} else {

token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())

if err != nil {

return nil, fmt.Errorf("failed to get token: %w", err)

}

}

ts := oauth2.StaticTokenSource(

&oauth2.Token{AccessToken: token},

)

tc := oauth2.NewClient(ctx, ts)

client := github.NewClient(tc)

return &GitHubSecurityClient{

ScrapeContext: ctx,

Client: client,

owner: owner,

repo: repo,

}, nil

}

func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {

var token string

if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {

return nil, fmt.Errorf("failed to hydrate connection: %w", err)

} else if connection != nil {

token = connection.Password

} else {

token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())

if err != nil {

return nil, fmt.Errorf("failed to get token: %w", err)

}

}

if token == "" {

return nil, fmt.Errorf("github personal access token is required (connection or personalAccessToken)")

}

ts := oauth2.StaticTokenSource(

&oauth2.Token{AccessToken: token},

)

tc := oauth2.NewClient(ctx, ts)

client := github.NewClient(tc)

return &GitHubSecurityClient{

ScrapeContext: ctx,

Client: client,

owner: owner,

repo: repo,

}, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 23 - 48, In NewGitHubSecurityClient, after resolving token (the block that sets token from ctx.HydrateConnectionByURL or ctx.GetEnvValueFromCache), add a check that token is not empty and return a descriptive error if it is; for example, if token == "" return fmt.Errorf("missing GitHub token: no connection %q and no personalAccessToken provided", config.ConnectionName or config.PersonalAccessToken) so the function fails fast instead of creating a client that will 401 on API calls.

coderabbitai · 2026-01-23T09:01:55Z

scrapers/github/security_client.go

+// GetSecurityAdvisories fetches repository security advisories
+func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {
+	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
+		State: opts.State,
+	}
+
+	advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)
+	if err != nil {
+		return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)
+	}
+
+	return advisories, resp, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "*.go" -type f | head -20

Repository: flanksource/config-db

Length of output: 465

🏁 Script executed:

# Search for AlertListOptions definition rg "type AlertListOptions" -A 10

Repository: flanksource/config-db

Length of output: 857

🏁 Script executed:

# Search for usage of ListRepositorySecurityAdvisoriesOptions rg "ListRepositorySecurityAdvisoriesOptions" -B 2 -A 5

Repository: flanksource/config-db

Length of output: 782

🏁 Script executed:

# Check the security_client.go file structure cat -n scrapers/github/security_client.go | head -150

Repository: flanksource/config-db

Length of output: 5681

🌐 Web query:

go-github ListRepositorySecurityAdvisoriesOptions ListOptions pagination

💡 Result:

In go-github, pagination for “list” endpoints is done with github.ListOptions (Page, PerPage). You pass it either directly or via embedding in a more specific *Options struct, then loop using resp.NextPage. [1], [2]

So for ListRepositorySecurityAdvisoriesOptions, you paginate the same way (GitHub’s REST endpoint also supports page / per_page). [1], [3]

opt := &github.ListRepositorySecurityAdvisoriesOptions{ ListOptions: github.ListOptions{PerPage: 100}, } for { advisories, resp, err := client.SecurityAdvisories. ListRepositorySecurityAdvisories(ctx, owner, repo, opt) if err != nil { return err } // ... use advisories ... if resp.NextPage == 0 { break } opt.Page = resp.NextPage // (or opt.ListOptions.Page, depending on embedding) }

Sources: [1] https://sources.debian.org/src/golang-github-google-go-github/48.1.0-2/README.md (Pagination section) • [2] https://pkg.go.dev/github.com/google/go-github/v30/github (ListOptions) • [3] https://docs.github.com/en/rest/security-advisories/repository-advisories (List repository security advisories)

Pagination options ignored for security advisories.

opts.Page and opts.PerPage are not applied, so only the first page is fetched. This creates a data loss risk for repositories with many advisories. All other alert methods in this file (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts) correctly include ListOptions; this method should follow the same pattern.

Proposed fix

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{ State: opts.State, + ListOptions: github.ListOptions{ + Page: opts.Page, + PerPage: opts.PerPage, + }, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

ListOptions: github.ListOptions{

Page: opts.Page,

PerPage: opts.PerPage,

},

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 121 - 133, GetSecurityAdvisories ignores pagination and always fetches the first page; set the ListRepositorySecurityAdvisoriesOptions to include the requested page and per-page values from AlertListOptions (map opts.Page and opts.PerPage into the options used for the API call). Update the reqOpts used in GetSecurityAdvisories to populate the ListOptions/Page and PerPage fields (same approach as GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) so the call to c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories honors pagination.

coderabbitai · 2026-01-23T09:01:56Z

scrapers/github/security.go

+			results = append(results, result)
+			ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName,
+				result.Properties.Find("Critical Alerts").Text+result.Properties.Find("High Alerts").Text+
+				result.Properties.Find("Medium Alerts").Text+result.Properties.Find("Low Alerts").Text)


⚠️ Potential issue | 🟡 Minor

Fix total-alerts logging (string concat + %d).

The log concatenates strings and formats with %d, so the output is incorrect. Compute an integer total before logging.

🐛 Suggested fix

import ( "fmt" + "strconv" "sync" "time"

- results = append(results, result) - ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, - result.Properties.Find("Critical Alerts").Text+result.Properties.Find("High Alerts").Text+ - result.Properties.Find("Medium Alerts").Text+result.Properties.Find("Low Alerts").Text) + results = append(results, result) + totalAlerts := 0 + for _, name := range []string{"Critical Alerts", "High Alerts", "Medium Alerts", "Low Alerts"} { + if p := result.Properties.Find(name); p != nil { + if n, err := strconv.Atoi(p.Text); err == nil { + totalAlerts += n + } + } + } + ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, totalAlerts)

🤖 Prompt for AI Agents

In `@scrapers/github/security.go` around lines 59 - 62, The log call is formatting with "%d" but is concatenating the alert-count strings, so compute an integer total first by parsing the individual counts from result.Properties.Find("Critical Alerts").Text, "High Alerts", "Medium Alerts", and "Low Alerts" (use strconv.Atoi, handle errors defaulting to 0), sum them into a total int, then call ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, total) instead of concatenating the Text values.

coderabbitai · 2026-01-23T09:01:56Z

scrapers/processors/json.go

+		if len(input.Changes) == 0 {
+			return input, fmt.Errorf("no id defined for: %s", input.Debug().ANSI())
+		}
+		if len(lo.Filter(input.Changes, func(c v1.ChangeResult, _ int) bool {
+			return c.ExternalChangeID == ""
+		})) > 0 {
+			return input, fmt.Errorf("standalone changes must have both an `external_id` and `external_change_id`: %s: %v", input, e.Config)
+
+		}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find ChangeResult struct definition first echo "=== ChangeResult struct definition ===" rg -n 'type ChangeResult struct' --type go -A 10 echo -e "\n=== ChangeResult instantiation patterns ===" rg -n 'ChangeResult\{' --type go -C 2 echo -e "\n=== Standalone changes validation/tests ===" rg -n 'standalone.*change|ExternalChangeID|ExternalID' --type go -g '*_test.go' -C 1

Repository: flanksource/config-db

Length of output: 19954

🏁 Script executed:

# Get full context around lines 599-607 in json.go echo "=== Context around json.go lines 599-607 ===" sed -n '590,620p' scrapers/processors/json.go # Search for other standalone change validations or references echo -e "\n=== Standalone changes references ===" rg -n 'standalone' --type go # Check if ExternalID is used/populated in json processor echo -e "\n=== ExternalID usage in json processor ===" rg -n 'ExternalID' scrapers/processors/json.go -C 2

Repository: flanksource/config-db

Length of output: 1179

🏁 Script executed:

# Check git history/comments around this validation echo "=== Full function context in json.go ===" grep -n 'func.*Process' scrapers/processors/json.go | head -5 # Get the function signature and full validation logic sed -n '550,650p' scrapers/processors/json.go # Look for documentation or comments about standalone changes echo -e "\n=== Comments/docs about standalone changes ===" rg -n -B5 'input.ID == ""' scrapers/processors/json.go | head -20 # Check if ExternalID is required elsewhere in change processing echo -e "\n=== How changes are processed when ExternalID is populated ===" rg -n 'ExternalID' scrapers/processors/ --type go

Repository: flanksource/config-db

Length of output: 2907

🏁 Script executed:

# Check if ExternalID is used/populated at the ScrapeResult level echo "=== ScrapeResult struct definition and ExternalID usage ===" rg -n 'type ScrapeResult struct' --type go -A 20 api/v1/interface.go # Check if there's logic that depends on ExternalID when processing changes echo -e "\n=== How ExternalID is processed in db updates ===" rg -n 'ExternalID' db/update.go -B 2 -A 2 # Check if there are any tests for this validation echo -e "\n=== Tests for extractAttributes validation ===" rg -n 'extractAttributes|standalone changes' --type go -g '*_test.go' -B 2 -A 5

Repository: flanksource/config-db

Length of output: 8488

🏁 Script executed:

# Search for the term "standalone" in documentation or comments echo "=== Standalone changes documentation/comments ===" rg -n 'standalone' --type go --type md -B 3 -A 3 # Check if there's a pattern where both ExternalID and ExternalChangeID are populated together echo -e "\n=== Changes with both ExternalID and ExternalChangeID ===" rg -n 'ExternalID.*ExternalChangeID|ExternalChangeID.*ExternalID' --type go

Repository: flanksource/config-db

Length of output: 783

Validation is incomplete — missing ExternalID check for standalone changes.

When input.ID is empty (standalone changes), the code only validates that ExternalChangeID is present. However, the error message correctly states both external_id and external_change_id are required. For standalone changes without an input ID, ExternalID is needed to identify the associated config item. Add validation for both fields.

Proposed fix

if input.ID == "" { if len(input.Changes) == 0 { return input, fmt.Errorf("no id defined for: %s", input.Debug().ANSI()) } if len(lo.Filter(input.Changes, func(c v1.ChangeResult, _ int) bool { - return c.ExternalChangeID == "" + return c.ExternalID == "" || c.ExternalChangeID == "" })) > 0 { return input, fmt.Errorf("standalone changes must have both an `external_id` and `external_change_id`: %s: %v", input, e.Config) - } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if len(input.Changes) == 0 {

return input, fmt.Errorf("no id defined for: %s", input.Debug().ANSI())

}

if len(lo.Filter(input.Changes, func(c v1.ChangeResult, _ int) bool {

return c.ExternalChangeID == ""

})) > 0 {

return input, fmt.Errorf("standalone changes must have both an `external_id` and `external_change_id`: %s: %v", input, e.Config)

}

if len(input.Changes) == 0 {

return input, fmt.Errorf("no id defined for: %s", input.Debug().ANSI())

}

if len(lo.Filter(input.Changes, func(c v1.ChangeResult, _ int) bool {

return c.ExternalID == "" || c.ExternalChangeID == ""

})) > 0 {

return input, fmt.Errorf("standalone changes must have both an `external_id` and `external_change_id`: %s: %v", input, e.Config)

}

🤖 Prompt for AI Agents

In `@scrapers/processors/json.go` around lines 599 - 607, The validation currently checks only ExternalChangeID for standalone inputs but not ExternalID; update the validator in the same block that inspects input.Changes (near the lo.Filter usage) to also check for any change where ExternalID == "" when input.ID is empty (standalone change). Use the same pattern (e.g., lo.Filter or an additional check) referencing input.Changes, ExternalID, ExternalChangeID and input.ID, and return the same formatted error (including e.Config and input.Debug()) when either required field is missing.

coderabbitai · 2026-01-23T09:01:56Z

scrapers/processors/script.go

 	if err := json.Unmarshal([]byte(s), &results); err != nil {
+		if logger.V(5).Enabled() {
+			logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s)
+		}


⚠️ Potential issue | 🟠 Major

Avoid logging full script output (potential secrets/PII leak).

Even at verbose level, logging raw s can expose credentials/configs and dump large payloads. Consider truncating/redacting or logging only a sample/length.

💡 Suggested fix (truncate output)

- if logger.V(5).Enabled() { - logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s) - } + if logger.V(5).Enabled() { + const maxLogBytes = 4096 + out := s + if len(out) > maxLogBytes { + out = out[:maxLogBytes] + "...(truncated)" + } + logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, out) + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if err := json.Unmarshal([]byte(s), &results); err != nil {

if logger.V(5).Enabled() {

logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s)

}

if err := json.Unmarshal([]byte(s), &results); err != nil {

if logger.V(5).Enabled() {

const maxLogBytes = 4096

out := s

if len(out) > maxLogBytes {

out = out[:maxLogBytes] + "...(truncated)"

}

logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, out)

}

🤖 Prompt for AI Agents

In `@scrapers/processors/script.go` around lines 33 - 36, The current logger.Infof call in the json.Unmarshal error block logs the entire script output variable s (risking secrets/PII and large dumps); update the error handling around json.Unmarshal to avoid printing the full s — instead log the error and either a safely truncated sample (e.g., first N characters with "…(truncated)") and the total length, or a redacted/hashed summary of s, and consider simple redaction for common secrets (tokens, passwords) before including any sample; modify the logger.Infof invocation that references err and s to use the new truncated/redacted sample and length-only information.

moshloop added 5 commits January 23, 2026 10:33

feat: add github/openssf scrapers

13a638f

fix: route53 alias records

89fff5b

chore: logging improvements

562a0f4

feat: add support for change mapping to a different config

6c2c1c2

chore: improve local cli run experience

d808826

moshloop force-pushed the security-scrapers branch from 35d3f51 to d808826 Compare January 23, 2026 08:34

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

		// MinScore optionally filters repositories by minimum score (0-10)
		MinScore *float64 `yaml:"minScore,omitempty" json:"minScore,omitempty"`

		// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls
		// Based on DataSheet reference: plugins/vite-plugin-security.js

	func scorecardCheckNameToKebab(name string) string {
	return strings.ToLower(strings.ReplaceAll(name, "-", "-"))
	}

Conversation

moshloop commented Jan 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

GitHub REST API: security alerts (code scanning, secret scanning, Dependabot) — auth & access requirements

1) Token types you can use

2) Scopes / permissions (classic PATs & OAuth tokens)

3) Fine-grained PAT / GitHub App permission names (repo-scoped)

4) Who is allowed to call the endpoints (role / access gating)

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

moshloop commented Jan 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 23, 2026 •

edited

Loading