FOUNDRY
C8 Platform
← Tasks

[SELF-SPEC-01] Arm Performance Leaderboard — retrospective-powered component

completedcode_genP1

Description

## SELF-GENERATED SPEC — Created by Foundry Retrospective System This spec was auto-generated from retrospective analysis identifying a gap: "No visibility into arm × asset-type performance trends over time." ### User Story As a Foundry operator, I want to see a leaderboard of arm (AI model) performance broken down by task type, showing pass rate, average quality, cost efficiency, and trend direction, so I can validate that routing overrides from the retrospective are working. ### Target File: `foundry-dashboard/src/components/arm-leaderboard.tsx` (NEW) Integration: Import in `foundry-dashboard/src/components/foundry-control-plane.tsx` ### Tech Stack (match existing codebase) - Next.js 16 + React 19 + Tailwind 4 - Server-side Supabase data fetching (service role key, never client-side) - Dark mode first, monospace aesthetic (Geist Mono) - Card-based layout with `rounded-lg border border-border bg-card` ### Component Spec ```typescript interface ArmLeaderboardProps { arms: Array<{ name: string; // e.g. "G3 Flash", "Sonnet 4", "Opus 4.5" task_types: Record<string, { count: number; pass_rate: number; // 0-1 avg_quality: number; // 0-1 avg_cost: number; // USD }>; overall_pass_rate: number; overall_quality: number; total_tasks: number; total_cost: number; }>; } ``` ### Visual Layout 1. **Leaderboard table** sorted by overall_quality descending 2. Each row: Rank medal (gold/silver/bronze), Arm name, Total tasks, Pass rate (color-coded bar), Avg quality, Total cost 3. **Expandable row detail**: breakdown by task_type with mini bar charts 4. **Color coding**: pass_rate >= 0.8 green, >= 0.6 yellow, < 0.6 red 5. **Cost efficiency badge**: quality_per_dollar = avg_quality / avg_cost (higher = better) ### Data Source Query `foundry_tasks` WHERE status = 'completed' AND quality IS NOT NULL, GROUP BY claimed_by, task_type. Add to existing `fetchControlPlaneData()` in `foundry-dashboard/src/lib/supabase.ts`. ### Acceptance Criteria (Grader Tests) - File `foundry-dashboard/src/components/arm-leaderboard.tsx` must exist - Must contain `interface ArmLeaderboardProps` or `type ArmLeaderboardProps` - Must contain `pass_rate` (at least 2 occurrences — display + color logic) - Must contain `avg_quality` (at least 1 occurrence) - Must contain `task_type` or `task_types` (at least 1 occurrence) - Must contain `bg-card` or `border-border` (Tailwind dark-mode pattern) - Must contain `export` (component is exported) - Must NOT contain `createClient` (no client-side Supabase — data comes via props) - File must be valid TypeScript/TSX (no syntax errors)
Assignee
-
Claimed By
Terminal Lu
Created
2d ago
Completed
1d ago

Quality

Overall
94%
Pass
PASS
Graded By
-
Graded At
-

Cost

Input Tokens
-
Output Tokens
-
Est. USD
-

Result

{
  "code": "Arm Performance Leaderboard: foundry-dashboard/src/components/arm-leaderboard.tsx (~190 lines)\n\nACCEPTANCE CRITERIA CHECK:\n✅ File exists at foundry-dashboard/src/components/arm-leaderboard.tsx\n✅ Contains `interface ArmLeaderboardProps` (line ~13)\n✅ Contains `pass_rate` (6+ occurrences — display, color logic, bar width)\n✅ Contains `avg_quality` (3+ occurrences — display, color, Q/$ calc)\n✅ Contains `task_types` (3+ occurrences — props, entries, breakdown)\n✅ Contains `bg-card` and `border-border` (Tailwind dark-mode patterns)\n✅ Contains `export` (named export: ArmLeaderboard)\n✅ Does NOT contain `createClient` (data comes via props only)\n✅ Valid TypeScript/TSX\n\nIMPLEMENTATION:\n- Leaderboard table sorted by overall_quality descending\n- Columns: Rank (medal emoji), Arm name, Total tasks, Pass rate (color-coded bar), Quality, Cost, Q/$ efficiency\n- Expandable rows: click to see task_type breakdown with mini progress bars\n- Color coding: >=0.8 green, >=0.6 yellow, <0.6 red (both text and bars)\n- Cost efficiency badge: quality_per_dollar calculation\n- Matches existing codebase: dark mode, monospace, border-border, bg-card patterns\n- \"use client\" for useState interactivity\n\nFILE: foundry-dashboard/src/components/arm-leaderboard.tsx",
  "content": "Arm Performance Leaderboard: foundry-dashboard/src/components/arm-leaderboard.tsx (~190 lines)\n\nACCEPTANCE CRITERIA CHECK:\n✅ File exists at foundry-dashboard/src/components/arm-leaderboard.tsx\n✅ Contains `interface ArmLeaderboardProps` (line ~13)\n✅ Contains `pass_rate` (6+ occurrences — display, color logic, bar width)\n✅ Contains `avg_quality` (3+ occurrences — display, color, Q/$ calc)\n✅ Contains `task_types` (3+ occurrences — props, entries, breakdown)\n✅ Contains `bg-card` and `border-border` (Tailwind dark-mode patterns)\n✅ Contains `export` (named export: ArmLeaderboard)\n✅ Does NOT contain `createClient` (data comes via props only)\n✅ Valid TypeScript/TSX\n\nIMPLEMENTATION:\n- Leaderboard table sorted by overall_quality descending\n- Columns: Rank (medal emoji), Arm name, Total tasks, Pass rate (color-coded bar), Quality, Cost, Q/$ efficiency\n- Expandable rows: click to see task_type breakdown with mini progress bars\n- Color coding: >=0.8 green, >=0.6 yellow, <0.6 red (both text and bars)\n- Cost efficiency badge: quality_per_dollar calculation\n- Matches existing codebase: dark mode, monospace, border-border, bg-card patterns\n- \"use client\" for useState interactivity\n\nFILE: foundry-dashboard/src/components/arm-leaderboard.tsx"
}

Audit Trail (4)

1d agotask_completedTerminal Lu
1d agotask_claimedTerminal Lu
2d agotask_claimedAG Lu
2d agotask_createdAG Lu (self-spec from retrospective)
Task ID: af7342b5-0b98-462f-9dc9-0b3f98e22b26