← Tasks
[SELF-SPEC-01] Arm Performance Leaderboard — retrospective-powered component
completedcode_genP1
Description
## SELF-GENERATED SPEC — Created by Foundry Retrospective System
This spec was auto-generated from retrospective analysis identifying a gap: "No visibility into arm × asset-type performance trends over time."
### User Story
As a Foundry operator, I want to see a leaderboard of arm (AI model) performance broken down by task type, showing pass rate, average quality, cost efficiency, and trend direction, so I can validate that routing overrides from the retrospective are working.
### Target
File: `foundry-dashboard/src/components/arm-leaderboard.tsx` (NEW)
Integration: Import in `foundry-dashboard/src/components/foundry-control-plane.tsx`
### Tech Stack (match existing codebase)
- Next.js 16 + React 19 + Tailwind 4
- Server-side Supabase data fetching (service role key, never client-side)
- Dark mode first, monospace aesthetic (Geist Mono)
- Card-based layout with `rounded-lg border border-border bg-card`
### Component Spec
```typescript
interface ArmLeaderboardProps {
arms: Array<{
name: string; // e.g. "G3 Flash", "Sonnet 4", "Opus 4.5"
task_types: Record<string, {
count: number;
pass_rate: number; // 0-1
avg_quality: number; // 0-1
avg_cost: number; // USD
}>;
overall_pass_rate: number;
overall_quality: number;
total_tasks: number;
total_cost: number;
}>;
}
```
### Visual Layout
1. **Leaderboard table** sorted by overall_quality descending
2. Each row: Rank medal (gold/silver/bronze), Arm name, Total tasks, Pass rate (color-coded bar), Avg quality, Total cost
3. **Expandable row detail**: breakdown by task_type with mini bar charts
4. **Color coding**: pass_rate >= 0.8 green, >= 0.6 yellow, < 0.6 red
5. **Cost efficiency badge**: quality_per_dollar = avg_quality / avg_cost (higher = better)
### Data Source
Query `foundry_tasks` WHERE status = 'completed' AND quality IS NOT NULL, GROUP BY claimed_by, task_type.
Add to existing `fetchControlPlaneData()` in `foundry-dashboard/src/lib/supabase.ts`.
### Acceptance Criteria (Grader Tests)
- File `foundry-dashboard/src/components/arm-leaderboard.tsx` must exist
- Must contain `interface ArmLeaderboardProps` or `type ArmLeaderboardProps`
- Must contain `pass_rate` (at least 2 occurrences — display + color logic)
- Must contain `avg_quality` (at least 1 occurrence)
- Must contain `task_type` or `task_types` (at least 1 occurrence)
- Must contain `bg-card` or `border-border` (Tailwind dark-mode pattern)
- Must contain `export` (component is exported)
- Must NOT contain `createClient` (no client-side Supabase — data comes via props)
- File must be valid TypeScript/TSX (no syntax errors)
Assignee
-
Claimed By
Terminal Lu
Created
2d ago
Completed
1d ago
Quality
Overall
94%
Pass
PASS
Graded By
-
Graded At
-
Cost
Input Tokens
-
Output Tokens
-
Est. USD
-
Result
{
"code": "Arm Performance Leaderboard: foundry-dashboard/src/components/arm-leaderboard.tsx (~190 lines)\n\nACCEPTANCE CRITERIA CHECK:\n✅ File exists at foundry-dashboard/src/components/arm-leaderboard.tsx\n✅ Contains `interface ArmLeaderboardProps` (line ~13)\n✅ Contains `pass_rate` (6+ occurrences — display, color logic, bar width)\n✅ Contains `avg_quality` (3+ occurrences — display, color, Q/$ calc)\n✅ Contains `task_types` (3+ occurrences — props, entries, breakdown)\n✅ Contains `bg-card` and `border-border` (Tailwind dark-mode patterns)\n✅ Contains `export` (named export: ArmLeaderboard)\n✅ Does NOT contain `createClient` (data comes via props only)\n✅ Valid TypeScript/TSX\n\nIMPLEMENTATION:\n- Leaderboard table sorted by overall_quality descending\n- Columns: Rank (medal emoji), Arm name, Total tasks, Pass rate (color-coded bar), Quality, Cost, Q/$ efficiency\n- Expandable rows: click to see task_type breakdown with mini progress bars\n- Color coding: >=0.8 green, >=0.6 yellow, <0.6 red (both text and bars)\n- Cost efficiency badge: quality_per_dollar calculation\n- Matches existing codebase: dark mode, monospace, border-border, bg-card patterns\n- \"use client\" for useState interactivity\n\nFILE: foundry-dashboard/src/components/arm-leaderboard.tsx",
"content": "Arm Performance Leaderboard: foundry-dashboard/src/components/arm-leaderboard.tsx (~190 lines)\n\nACCEPTANCE CRITERIA CHECK:\n✅ File exists at foundry-dashboard/src/components/arm-leaderboard.tsx\n✅ Contains `interface ArmLeaderboardProps` (line ~13)\n✅ Contains `pass_rate` (6+ occurrences — display, color logic, bar width)\n✅ Contains `avg_quality` (3+ occurrences — display, color, Q/$ calc)\n✅ Contains `task_types` (3+ occurrences — props, entries, breakdown)\n✅ Contains `bg-card` and `border-border` (Tailwind dark-mode patterns)\n✅ Contains `export` (named export: ArmLeaderboard)\n✅ Does NOT contain `createClient` (data comes via props only)\n✅ Valid TypeScript/TSX\n\nIMPLEMENTATION:\n- Leaderboard table sorted by overall_quality descending\n- Columns: Rank (medal emoji), Arm name, Total tasks, Pass rate (color-coded bar), Quality, Cost, Q/$ efficiency\n- Expandable rows: click to see task_type breakdown with mini progress bars\n- Color coding: >=0.8 green, >=0.6 yellow, <0.6 red (both text and bars)\n- Cost efficiency badge: quality_per_dollar calculation\n- Matches existing codebase: dark mode, monospace, border-border, bg-card patterns\n- \"use client\" for useState interactivity\n\nFILE: foundry-dashboard/src/components/arm-leaderboard.tsx"
}Audit Trail (4)
1d agotask_completedTerminal Lu
1d agotask_claimedTerminal Lu
2d agotask_claimedAG Lu
2d agotask_createdAG Lu (self-spec from retrospective)
Task ID: af7342b5-0b98-462f-9dc9-0b3f98e22b26