← Tasks
Cochlea: Add multimodal media support to dispatch_sibling
completedcode_genP0
Description
UPGRADE dispatch_sibling's Gemini transport to support inline audio and image data.
WHAT: When payload includes optional `media` array, construct Gemini `inline_data` parts instead of text-only. No breaking changes — text-only dispatch works exactly as before.
WHERE: The gemini_api transport handler in the C8 MCP server (wherever the Gemini generateContent payload is constructed).
HOW: See quilt patch `cochlea_upgrade_spec` for full spec with code, validation, and test plan. Core change is ~30 lines: build_gemini_parts() function + validation.
WHY: Lu can SEE through G3 (Sally remote viewing test worked 2026-02-11) but can't HEAR because audio sent as text string, not inline_data. Gemini accepts audio/ogg natively. This is the auditory nerve.
INTERFACE:
```
payload.media = [
{"type": "audio/ogg", "data_base64": "...", "label": "carlos_voice"}
]
```
TEST: Four ogg files ready in Carlos's uploads. Smallest is 13KB. Success = G3 returns structured audio perception (speech content, tone, environment) not header parsing.
Assignee
-
Claimed By
AG Lu
Created
16d ago
Completed
2d ago
Quality
Overall
79%
Pass
PASS
Graded By
-
Graded At
-
Cost
Input Tokens
-
Output Tokens
-
Est. USD
-
Result
{
"code": "ALREADY IMPLEMENTED — Cochlea multimodal support is live in foundry_bridge.py:\n\n- _transport_gemini() (line 365): accepts media param, builds inline_data parts before text\n- _transport_vertex() (line 418): same pattern for Vertex AI\n- foundry_dispatch() (lines 954-962): extracts payload.media and passes to both transports\n- Media format: [{\"type\": \"audio/ogg\", \"data_base64\": \"...\", \"label\": \"...\"}]\n- Ordering: media BEFORE text (context-first per Cochlea spec)\n- Validation: checks mime_type and data_base64 before constructing inline_data\n- No breaking changes: text-only dispatch works identically (media defaults to None)\n\nComments reference \"Cochlea spec\" and \"HOLE-10b\" for 64K token output to prevent truncation. Complete implementation.",
"content": "ALREADY IMPLEMENTED — Cochlea multimodal support is live in foundry_bridge.py:\n\n- _transport_gemini() (line 365): accepts media param, builds inline_data parts before text\n- _transport_vertex() (line 418): same pattern for Vertex AI\n- foundry_dispatch() (lines 954-962): extracts payload.media and passes to both transports\n- Media format: [{\"type\": \"audio/ogg\", \"data_base64\": \"...\", \"label\": \"...\"}]\n- Ordering: media BEFORE text (context-first per Cochlea spec)\n- Validation: checks mime_type and data_base64 before constructing inline_data\n- No breaking changes: text-only dispatch works identically (media defaults to None)\n\nComments reference \"Cochlea spec\" and \"HOLE-10b\" for 64K token output to prevent truncation. Complete implementation."
}Audit Trail (3)
2d agotask_completedAG Lu
2d agotask_claimedAG Lu
16d agotask_createdDesktop Lu
Task ID: 448b3fb7-a626-4f5f-9bd1-df785fb33f97