be:// fetch re-ships already-present objects → unbounded shard bloat (498 MB pack vs 36 MB fresh clone)
A clean fresh clone's .be/ is 36 MB, yet a re-fetched shard grew to a 498 MB / 290,853-object pack — the same objects at ~14× the bytes. The root cause is NOT a missing client-side negotiation (the client DOES advertise haves from the store) — it is server-side have-pruning that fails and re-ships the whole log, which the client then re-appends. The real fix is to make the peer ship only the objects the client lacks; the ingest-side dedup (below) is a held band-aid, not the cure. See GET, POST, CLAUDE.
Negotiation machinery exists end-to-end, but the server's pruning is an offset-watermark within one pack file and silently degrades to "ship everything".
wcli_collect_haves (keeper/WIRECLI.c:594) walks REFSEach($path(keepdir), …) — the STORE — collecting both local ?<branch> and peer-observed <host>?<branch> tips; they go out as have <sha> lines in wcli_send_request (:652), called on every fetch path (WIREFetchAll:965, multi-want :827). So "it only advertises worktree hashes" is NOT the bug — refs come from the store, as they should.keeper/WIRE.c:272-330): for each have it wire_locate_sha → wire_find_pack, takes cand = hpoff + hplen (the END of the pack containing that have), and the watermark is the max cand; it then ships the log segment [watermark .. end_offset).if (hfid != want_fid) continue (WIRE.c:283) DISCARDS any have not in the want's pack-log file_id; an unlocatable have is also skipped (:282). If no have anchors, have_anchor stays NO and watermark = 12 (:295-298) → the server ships the WHOLE log from the first object → a full pack the client mostly already has..be = 36 MB; the polluted shard = 570 MB (one 498 MB / 290,853-object pack) — same objects, ~14× bytes.
The keeper pack log is append-only; any ingest-side safety net must stay append-only (decide-before-append, never truncate/u8bShed — see the held band-aid). The primary fix is server-side and protocol-level; trace it before coding.
Make the server ship only objects the client lacks; confirm the watermark failure first.
$HOME/tmp, never ~/.be) with a populated shard; clone, then re-fetch and TRACE the server WIRE.c path — capture req->nhaves, each have's wire_locate_sha result, hfid vs want_fid, and the resulting watermark. Confirm whether the anchor fails because haves land in a different file_id, are unlocatable, or the segment still includes duplicates. Pin the EXACT reason before changing code.hfid == want_fid only, or (b) move to a reachability-based exclusion (closure(wants) − closure(haves)) like real upload-pack. The peer must ship ~0 objects when the client's haves already cover the wants.keep_pack_all_present/KEEPHas in keeper/KEEP.c) — append-only-preserving, but it only triggers on an all-duplicate full pack (re-fetch of an unchanged repo) and filters on the 60-bit hashlet (a prefix match → a 60-bit collision would SILENTLY DROP a genuinely-new object). Keep it on a branch as a possible safety net, but do not rely on it; the negotiation fix makes it largely moot and avoids the silent-drop risk.test/get / keeper/test; update keeper/INDEX.md.