带溯源的 K 路报价流合并
Merge K Quote Streams with Lineage
开始编码跨场所报价聚合管道接入 K 条独立行情——每个场所一条——每条都是按 ts 升序的 [ts, payload] 流(场所端解析器以这个形态推过来)。下游消费者(统一 TCA / 回放层)想要的是**一条按时间戳升序的合并流,且每条记录都带上来源 stream_id,让端到端的 lineage 可追溯:当回放在 t=12345 显示一笔报价时,必须能反指出究竟是哪个场所打印的。这条管道还需要审计支线**——每个场所贡献了多少条记录到合并输出——好让 ops 一眼就能识别出某个理应活跃的场所是否静默了(贡献数为 0),无须再次重走整条合并磁带。
朴素的“压平再 sort”是 O(N log N) 且抛弃了“每条流本身已有序”这一结构。标准做法仍是最小堆 K 路归并:把每条流的当前队首以 (ts, stream_index, position) 入堆,弹出最小者输出、再把该流的下一条压进堆。整体 O(N log K)。stream_index 放在元组第二位是稳定决胜的关键:相同 ts 时较小的 stream_index 确定地先输出,且 Python 永远不需要去比较 payload。
请实现 solution(streams: list[list[list]]) -> dict。streams 是 K 条按 ts 升序的报价流,每条记录形如 [ts, payload]。返回 {"merged": [...], "counts_per_stream": [...]},其中 merged 是全局有序的 [stream_id, ts, payload] 三元组列表(注意 stream_id 排在三元组*第一位*——这是溯源标签),counts_per_stream[i] 是 streams[i] 贡献的记录数。
例如 solution([[[1, "A1"], [4, "A2"]], [[2, "B1"], [4, "B2"]], [[3, "C1"], [4, "C2"]]]) 应返回 {"merged": [[0, 1, "A1"], [1, 2, "B1"], [2, 3, "C1"], [0, 4, "A2"], [1, 4, "B2"], [2, 4, "C2"]], "counts_per_stream": [2, 2, 2]}。ts=4 时三条流同时打印,输出顺序为 stream 0、1、2,正合契约要求。
再举一例:solution([[[5, "x"]]])(单流)返回 {"merged": [[0, 5, "x"]], "counts_per_stream": [1]}——注意即便不需要真正归并,记录依然要重塑成 [stream_id, ts, payload] 三元组形态。solution([])(空 streams)返回 {"merged": [], "counts_per_stream": []}。
约束条件
- 0 ≤ K = len(streams) ≤ 100
- 每条流长度 0 ≤ m_i;所有流总记录数 N = Σ m_i ≤ 10^5
- 每条流内部按 ts 升序;同一流内 ts 并列时保持原顺序
- streams[i][j] 形如 `[ts, payload]`,ts 是 [0, 10^9] 的整数,payload 是字符串
- K 条流中允许出现空流 `[]`,必须跳过、不能压进堆
- 返回 `{"merged": list, "counts_per_stream": list[int]}`。`merged` 的每个元素是 `[stream_id, ts, payload]` 三元组,整体按 `ts` 升序;并列时按较小的 `stream_id`(在 `streams` 中较靠前的索引)优先输出。`counts_per_stream[i]` 是 `streams[i]` 贡献到合并输出的记录数(正确归并下永远等于 `len(streams[i])`,但显式返回是为了下游审计)
- K = 0(空 `streams` 列表)→ `{"merged": [], "counts_per_stream": []}`
样例
Case 1 · example three streams with full tie at ts=4
输入: [[[[1,"A1"],[4,"A2"]],[[2,"B1"],[4,"B2"]],[[3,"C1"],[4,"C2"]]]]
期望: {"merged":[[0,1,"A1"],[1,2,"B1"],[2,3,"C1"],[0,4,"A2"],[1,4,"B2"],[2,4,"C2"]],"counts_per_stream":[2,2,2]}
ts=1/2/3 各自按升序流出;ts=4 时三条流同时触发,按 stream index 0 → 1 → 2 决胜,验证 K 路归并的稳定 tie-break。
Case 2 · single non-empty stream is still wrapped with stream_id 0
输入: [[[[5,"x"],[10,"y"],[15,"z"]]]]
期望: {"merged":[[0,5,"x"],[0,10,"y"],[0,15,"z"]],"counts_per_stream":[3]}
单流——不需要真的归并,但每条记录仍要从 [ts, payload] 重塑为 [stream_id, ts, payload],且 counts_per_stream[0] 等于该流长度。
Case 3 · two stream interleave with shared timestamp at end
输入: [[[[1,"a"],[3,"c"],[5,"e1"]],[[2,"b"],[4,"d"],[5,"e2"]]]]
期望: {"merged":[[0,1,"a"],[1,2,"b"],[0,3,"c"],[1,4,"d"],[0,5,"e1"],[1,5,"e2"]],"counts_per_stream":[3,3]}
K=2 退化为双指针归并;ts=5 时 stream 0 索引更小,所以 [0, 5, 'e1'] 先于 [1, 5, 'e2']。
Case 4 · three streams tie at multiple timestamps stable order
输入: [[[[1,"p"],[2,"q"],[3,"r"]],[[1,"s"],[2,"t"],[3,"u"]],[[1,"v"],[2,"w"],[3,"x"]]]]
期望: {"merged":[[0,1,"p"],[1,1,"s"],[2,1,"v"],[0,2,"q"],[1,2,"t"],[2,2,"w"],[0,3,"r"],[1,3,"u"],[2,3,"x"]],"counts_per_stream":[3,3,3]}
每个时间戳都被三条流共享,每个 ts 处必须按 stream index 顺序输出。
最近提交
还没有提交记录。
编码区
实现 solution(...)。本地运行当前支持 Python 可见样例;服务端提交会运行可见样例和隐藏测试。
默认展示公开样例。点击「运行样例」后会在这里显示实际输出;点击「提交评测」会进入隐藏测试。
Case 1 · example three streams with full tie at ts=4
输入: [[[[1,"A1"],[4,"A2"]],[[2,"B1"],[4,"B2"]],[[3,"C1"],[4,"C2"]]]]
期望: {"merged":[[0,1,"A1"],[1,2,"B1"],[2,3,"C1"],[0,4,"A2"],[1,4,"B2"],[2,4,"C2"]],"counts_per_stream":[2,2,2]}
ts=1/2/3 各自按升序流出;ts=4 时三条流同时触发,按 stream index 0 → 1 → 2 决胜,验证 K 路归并的稳定 tie-break。
Case 2 · single non-empty stream is still wrapped with stream_id 0
输入: [[[[5,"x"],[10,"y"],[15,"z"]]]]
期望: {"merged":[[0,5,"x"],[0,10,"y"],[0,15,"z"]],"counts_per_stream":[3]}
单流——不需要真的归并,但每条记录仍要从 [ts, payload] 重塑为 [stream_id, ts, payload],且 counts_per_stream[0] 等于该流长度。
Case 3 · two stream interleave with shared timestamp at end
输入: [[[[1,"a"],[3,"c"],[5,"e1"]],[[2,"b"],[4,"d"],[5,"e2"]]]]
期望: {"merged":[[0,1,"a"],[1,2,"b"],[0,3,"c"],[1,4,"d"],[0,5,"e1"],[1,5,"e2"]],"counts_per_stream":[3,3]}
K=2 退化为双指针归并;ts=5 时 stream 0 索引更小,所以 [0, 5, 'e1'] 先于 [1, 5, 'e2']。
Case 4 · three streams tie at multiple timestamps stable order
输入: [[[[1,"p"],[2,"q"],[3,"r"]],[[1,"s"],[2,"t"],[3,"u"]],[[1,"v"],[2,"w"],[3,"x"]]]]
期望: {"merged":[[0,1,"p"],[1,1,"s"],[2,1,"v"],[0,2,"q"],[1,2,"t"],[2,2,"w"],[0,3,"r"],[1,3,"u"],[2,3,"x"]],"counts_per_stream":[3,3,3]}
每个时间戳都被三条流共享,每个 ts 处必须按 stream index 顺序输出。