基于 Pinball 损失差分的 Diebold-Mariano 两 VaR 预测对比检验
Diebold-Mariano Test for Two VaR Forecasts via Pinball-Loss Differential
开始编码Diebold-Mariano(1995)检验是比较两个竞争预报模型预测精度的标准工具。在 VaR 研究仪表盘上它是横向竞速用的对比器:给定两候选 VaR 预报模型并以同一实现 PnL 序列打分,DM 统计量回答"在严格 proper 评分规则意义下,A 是否显著优于 B?"在等精度原假设下 DM 渐近 N(0, 1),故 |DM| > 1.96 拒绝等精度(95% 水平)。单模型 pinball 损失兄弟 coding-pinball-loss-var-quantile-backtest 一次只评分一个模型;本例程是其两模型对比版本。
请实现 solution(realized_pnls: list[float], var_forecasts_a: list[float], var_forecasts_b: list[float], alpha: float) -> float,返回 DM 统计量(有符号)。哨兵情形使用 float('inf') / float('-inf') / float('nan')。
配方为三步(每模型 pinball 损失各一遍,再做汇总差分统计——规范实现合并为对输入两遍扫描):
tau = 1 - alpha # 尾部概率(如 0.01)
q_a[t] = -var_forecasts_a[t] # 模型 A 的有符号 PnL tau 分位
q_b[t] = -var_forecasts_b[t] # 模型 B 的有符号 PnL tau 分位
r_a[t] = realized_pnls[t] - q_a[t]
r_b[t] = realized_pnls[t] - q_b[t]
loss_a[t] = tau * r_a[t] if r_a[t] >= 0 else (1 - tau) * (-r_a[t])
loss_b[t] = tau * r_b[t] if r_b[t] >= 0 else (1 - tau) * (-r_b[t])
d[t] = loss_a[t] - loss_b[t] # 每日损失差分
mean_d = sum(d[t]) / T
std_d = sqrt( sum( (d[t] - mean_d)^2 ) / (T - 1) ) # 样本标准差
DM = mean_d * sqrt(T) / std_d分子是均值损失差分(带符号——A 优、B 优的日子都贡献);分母是差分序列的样本标准差,再除 sqrt(T) 标准化。等价写法 DM = mean_d / (std_d / sqrt(T))——均值差分除以其自身的标准误。符号约定明确:DM 为正表示在此窗口 A 的 pinball 损失更高(A 较差);DM 为负表示 A 较优。
例
solution([50000.0, -120000.0, 30000.0, -180000.0, -60000.0], [100000.0, 100000.0, 100000.0, 100000.0, 100000.0], [200000.0, 200000.0, 200000.0, 200000.0, 200000.0], 0.99) 返回约 1.2264。5 日窗口,alpha = 0.99(tau = 0.01)。模型 A 用恒为 100k 的 VaR:q_a = -100000,残差 [150000, -20000, 130000, -80000, 40000],损失 [1500, 19800, 1300, 79200, 400](合 102200,均 20440——与单模型 pinball 例子一致)。模型 B 用恒为 200k 的 VaR:q_b = -200000,残差 [250000, 80000, 230000, 20000, 140000](每日均过报;仅命中轻支),损失 [2500, 800, 2300, 200, 1400](合 7200,均 1440)。差分 d = loss_a - loss_b = [-1000, 19000, -1000, 79000, -1000],mean_d = 19000,样本方差 4.8e9 / 4 = 1.2e9,样本标准差 ≈ 34641.016,DM = 19000 * sqrt(5) / 34641.016 ≈ 1.2264。DM 为正——A 平均 pinball 损失更高——故 A 在此窗口为较差预报器。两次尾部错过日主导 A 的损失;B 过报但只付轻惩。
常见陷阱
编码前请通读:
- Pinball 规范约定(下尾)。 两模型同用规范 pinball:
residual >= 0(过报)取轻权tau = 1 - alpha;residual < 0(尾部错过)取重权1 - tau = alpha。alpha = 0.99时尾部错过每元 0.99、过报每元仅 0.01。颠倒两支者使 A、B 损失都算错,差分d[t]与最终 DM 也跟着错(往往反号——胜负判错)。务必沿用 pinball 兄弟的规范约定。 - **有符号分位
q = -var_forecasts[t]。 VaR 给的是正亏损量级**(如 99% 1 日 VaR 的 25 万美元为250000.0);其有符号 PnL 的 tau 分位是其取负。直接用var_forecasts[t]者残差对的是正数(应为负数)——每日残差符号皆错。 - **
tau = 1 - alpha,不是tau = alpha。**alpha = 0.99的 VaR 对应 1% 尾部 = 有符号 PnL 的tau = 0.01分位。互换tau与(1 - tau)使两支权重对调——规范"重权在尾部错过"的 V 形变成"重权在过报"的 V 形。 - 样本标准差(T-1)而非总体标准差(T)。 DM 使用
std_d = sqrt(sum(...) / (T - 1))——与 IR 兄弟约定一致。误用除以 T 者得sqrt(T / (T - 1))倍的偏置;T 大时衰减,但小 T 在严容差下可辨。 - **
sqrt(T)乘均值。**DM = mean_d * sqrt(T) / std_d,不是mean_d / std_d。sqrt(T)是标准化因子,使 DM 渐近N(0, 1)。漏掉者得到非标准化量,|DM| > 1.96的拒绝阈值不再适用。 - 差分而非比率。
d[t] = loss_a[t] - loss_b[t],不是loss_a[t] / loss_b[t]。比率版有时见于通俗写法,但非渐近高斯(两非负数之比下界为 0)且loss_b[t]小时爆炸。DM 检验定义在差分上。 - 符号约定。 DM 为正:A 平均 pinball 损失高于 B——A 较差;DM 为负:A 较优。读结果时务必标注;颠倒约定将悄无声息地把胜负判反。
- **
std_d == 0哨兵。** 若差分序列恒定(如forecasts_a == forecasts_b逐元素相等使d恒为 0;或两模型仅产生过报使d[t] = tau * (VaR_a - VaR_b)恒定——loss = tau * residual = tau * (realized + VaR),故差分为tau * (VaR_a - VaR_b)),定义除法将抛ZeroDivisionError。规范返回:mean_d > 0返回+inf(A 一致较差)、mean_d < 0返回-inf(A 一致较优)、mean_d == 0返回NaN(A、B 损失恒等的0/0退化)。
边界情形
T == 0 或 T == 1:返回 NaN。样本标准差需要至少 2 个观测(T - 1 至少为 1)。
forecasts_a == forecasts_b 逐元素相等:每日 d[t] = 0,mean_d = 0,std_d = 0 → NaN(0/0 退化)。
差分恒定且非零:A 在每日均较差或均较优一个常量。std_d = 0;mean_d > 0 返回 +inf,mean_d < 0 返回 -inf。
alpha == 0.5:V 对称(tau = 0.5),两支均为 loss[t] = 0.5 * |residual|。DM 退化为两条围绕分位的 MAE 序列对比。
alpha 接近 1.0(如 0.999):尾部错过支每元罚 0.999、过报支每元罚 0.001。单一尾部错过日可能主导差分。
A、B 交替胜出:差分序列日内变号,std_d 相对 |mean_d| 大,|DM| 小——两模型在统计上不可区分。
实务背景
Diebold 和 Mariano(1995,《比较预测精度》,Journal of Business & Economic Statistics)将该检验作为模型无关的工具,用于在用户自选损失函数下比较任意两个竞争预报。Pinball 损失是分位预报的严格 proper 评分规则(Gneiting 与 Raftery 2007);将其代入 DM 即得对比两 VaR 模型的标准检验。在 VaR 研究仪表盘上,当两 VaR 模型候选在同一回测窗口中对决时,DM 是头条数字——Basel 红绿灯(分类)、Kupiec POF(边际超出率二项 LR)、Christoffersen(独立性与条件覆盖)、Lopez-I(量级感知计数)、单模型 pinball 损失都是单模型控件。DM 是两模型对比器:用单一带符号统计量与干净的渐近分布回答"A 是否显著优于 B?"。
约束条件
- 0 ≤ T ≤ 1500,其中 `T = len(realized_pnls) == len(var_forecasts_a) == len(var_forecasts_b)`
- 对所有 t,|`realized_pnls[t]`| ≤ 1e6(有符号 PnL;负为亏损)
- 对所有 t,0.0 ≤ `var_forecasts_a[t]`、`var_forecasts_b[t]` ≤ 1e6(置信水平 `alpha` 下的**正亏损数值**)
- 0.5 ≤ `alpha` < 1.0
- 输出:`float` —— DM 检验统计量(有符号);`float('inf')` / `float('-inf')` / `float('nan')` 按 std_d=0 与 T<2 哨兵规则返回
- 浮点比较容差:`rel_tol = 1e-9`,`abs_tol = 1e-9`(NaN 等于 NaN)
样例
Case 1 · statement-example: A under-forecasts vs B over-forecasts at alpha=0.99
输入: [[50000,-120000,30000,-180000,-60000],[100000,100000,100000,100000,100000],[200000,200000,200000,200000,200000],0.99]
期望: 1.2264447262990152
5 日窗口,alpha=0.99(tau=0.01)。A=10万恒定:q_a=-100000,残差=[150000,-20000,130000,-80000,40000],损失=[1500,19800,1300,79200,400](合 102200)。B=20万恒定:q_b=-200000,残差=[250000,80000,230000,20000,140000](全部过报——轻惩),损失=[2500,800,2300,200,1400](合 7200)。d=[-1000,19000,-1000,79000,-1000],mean_d=19000,sample_var=1.2e9,std~34641.016,DM=19000*sqrt(5)/34641.016~1.2264。DM 为正:A 损失更高 => A 较差。
Case 2 · boundary: empty backtest window T=0
输入: [[],[],[],0.99]
期望: "NaN"
T=0:样本标准差需 T>=2;返回 NaN。
Case 3 · boundary: single day T=1
输入: [[-50],[100],[50],0.99]
期望: "NaN"
T=1:T-1=0,样本标准差未定义;返回 NaN。
Case 4 · boundary: identical forecasts A==B element-wise
输入: [[1,-2,0.5,3,-1.5],[50,50,50,50,50],[50,50,50,50,50],0.95]
期望: "NaN"
forecasts_a 与 forecasts_b 逐元素相等:每日 loss_a==loss_b,d 恒为 0;mean_d=0、std_d=0 => 0/0 哨兵:NaN。
Case 5 · boundary: differential constant negative => -inf (A uniformly better)
输入: [[0,5,10,15,20],[10,10,10,10,10],[20,20,20,20,20],0.95]
期望: "-Infinity"
两模型每日均为过报,loss=tau*residual。d[t]=tau*(VaR_a-VaR_b)=0.05*(10-20)=-0.5 恒定。std_d=0、mean_d=-0.5<0 => -inf(A 一致优于 B)。
Case 6 · boundary: differential constant positive => +inf (A uniformly worse)
输入: [[0,5,10,15,20],[20,20,20,20,20],[10,10,10,10,10],0.95]
期望: "Infinity"
同上 A、B 互换:d[t]=tau*(VaR_a-VaR_b)=+0.5 恒定(此处 VaR_a=20、VaR_b=10,VaR_a-VaR_b=+10)。std_d=0、mean_d>0 => +inf(A 一致较差)。
Case 7 · boundary: T=2 minimum sample size
输入: [[10,-10],[50,50],[60,40],0.95]
期望: 0
T=2:alpha=0.95。q_a=[-50,-50],q_b=[-60,-40]。residual_a=[60,40](轻),loss_a=[3,2];residual_b=[70,30],loss_b=[3.5,1.5]。d=[-0.5,0.5],mean_d=0 => DM=0.0。
Case 8 · boundary: alpha=0.5 symmetric V-loss; mean_d=0
输入: [[-1,1,-2,2],[0,0,0,0],[1,1,1,1],0.5]
期望: 0
alpha=0.5:V 对称,loss=0.5*|residual|。q_a=0,q_b=-1。residual_a=[-1,1,-2,2],residual_b=[0,2,-1,3]。loss_a=[0.5,0.5,1.0,1.0],loss_b=[0,1,0.5,1.5]。d=[0.5,-0.5,0.5,-0.5],mean_d=0 => DM=0.0。
Case 9 · typical: A=200k VaR vs B=20k VaR; B misses tails => DM negative
输入: [[-14409,-17290,-11132,70198,-12759,-149735,33232,-26734,-21696,11588],[200000,200000,200000,200000,200000,200000,200000,200000,200000,200000],[20000,20000,20000,20000,20000,20000,20000,20000,20000,20000],0.95]
期望: -0.3734526840242883
随机 PnL ~N(0,10万),T=10。A 用 20 万 VaR(足以覆盖);B 用极小 2 万 VaR(多日尾部错过)。B 损失更高;mean_d<0;DM 强负——A 胜。
Case 10 · typical: A=20k VaR (under-forecasts) vs B=200k VaR; DM positive
输入: [[-14409,-17290,-11132,70198,-12759,-149735,33232,-26734,-21696,11588],[20000,20000,20000,20000,20000,20000,20000,20000,20000,20000],[200000,200000,200000,200000,200000,200000,200000,200000,200000,200000],0.95]
期望: 0.3734526840242883
同样 PnL,两模型互换:A 用 2 万 VaR 错过尾部;B 充足。A 损失更高;DM 强正——A 较差。
Case 11 · typical: A and B alternate winning; finite DM
输入: [[100,-100,100,-100,100,-100],[80,80,80,80,80,80],[120,120,120,120,120,120],0.95]
期望: 1.7888543819998306
alpha=0.95。A=80,B=120。q_a=-80,q_b=-120。residual_a=[180,-20,180,-20,180,-20],loss_a=[9,19,9,19,9,19];residual_b 全为正,loss_b=[11,1,11,1,11,1]。d=[-2,18,-2,18,-2,18],mean_d=8,sample_var=120,std=sqrt(120)。DM=8*sqrt(6)/sqrt(120)=8*sqrt(0.05)~1.7889 — A 整体较差但每日差分波动大。
Case 12 · typical: alpha=0.99 high confidence two competing models
输入: [[-12794,25572,-90000,-15753,-46501,-10665,55596,21207,-120000,12445,19738,9266],[150000,150000,150000,150000,150000,150000,150000,150000,150000,150000,150000,150000],[80000,80000,80000,80000,80000,80000,80000,80000,80000,80000,80000,80000],0.99]
期望: -1.0312480912459054
alpha=0.99(tau=0.01),T=12。A=15万 VaR(覆盖所有亏损);B=8万 VaR(在亏损超过 -8万的第 2、8 日尾部错过)。q_a=-150000,q_b=-80000。A 每日均过报(轻惩 0.01);B 多数日过报但在两次大亏损日付重惩 0.99。B 的平均 pinball 损失高于 A;mean_d<0;DM 为负(约 -1.03)——A 较优,但量级低于 1.96 拒绝阈值。
Case 13 · typical: alpha=0.5 reduces to half-MAE differential
输入: [[10,-5,8,-12,3,-7,15,-2],[4,4,4,4,4,4,4,4],[10,10,10,10,10,10,10,10],0.5]
期望: -2.26260977304409
alpha=0.5:pinball 退化为 0.5*|residual|;DM 比较两条围绕分位的 MAE。q_a=-4,q_b=-10。
Case 14 · typical: T=3 short window
输入: [[100,-50,25],[60,60,60],[40,40,40],0.9]
期望: -0.3999999999999999
短 T=3,alpha=0.9。A=60、B=40 混合过报与错过日。
Case 15 · typical: 20-day daily-style PnL with two slightly different VaR forecasters
输入: [[-97426,30707,80100,-40557,-105818,-4742,38741,88355,-96770,-102929,56974,38427,123203,79097,-7643,-21439,206142,-22604,-62295,-90567],[116575,115150,120284,115596,116902,117419,115301,119639,119405,123424,120191,121403,119998,121624,119573,117782,124977,124957,123402,122078],[178153,177297,177890,175702,182663,179004,183466,178865,184580,183473,175005,177097,184103,179700,184804,178974,175730,181295,182785,177698],0.95]
期望: -62.1668560137236
20 日窗口,alpha=0.95。两模型 VaR(约 12 万与 18 万)在带噪日 PnL 上。验证中等窗口 DM。
Case 16 · adversarial: pinball-convention swap flips DM sign
输入: [[-1000,200],[100,100],[50,50],0.99]
期望: -0.9800000000000001
T=2,alpha=0.99。规范下尾 pinball:residual<0 重权 alpha=0.99;residual>=0 轻权 0.01。q_a=-100,q_b=-50。residual_a=[-900,300] => loss_a=[891,3];residual_b=[-950,250] => loss_b=[940.5,2.5]。d=[-49.5,0.5],mean_d=-24.5。两支颠倒者得 d=[-0.5,49.5]、mean_d=+24.5 — DM 符号反转。
Case 17 · adversarial: signed-quantile q = -VaR (NOT +VaR)
输入: [[10,20,5,30],[100,100,100,100],[50,50,50,50],0.99]
期望: "Infinity"
实现 PnL 为小正数。正确 q=-VaR:所有残差为正(过报),两侧轻权。q_a=-100,q_b=-50。residual_a=[110,120,105,130],residual_b=[60,70,55,80]。loss_a=tau*residual_a=[1.1,1.2,1.05,1.3];loss_b=[0.6,0.7,0.55,0.8]。d=[0.5,0.5,0.5,0.5],std_d=0、mean_d>0 => +inf。错用 q=+VaR 者得有限 DM — 哨兵错误。
Case 18 · adversarial: tau-vs-alpha swap distinguished by asymmetric residual magnitudes
输入: [[10,-200,30],[100,100,100],[200,200,200],0.99]
期望: 0.9700000000000001
alpha=0.99,T=3。q_a=-100,q_b=-200。residual_a=[110,-100,130],规范 loss_a=[1.1,99,1.3]。residual_b=[210,0,230],loss_b=[2.1,0,2.3]。d=[-1,99,-1],mean_d=97/3。tau<->alpha 交换者得 loss_a=[108.9,1,128.7]、loss_b=[207.9,0,227.7]、d=[-99,1,-99]、mean_d=-65.667 — 符号相反、量级不同。
Case 19 · adversarial: sample-stdev (T-1) vs population-stdev (T) distinguishable at T=4
输入: [[10,-50,20,-30],[20,20,20,20],[40,40,40,40],0.95]
期望: 1.3578057164544433
T=4,alpha=0.95。q_a=-20,q_b=-40。residual_a=[30,-30,40,-10],loss_a=[1.5,28.5,2,9.5];residual_b=[50,-10,60,10],loss_b=[2.5,9.5,3,0.5]。d=[-1,19,-1,9],mean_d=6.5。sq_dev=275,样本方差 275/3,样本标准差~9.5743,DM~1.3578。总体标准差者得 ~1.5680 — 偏差 sqrt(4/3)~1.1547。
Case 20 · adversarial: sqrt(T) factor — DM = mean_d * sqrt(T) / std_d
输入: [[1000,-2000,500,-3000,-1000],[50,50,50,50,50],[100,100,100,100,100],0.99]
期望: 2.408664913736792
T=5,alpha=0.99。验证 sqrt(T) 标准化:仅返回 mean_d/std_d 的实现差 sqrt(5)~2.236 倍。
Case 21 · adversarial: difference (loss_a - loss_b) vs ratio (loss_a / loss_b)
输入: [[100,-50,80,-120,30],[70,70,70,70,70],[90,90,90,90,90],0.95]
期望: 0.7499999999999998
5 日,alpha=0.95。DM 使用 d[t]=loss_a[t]-loss_b[t](渐近高斯)。比率 d[t]=loss_a[t]/loss_b[t] 非 N(0,1),loss_b[t]~0 时爆炸。
Case 22 · boundary: residual exactly zero for one model on one day
输入: [[-10,5],[10,10],[20,20],0.95]
期望: "-Infinity"
alpha=0.95。q_a=-10,q_b=-20。residual_a=[0,15](第 0 日 V 顶点 — loss=0);residual_b=[10,25]。loss_a=[0,0.75],loss_b=[0.5,1.25],d=[-0.5,-0.5] 恒定 => std_d=0、mean_d=-0.5<0 => -inf(A 一致优于 B)。
Case 23 · boundary: alpha=0.999 extreme tail
输入: [[1000,-50000,500],[40000,40000,40000],[60000,60000,60000],0.999]
期望: 0.994
alpha=0.999。q_a=-40000,q_b=-60000。residual_a=[41000,-10000,40500],loss_a=[41,9990,40.5];residual_b=[61000,10000,60500],loss_b=[61,10,60.5]。d=[-20,9980,-20];单一尾部错过日主导 A;DM 强正。
Case 24 · boundary: alpha exactly 0.5 — loss = 0.5 * |residual|
输入: [[5,-5,10,-10],[3,3,3,3],[7,7,7,7],0.5]
期望: -0.5222329678670935
alpha=0.5:两支退化为 0.5*|residual|。q_a=-3,q_b=-7。residual_a=[8,-2,13,-7],residual_b=[12,2,17,-3]。loss_a=[4,1,6.5,3.5],loss_b=[6,1,8.5,1.5]。d=[-2,0,-2,2],mean_d=-0.5。
Case 25 · boundary: realised all zero, A higher VaR; differential constant => +inf
输入: [[0,0,0,0,0],[100,100,100,100,100],[50,50,50,50,50],0.95]
期望: "Infinity"
实现 PnL 全 0。q_a=-100、q_b=-50;残差均为正恒定(过报)。loss_a=tau*100=5;loss_b=tau*50=2.5;d=2.5 恒定。std_d=0、mean_d>0 => +inf(A 在轻支下成本更高)。
Case 26 · large: stress T=100 random-walk PnL with two competing VaR forecasters
输入: [[40423,13801,-39858,26275,22689,-16225,-87548,-20265,35292,-47260,-46955,76931,176,5780,-39436,16692,-31127,-21906,-43605,99972,36557,-24711,110756,137669,218081,29272,-70321,-34949,89730,-145731,-99638,149704,36686,132023,-179430,-30822,-40180,92431,14490,-119036,-56777,-130836,-105086,-181990,-148033,12927,-17935,18043,1659,-63262,94476,-200151,-33345,-51164,-51742,85134,98965,49094,-81115,-40161,16051,39851,-11124,-11598,-43104,57851,-157153,-32028,120087,-77109,257567,-13293,-49910,-53142,-62209,6350,108295,-5477,312,-30043,43414,49074,-3723,93085,-14661,105725,-42075,23008,86505,35901,90403,56411,123711,6661,137819,68577,-239555,164552,-8902,112753],[165049,140090,140902,158348,178181,173033,145142,163063,160743,142327,134098,124703,172735,123430,151122,152252,128768,159356,174949,166568,143356,127588,178496,151026,157777,168137,143175,123723,128161,160107,140972,172265,164410,178826,160130,138228,167467,121695,151295,172284,150883,128242,150007,152727,142140,173606,140405,136305,150378,132196,139323,143890,176296,124822,131826,131970,179716,173449,131650,153877,125106,139022,140469,152261,175762,168541,152051,167165,149038,163834,166617,135411,169967,154677,149167,143749,133449,150075,127566,158087,173896,167432,156549,149668,161747,127339,123146,171699,141710,160137,145246,141520,166217,140582,165223,169229,130841,178634,123747,144884],[220028,239761,200493,223542,212111,245301,197340,205925,239188,209334,246653,193096,213305,196448,224412,247787,209482,228810,205955,216353,224291,204905,231428,220060,244328,206501,247291,200932,191195,229612,234026,218639,192566,249888,198493,242228,243492,203251,241836,217804,243975,243251,236458,207323,224806,248879,209237,213190,231279,193758,205202,200081,214963,215499,226424,219809,199886,192797,221486,235831,245614,240378,208165,241940,218043,206947,195530,216563,222698,206758,201718,246453,194066,246364,214385,237491,196657,224200,235702,216161,211388,219593,201668,202698,204479,192253,244601,194406,245628,190657,229373,222816,192511,246488,233846,242748,216140,207483,201156,196763],0.95]
期望: -1.2690845780676219
T=100,alpha=0.95。两个竞争日 VaR 模型;参考 DM 由规范 pinball+样本标准差计算。验证 O(T) 实现与双精度稳定性。
最近提交
还没有提交记录。
编码区
实现 solution(...)。本地运行当前支持 Python 可见样例;服务端提交会运行可见样例和隐藏测试。
默认展示公开样例。点击「运行样例」后会在这里显示实际输出;点击「提交评测」会进入隐藏测试。
Case 1 · statement-example: A under-forecasts vs B over-forecasts at alpha=0.99
输入: [[50000,-120000,30000,-180000,-60000],[100000,100000,100000,100000,100000],[200000,200000,200000,200000,200000],0.99]
期望: 1.2264447262990152
5 日窗口,alpha=0.99(tau=0.01)。A=10万恒定:q_a=-100000,残差=[150000,-20000,130000,-80000,40000],损失=[1500,19800,1300,79200,400](合 102200)。B=20万恒定:q_b=-200000,残差=[250000,80000,230000,20000,140000](全部过报——轻惩),损失=[2500,800,2300,200,1400](合 7200)。d=[-1000,19000,-1000,79000,-1000],mean_d=19000,sample_var=1.2e9,std~34641.016,DM=19000*sqrt(5)/34641.016~1.2264。DM 为正:A 损失更高 => A 较差。
Case 2 · boundary: empty backtest window T=0
输入: [[],[],[],0.99]
期望: "NaN"
T=0:样本标准差需 T>=2;返回 NaN。
Case 3 · boundary: single day T=1
输入: [[-50],[100],[50],0.99]
期望: "NaN"
T=1:T-1=0,样本标准差未定义;返回 NaN。
Case 4 · boundary: identical forecasts A==B element-wise
输入: [[1,-2,0.5,3,-1.5],[50,50,50,50,50],[50,50,50,50,50],0.95]
期望: "NaN"
forecasts_a 与 forecasts_b 逐元素相等:每日 loss_a==loss_b,d 恒为 0;mean_d=0、std_d=0 => 0/0 哨兵:NaN。
Case 5 · boundary: differential constant negative => -inf (A uniformly better)
输入: [[0,5,10,15,20],[10,10,10,10,10],[20,20,20,20,20],0.95]
期望: "-Infinity"
两模型每日均为过报,loss=tau*residual。d[t]=tau*(VaR_a-VaR_b)=0.05*(10-20)=-0.5 恒定。std_d=0、mean_d=-0.5<0 => -inf(A 一致优于 B)。
Case 6 · boundary: differential constant positive => +inf (A uniformly worse)
输入: [[0,5,10,15,20],[20,20,20,20,20],[10,10,10,10,10],0.95]
期望: "Infinity"
同上 A、B 互换:d[t]=tau*(VaR_a-VaR_b)=+0.5 恒定(此处 VaR_a=20、VaR_b=10,VaR_a-VaR_b=+10)。std_d=0、mean_d>0 => +inf(A 一致较差)。
Case 7 · boundary: T=2 minimum sample size
输入: [[10,-10],[50,50],[60,40],0.95]
期望: 0
T=2:alpha=0.95。q_a=[-50,-50],q_b=[-60,-40]。residual_a=[60,40](轻),loss_a=[3,2];residual_b=[70,30],loss_b=[3.5,1.5]。d=[-0.5,0.5],mean_d=0 => DM=0.0。
Case 8 · boundary: alpha=0.5 symmetric V-loss; mean_d=0
输入: [[-1,1,-2,2],[0,0,0,0],[1,1,1,1],0.5]
期望: 0
alpha=0.5:V 对称,loss=0.5*|residual|。q_a=0,q_b=-1。residual_a=[-1,1,-2,2],residual_b=[0,2,-1,3]。loss_a=[0.5,0.5,1.0,1.0],loss_b=[0,1,0.5,1.5]。d=[0.5,-0.5,0.5,-0.5],mean_d=0 => DM=0.0。
Case 9 · typical: A=200k VaR vs B=20k VaR; B misses tails => DM negative
输入: [[-14409,-17290,-11132,70198,-12759,-149735,33232,-26734,-21696,11588],[200000,200000,200000,200000,200000,200000,200000,200000,200000,200000],[20000,20000,20000,20000,20000,20000,20000,20000,20000,20000],0.95]
期望: -0.3734526840242883
随机 PnL ~N(0,10万),T=10。A 用 20 万 VaR(足以覆盖);B 用极小 2 万 VaR(多日尾部错过)。B 损失更高;mean_d<0;DM 强负——A 胜。
Case 10 · typical: A=20k VaR (under-forecasts) vs B=200k VaR; DM positive
输入: [[-14409,-17290,-11132,70198,-12759,-149735,33232,-26734,-21696,11588],[20000,20000,20000,20000,20000,20000,20000,20000,20000,20000],[200000,200000,200000,200000,200000,200000,200000,200000,200000,200000],0.95]
期望: 0.3734526840242883
同样 PnL,两模型互换:A 用 2 万 VaR 错过尾部;B 充足。A 损失更高;DM 强正——A 较差。
Case 11 · typical: A and B alternate winning; finite DM
输入: [[100,-100,100,-100,100,-100],[80,80,80,80,80,80],[120,120,120,120,120,120],0.95]
期望: 1.7888543819998306
alpha=0.95。A=80,B=120。q_a=-80,q_b=-120。residual_a=[180,-20,180,-20,180,-20],loss_a=[9,19,9,19,9,19];residual_b 全为正,loss_b=[11,1,11,1,11,1]。d=[-2,18,-2,18,-2,18],mean_d=8,sample_var=120,std=sqrt(120)。DM=8*sqrt(6)/sqrt(120)=8*sqrt(0.05)~1.7889 — A 整体较差但每日差分波动大。
Case 12 · typical: alpha=0.99 high confidence two competing models
输入: [[-12794,25572,-90000,-15753,-46501,-10665,55596,21207,-120000,12445,19738,9266],[150000,150000,150000,150000,150000,150000,150000,150000,150000,150000,150000,150000],[80000,80000,80000,80000,80000,80000,80000,80000,80000,80000,80000,80000],0.99]
期望: -1.0312480912459054
alpha=0.99(tau=0.01),T=12。A=15万 VaR(覆盖所有亏损);B=8万 VaR(在亏损超过 -8万的第 2、8 日尾部错过)。q_a=-150000,q_b=-80000。A 每日均过报(轻惩 0.01);B 多数日过报但在两次大亏损日付重惩 0.99。B 的平均 pinball 损失高于 A;mean_d<0;DM 为负(约 -1.03)——A 较优,但量级低于 1.96 拒绝阈值。
Case 13 · typical: alpha=0.5 reduces to half-MAE differential
输入: [[10,-5,8,-12,3,-7,15,-2],[4,4,4,4,4,4,4,4],[10,10,10,10,10,10,10,10],0.5]
期望: -2.26260977304409
alpha=0.5:pinball 退化为 0.5*|residual|;DM 比较两条围绕分位的 MAE。q_a=-4,q_b=-10。
Case 14 · typical: T=3 short window
输入: [[100,-50,25],[60,60,60],[40,40,40],0.9]
期望: -0.3999999999999999
短 T=3,alpha=0.9。A=60、B=40 混合过报与错过日。
Case 15 · typical: 20-day daily-style PnL with two slightly different VaR forecasters
输入: [[-97426,30707,80100,-40557,-105818,-4742,38741,88355,-96770,-102929,56974,38427,123203,79097,-7643,-21439,206142,-22604,-62295,-90567],[116575,115150,120284,115596,116902,117419,115301,119639,119405,123424,120191,121403,119998,121624,119573,117782,124977,124957,123402,122078],[178153,177297,177890,175702,182663,179004,183466,178865,184580,183473,175005,177097,184103,179700,184804,178974,175730,181295,182785,177698],0.95]
期望: -62.1668560137236
20 日窗口,alpha=0.95。两模型 VaR(约 12 万与 18 万)在带噪日 PnL 上。验证中等窗口 DM。
Case 16 · adversarial: pinball-convention swap flips DM sign
输入: [[-1000,200],[100,100],[50,50],0.99]
期望: -0.9800000000000001
T=2,alpha=0.99。规范下尾 pinball:residual<0 重权 alpha=0.99;residual>=0 轻权 0.01。q_a=-100,q_b=-50。residual_a=[-900,300] => loss_a=[891,3];residual_b=[-950,250] => loss_b=[940.5,2.5]。d=[-49.5,0.5],mean_d=-24.5。两支颠倒者得 d=[-0.5,49.5]、mean_d=+24.5 — DM 符号反转。
Case 17 · adversarial: signed-quantile q = -VaR (NOT +VaR)
输入: [[10,20,5,30],[100,100,100,100],[50,50,50,50],0.99]
期望: "Infinity"
实现 PnL 为小正数。正确 q=-VaR:所有残差为正(过报),两侧轻权。q_a=-100,q_b=-50。residual_a=[110,120,105,130],residual_b=[60,70,55,80]。loss_a=tau*residual_a=[1.1,1.2,1.05,1.3];loss_b=[0.6,0.7,0.55,0.8]。d=[0.5,0.5,0.5,0.5],std_d=0、mean_d>0 => +inf。错用 q=+VaR 者得有限 DM — 哨兵错误。
Case 18 · adversarial: tau-vs-alpha swap distinguished by asymmetric residual magnitudes
输入: [[10,-200,30],[100,100,100],[200,200,200],0.99]
期望: 0.9700000000000001
alpha=0.99,T=3。q_a=-100,q_b=-200。residual_a=[110,-100,130],规范 loss_a=[1.1,99,1.3]。residual_b=[210,0,230],loss_b=[2.1,0,2.3]。d=[-1,99,-1],mean_d=97/3。tau<->alpha 交换者得 loss_a=[108.9,1,128.7]、loss_b=[207.9,0,227.7]、d=[-99,1,-99]、mean_d=-65.667 — 符号相反、量级不同。
Case 19 · adversarial: sample-stdev (T-1) vs population-stdev (T) distinguishable at T=4
输入: [[10,-50,20,-30],[20,20,20,20],[40,40,40,40],0.95]
期望: 1.3578057164544433
T=4,alpha=0.95。q_a=-20,q_b=-40。residual_a=[30,-30,40,-10],loss_a=[1.5,28.5,2,9.5];residual_b=[50,-10,60,10],loss_b=[2.5,9.5,3,0.5]。d=[-1,19,-1,9],mean_d=6.5。sq_dev=275,样本方差 275/3,样本标准差~9.5743,DM~1.3578。总体标准差者得 ~1.5680 — 偏差 sqrt(4/3)~1.1547。
Case 20 · adversarial: sqrt(T) factor — DM = mean_d * sqrt(T) / std_d
输入: [[1000,-2000,500,-3000,-1000],[50,50,50,50,50],[100,100,100,100,100],0.99]
期望: 2.408664913736792
T=5,alpha=0.99。验证 sqrt(T) 标准化:仅返回 mean_d/std_d 的实现差 sqrt(5)~2.236 倍。
Case 21 · adversarial: difference (loss_a - loss_b) vs ratio (loss_a / loss_b)
输入: [[100,-50,80,-120,30],[70,70,70,70,70],[90,90,90,90,90],0.95]
期望: 0.7499999999999998
5 日,alpha=0.95。DM 使用 d[t]=loss_a[t]-loss_b[t](渐近高斯)。比率 d[t]=loss_a[t]/loss_b[t] 非 N(0,1),loss_b[t]~0 时爆炸。
Case 22 · boundary: residual exactly zero for one model on one day
输入: [[-10,5],[10,10],[20,20],0.95]
期望: "-Infinity"
alpha=0.95。q_a=-10,q_b=-20。residual_a=[0,15](第 0 日 V 顶点 — loss=0);residual_b=[10,25]。loss_a=[0,0.75],loss_b=[0.5,1.25],d=[-0.5,-0.5] 恒定 => std_d=0、mean_d=-0.5<0 => -inf(A 一致优于 B)。
Case 23 · boundary: alpha=0.999 extreme tail
输入: [[1000,-50000,500],[40000,40000,40000],[60000,60000,60000],0.999]
期望: 0.994
alpha=0.999。q_a=-40000,q_b=-60000。residual_a=[41000,-10000,40500],loss_a=[41,9990,40.5];residual_b=[61000,10000,60500],loss_b=[61,10,60.5]。d=[-20,9980,-20];单一尾部错过日主导 A;DM 强正。
Case 24 · boundary: alpha exactly 0.5 — loss = 0.5 * |residual|
输入: [[5,-5,10,-10],[3,3,3,3],[7,7,7,7],0.5]
期望: -0.5222329678670935
alpha=0.5:两支退化为 0.5*|residual|。q_a=-3,q_b=-7。residual_a=[8,-2,13,-7],residual_b=[12,2,17,-3]。loss_a=[4,1,6.5,3.5],loss_b=[6,1,8.5,1.5]。d=[-2,0,-2,2],mean_d=-0.5。
Case 25 · boundary: realised all zero, A higher VaR; differential constant => +inf
输入: [[0,0,0,0,0],[100,100,100,100,100],[50,50,50,50,50],0.95]
期望: "Infinity"
实现 PnL 全 0。q_a=-100、q_b=-50;残差均为正恒定(过报)。loss_a=tau*100=5;loss_b=tau*50=2.5;d=2.5 恒定。std_d=0、mean_d>0 => +inf(A 在轻支下成本更高)。
Case 26 · large: stress T=100 random-walk PnL with two competing VaR forecasters
输入: [[40423,13801,-39858,26275,22689,-16225,-87548,-20265,35292,-47260,-46955,76931,176,5780,-39436,16692,-31127,-21906,-43605,99972,36557,-24711,110756,137669,218081,29272,-70321,-34949,89730,-145731,-99638,149704,36686,132023,-179430,-30822,-40180,92431,14490,-119036,-56777,-130836,-105086,-181990,-148033,12927,-17935,18043,1659,-63262,94476,-200151,-33345,-51164,-51742,85134,98965,49094,-81115,-40161,16051,39851,-11124,-11598,-43104,57851,-157153,-32028,120087,-77109,257567,-13293,-49910,-53142,-62209,6350,108295,-5477,312,-30043,43414,49074,-3723,93085,-14661,105725,-42075,23008,86505,35901,90403,56411,123711,6661,137819,68577,-239555,164552,-8902,112753],[165049,140090,140902,158348,178181,173033,145142,163063,160743,142327,134098,124703,172735,123430,151122,152252,128768,159356,174949,166568,143356,127588,178496,151026,157777,168137,143175,123723,128161,160107,140972,172265,164410,178826,160130,138228,167467,121695,151295,172284,150883,128242,150007,152727,142140,173606,140405,136305,150378,132196,139323,143890,176296,124822,131826,131970,179716,173449,131650,153877,125106,139022,140469,152261,175762,168541,152051,167165,149038,163834,166617,135411,169967,154677,149167,143749,133449,150075,127566,158087,173896,167432,156549,149668,161747,127339,123146,171699,141710,160137,145246,141520,166217,140582,165223,169229,130841,178634,123747,144884],[220028,239761,200493,223542,212111,245301,197340,205925,239188,209334,246653,193096,213305,196448,224412,247787,209482,228810,205955,216353,224291,204905,231428,220060,244328,206501,247291,200932,191195,229612,234026,218639,192566,249888,198493,242228,243492,203251,241836,217804,243975,243251,236458,207323,224806,248879,209237,213190,231279,193758,205202,200081,214963,215499,226424,219809,199886,192797,221486,235831,245614,240378,208165,241940,218043,206947,195530,216563,222698,206758,201718,246453,194066,246364,214385,237491,196657,224200,235702,216161,211388,219593,201668,202698,204479,192253,244601,194406,245628,190657,229373,222816,192511,246488,233846,242748,216140,207483,201156,196763],0.95]
期望: -1.2690845780676219
T=100,alpha=0.95。两个竞争日 VaR 模型;参考 DM 由规范 pinball+样本标准差计算。验证 O(T) 实现与双精度稳定性。