0.049 vs 0.051 Decision Cliff
Two backtests differ only slightly: one reports p = 0.049 and the other p = 0.051. Why is it bad practice to call one ‘real’ and the other ‘not real’ purely because one is below 0.05?
打开 →GLOBAL SEARCH
搜索在服务端完成,题目解析与答案不会进入搜索结果。登录后可搜索自己的收藏题单。
找到 30 个结果
中文题目Two backtests differ only slightly: one reports p = 0.049 and the other p = 0.051. Why is it bad practice to call one ‘real’ and the other ‘not real’ purely because one is below 0.05?
打开 →If a feature x is replaced by x+k in a regression that already includes an intercept, what happens to the slope on x and the intercept?
打开 →Someone proposes using yesterday's order-flow imbalance as an instrument for today's imbalance in a return-impact regression. Why is this not automatically a valid instrument in financial data?
打开 →Let $X\sim \mathrm{Binomial}(10,p)$ and consider the estimator $$\delta = \frac{X+1}{12}$$ for $p$. At the parameter value $p=0.2$, compute the bias, variance, and MSE of $\delta$, and compare its MSE with the usual sample proportion $\hat p = X/10$.
打开 →Two candidate rollouts have the same reduced-form impact on PnL: $$E[Y\mid Z=1]-E[Y\mid Z=0]=0.02.$$ For rollout A, the first stage is $0.20$; for rollout B, the first stage is $0.01$. Which rollout creates the weaker IV design, and why?
打开 →A strategy makes +0.04% on 98% of days and loses -2.5% on the remaining 2% of days. What is the unconditional average daily return?
打开 →The second sleeve has both larger alpha and a different risk unit, so the optimal point must balance both effects. Maximize 4x + 6y subject to 4x^2 + 9y^2 = 225.
打开 →Suppose Xbar ~ N(theta, 0.16). A desk uses delta = 0.6Xbar + 0.8. For what values of theta does delta have lower MSE than Xbar?
打开 →A coin is flipped 20 times and lands heads 14 times. Use the normal approximation to test fairness at the 5% two-sided level.
打开 →You observe the diagnostic statement: Both ACF and PACF tail off. What is the correct modeling conclusion?
打开 →You observe the diagnostic statement: (1-0.5L) X_t = (1-0.5L) e_t. What is the correct modeling conclusion?
打开 →Three candidate thresholds on the same classifier yield t=0.3 -> FP=18, FN=4; t=0.5 -> FP=9, FN=7; t=0.7 -> FP=4, FN=14. If one false negative costs 5 units and one false positive costs 1 unit(s), which threshold minimizes expected classification cost over this sample?
打开 →A monthly feature is observed for 60 months and behaves roughly like an AR(1) series with lag-1 autocorrelation $\rho=0.6$. Using the heuristic $n_\text{eff}\approx n(1-\rho)/(1+\rho)$, what is the effective sample size?
打开 →A signal earns +6 bps on 70% of days in calm markets and -10 bps on 30% of days in stressed markets. What is its unconditional average daily edge in bps?
打开 →A live manager panel shows 27 low-leverage funds and 18 high-leverage funds. Survival rates for those groups were 90% and 60%, respectively. Suppose low-leverage funds average 1.2x gross leverage and high-leverage funds average 2.4x gross leverage. What was the average gross lev
打开 →An expanding walk-forward starts with 12 months of training and then advances by 6 months for each of 5 complete test folds. What is the average training-window length used across the 5 folds?
打开 →Again the structural model is $Y=2X+u$ with $E[u\mid X]=0$, but now you observe two noisy proxies: $$W_1=X+\eta_1, \qquad W_2=X+\eta_2,$$ where $\eta_1,\eta_2$ are independent of each other and of $X,u$. Suppose $\operatorname{Var}(X)=4$ and each noise term has variance 1. If yo
打开 →A desk regresses slippage Y on inventory pressure X. Without an urgency control, the OLS slope on X is 0.90. After adding a perfect measure of urgency U, the slope falls to 0.60. Suppose the structural model is Y = beta X + 0.5 U + noise and Var(X)=1. What is Cov(X,U)?
打开 →A PM deck says: ‘The event-study p-value is 0.03, so there is a 97% probability the signal is real.’ What is the statistical mistake?
打开 →Assume each tree has the same squared bias b^2 and prediction noise floor nu, while bagging only changes the variance term according to the equicorrelated-tree formula. Derive the bagged test MSE with B trees.
打开 →Before adding a new return-based feature to your model, what is the first alignment question you should ask?
打开 →Before fitting a meta-model on top of several signals, what is the first data question you should ask?
打开 →Two hyperparameter settings differ in mean CV score by only 0.001, while the estimated standard error is 0.010. What is the first sensible interpretation?
打开 →The current best setting sits at extreme values on both the learning-rate and regularization grids. What should your next search action be?
打开 →A categorical encoder was fit once on all rows and then reused inside cross-validation. What is the immediate correction?
打开 →What should you inspect first before saying a walk-forward result is robust?
打开 →Before combining several signals, what should you check first besides each signal's standalone Sharpe?
打开 →Before you compare ROC and PR curves across models, what dataset property should you check first?
打开 →Two models were validated under different walk-forward schemes. What is the first reason not to compare their average scores naively?
打开 →What should you check first before saying that adding five more signals makes the ensemble diversified?
打开 →