题目1653 · 统计
Let $X\sim \mathrm{Binomial}(10,p)$ and consider the estimator
$$\delta = \frac{X+1}{12}$$
for $p$.
At the parameter value $p=0.2$, compute the bias, variance, and MSE of $\delta$, and compare its MSE with the usual sample proportion $\hat p = X/10$.
打开 →题目1661 · 统计
Suppose Xbar ~ N(theta, 0.16). A desk uses delta = 0.6Xbar + 0.8. For what values of theta does delta have lower MSE than Xbar?
打开 →题目1845 · 统计
You observe the diagnostic statement: (1-0.5L) X_t = (1-0.5L) e_t. What is the correct modeling conclusion?
打开 →题目1658 · 统计
Two unbiased estimators of the same parameter have variances 9 and 4, and their correlation is 0.5. For T(a) = aT1 + (1-a)T2, what value of a minimizes variance, and what is the resulting minimum variance?
打开 →题目1659 · 统计
An unbiased estimator U has variance 0.06. A regularized estimator R has variance 0.03 and constant bias b. What is the largest absolute bias |b| for which R still has smaller MSE than U?
打开 →题目1651 · 统计
A slow benchmark estimator U is unbiased with variance 0.64. A faster proxy P has variance 0.25 but constant bias b. What is the largest absolute bias |b| for which P still has smaller MSE than U?
打开 →题目2398 · 机器学习
A regularization change reduces a model's variance term from 0.30 to 0.11 while leaving irreducible noise unchanged. How much extra bias squared could you add before the total MSE stops improving?
打开 →题目2406 · 机器学习
At sample size n=60, compare model A with excess error 0.04 + 12/n to model B with excess error 0.16 + 2/n. Which one has smaller excess test error?
打开 →题目2404 · 机器学习
A model's variance term is currently 0.30, and irreducible noise is 0.05. If variance scales exactly like 1/n, by what factor must the dataset grow so the variance term falls to 0.05?
打开 →题目2427 · 机器学习
A false negative costs 5 and a false positive costs 1. If p is the predicted probability of the positive class, above what threshold should you classify as positive?
打开 →题目1794 · 统计
If two predictors are exactly identical and the model uses pure Lasso, what modeling pathology should you expect?
打开 →题目2592 · 机器学习
Define B_eff by matching the correlated-forest variance sigma^2 [rho + (1-rho)/B] to the variance sigma^2 / B_eff of averaging independent trees. Derive B_eff.
打开 →题目2629 · 机器学习
Let m_t = beta m_{t-1} + (1-beta) x_t with m_0=0. Derive m_t as an explicit weighted sum of x_1,...,x_t.
打开 →题目2559 · 机器学习
A surrogate split agrees with the primary split on 34 of 40 training cases where both features are present. If 12 production cases are missing the primary split feature and are routed by the surrogate, what is the expected number of misroutes?
打开 →题目2622 · 机器学习
A gradient vector g has norm ||g|| greater than clip threshold c. Derive the clipped gradient under standard global-norm clipping.
打开 →题目2400 · 机器学习
Each independently trained model has variance 2.4 and negligible bias. How many equally weighted independent fits must you average to bring the variance term below 0.3?
打开 →题目2407 · 机器学习
A regularization change raises bias^2 from 0.03 to 0.07 but cuts variance from 0.22 to 0.08. By how much does excess test error improve?
打开 →题目2579 · 机器学习
A single tree has variance 6, while an extremely large forest appears to level off at variance 1.8. What pairwise tree correlation rho is implied?
打开 →题目2573 · 机器学习
Using the equicorrelated-tree variance formula, derive the prediction variance as the number of trees B tends to infinity.
打开 →题目1665 · 统计
A noisy unbiased signal X satisfies X ~ N(theta, 0.25). A fallback benchmark always reports 1.2. For what values of theta does the fixed benchmark have lower MSE than X?
打开 →题目1777 · 统计
A standardized lasso fit has absolute score magnitudes (3.8, 2.5, 0.9). What is the smallest lambda that zeroes the weakest feature while leaving the other two still active?
打开 →题目2633 · 机器学习
Ignoring learned affine parameters, why does adding the same constant a to every coordinate of a vector leave layer-normalized activations unchanged?
打开 →题目2422 · 机器学习
An event occurs (y=1). Forecast A assigns probability 0.9 and forecast B assigns probability 0.7. By how much is B's log loss larger than A's?
打开 →题目2584 · 机器学习
Under the equicorrelated-tree model, derive how much the ensemble variance falls when you move from B trees to B+1 trees.
打开 →题目1660 · 统计
In dimension p = 4 with unit noise variance, the positive-part James-Stein shrinkage factor is 0.75 for an observed vector x. What value of ||x||^2 produced that factor?
打开 →题目1657 · 统计
An estimator A is unbiased for theta and satisfies Var(A) = 0.3 theta^2. A risk team reports delta_c = cA instead. Find the value of c that minimizes MSE, and give the minimum MSE as a multiple of theta^2.
打开 →题目1652 · 统计
A desk observes X ~ N(theta, 9) and reports delta_c = cX + (1-c)4. At the specific parameter value theta = 5, what choice of c minimizes MSE, and what is the minimum MSE?
打开 →题目2399 · 机器学习
Model A is unbiased with variance 9. Model B has variance 1.44 and fixed bias 0.6. If you blend them as P_w = wA + (1-w)B and treat their errors as independent, what weight w minimizes MSE?
打开 →题目2480 · 机器学习
Suppose two features x1 and x2 are centered and orthogonal. Derive the OLS coefficients in terms of x1^T y, x2^T y, ||x1||^2, and ||x2||^2.
打开 →题目2431 · 机器学习
For pseudo-Huber loss ell(r)=delta^2(sqrt(1+(r/delta)^2)-1), derive d ell / d r.
打开 →