全站搜索 — 锐望实验室

全部 · 4546 课程 · 299 模块 · 72 题目 · 4169 帮助 · 6 收藏题单 · 0

找到 30 个结果

模块2.6.1 · 数学与统计能力 · 机器学习理论

监督学习基础

machine-learning · statistical-learning · supervised-learning · erm · loss-functions · bayes-optimal · bias-variance · generalization

打开 →

题目1653 · 统计

A Smoothed Bernoulli Estimator vs the Sample Proportion

Let $X\sim \mathrm{Binomial}(10,p)$ and consider the estimator $$\delta = \frac{X+1}{12}$$ for $p$. At the parameter value $p=0.2$, compute the bias, variance, and MSE of $\delta$, and compare its MSE with the usual sample proportion $\hat p = X/10$.

打开 →

题目1661 · 统计

Anchor-Based Shrinkage Crossover Interval

Suppose Xbar ~ N(theta, 0.16). A desk uses delta = 0.6Xbar + 0.8. For what values of theta does delta have lower MSE than Xbar?

打开 →

题目1845 · 统计

ARMA Identification or Simplification 5

You observe the diagnostic statement: (1-0.5L) X_t = (1-0.5L) e_t. What is the correct modeling conclusion?

打开 →

题目1658 · 统计

Best Blend of Two Correlated Unbiased Signals

Two unbiased estimators of the same parameter have variances 9 and 4, and their correlation is 0.5. For T(a) = aT1 + (1-a)T2, what value of a minimizes variance, and what is the resulting minimum variance?

打开 →

题目1659 · 统计

Bias Allowed for a Lower-Variance Regularized Estimate

An unbiased estimator U has variance 0.06. A regularized estimator R has variance 0.03 and constant bias b. What is the largest absolute bias |b| for which R still has smaller MSE than U?

打开 →

题目1651 · 统计

Bias Budget for a Faster Proxy

A slow benchmark estimator U is unbiased with variance 0.64. A faster proxy P has variance 0.25 but constant bias b. What is the largest absolute bias |b| for which P still has smaller MSE than U?

打开 →

题目2398 · 机器学习

Bias Budget Implied by a Variance Reduction

A regularization change reduces a model's variance term from 0.30 to 0.11 while leaving irreducible noise unchanged. How much extra bias squared could you add before the total MSE stops improving?

打开 →

题目2406 · 机器学习

Choose the Better Model at a Given Sample Size

At sample size n=60, compare model A with excess error 0.04 + 12/n to model B with excess error 0.16 + 2/n. Which one has smaller excess test error?

打开 →

题目2404 · 机器学习

Data Multiplier Needed to Push Variance Below a Noise Floor Fraction

A model's variance term is currently 0.30, and irreducible noise is 0.05. If variance scales exactly like 1/n, by what factor must the dataset grow so the variance term falls to 0.05?

打开 →

题目2427 · 机器学习

Decision Threshold Under Asymmetric Classification Cost

A false negative costs 5 and a false positive costs 1. If p is the predicted probability of the positive class, above what threshold should you classify as positive?

打开 →

题目1794 · 统计

Duplicate Feature Under Pure Lasso

If two predictors are exactly identical and the model uses pure Lasso, what modeling pathology should you expect?

打开 →

题目2592 · 机器学习

Effective Independent Tree Count 8

Define B_eff by matching the correlated-forest variance sigma^2 [rho + (1-rho)/B] to the variance sigma^2 / B_eff of averaging independent trees. Derive B_eff.

打开 →

题目2629 · 机器学习

EMA From Zero Initialization 6

Let m_t = beta m_{t-1} + (1-beta) x_t with m_0=0. Derive m_t as an explicit weighted sum of x_1,...,x_t.

打开 →

题目2559 · 机器学习

Expected Misroutes From a Surrogate Split

A surrogate split agrees with the primary split on 34 of 40 training cases where both features are present. If 12 production cases are missing the primary split feature and are routed by the surrogate, what is the expected number of misroutes?

打开 →

题目2622 · 机器学习

Global-Norm Clipping Formula 2

A gradient vector g has norm ||g|| greater than clip threshold c. Derive the clipped gradient under standard global-norm clipping.

打开 →

题目2400 · 机器学习

How Many Independent Fits to Hit a Variance Target

Each independently trained model has variance 2.4 and negligible bias. How many equally weighted independent fits must you average to bring the variance term below 0.3?

打开 →

题目2407 · 机器学习

Improvement in Excess Error From a Regularization Move

A regularization change raises bias^2 from 0.03 to 0.07 but cuts variance from 0.22 to 0.08. By how much does excess test error improve?

打开 →

题目2579 · 机器学习

Infer Tree Correlation From the Variance Floor 23

A single tree has variance 6, while an extremely large forest appears to level off at variance 1.8. What pairwise tree correlation rho is implied?

打开 →

题目2573 · 机器学习

Infinite-Forest Variance Floor 2

Using the equicorrelated-tree variance formula, derive the prediction variance as the number of trees B tends to infinity.

打开 →

题目1665 · 统计

Interval Where a Fixed Benchmark Beats a Noisy Unbiased Signal

A noisy unbiased signal X satisfies X ~ N(theta, 0.25). A fallback benchmark always reports 1.2. For what values of theta does the fixed benchmark have lower MSE than X?

打开 →

题目1777 · 统计

Lasso Threshold Calibration 2

A standardized lasso fit has absolute score magnitudes (3.8, 2.5, 0.9). What is the smallest lambda that zeroes the weakest feature while leaving the other two still active?

打开 →

题目2633 · 机器学习

Layer-Norm Shift Invariance 8

Ignoring learned affine parameters, why does adding the same constant a to every coordinate of a vector leave layer-normalized activations unchanged?

打开 →

题目2422 · 机器学习

Log-Loss Gap Between Two Positive Forecasts

An event occurs (y=1). Forecast A assigns probability 0.9 and forecast B assigns probability 0.7. By how much is B's log loss larger than A's?

打开 →

题目2584 · 机器学习

Marginal Variance Reduction From One More Tree 3

Under the equicorrelated-tree model, derive how much the ensemble variance falls when you move from B trees to B+1 trees.

打开 →

题目1660 · 统计

Norm Implied by a 25% Positive-Part James-Stein Shrink

In dimension p = 4 with unit noise variance, the positive-part James-Stein shrinkage factor is 0.75 for an observed vector x. What value of ||x||^2 produced that factor?

打开 →

题目1657 · 统计

Optimal Haircut on a Multiplicative Vol Signal

An estimator A is unbiased for theta and satisfies Var(A) = 0.3 theta^2. A risk team reports delta_c = cA instead. Find the value of c that minimizes MSE, and give the minimum MSE as a multiple of theta^2.

打开 →

题目1652 · 统计

Optimal Shrink Toward a Desk Anchor

A desk observes X ~ N(theta, 9) and reports delta_c = cX + (1-c)4. At the specific parameter value theta = 5, what choice of c minimizes MSE, and what is the minimum MSE?

打开 →

题目2399 · 机器学习

Optimal Weight on a Noisy Unbiased Model

Model A is unbiased with variance 9. Model B has variance 1.44 and fixed bias 0.6. If you blend them as P_w = wA + (1-w)B and treat their errors as independent, what weight w minimizes MSE?

打开 →

题目2480 · 机器学习

Orthogonal Features Give Coordinatewise Coefficients 9

Suppose two features x1 and x2 are centered and orthogonal. Derive the OLS coefficients in terms of x1^T y, x2^T y, ||x1||^2, and ||x2||^2.

打开 →