INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题；编程题使用独立的 LeetCode 式 coding lab。

做诊断按领域练习按面试风格练习代码题库

题目: 4169
领域: 8
当前筛选: 4169

第 77 / 209 页

非代码面试题

显示 20 / 4169 道匹配题目

答题状态：未尝试未正确已正确

ID题目领域难度题型进度权限

2604Why Label Noise Is Especially Toxic 13Why does boosting often suffer badly when labels are noisy?机器学习中等essay未尝试免费 2607Why Overly Deep Base Trees Can Cancel Shrinkage Discipline 15Why can a very deep base tree undermine the regularizing effect of a small learning rate?机器学习简单essay未尝试免费 2608Residual After Two Shrunken Updates 24A point currently has residual 6. Two boosting rounds hit its region with leaf updates 1.5 and 0.8, using learning rate eta=0.2 in both rounds. What residual remains after the two rounds?机器学习中等数值题未尝试免费 2610Scale-Update Invariance Between Eta and Gamma 6Why does multiplying every leaf update gamma m by c and dividing the learning rate eta by c leave the final additive score unchanged?机器学习困难derivation未尝试面试订阅 2611Why Boosting Parallelizes Worse Than Random Forests 16Why is boosting fundamentally harder to parallelize across rounds than random forests?机器学习简单essay未尝试免费 2613L2-Regularized Region Update 7In one boosting region, choose a constant update gamma to minimize sum i in R (r i-gamma) 2 + lambda gamma 2. Let S = sum i in R r i and n = |R|. Derive gamma.机器学习困难derivation未尝试面试订阅 2614Why the Initial Prediction Matters 18Why can the choice of the initial prediction F 0 matter for the early trajectory of boosting?机器学习中等essay未尝试面试订阅 2615Why Calibration Can Degrade Before Ranking 19Why can late-stage boosting sometimes keep ranking examples well while making the predicted scores less well calibrated?机器学习困难essay未尝试面试订阅 2616Why Leaf-Wise Growth Can Be Higher Variance 20Why can leaf-wise tree growth be more variance-prone than level-wise growth inside a boosting system?机器学习简单essay未尝试免费 2617Two-Region Two-Round Boosting Calculation 25A boosting model starts from F 0=0 with learning rate eta=0.1. In round 1, region A gets update +2 and region B gets update -1. In round 2, region A gets update -0.5 and region B gets update +0.25. What are the final predictions for a point that always stays in region A and a point that always stays in region B?机器学习简单数值题未尝试免费 2618Why Many Small Corrections Can Beat One Big Tree 21Why can an additive sequence of small boosting steps outperform a single large tree with similar in-sample flexibility?机器学习中等essay未尝试面试订阅 2619Why Flat Late-Round Validation Gains Still Suggest Stopping 22If the validation gain per boosting round becomes tiny and erratic late in training, why is that often a strong argument for stopping?机器学习中等essay未尝试面试订阅 2620A Bound on Total Function Movement 8Suppose every boosting round changes any one point's prediction by at most eta A in absolute value. What upper bound does this imply on the total prediction movement after M rounds?机器学习困难derivation未尝试面试订阅 2621Residual Block Gradient 1A scalar residual block outputs y = x + f(x). Derive dy/dx.机器学习简单derivation未尝试免费 2622Global-Norm Clipping Formula 2A gradient vector g has norm ||g|| greater than clip threshold c. Derive the clipped gradient under standard global-norm clipping.机器学习简单derivation未尝试免费 2623One Momentum Update 15Suppose momentum uses v t = beta v t-1 + g t with beta=0.9, previous velocity v t-1 =0.5, and current gradient g t=2. What is v t?机器学习中等数值题未尝试免费 2624Momentum as an Unrolled Geometric Sum 3If momentum obeys v t = beta v t-1 + g t, derive v t in terms of v 0 and the past gradients g 1,...,g t.机器学习中等derivation未尝试免费 2625Decoupled Weight Decay Update 4Under decoupled weight decay with learning rate eta, decay lambda, parameters w t, and gradient g t, derive w t+1 .机器学习困难derivation未尝试免费 2626Global-Norm Clipping Numerically 16A gradient vector is g=(6,8), whose norm is 10. If the clip threshold is 5, what clipped gradient is produced?机器学习简单数值题未尝试免费 2627Linear Warmup Schedule 5A learning rate warms up linearly from 0 to eta max over T steps. Derive eta t for step t in the warmup phase.机器学习中等derivation未尝试免费