第 10 / 32 页
非代码面试题
显示 20 / 622 道匹配题目
答题状态:未尝试未正确已正确
ID题目领域难度题型进度权限
2597Weighted Region Update 2If observations in a boosting region R carry positive weights w i, derive the constant update gamma that minimizes sum i in R w i (r i-gamma) 2.机器学习简单derivation未尝试免费2598Final Prediction After Three Boosting Rounds 23A boosting model starts from F 0(x)=10. For one observation, the leaf updates along its path are +1.2, -0.5, and +0.8 across three rounds, with learning rate eta=0.1 each round. What is the final prediction?机器学习中等数值题未尝试免费2599Why Boosting Mostly Attacks Bias 9Why is boosting usually described as a bias-reduction method more than a variance-reduction method?机器学习中等essay未尝试免费2602Why Early Stopping Matters Even if Train Loss Falls 12Why can validation performance start to deteriorate even while the training objective of boosting keeps improving?机器学习中等essay未尝试免费2604Why Label Noise Is Especially Toxic 13Why does boosting often suffer badly when labels are noisy?机器学习中等essay未尝试免费2607Why Overly Deep Base Trees Can Cancel Shrinkage Discipline 15Why can a very deep base tree undermine the regularizing effect of a small learning rate?机器学习简单essay未尝试免费2608Residual After Two Shrunken Updates 24A point currently has residual 6. Two boosting rounds hit its region with leaf updates 1.5 and 0.8, using learning rate eta=0.2 in both rounds. What residual remains after the two rounds?机器学习中等数值题未尝试免费2610Scale-Update Invariance Between Eta and Gamma 6Why does multiplying every leaf update gamma m by c and dividing the learning rate eta by c leave the final additive score unchanged?机器学习困难derivation未尝试面试订阅2611Why Boosting Parallelizes Worse Than Random Forests 16Why is boosting fundamentally harder to parallelize across rounds than random forests?机器学习简单essay未尝试免费2613L2-Regularized Region Update 7In one boosting region, choose a constant update gamma to minimize sum i in R (r i-gamma) 2 + lambda gamma 2. Let S = sum i in R r i and n = |R|. Derive gamma.机器学习困难derivation未尝试面试订阅2614Why the Initial Prediction Matters 18Why can the choice of the initial prediction F 0 matter for the early trajectory of boosting?机器学习中等essay未尝试面试订阅2615Why Calibration Can Degrade Before Ranking 19Why can late-stage boosting sometimes keep ranking examples well while making the predicted scores less well calibrated?机器学习困难essay未尝试面试订阅2616Why Leaf-Wise Growth Can Be Higher Variance 20Why can leaf-wise tree growth be more variance-prone than level-wise growth inside a boosting system?机器学习简单essay未尝试免费2617Two-Region Two-Round Boosting Calculation 25A boosting model starts from F 0=0 with learning rate eta=0.1. In round 1, region A gets update +2 and region B gets update -1. In round 2, region A gets update -0.5 and region B gets update +0.25. What are the final predictions for a point that always stays in region A and a point that always stays in region B?机器学习简单数值题未尝试免费2618Why Many Small Corrections Can Beat One Big Tree 21Why can an additive sequence of small boosting steps outperform a single large tree with similar in-sample flexibility?机器学习中等essay未尝试面试订阅2619Why Flat Late-Round Validation Gains Still Suggest Stopping 22If the validation gain per boosting round becomes tiny and erratic late in training, why is that often a strong argument for stopping?机器学习中等essay未尝试面试订阅2620A Bound on Total Function Movement 8Suppose every boosting round changes any one point's prediction by at most eta A in absolute value. What upper bound does this imply on the total prediction movement after M rounds?机器学习困难derivation未尝试面试订阅2621Residual Block Gradient 1A scalar residual block outputs y = x + f(x). Derive dy/dx.机器学习简单derivation未尝试免费2622Global-Norm Clipping Formula 2A gradient vector g has norm ||g|| greater than clip threshold c. Derive the clipped gradient under standard global-norm clipping.机器学习简单derivation未尝试免费2623One Momentum Update 15Suppose momentum uses v t = beta v t-1 + g t with beta=0.9, previous velocity v t-1 =0.5, and current gradient g t=2. What is v t?机器学习中等数值题未尝试免费