INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题；编程题使用独立的 LeetCode 式 coding lab。

做诊断按领域练习按面试风格练习代码题库

题目: 4169
领域: 8
当前筛选: 268

第 10 / 14 页

非代码面试题

显示 20 / 268 道匹配题目

答题状态：未尝试未正确已正确

ID题目领域难度题型进度权限

2576Why Feature Subsampling Helps When One Predictor Dominates 12Why can random feature subsampling improve a forest when one very strong predictor would otherwise appear at the top of almost every tree?机器学习简单essay未尝试免费 2578Why Tiny max_features Can Raise Bias 14Why can making max features too small hurt a random forest even though it lowers correlation?机器学习中等essay未尝试免费 2581Why Random-Forest Regression Extrapolates Poorly 16Why does random-forest regression usually fail to extrapolate a trend far beyond the training range?机器学习简单essay未尝试免费 2591Why OOB Can Be Noisy on Small Samples 19Why can out-of-bag error fluctuate a lot on a small dataset even when the forest itself is reasonably stable?机器学习简单essay未尝试免费 2592Effective Independent Tree Count 8Define B eff by matching the correlated-forest variance sigma 2 [rho + (1-rho)/B] to the variance sigma 2 / B eff of averaging independent trees. Derive B eff.机器学习简单derivation未尝试免费 2593Why Averaging Cannot Cure Systematic Label Noise 20Why can a larger forest fail to repair performance when the training labels themselves are systematically corrupted?机器学习中等essay未尝试面试订阅 2597Weighted Region Update 2If observations in a boosting region R carry positive weights w i, derive the constant update gamma that minimizes sum i in R w i (r i-gamma) 2.机器学习简单derivation未尝试免费 2598Final Prediction After Three Boosting Rounds 23A boosting model starts from F 0(x)=10. For one observation, the leaf updates along its path are +1.2, -0.5, and +0.8 across three rounds, with learning rate eta=0.1 each round. What is the final prediction?机器学习中等数值题未尝试免费 2604Why Label Noise Is Especially Toxic 13Why does boosting often suffer badly when labels are noisy?机器学习中等essay未尝试免费 2607Why Overly Deep Base Trees Can Cancel Shrinkage Discipline 15Why can a very deep base tree undermine the regularizing effect of a small learning rate?机器学习简单essay未尝试免费 2608Residual After Two Shrunken Updates 24A point currently has residual 6. Two boosting rounds hit its region with leaf updates 1.5 and 0.8, using learning rate eta=0.2 in both rounds. What residual remains after the two rounds?机器学习中等数值题未尝试免费 2611Why Boosting Parallelizes Worse Than Random Forests 16Why is boosting fundamentally harder to parallelize across rounds than random forests?机器学习简单essay未尝试免费 2614Why the Initial Prediction Matters 18Why can the choice of the initial prediction F 0 matter for the early trajectory of boosting?机器学习中等essay未尝试面试订阅 2616Why Leaf-Wise Growth Can Be Higher Variance 20Why can leaf-wise tree growth be more variance-prone than level-wise growth inside a boosting system?机器学习简单essay未尝试免费 2617Two-Region Two-Round Boosting Calculation 25A boosting model starts from F 0=0 with learning rate eta=0.1. In round 1, region A gets update +2 and region B gets update -1. In round 2, region A gets update -0.5 and region B gets update +0.25. What are the final predictions for a point that always stays in region A and a point that always stays in region B?机器学习简单数值题未尝试免费 2619Why Flat Late-Round Validation Gains Still Suggest Stopping 22If the validation gain per boosting round becomes tiny and erratic late in training, why is that often a strong argument for stopping?机器学习中等essay未尝试面试订阅 2641Why Clipping Helps Exploding but Not Vanishing Gradients 23Why is gradient clipping a natural remedy for exploding gradients but not for vanishing gradients?机器学习简单essay未尝试免费 2642BatchNorm Running Mean Update 13A BatchNorm layer updates its running mean by mu new = m mu old + (1-m) mu batch. What does this formula mean operationally?机器学习简单derivation未尝试免费 2643Clipping Plus Weight Decay on a Vector 25A parameter vector is w t=(3,4). Its gradient is g=(6,8), whose norm is 10. Apply global-norm clipping with threshold 5, then a decoupled weight-decay step with learning rate eta=0.1 and lambda=0.1. What is the new parameter vector?机器学习中等数值题未尝试面试订阅 2644Why LayerNorm Is Attractive in Sequence and Online Settings 24Why is LayerNorm often preferred over BatchNorm in sequence models or online inference settings?机器学习中等essay未尝试面试订阅