INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题;编程题使用独立的 LeetCode 式 coding lab。

题目
4169
领域
8
当前筛选
91

5 / 5

非代码面试题

显示 11 / 91 道匹配题目

答题状态:未尝试未正确已正确
2622Global-Norm Clipping Formula 2A gradient vector g has norm ||g|| greater than clip threshold c. Derive the clipped gradient under standard global-norm clipping.机器学习简单derivation未尝试免费2628Why Residual Connections Help Train Deep Nets 20Why do residual connections often make very deep networks easier to optimize?机器学习中等essay未尝试免费2629EMA From Zero Initialization 6Let m t = beta m t-1 + (1-beta) x t with m 0=0. Derive m t as an explicit weighted sum of x 1,...,x t.机器学习中等derivation未尝试免费2633Layer-Norm Shift Invariance 8Ignoring learned affine parameters, why does adding the same constant a to every coordinate of a vector leave layer-normalized activations unchanged?机器学习中等derivation未尝试免费2645Why Global-Norm Clipping Preserves Direction 14Why does global-norm clipping change the magnitude of a gradient vector but not its direction whenever clipping is active?机器学习困难derivation未尝试面试订阅2655Why Expanding Windows Can Beat Rolling Windows Under Sparse DataWhy might an expanding-window CV design be preferable to a rolling-window design when the series is short and drift is present but not violent?机器学习困难essay未尝试面试订阅2665Why Tiny Folds Can Exaggerate RegularizationWhy can a very small training fold make heavily regularized models look better than they would on the full training set?机器学习困难essay未尝试面试订阅2666Why Outer-Fold Disagreement Is InformativeIf different outer folds in nested CV keep selecting different hyperparameters, what does that usually say about the learning problem?机器学习简单essay未尝试免费2680Why Low R-Squared Can Still Be Valuable Yet Hard to VerifyWhy can a signal with tiny explanatory power still be economically useful, while also being unusually hard to validate convincingly?机器学习困难essay未尝试面试订阅2683Why Long Training Windows Can Learn the Wrong WorldWhy can adding more historical years lower estimation variance and yet make a finance model worse?机器学习中等essay未尝试面试订阅2684Why Short Windows Adapt but Also WhipsawWhy does a short rolling window often react faster to new regimes while simultaneously making parameter estimates much less stable?机器学习困难essay未尝试面试订阅