INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题;编程题使用独立的 LeetCode 式 coding lab。

题目
4169
领域
8
当前筛选
453

21 / 23

非代码面试题

显示 20 / 453 道匹配题目

答题状态:未尝试未正确已正确
4295Weight Decay Shrinkage 5An optimizer uses the proximal L1 shrinkage step sign(w)*max(|w| - tau, 0). If the pre-step weight is w = 0.7 and tau = 0.2, what weight remains after shrinkage?机器学习简单数值题未尝试面试订阅4296Dropout Noise Level 1Keep eta = 0.1, gradient g = 0.3, and current weight w = 2.0. In the decoupled update w new = (1 - eta*lambda)w - eta*g, lambda rises from 0.05 to 0.10. By how much does the updated weight decrease relative to the old lambda case?机器学习中等数值题未尝试面试订阅4297Dropout Noise Level 2A unit has activation 2.0 before standard dropout, meaning dropped units become 0 and kept units stay at 2.0. If keep probability falls from 0.8 to 0.5, what happens to the expected post-dropout activation?机器学习中等数值题未尝试面试订阅4298Dropout Noise Level 3A 5-class model uses label smoothing with epsilon distributed uniformly across all classes. If epsilon rises from 0.1 to 0.3, by how much does the true-class target change?机器学习中等数值题未尝试面试订阅4300Dropout Noise Level 5A proximal L1 step uses sign(w)*max(|w| - tau, 0). If the pre-step weight is 0.6, what output do you get when tau rises from 0.2 to 0.5?机器学习中等数值题未尝试面试订阅4301Label Smoothing Loss 1Mixup combines one-hot labels for class 1 and class 4 in a 4-class problem with lambda = 0.3 on class 1's example. What mixed target vector is produced?机器学习中等数值题未尝试面试订阅4302Label Smoothing Loss 2A network uses stochastic depth on 12 residual blocks, each with survival probability 0.75 during training. How many blocks are active on average in a training pass?机器学习中等数值题未尝试面试订阅4303Label Smoothing Loss 3A layer has 400 weights. DropConnect keeps each weight independently with probability 0.9 during training. How many weights are active on average in one forward pass?机器学习中等数值题未尝试面试订阅4304Label Smoothing Loss 4A unit has activation a = 3 before inverted dropout with keep probability q = 0.75. During training the output is either 0 or a/q. What is the variance of that post-dropout output?机器学习中等数值题未尝试面试订阅4306Sparse Weights Blow UpA wide MLP on 8k tabular rows drives training AUC to 0.99 while validation AUC stalls at 0.76. Feature semantics do not support label-preserving augmentation, and the largest weights sit on sparse one-hot inputs. Which regularization control should you try first?机器学习中等essay未尝试面试订阅4307Validation Peak Then DriftTraining loss keeps improving every epoch, but validation Sharpe peaks around epoch 11 and then gradually drifts lower. You are not changing architecture or dataset. What regularization move is most justified?机器学习中等essay未尝试面试订阅4308Noisy Labels And OverconfidenceA classifier already has good accuracy, but on borderline names it assigns 99% probability too often and the labels are believed to contain small noise. Which regularization change best targets that failure mode?机器学习中等essay未尝试面试订阅4309Co-Adapted Hidden UnitsTwo hidden layers memorize pairs of co-occurring signals. In-sample metrics look great, but when one signal in the pair shifts slightly out of sample, performance collapses. Which control is most naturally aimed at reducing this co-adaptation?机器学习中等essay未尝试面试订阅4310Safe Invariance AvailableYou are training on a small image-like signal dataset where small translations and mirror flips preserve the label by construction. The network fits the training set too easily. What regularization lever should move to the front of the queue?机器学习中等essay未尝试面试订阅4311Before Raising DropoutYou are tempted to raise dropout from 0.2 to 0.6 after one mediocre run. What is the first diagnostic question you should answer before doing that?机器学习中等essay未尝试面试订阅4312Before Adding AugmentationA teammate proposes aggressive data augmentation as a universal fix. What is the first check you should make before accepting that plan?机器学习中等essay未尝试面试订阅4313When Weight Decay Starts HurtingPerformance falls as you increase weight decay. Before concluding that regularization is bad, what structural question should you ask about the signal?机器学习中等essay未尝试面试订阅4314Before Trusting Early StoppingYour validation metric is noisy day to day. Before treating the first local peak as the stopping point, what should you calibrate?机器学习中等essay未尝试面试订阅4315Regularization Is Not IsolatedIn an overparameterized network, why is it a mistake to discuss regularization strength without also looking at optimizer and data pipeline choices?机器学习中等essay未尝试面试订阅4316Attention Score CountA Transformer layer processes L=256 tokens with H=8 heads. Ignoring the value dimension, how many raw attention score entries are formed across all heads?机器学习简单数值题未尝试面试订阅