INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题；编程题使用独立的 LeetCode 式 coding lab。

做诊断按领域练习按面试风格练习代码题库

题目: 4169
领域: 8
当前筛选: 89

第 4 / 5 页

非代码面试题

显示 20 / 89 道匹配题目

答题状态：未尝试未正确已正确

ID题目领域难度题型进度权限

2457PCA Fit Once Before Cross-ValidationA notebook computes PCA on the full feature matrix and then feeds the resulting components into every cross-validation fold. Why is that not a harmless speed optimization?机器学习简单essay未尝试免费 2458Choosing Early Stopping by the Test CurveA team trains one model, plots test loss by boosting round, and reports the round with the best test value. Why is the final test score no longer a valid final check?机器学习中等essay未尝试面试订阅 2459Using Revised Index Membership in Historical FilteringA backtest filters the universe using current index membership and then evaluates historical predictions on that restricted universe. Why is this also a train/test discipline problem?机器学习困难essay未尝试面试订阅 2460Validation Used Until One Model Wins by LuckTwo candidate models are close. A researcher keeps slightly changing seeds and preprocessing until one model wins on the same validation slice. Why should the apparent win be discounted?机器学习困难essay未尝试面试订阅 2461Learning Rare-Category Merges From Future FeaturesNo labels are used, but the preprocessing step decides which rare sectors to merge by looking at category frequencies on the full dataset. Why can that still make the evaluation optimistic?机器学习简单essay未尝试免费 2462Peer Average Features That Include Held-Out TargetsA feature for each bond is the average realized default rate of bonds from the same issuer-year bucket, computed over the full sample. Why is this worse than ordinary scaling leakage?机器学习中等essay未尝试面试订阅 2463Reusing the Test Set After DebuggingA model is evaluated on test, a bug is found, the code is fixed, and the same test set is used again to verify the fix and choose among two corrected versions. Why is that second use no longer a clean test?机器学习中等essay未尝试面试订阅 2464No Test Labels Touched Is Not EnoughSomeone argues there was no leakage because the code never accessed test labels. Give the core reason this defense can fail in real ML pipelines.机器学习困难essay未尝试面试订阅 2465Why Nested Validation ExistsIf the same validation set is repeatedly used for model family choice, feature engineering, and threshold tuning, why is a second outer holdout or nested procedure conceptually necessary?机器学习困难essay未尝试面试订阅 2466What to Audit in a Leakage ReviewYou are auditing a pipeline for leakage. Beyond checking the split line in the final dataframe, what is the highest-value thing to inspect in the code path?机器学习简单essay未尝试免费 2467Unsupervised Preprocessing Can Still Distort EvaluationWhy can fitting an unsupervised step like PCA or quantile normalization on all rows still make the final reported test error too optimistic?机器学习简单essay未尝试免费 2468Group Leakage Inflates Confidence TooWhy does entity overlap across train and test typically make confidence intervals and model-stability assessments look better than they really are?机器学习中等essay未尝试面试订阅 2469Why Point-in-Time Feature Stores MatterA team says they can avoid leakage by using the latest vendor table everywhere because the values are more accurate. What core point about deployment reality are they missing?机器学习中等essay未尝试免费 2470Rare Category Thresholding After Seeing Test CompositionSuppose you choose the minimum frequency for keeping a category only after inspecting how many rare categories appear in the test set. Why is that already a contaminated design choice?机器学习困难essay未尝试面试订阅 4141Generative Threshold from Equal-Variance Gaussians 1A discriminative model was trained at class prior P(Y=1)=0.5 and outputs posterior probability 0.7 for a case x. Overnight the base rate shifts to P(Y=1)=0.2, while the class-conditional evidence for x is assumed unchanged. What posterior probability should you use after this pure prior shift?机器学习中等数值题未尝试面试订阅 4146Naive Bayes Posterior 1A generative regime model assigns posterior probability P(trend|x)=0.7 to the trend regime. If the next-day expected payoff is 12 bps in trend and -4 bps in mean reversion, what conditional expected payoff E[r|x] does the model imply?机器学习中等数值题未尝试面试订阅 4148Naive Bayes Posterior 3A generative regime model assigns posterior probability P(trend|x)=0.6 to the trend regime. If the next-day expected payoff is 0.015 return units in trend and -0.01 return units in mean reversion, what conditional expected payoff E[r|x] does the model imply?机器学习中等数值题未尝试面试订阅 4149Naive Bayes Posterior 4A generative regime model assigns posterior probability P(trend|x)=0.4 to the trend regime. If the next-day expected payoff is 3 return units in trend and 1 return units in mean reversion, what conditional expected payoff E[r|x] does the model imply?机器学习中等数值题未尝试面试订阅 4150Naive Bayes Posterior 5A generative regime model assigns posterior probability P(trend|x)=0.8 to the trend regime. If the next-day expected payoff is -2 bps in trend and 5 bps in mean reversion, what conditional expected payoff E[r|x] does the model imply?机器学习中等数值题未尝试面试订阅 4151Generative Classification with a Missing Feature 1A two-feature naive Bayes model was trained generatively, but at prediction time X2 is missing. Prior P(Y=1)=0.5, P(X1=1|Y=1)=0.8, P(X1=1|Y=0)=0.3, P(X2=1|Y=1)=0.75, P(X2=1|Y=0)=0.4. You only observe X1=1. What posterior P(Y=1|X1) should the generative model use?机器学习中等数值题未尝试面试订阅