监督学习基础
machine-learning · statistical-learning · supervised-learning · erm · loss-functions · bayes-optimal · bias-variance · generalization
打开 →GLOBAL SEARCH
搜索在服务端完成,题目解析与答案不会进入搜索结果。登录后可搜索自己的收藏题单。
找到 25 个结果
中文题目machine-learning · statistical-learning · supervised-learning · erm · loss-functions · bayes-optimal · bias-variance · generalization
打开 →偏差 方差分解与泛化 Hook:周一的因子复盘 上海某私募的因子研究员周一收到了风控的复盘邮件。他原本用 6 个 Barra 风格因子在沪深300 成份股上做截面回归预测次日超额收益,样本内 公式,模型经理觉得「不够性感」。一周后他把因子从 6 个铺到 36 个——叠加了 28 个行业哑变量、过去 30 日动量分位、几个高频微观结构特征——样本内 公式 一跃...
打开 →A teammate proposes aggressive data augmentation as a universal fix. What is the first check you should make before accepting that plan?
打开 →You are tempted to raise dropout from 0.2 to 0.6 after one mediocre run. What is the first diagnostic question you should answer before doing that?
打开 →Your validation metric is noisy day to day. Before treating the first local peak as the stopping point, what should you calibrate?
打开 →Two hidden layers memorize pairs of co-occurring signals. In-sample metrics look great, but when one signal in the pair shifts slightly out of sample, performance collapses. Which control is most naturally aimed at reducing this co-adaptation?
打开 →Keep eta = 0.1, gradient g = 0.3, and current weight w = 2.0. In the decoupled update w_new = (1 - eta*lambda)w - eta*g, lambda rises from 0.05 to 0.10. By how much does the updated weight decrease relative to the old lambda case?
打开 →A unit has activation 2.0 before standard dropout, meaning dropped units become 0 and kept units stay at 2.0. If keep probability falls from 0.8 to 0.5, what happens to the expected post-dropout activation?
打开 →A 5-class model uses label smoothing with epsilon distributed uniformly across all classes. If epsilon rises from 0.1 to 0.3, by how much does the true-class target change?
打开 →A proximal L1 step uses sign(w)*max(|w| - tau, 0). If the pre-step weight is 0.6, what output do you get when tau rises from 0.2 to 0.5?
打开 →A classifier already has good accuracy, but on borderline names it assigns 99% probability too often and the labels are believed to contain small noise. Which regularization change best targets that failure mode?
打开 →In an overparameterized network, why is it a mistake to discuss regularization strength without also looking at optimizer and data pipeline choices?
打开 →You are training on a small image-like signal dataset where small translations and mirror flips preserve the label by construction. The network fits the training set too easily. What regularization lever should move to the front of the queue?
打开 →A wide MLP on 8k tabular rows drives training AUC to 0.99 while validation AUC stalls at 0.76. Feature semantics do not support label-preserving augmentation, and the largest weights sit on sparse one-hot inputs. Which regularization control should you try first?
打开 →Training loss keeps improving every epoch, but validation Sharpe peaks around epoch 11 and then gradually drifts lower. You are not changing architecture or dataset. What regularization move is most justified?
打开 →A hidden unit has pre-dropout activation 3.2. You apply inverted dropout with keep probability 0.8. If the unit is kept on this training pass, what value is forwarded after dropout?
打开 →A 4-class classifier uses label smoothing with epsilon = 0.2, distributing epsilon uniformly across all 4 classes including the true class. If class 3 is the correct label, what smoothed target vector do you train on?
打开 →A parameter has current value w = 2.0 and gradient g = 0.3. Using a decoupled weight-decay update w_new = (1 - eta*lambda) w - eta*g with eta = 0.1 and lambda = 0.05, what is the updated weight after one step?
打开 →A layer weight vector is w = (3, 4), so its norm is 5. You enforce max-norm regularization with cap c = 4 by rescaling only when the norm exceeds c. What vector is stored after clipping?
打开 →An optimizer uses the proximal L1 shrinkage step sign(w)*max(|w| - tau, 0). If the pre-step weight is w = 0.7 and tau = 0.2, what weight remains after shrinkage?
打开 →Performance falls as you increase weight decay. Before concluding that regularization is bad, what structural question should you ask about the signal?
打开 →分类损失与 Logistic 回归 Hook:二元跑赢信号 上海某私募的因子研究员把上一节的 5 因子载荷在沪深300 全样本上重新拟过一遍,现在 PM 把问题反过来问:「不要预测下月超额收益率,直接给我一个『这只票下月跑赢沪深300 的概率』。」目标变量从连续的 公式 收缩成二值的 公式,这条信号要直接驱动一个多空叠加层(long/short overla...
打开 →正则化与模型选择 Hook:一次「翻牌」事件 你在上海一家私募基金负责沪深300 选股策略。上周你按第 3 课的做法,用普通最小二乘(ordinary least squares, OLS)把 5 个 Barra 风格因子——估值、质量、动量、规模、低波动——回归到下一期超额收益上,得到一组 公式。这周把估计窗口前移 5 个交易日重跑,价值因子载荷从 公式 ...
打开 →线性回归作为监督学习的基线 Hook:周二早会的 OLS 提问 周二早会上,你向一家头部私募(private fund)的 PM 汇报上周的因子归因。你用沪深300 成份股过去 60 个交易日的横截面数据,对 5 个 Barra 风格因子——市值、估值、动量、质量、低波动——跑了一次普通最小二乘(ordinary least squares, OLS),这是...
打开 →统计学习框架:损失、风险与经验风险最小化 开篇场景(Hook):下月信号要不要照搬 上海一家私募的量化研究员把过去三年沪深300(CSI 300)成分股的月度超额收益(excess return)整理成一张表:每一行是一只股票在某月的 公式,公式 是当月末的因子向量(规模、价值、动量、低波),公式 是下月的超额收益。她准备在这张大约一万行的样本里挑一个预测器...
打开 →