GLOBAL SEARCH

搜索课程、模块、题目与收藏题单

搜索在服务端完成,题目解析与答案不会进入搜索结果。登录后可搜索自己的收藏题单。

找到 25 个结果

中文题目
模块2.6.2 · 数学与统计能力 · 机器学习理论

树模型与核方法

machine-learning · tree-based-methods · decision-tree · cart · impurity · pruning · bagging · random-forest

打开 →
课程树模型与核方法 · 机器学习理论

Bagging 与随机森林

周五午盘,一家 50 亿规模的 CN 私募把一份沪深300 alpha 数据甩到你工位:30 个特征、日频次日超额收益作标签。上一课那棵深度 15 的 CART 树样本内方向准确率 100%、样本外只有 51%——比抛硬币好不了多少,Sharpe 几乎为零。你把它换成 500 棵在 bootstrap 样本上独立训练的深树取平均,样本外跳到 57%。这一跳,...

打开 →
题目2566 · 机器学习

Choose the Weakest-Link Node to Prune 24

Node A would have leaf error 12 if pruned, while its current subtree has error 7 and 3 leaves. Node B would have leaf error 9 if pruned, while its current subtree has error 6 and 2 leaves. Which node is the weaker link and should be pruned first under cost-complexity pruning?

打开 →
题目2568 · 机器学习

Compare Penalized Tree Options 25

A parent node left uncut has SSE 70. A 2-leaf split gives total SSE 44. A 3-leaf subtree gives total SSE 36. If the complexity penalty is 10 per extra leaf relative to the uncut node, which option has the lowest penalized objective?

打开 →
题目2554 · 机器学习

Cost-Sensitive Leaf Label 21

A leaf contains 7 positives and 13 negatives. Predicting negative incurs false-negative cost 4 on each hidden positive, while predicting positive incurs false-positive cost 1 on each hidden negative. Which class should the leaf predict?

打开 →
题目2559 · 机器学习

Expected Misroutes From a Surrogate Split

A surrogate split agrees with the primary split on 34 of 40 training cases where both features are present. If 12 production cases are missing the primary split feature and are routed by the surrogate, what is the expected number of misroutes?

打开 →
题目2556 · 机器学习

Grouped Values and Feasible Thresholds 22

A sorted feature has five distinct-value blocks of sizes [3, 5, 2, 4, 6], and splits are allowed only between distinct-value blocks. If each child leaf must contain at least 6 observations, how many legal thresholds exist?

打开 →
题目2553 · 机器学习

Maximum Balanced Depth Numerically 20

A tree starts with 96 observations at the root and every split is perfectly balanced. If each leaf must contain at least 12 observations, what is the maximum possible depth?

打开 →
题目2547 · 机器学习

Numeric Weakest-Link Alpha 16

A node has leaf error 18 if pruned into a single leaf. Its current subtree has training error 10 and 3 leaves. What is the weakest-link alpha for pruning this subtree?

打开 →
题目2550 · 机器学习

Optimal Leaf Label Under Asymmetric Trading Costs

A classification leaf contains 6 positive cases and 14 negative cases. Predicting positive costs 1 per false positive, while predicting negative costs 4 per false negative. Which class should the leaf predict to minimize expected leaf loss?

打开 →
题目2570 · 机器学习

Surrogate Split Agreement Rate 8

A primary split is missing for some rows, so a surrogate split is trained on the M rows where the primary feature is observed. If it sends A of those rows to the same side as the primary split, what is its agreement rate?

打开 →
课程树模型与核方法 · 机器学习理论

决策树:CART、不纯度准则与剪枝

周一早盘九点二十,你接手了离职同事留下的 alpha 模型——一棵深度 15 的 CART(Classification and Regression Tree, CART)树,在三年 沪深300 成分股日度面板上训练,特征是动量、价值、质量、低波、5 日收益、20 日波动率、换手率等 12 个变量,目标是预测下一日超额收益方向(涨/跌)。样本内训练精度 1...

打开 →
课程树模型与核方法 · 机器学习理论

核方法与支持向量机

周一开盘前一小时,你坐在上海一家中型私募基金(private fund)的研究室。投研经理把一张 CSV 推到桌上:沪深300 成分股 300 只,每只配 15 维因子向量(PE、PB、12 个月动量、20 日波动率、换手率、分析师上调比例),本质上是一张轻量级因子模型(factor model)输入表;标签 公式 表示下月相对指数 outperform /...

打开 →
课程树模型与核方法 · 机器学习理论

梯度提升与 XGBoost / LightGBM

上海某私募的因子研究员把上一节的 500 棵随机森林训完,沪深300 + 中证500 上的样本外准确率 57%——比单棵深树的 51% 上了 6 个点。她把 max features 从 sqrt(p) 调到 p/3、把树数加到 2000,准确率纹丝不动停在 57.2%——bagging 的方差红利已经吃干净了。PM 在因子复盘会上一句话:「方差降到底了,把...

打开 →