题目2555 · 机器学习
Three candidate splits on the same node have Gini gains 0.18, 0.16, and 0.11, with smaller-child sizes 3, 4, and 7 respectively. If the minimum allowed leaf size is 4, which split is actually chosen?
打开 →题目2566 · 机器学习
Node A would have leaf error 12 if pruned, while its current subtree has error 7 and 3 leaves. Node B would have leaf error 9 if pruned, while its current subtree has error 6 and 2 leaves. Which node is the weaker link and should be pruned first under cost-complexity pruning?
打开 →题目2568 · 机器学习
A parent node left uncut has SSE 70. A 2-leaf split gives total SSE 44. A 3-leaf subtree gives total SSE 36. If the complexity penalty is 10 per extra leaf relative to the uncut node, which option has the lowest penalized objective?
打开 →题目2554 · 机器学习
A leaf contains 7 positives and 13 negatives. Predicting negative incurs false-negative cost 4 on each hidden positive, while predicting positive incurs false-positive cost 1 on each hidden negative. Which class should the leaf predict?
打开 →题目2559 · 机器学习
A surrogate split agrees with the primary split on 34 of 40 training cases where both features are present. If 12 production cases are missing the primary split feature and are routed by the surrogate, what is the expected number of misroutes?
打开 →题目2560 · 机器学习
If every sample weight in a node is multiplied by the same constant c>0, how does each candidate split's weighted impurity decrease change?
打开 →题目2556 · 机器学习
A sorted feature has five distinct-value blocks of sizes [3, 5, 2, 4, 6], and splits are allowed only between distinct-value blocks. If each child leaf must contain at least 6 observations, how many legal thresholds exist?
打开 →题目2546 · 机器学习
A sorted feature has 31 observations, and each child leaf must contain at least 6 observations. How many legal split positions are there?
打开 →题目2553 · 机器学习
A tree starts with 96 observations at the root and every split is perfectly balanced. If each leaf must contain at least 12 observations, what is the maximum possible depth?
打开 →题目2547 · 机器学习
A node has leaf error 18 if pruned into a single leaf. Its current subtree has training error 10 and 3 leaves. What is the weakest-link alpha for pruning this subtree?
打开 →题目2550 · 机器学习
A classification leaf contains 6 positive cases and 14 negative cases. Predicting positive costs 1 per false positive, while predicting negative costs 4 per false negative. Which class should the leaf predict to minimize expected leaf loss?
打开 →题目2549 · 机器学习
A regression leaf has SSE 260. Splitting it would reduce child SSE to 230. If the complexity penalty is 12 per extra leaf, should you keep the split?
打开 →题目2570 · 机器学习
A primary split is missing for some rows, so a surrogate split is trained on the M rows where the primary feature is observed. If it sends A of those rows to the same side as the primary split, what is its agreement rate?
打开 →题目2564 · 机器学习
A stump has validation loss 30. Splitting it into two leaves lowers validation loss to 22 but adds an instability penalty lambda per extra leaf. For what largest lambda is the split still preferred?
打开 →题目2565 · 机器学习
Replacing a single leaf by a 3-leaf subtree reduces validation loss by 4.5. If the complexity charge is alpha = 1.2 per extra leaf, should you keep the subtree?
打开 →题目2552 · 机器学习
Split A originally has gain 1.20 and split B has gain 1.05. After one row is corrected, A loses 0.10 gain while B gains 0.08. Which split is now best?
打开 →题目2569 · 机器学习
Why can a decision tree need many small rectangles to approximate a simple diagonal boundary?
打开 →题目2551 · 机器学习
Why can an aggressive pre-pruning rule reject a first split that looks weak locally even though it would unlock a much better second-level structure?
打开 →题目2557 · 机器学习
Why are deep decision trees often called unstable learners?
打开 →题目2567 · 机器学习
Why can two root splits with almost identical immediate gain still lead to very different final trees?
打开 →