1967数学简单derivationshort
Positive Softplus Tilt 22
题目
The linear reward is strong relative to the saturation penalty, so the optimum should be positive. The desk maximizes K(x) = 3 x - 4 ln(1+e^x). What x is optimal?
解题计时
0:00
提交作答时记录,用于后续平均用时统计。
你的答案