第 2 / 2 页
非代码面试题
显示 6 / 26 道匹配题目
答题状态:未尝试未正确已正确
ID题目领域难度题型进度权限
2637ReLU Local Derivative 10For ReLU(z)=max(0,z), what derivative does backprop use when z>0 and when z<0?机器学习中等derivation未尝试免费2638Residual Gradient Numerically 19A scalar residual block has y=x+f(x) with f(x)=3x 2. What is dy/dx at x=1?机器学习中等数值题未尝试免费2639Steady-State Momentum Under a Constant Gradient 11If v t = beta v t-1 + g with constant gradient g and |beta|<1, what constant value does v t converge to?机器学习困难derivation未尝试免费2640Cosine Decay Schedule 12A learning rate decays from eta max to eta min over T steps using cosine annealing. What is eta t at step t?机器学习困难derivation未尝试免费2642BatchNorm Running Mean Update 13A BatchNorm layer updates its running mean by mu new = m mu old + (1-m) mu batch. What does this formula mean operationally?机器学习简单derivation未尝试免费2643Clipping Plus Weight Decay on a Vector 25A parameter vector is w t=(3,4). Its gradient is g=(6,8), whose norm is 10. Apply global-norm clipping with threshold 5, then a decoupled weight-decay step with learning rate eta=0.1 and lambda=0.1. What is the new parameter vector?机器学习中等数值题未尝试面试订阅