INTERVIEW PREP

数学与非代码面试题

覆盖数学、概率、统计、脑筋急转弯、机器学习和金融。这里负责筛选和进入单题;编程题使用独立的 LeetCode 式 coding lab。

题目
4169
领域
8
当前筛选
622

23 / 32

非代码面试题

显示 20 / 622 道匹配题目

答题状态:未尝试未正确已正确
4321Streaming Order-Flow MotifsYou need millisecond-latency prediction from a live order-flow stream. Most of the useful structure comes from local motifs over the most recent 20-40 events, and the model must update online without waiting for a block. Which architecture family should be your first baseline?机器学习中等essay未尝试面试订阅4322Online Stateful SequenceA model must process an indefinite event stream one tick at a time and maintain a compact evolving hidden state that can be updated without revisiting past inputs. Which architecture family is most naturally aligned with that requirement?机器学习中等essay未尝试面试订阅4323Long Offline Cross-ReferenceYou are building an offline model over 4000-token documents where answers often depend on matching phrases across distant sections. Latency is less important than capturing those long-range interactions. Which architecture should dominate the shortlist?机器学习中等essay未尝试面试订阅4324Small Data With Local StationarityYou have limited labeled data, and the target depends on local translation-equivariant patterns in a 2D signal map. Which architecture family usually brings the strongest built-in inductive bias?机器学习中等essay未尝试面试订阅4325Rare But Crucial Global LinksA sequence problem has mostly local structure, but a small fraction of labels flips because of interactions between positions hundreds of steps apart. Missing those interactions is very costly. Which architecture family should you favor?机器学习中等essay未尝试面试订阅4326Length-Doubling Cost ShockA local CNN with window size 7 scales like 7L interactions, while a Transformer attention block scales like L 2 score pairs. If L doubles from 256 to 512, by what factor does each interaction count grow, and which architecture hits the sharper scaling wall?机器学习中等essay未尝试面试订阅4327CNN Depth For Longer HorizonA stride-1 CNN uses kernel size 3 and no dilation. To cover a dependency horizon of 9 steps you need 4 layers. If the required horizon rises to 41 steps, how many layers are needed, and what does that imply about the architecture pressure?机器学习中等essay未尝试面试订阅4328Small-Data Regime ShiftSuppose the task stays strongly local and translation-equivariant, but your labeled dataset shrinks by a factor of 10. Which architecture becomes more attractive, and why does the shift in data regime matter?机器学习中等essay未尝试面试订阅4329Latency Budget RelaxationA task was originally fully online, making recurrence or causal convolution preferable. If the deployment changes to offline batch scoring with the whole sequence available, which architecture family gains the most from that relaxation?机器学习中等essay未尝试面试订阅4330From Local To Global Task StructureA prediction problem used to depend on short motifs, but after a product change the label now depends on matching information from the first and last quarter of each sequence. Which architecture family should move up the ranking?机器学习中等essay未尝试面试订阅4331What To Quantify FirstBefore you choose between a CNN, RNN, and Transformer for a new sequence task, what two structural quantities should you quantify first?机器学习中等essay未尝试面试订阅4332Before Picking Transformer By DefaultA teammate wants to start with a Transformer because it won the last benchmark. What is the first counter-question you should ask?机器学习中等essay未尝试面试订阅4333Before Discarding RNNsWhy should you hesitate before ruling out RNNs entirely in a trading-system pipeline?机器学习中等essay未尝试面试订阅4334Before Using CNNWhat is the first structural property you should verify before leaning on a CNN as your main architecture?机器学习中等essay未尝试面试订阅4335Hybrid ThinkingIf you suspect the task has both strong local motifs and occasional long-range dependencies, what should be your first decomposition step before arguing about model family?机器学习中等essay未尝试面试订阅4336Why CNN Can WinWhy can a modest CNN beat a larger Transformer on a small-data task whose label depends mainly on short local patterns?机器学习中等essay未尝试面试订阅4337Why RNN Still MattersWhy might an RNN still be the practical choice for a production event-stream model even if Transformers benchmark better offline?机器学习中等essay未尝试面试订阅4338When Attention Earns Its CostWhat kind of task structure makes the quadratic cost of attention worth paying?机器学习中等essay未尝试面试订阅4339Why Architecture Mismatch HurtsWhy can architecture mismatch dominate parameter count when performance is poor?机器学习中等essay未尝试面试订阅4340Hybrid Versus PureWhen is it more sensible to consider a hybrid architecture instead of insisting on a pure CNN, pure RNN, or pure Transformer?机器学习中等essay未尝试面试订阅