题目1642 · 统计
A market regime model has three states: calm, trending, and dislocated, with probabilities $(p_1,p_2,p_3)$. Over 100 days, the observed counts are 20 calm days, 30 trending days, and 50 dislocated days.
Find the MLE of $(p_1,p_2,p_3)$.
打开 →题目1649 · 统计
Suppose trade-duration observations are modeled as Gamma with known shape $k=3$ and unknown scale $\theta$. Under this parameterization,
$$E[X]=k\theta.$$If the sample mean is $12$, find the MLE of $\theta$.
打开 →题目1641 · 统计
A strategy is repeatedly tried until the first profitable fill. Let $X$ be the number of attempts until the first success, with support $1,2,\ldots$, and model $X\sim \mathrm{Geometric}(p)$.
If the sample mean from many independent episodes is $4$, find the MLE of $p$. Under the
打开 →题目1645 · 统计
Suppose short-horizon pricing errors are modeled as i.i.d. Laplace$(\mu,b)$ with known scale $b=2$, so the density is proportional to $e^{-|x-\mu|/2}$. The observed sample is
$$-1,\;0,\;2,\;2,\;3,\;5,\;7.$$
Find the MLE of $\mu$.
打开 →题目1643 · 统计
Suppose large execution slippage magnitudes are modeled as Pareto with known scale $x_m=1$ and unknown tail index $\alpha$, so the density is
$$f(x)=\alpha x^{-\alpha-1}, \qquad x\ge 1.$$
If $n=8$ observations satisfy
$$\sum_{i=1}^8 \log X_i = 12,$$
find the MLE of $\alpha$. Then
打开 →题目1637 · 统计
During a 40-minute observation window, a venue records 120 child-order arrivals. Model the arrivals as a homogeneous Poisson process with intensity $\lambda$ arrivals per minute.
Find the MLE of $\lambda$, and estimate the probability of seeing zero arrivals in the next minute u
打开 →题目1640 · 统计
Five i.i.d. observations are modeled as $\mathrm{Uniform}(0,\theta)$. The sample maximum is $7.4$.
Find the MLE of $\theta$, and then estimate the median of the fitted distribution.
打开 →题目1644 · 统计
Suppose execution delays are modeled as Weibull with known shape $k=2$ and unknown scale $\lambda$, with density
$$f(x)=\frac{2x}{\lambda^2}e^{-(x/\lambda)^2}, \qquad x>0.$$
If $n=10$ observations satisfy
$$\sum_{i=1}^{10} X_i^2 = 90,$$
find the MLE of $\lambda$.
打开 →题目1638 · 统计
Ten independent waiting times between mid-price changes sum to 25 seconds. Model each waiting time as $\mathrm{Exp}(\lambda)$.
Find the MLE of $\lambda$, and under the fitted model compute the median waiting time.
打开 →题目1650 · 统计
Suppose $X_1,\dots,X_{25}\sim N(\mu,4)$ i.i.d., and the sample mean is $\bar X=1.2$.
Find the MLE of $\mu$, and then use invariance to estimate $e^{\mu}$.
打开 →题目1647 · 统计
Suppose positive holding-period multipliers are modeled as lognormal: if $X\sim \mathrm{Lognormal}(\mu,\sigma^2)$ then $\log X\sim N(\mu,\sigma^2)$. For a sample of size 12, you are given
$$\overline{\log X} = 0.3, \qquad \sum_{i=1}^{12}(\log X_i-0.3)^2 = 10.8.$$
Find the MLEs of
打开 →题目1636 · 统计
A binary trading signal was profitable on 44 of the last 80 trading days. Model each day as an independent Bernoulli$(p)$ outcome.
Find the maximum likelihood estimator of $p$, and then estimate the probability that the next 3 days are all profitable under the fitted model.
打开 →题目1646 · 统计
A venue studies the time to the next spread-widening event. Eight observation windows are each followed for up to 5 seconds. In total, 5 windows contain an event before 5 seconds and 3 windows are right-censored at 5 seconds. The total observed exposure time across all 8 windows
打开 →题目1648 · 统计
Suppose observations satisfy
$$Y_i = \beta X_i + \varepsilon_i, \qquad \varepsilon_i\stackrel{iid}{\sim}N(0,\sigma^2),$$
with no intercept and known Gaussian errors. You are told that
$$\sum X_iY_i = 48, \qquad \sum X_i^2 = 16.$$
Find the MLE of $\beta$.
打开 →题目1639 · 统计
Suppose $X_1,\dots,X_9$ are modeled as i.i.d. $N(\mu,\sigma^2)$. From the sample you know that
$$\bar X = 5, \qquad \sum_{i=1}^9 (X_i-\bar X)^2 = 18.$$
Find the MLEs of $\mu$ and $\sigma^2$.
打开 →题目3237 · 统计
A desk observes only 4 new defaults for a rare event. Using a strong historical Beta prior, the Bayesian posterior mean default rate is much lower than the sample proportion, while the frequentist MLE equals the sample proportion exactly. Explain why these two answers can legitim
打开 →题目2521 · 机器学习
For an intercept-only logistic model with n_1 positives and n_0 negatives, what fitted probability p_hat maximizes the log-likelihood?
打开 →课程参数估计与假设检验 · 统计推断
上海某私募的量化研究员把上一课跑出来的两个候选估计量并排放着:一个是无偏的样本方差 公式(分母 公式),另一个是极大似然估计(maximum likelihood estimation, MLE)的方差版 公式(分母 公式)。直觉告诉他「无偏」听起来更值得信赖,但当真到了要在波动率模型里塞一个数,他需要的是一把明确可比较的「好坏」尺子——能告诉他在 公式 的...
打开 →模块2.2.1 · 数学与统计能力 · 统计推断
MLE · 假设检验 · 置信区间 · Bootstrap
打开 →课程参数估计与假设检验 · 统计推断
上海某私募的量化研究员周一上午把过去 200 个交易日的沪深300 日内对数收益堆在屏幕上,准备给一个新的日频股指期货策略估出「年化波动率」。他知道收益的真实分布参数永远看不见,手里有的只是一串样本。问题就此变形:从这 200 个数里挤出哪个数字配叫做「波动率的估计」?另一位同事在 50ETF 期权交易台做做市,他需要从最近一周的成交频次里估出每秒到单率 公...
打开 →课程参数估计与假设检验 · 统计推断
某私募的量化研究员把新风控流程在 60 个交易日上跑出的日收益序列丢到屏幕上,样本均值比对照组高出 12 bp,样本标准差 35 bp。组合经理只关心一个问题:这 12 bp 究竟是流程改造带来的真效应,还是 60 个数里凑巧抖出来的噪声?把「凑巧」翻译成数学,就是本课要交付的工具:在一个明确的概率模型下,把「真效应」与「凑巧」分到拒绝域与接受域两边,并给做...
打开 →课程参数估计与假设检验 · 统计推断
周一上午,某私募的量化研究员要给 LP 周报里的「日均超额收益」配上一句免责声明。点估计给出 公式、样本标准差 公式、样本量 公式。市场部追问:「这个 5.2 准吗?能不能告诉我一个区间?」她不能回答「真值有 95% 的概率落在某段里」——后面会看到这是个语言陷阱——但她可以给出一段 置信区间 (confidence interval, CI),并把...
打开 →题目2457 · 机器学习
A notebook computes PCA on the full feature matrix and then feeds the resulting components into every cross-validation fold. Why is that not a harmless speed optimization?
打开 →题目4188 · 机器学习
Why is the dummy-variable trap more than just a harmless coding oversight?
打开 →题目2388 · 数学
Why can a rare-event payoff have an unstable Monte Carlo estimate even when most simulated paths look harmless?
打开 →题目2708 · 机器学习
Why should changing the tradable universe be counted as another research branch rather than as harmless context?
打开 →课程SciPy 与统计工具 · Python 数据与量化分析
周一上午十点,你坐在一家中型私募的研究台。3.2.2 收尾那张 tear sheet 昨晚跑完了,落到磁盘的中间产物里有一行 returns = (closes['510300.SH'].pct change().dropna()).to numpy() ——一根长度 252 的 np.ndarray ,是沪深300 ETF(510300.SH)在 2024...
打开 →课程回归与广义线性模型 · 统计推断
上海某量化私募的两位研究员同一天上午被同一类工具卡住:小赵在搭一个「明日是否跑赢沪深300」的择时信号,标签是二元的 0/1;小李在 50ETF 期权做市数据上估「下一分钟到单笔数」,响应是非负整数 公式。模块前三课的普通最小二乘(ordinary least squares, OLS)对这两个任务都派不上用场——OLS 默认响应在正态分布(Gaussian...
打开 →课程监督学习基础 · 机器学习理论
线性回归作为监督学习的基线 Hook:周二早会的 OLS 提问 周二早会上,你向一家头部私募(private fund)的 PM 汇报上周的因子归因。你用沪深300 成份股过去 60 个交易日的横截面数据,对 5 个 Barra 风格因子——市值、估值、动量、质量、低波动——跑了一次普通最小二乘(ordinary least squares, OLS),这是...
打开 →