Value Losses¶
Ordinary mean-squared error loss function. |
|
Huber loss function. |
|
Logistic loss function for binary classification, y_true = \(y\in\{0,1\}\) and the model output is a probability y_pred = \(\hat{y}\in[0,1]\): |
|
Logistic loss function specific to the case in which the target is a sign \(y\in\{-1,1\}\) and the model output is a logit \(\hat{z}\in\mathbb{R}\). |
|
Quantile Huber loss function. |
This is a collection of loss functions that may be used for learning a value function. They are just ordinary loss functions known from supervised learning.
Object Reference¶
- coax.value_losses.mse(y_true, y_pred, w=None)[source]¶
Ordinary mean-squared error loss function.
\[L\ =\ \frac12(\hat{y} - y)^2\]- Parameters:
y_true (ndarray) – The target \(y\in\mathbb{R}\).
y_pred (ndarray) – The predicted output \(\hat{y}\in\mathbb{R}\).
w (ndarray, optional) – Sample weights.
- Returns:
loss (scalar ndarray) – The loss averaged over the batch.
- coax.value_losses.huber(y_true, y_pred, w=None, delta=1.0)[source]¶
Huber loss function.
\[\begin{split}L\ =\ \left\{\begin{matrix} (\hat{y} - y)^2 &\quad:\ |\hat{y} - y|\leq\delta \\ \delta\,|\hat{y} - y| - \frac{\delta^2}{2} &\quad:\ |\hat{y} - y| > \delta \end{matrix}\right.\end{split}\]- Parameters:
y_true (ndarray) – The target \(y\in\mathbb{R}\).
y_pred (ndarray) – The predicted output \(\hat{y}\in\mathbb{R}\).
w (ndarray, optional) – Sample weights.
delta (float, optional) – The scale of the quadratic-to-linear transition.
- Returns:
loss (scalar ndarray) – The loss averaged over the batch.
- coax.value_losses.logloss(y_true, y_pred, w=None)[source]¶
Logistic loss function for binary classification, y_true = \(y\in\{0,1\}\) and the model output is a probability y_pred = \(\hat{y}\in[0,1]\):
\[L\ =\ -y\log(\hat{y}) - (1 - y)\log(1 - \hat{y})\]- Parameters:
y_true (ndarray) – The binary target, encoded as \(y\in\{0,1\}\).
y_pred ((ndarray of) float) – The predicted output, represented by a probablity \(\hat{y}\in[0,1]\).
w (ndarray, optional) – Sample weights.
- Returns:
loss (scalar ndarray) – The loss averaged over the batch.
- coax.value_losses.logloss_sign(y_true_sign, logits, w=None)[source]¶
Logistic loss function specific to the case in which the target is a sign \(y\in\{-1,1\}\) and the model output is a logit \(\hat{z}\in\mathbb{R}\).
\[L\ =\ \log(1 + \exp(-y\,\hat{z}))\]This version tends to be more numerically stable than the generic implementation, because it avoids having to map the predicted logit to a probability.
- Parameters:
y_true_sign (ndarray) – The binary target, encoded as \(y=\pm1\).
logits (ndarray) – The predicted output, represented by a logit \(\hat{z}\in\mathbb{R}\).
w (ndarray, optional) – Sample weights.
- Returns:
loss (scalar ndarray) – The loss averaged over the batch.
- coax.value_losses.quantile_huber(y_true, y_pred, quantiles, w=None, delta=1.0)[source]¶
Quantile Huber loss function.
\[\begin{split}\delta_{ij} &= y_j - \hat{y}_i\\ \rho^\kappa_\tau(\delta_{ij}) &= |\tau - \mathbb{I}{\{ \delta_{ij} < 0 \}}| \ \frac{\mathcal{L}_\kappa(\delta_{ij})}{\kappa},\ \quad \text{with}\\ \mathcal{L}_\kappa(\delta_{ij}) &= \begin{cases} \frac{1}{2} \delta_{ij}^2,\quad \ &\text{if } |\delta_{ij}| \le \kappa\\ \kappa (|\delta_{ij}| - \frac{1}{2}\kappa),\quad \ &\text{otherwise} \end{cases}\end{split}\]- Parameters:
y_true (ndarray) – The target \(y\in\mathbb{R}^{2}\).
y_pred (ndarray) – The predicted output \(\hat{y}\in\mathbb{R}^{2}\).
quantiles (ndarray) – The quantiles of the prediction \(\tau\in\mathbb{R}^{2}\).
w (ndarray, optional) – Sample weights.
delta (float, optional) – The scale of the quadratic-to-linear transition.
- Returns:
loss (scalar ndarray) – The loss averaged over the batch.