Complex quantitative models are used extensively in actuarial and financial risk management applications, as well as in wider fields such as environmental risk modelling (Tsanakas and Millossovich 2016; Borgonovo and Plischke 2016; Silvana M. Pesenti, Millossovich, and Tsanakas 2019). The complexity of such models (high dimensionality of inputs; non-linear relationships) motivates the performance of sensitivity analyses, with the aim of providing insight into the ways that model inputs interact and impact upon the model output.
When model inputs are subject to uncertainty, global sensitivity methods are often used, considering the full space of (randomly generated) multivariate scenarios, which represent possible configurations of the model input vector. The particular task of ranking the importance of different model inputs leads to the use of sensitivity measures, which assign a score to each model input. A rich literature on global sensitivity analysis exists, with variance decomposition methods being particularly prominent; see Saltelli et al. (2008) and Borgonovo and Plischke (2016) for wide-ranging reviews. The R package sensitivity (Iooss, Janon, and Pujol 2019) implements a wide range of sensitivity analysis approaches and measures.
We introduce an alternative approach to sensitivity analysis called Scenario Weights for Importance Measurement (SWIM) and present the R package implementing it (Silvana M. Pesenti et al. 2020). The aim of this paper is to provide an accessible introduction to the concepts underlying SWIM and a vignette demonstrating how the package is used. SWIM quantifies how distorting a particular model component (which could be a model input, output, or an intermediate quantity) impacts all other model components. The SWIM approach can be summarised as follows:
The starting point is a table of simulated scenarios, each column containing realisations of a different model component. This table forms the baseline model as well as the dataset on which the SWIM bases its calculations.
A stress is defined as a particular modification of a model component (or group of components). This could relate to a change in moments, probabilities of events of interest, or risk measures, such as Value-at-Risk or Expected Shortfall (e.g. McNeil, Frey, and Embrechts (2015)).
SWIM calculates a set of scenario weights, acting upon the simulated scenarios and thus modifying the relative probabilities of scenarios occurring. Scenario weights are derived such that the defined stress on model components is fulfilled, while keeping the distortion to the baseline model to a minimum, as quantified by the Kullback-Leibler divergence (relative entropy).
Given the calculated scenario weights, the impact of the stress on the distributions of all model components is worked out and sensitivity measures, useful for ranking model components, are evaluated.
A key benefit of SWIM are that it provides a sensitivity analysis framework that is economical both computationally and in terms of the information needed to perform the analysis. Specifically, sensitivity analysis is performed using only one set of simulated scenarios. No further simulations are needed, thus eliminating the need for repeated evaluation of the model, which could be numerically expensive. Furthermore, the user of SWIM needs to know neither the explicit form of the joint distribution of model components nor the exact form of functional relations between them. Hence, SWIM is appropriate for the analysis of black box models, thus having a wide scope of applications.
The SWIM approach is largely based on Silvana M. Pesenti, Millossovich, and Tsanakas (2019) and uses theoretical results on risk measures and sensitivity measures developed in that paper. An early sensitivity analysis approach based on scenario weighting was proposed by Beckman and McKay (1987). The Kullback-Leibler divergence has been used extensively in the financial risk management literature – papers that are conceptually close to SWIM include Weber (2007); Breuer and Csiszár (2013); and Cambou and Filipović (2017). Some foundational results related to the minimisation of the Kullback-Leibler divergence are provided in Csiszár (1975).
The SWIM package can be installed from CRAN or through GitHub:
r} # directly from CRAN install.packages("SWIM") # and the development version from GitHub devtools::install_github("spesenti/SWIM")
Section @ref(Sec:Intro) provides an introduction to SWIM, illustrating key concepts and basic functionalities of the package on a simple example. Section @ref(Sec:Scope) contains technical background on the optimisations that underlay the SWIM package implementation. Furthermore, Section @ref(Sec:Scope) includes a brief reference guide, providing an overview of implemented R functions, objects, and graphical/analysis tools. Finally, a detailed case study of a credit risk portfolio is presented in Section @ref(Sec:CreditModel). Through this case study, advanced capabilities of SWIM for sensitivity analysis are demonstrated.
The purpose of SWIM is to enable sensitivity analysis of models implemented in a Monte Carlo simulation framework, by distorting (stressing) some of the models’ components and monitoring the resulting impact on quantities of interest. To clarify this idea and explain how SWIM works, we first define the terms used. By a model, we mean a set of n (typically simulated) realisations from a vector of random variables (X1, …, Xd), along with scenario weights W assigned to individual realisations, as shown in the table below. Hence each of of the columns 1 to d corresponds to a random variable, called a model component, while each row corresponds to a scenario, that is, a state of the world.
X1 | X2 | … | Xd | W |
---|---|---|---|---|
x11 | x21 | … | xd1 | w1 |
x12 | x22 | … | xd2 | w2 |
⋮ | ⋮ | ⋱ | ⋮ | ⋮ |
x1n | x2n | … | xdn | wn |
Each scenario has a scenario weight, shown in the last column, such that, scenario i has probability $\frac{w_i}{n}$ of occurring. Scenario weights are always greater or equal than zero and have an average of 1. When all scenario weights are equal to 1, such that the probability of each scenario is $\frac 1 n$ (the standard Monte Carlo framework), we call the model a baseline model – consequently weights of a baseline model will never be explicitly mentioned. When scenario weights are not identically equal to 1, such that some scenarios are more weighted than others, we say that we have a stressed model.
The scenario weights make the joint distribution of model components under the stressed model different, compared to the baseline model. For example, under the baseline model, the expected value of X1 and the cumulative distribution function of X1, at threshold t, are respectively given by:
$$ E(X_1)=\frac 1 n \sum_{i=1}^nx_{1i},\quad F_{X_1}(t)= P(X_1\leq t)=\frac 1 n \sum_{i=1}^n \mathbf 1 _{x_{1i}\leq t}, $$
where 1x1i ≤ t = 1 if x1i ≤ t and 0 otherwise. For a stressed model with scenario weights W, the expected value EW and cumulative distribution function FW become:
$$ E^W(X_1)=\frac 1 n \sum_{i=1}^n w_i x_{1i},\quad F_{X_1}^W(t)=P^W(X_1\leq t)=\frac 1 n \sum_{i=1}^n w_i \mathbf 1 _{x_{1i}\leq t}. $$
Similar expressions can be derived for more involved quantities, such as higher (joint) moments and quantiles.
The logic of stressing a model with SWIM then proceeds as follows. An analyst or modeller is supplied with a baseline model, in the form of a matrix of equiprobable simulated scenarios of model components. The modeller wants to investigate the impact of a change in the distribution of, say, X1. To this effect, she chooses a stress on the distribution of Xi, for example requiring that EW(X1) = m; we then say that she is stressing X1 and, by extension, the model. Subsequently, SWIM calculates the scenario weights such that the stress is fulfilled and the distortion to the baseline model induced by the stress is as small as possible; specifically the Kullback-Leibler divergence (or relative entropy) between the baseline and stressed models is minimised. (See Section @ref(Rfunctions) for more detail on the different types of possible stresses and the corresponding optimisation problems). Once scenario weights are obtained, they can be used to determine the stressed distribution of any model component or function of model components. For example, for scenario weights W obtained through a stress on X1, we may calculate
$$ E^W(X_2)=\frac 1 n\sum_{i=1}^n w_i x_{2i},\quad E^W(X_1^2+X_2^2)=\frac 1 n \sum_{i=1}^n w_i \left(x_{1i}^2+ x_{2i}^2 \right). $$
Through this process, the modeller can monitor the impact of the stress on X1 on any other random variable of interest. It is notable that this approach does not necessitate generating new simulations from a stochastic model. As the SWIM approach requires a single set of simulated scenarios (the baseline model) it offers a clear computational benefit.
Here, through an example, we illustrate the basic concepts and usage of SWIM for sensitivity analysis. More advanced usage of SWIM and options for constructing stresses are demonstrated in Sections @ref(Sec:Scope) and @ref(Sec:CreditModel). We consider a simple portfolio model, with the portfolio loss defined by Y = Z1 + Z2 + Z3. The random variables Z1, Z2, Z3 represent normally distributed losses, with Z1 ∼ N(100, 402), Z2 ∼ Z3 ∼ N(100, 202). Z1 and Z2 are correlated, while Z3 is independent of (Z1, Z2). Our purpose in this example is to investigate how a stress on the loss Z1 impacts on the overall portfolio loss Y. First we derive simulated data from the random vector (Z1, Z2, Z3, Y), forming our baseline model.
set.seed(0)
# number of simulated scenarios
n.sim <- 10 ^ 5
# correlation between Z1 and Z2
r <- 0.5
# simulation of Z1 and Z2
# constructed as a combination of independent standard normals U1, U2
U1 <- rnorm(n.sim)
U2 <- rnorm(n.sim)
Z1 <- 100 + 40 * U1
Z2 <- 100 + 20 * (r * U1 + sqrt(1 - r ^ 2) * U2)
# simulation of Z3
Z3 <- rnorm(n.sim, 100, 20)
# portfolio loss Y
Y <- Z1 + Z2 + Z3
# data of baseline model
dat <- data.frame(Z1, Z2, Z3, Y)
Now we introduce a stress to our baseline model. For our first
stress, we require that the mean of Z1 is increased from 100
to 110. This is done using the stress
function, which
generates as output the SWIM object
str.mean
. This object stores the stressed model, i.e. the
realisations of the model components and the scenario weights. In the
function call, the argument k = 1
indicates that the stress
is applied on the first column of dat
, that is, on the
realisations of the random variable Z1.
## cols required_moment achieved_moment abs_error rel_error
## 1 1 110 110 1e-08 9.2e-11
## $base
## Z1 Z2 Z3 Y
## mean 1.0e+02 99.9404 99.9843 299.9811
## sd 4.0e+01 19.9969 19.9818 56.6386
## skewness -6.1e-04 0.0012 -0.0025 -0.0023
## ex kurtosis -1.1e-02 -0.0089 -0.0125 -0.0093
## 1st Qu. 7.3e+01 86.4745 86.4816 261.6121
## Median 1.0e+02 99.9866 100.0091 300.0548
## 3rd Qu. 1.3e+02 113.3957 113.4934 338.2670
##
## $`stress 1`
## Z1 Z2 Z3 Y
## mean 110.0000 102.4437 99.9828 312.4265
## sd 40.0331 19.9953 19.9761 56.6170
## skewness -0.0024 -0.0015 -0.0049 -0.0037
## ex kurtosis -0.0050 -0.0031 -0.0154 -0.0011
## 1st Qu. 82.9984 88.9771 86.4815 274.2200
## Median 110.0759 102.4810 99.9954 312.5039
## 3rd Qu. 136.9310 115.8744 113.5019 350.6120
The summary
function, applied to the
SWIM object str.mean
, shows how the
distributional characteristics of all random variables change from the
baseline to the stressed model. In particular, we see that the mean of
Z1 changes to its
required value, while the mean of Y also increases. Furthermore there
is a small impact on Z2, due to its positive
correlation to Z1.
Beyond considering the standard statistics evaluated via the
summary
function, stressed probability distributions can be
plotted. In Figure @ref(fig:example1-cdfs-mean) we show the impact of
the stress on the cumulative distribution functions (cdf) of Z1 and Y. It is seen how the stressed cdfs
are lower than the original (baseline) ones. Loosely speaking, this
demonstrates that the stress has increased (in a stochastic sense) both
random variables Z1
and Y. While the stress was on
Z1, the impact on
the distribution of the portfolio Y is clearly visible.
# refer to variable of interest by name...
plot_cdf(str.mean, xCol = "Z1", base = TRUE)
# ... or column number
plot_cdf(str.mean, xCol = 4, base = TRUE)
The scenario weights, given their central role, can be extracted from
a SWIM object. In Figure
@ref(fig:example1-weights-mean), the scenario weights from
str.mean
are plotted against realisations from Z1 and Y respectively. It is seen how the
weights are increasing in the realisations from Z1. This is a consequence
of the weights’ derivation via a stress on the model component Z1. The increasingness
shows that those scenarios for which Z1 is largest are
assigned a higher weight. The relation between scenario weights and
Y is still increasing
(reflecting that high outcomes of Y tend to receive higher weights),
but no longer deterministic (showing that Y is not completely driven by
changes in Z1).
# parameter n specifies the number of scenario weights plotted
plot_weights(str.mean, xCol = "Z1", n = 1000)
# specifying the limits of the x-axis
plot_weights(str.mean, xCol = "Y", x_limits = c(90, 550), n = 1000)
The stress to the mean of Z1 did not impact the
volatility of either Z1 or Y, as can be seen by the practically
unchanged standard deviations in the output of
summary(str.mean)
. Thus, we introduce an alternative stress
that keeps the mean of Z1 fixed at 100, but
increases its standard deviation from 40 to 50. This new stress is seen
to impact the standard deviation of the portfolio loss Y.
## cols required_moment achieved_moment abs_error rel_error
## 1 1 100 100 3.0e-07 3.0e-09
## 2 1 12500 12500 5.2e-05 4.2e-09
## $`stress 1`
## Z1 Z2 Z3 Y
## mean 100.0000 99.941 99.9782 299.9187
## sd 50.0002 21.349 19.9799 67.9230
## skewness -0.0027 0.007 -0.0034 0.0049
## ex kurtosis -0.0556 -0.033 -0.0061 -0.0426
## 1st Qu. 66.0964 85.495 86.4822 253.7496
## Median 100.1290 99.974 100.0455 299.9766
## 3rd Qu. 133.7733 114.301 113.4701 345.9159
Furthermore, in Figure @ref(fig:example1-cdfs-sd), we compare the baseline and stressed cdfs of Z1 and Y, under the new stress on Z1. The crossing of probability distribution reflects the increase in volatility.
The different way in which a stress on the standard deviation of Z1 impacts on the model, compared to a stress on the mean, is reflected by the scenario weights. Figure @ref(fig:example1-weights-sd) shows the pattern of the scenario weights and how, when stressing standard deviations, higher weight is placed on scenarios where Z1 is extreme, either much lower or much higher than its mean of 100.
Finally we ought to note that not all stresses that one may wish to apply are feasible. Assume for example that we want to increase the mean of Z1 from 100 to 300, which exceeds the maximum realisation of Z1 in the baseline model. Then, clearly, no set of scenario weights can be found that produce a stress that yields the required mean for Z1; consequently an error message is produced.
## Error in stress_moment(x = x, f = means, k = as.list(k), m = new_means, :
Values in m must be in the range of f(x)
## [1] 273
We briefly introduce key concepts, using slightly more technical language compared to Section @ref(Sec:Intro). A model consists of a random vector of model components X = (X1, …, Xd) and a probability measure; we denote the probability measure of a baseline model by P and that of a stressed model by PW, where $W= \frac{dP^W}{dP}$, satisfying E(W) = 1 and W ≥ 0, is a Radon-Nikodym derivative. In a Monte Carlo simulation context, the probability space is discrete with n states Ω = {ω1, …, ωn}, each of which corresponds to a simulated scenario. To reconcile this formulation with the notation of Section @ref(Sec:Intro), we denote, for i = 1, …, n, j = 1, …, d, the realisations Xj(ωi) := xji and W(ωi) := wi; the latter are the scenario weights. Under the baseline model, each scenario has the same probability P(ωi) = 1/n, while under a stressed model it is PW(ωi) = W(ωi)/n = wi/n.
The stressed model thus arises from a change of measure from P to PW, which entails the application of scenario weights w1, …, wn on individual simulations. SWIM calculates scenario weights such that model components fulfil specific stresses, while the distortion to the baseline model is as small as possible when measured by the Kullback-Leibler divergence (or relative entropy). Mathematically, a stressed model is derived by solving
In what follows, we denote by a superscript W quantities of interest under the stressed model, such as FW, EW for the probability distribution and expectation under the stressed model, respectively. We refer to Silvana M. Pesenti, Millossovich, and Tsanakas (2019) and references therein for further mathematical details and derivations of solutions to @ref(eq:optimisation).
Table @ref(tab:Rfunstress) provides a collection of all implemented types of stresses in the SWIM package. The precise constraints of @ref(eq:optimisation) are explained below.
R function | Stress | type |
Reference |
---|---|---|---|
stress |
wrapper for the stress_type functions |
Sec. @ref(Rstress) | |
stress_user |
user defined scenario weights | user |
Sec. @ref(Sec:User) |
stress_prob |
probabilities of disjoint intervals | prob |
Eq. @ref(eq:optimisationprob) |
stress_mean |
means | mean |
Eq. @ref(eq:optimisationmean) |
stress_mean_sd |
means and standard deviations | mean sd |
Eq. @ref(eq:optimisationmeansd) |
stress_moment |
moments (of functions) | moment |
Eq. @ref(eq:optimisationmoment) |
stress_VaR |
VaR risk measure (quantile) | VaR |
Eq. @ref(eq:optimisationVaR) |
stress_VaR_ES |
VaR and ES risk measures | VaR ES |
Eq. @ref(eq:optimisationVaRES) |
The solutions to the optimisations @ref(eq:optimisationprob) and
@ref(eq:optimisationVaR) are worked out fully analytically (Silvana M. Pesenti,
Millossovich, and Tsanakas 2019), whereas problems
@ref(eq:optimisationmean), @ref(eq:optimisationmeansd),
@ref(eq:optimisationmoment) and @ref(eq:optimisationVaRES) require some
root-finding. Specifically, problems @ref(eq:optimisationmean),
@ref(eq:optimisationmeansd) and @ref(eq:optimisationmoment) rely on the
package nleqslv, whereas @ref(eq:optimisationVaRES)
uses the uniroot
function.
stress
function and the SWIM
objectThe stress
function is a wrapper for the
stress_type
functions, where
stress(type = "type", )
and stress_type
are
equivalent. The stress
function solves optimisation
@ref(eq:optimisation) for constraints specified through
type
and returns a SWIM
object, that is, a
list including the elements shown in Table @ref(tab:SWIMobject):
R output | Description |
---|---|
x |
realisations of the model |
new_weights |
scenario weights |
type |
type of stress |
specs |
details about the stress |
The data frame containing the realisations of the baseline model,
x
in the above table, can be extracted from a
SWIM object using get_data
. Similarly,
get_weights
and get_weightsfun
provide the
scenario weights, respectively the functions that, when applied to
x
, generate the scenario weights. The details of the
applied stress can be obtained using get_specs
.
Stressing probabilities of disjoint intervals allows defining
stresses by altering the probabilities of events pertaining to a model
component. The scenario weights are calculated via
stress_prob
, or equivalently
stress(type = "prob", )
, and the disjoint intervals are
specified through the lower
and upper
arguments, the endpoints of the intervals. Specifically,
stress_prob
solves @ref(eq:optimisation) with the constraints for disjoint intervals B1, …, BK with P(Xj ∈ Bk) > 0 for all k = 1, …, K, and α1, …, αK > 0 such that α1 + … + αK ≤ 1 and a model component Xj.
The functions stress_mean
, stress_mean_sd
and stress_moment
provide stressed models with moment
constraints. The function stress_mean
returns a stressed
model that fulfils constraints on the first moment of model components.
Specifically,
stress_mean
solves @ref(eq:optimisation) with the constraints for mj, j ∈ J, where J is a subset of {1, …, d}.
The arguments mj are specified
in the stress_mean
function through the argument
new_means
. The stress_mean_sd
function allows
to stress simultaneously the mean and the standard deviation of model
components. Specifically,
stress_mean_sd
solves @ref(eq:optimisation) with the constraints for mj, sj, j ∈ J, where J is a subset of {1, …, d}.
The arguments mj, sj
are defined in the stress_mean_sd
function by the arguments
new_means
and new_sd
respectively. The
functions stress_mean
and stress_mean_sd
are
special cases of the general stress_moment
function, which
allows for stressed models with constraints on functions of the (joint)
moments of model components. Specifically
For k = 1, …, K, Jk subsets of {1, …, d} and functions fk: ℝ|Jk| → ℝ,
stress_moment
solves @ref(eq:optimisation) with the constraints for mk, k = 1, …, K and XJk the subvector of model components with indices in Jk.
Note that stress_moment
not only allows to define
constraints on higher moments of model components, but also to construct
constraints that apply to multiple model components simultaneously. For
example, the stress EW(XhXl) = mk
is achieved by setting fk(xh, xl) = xhxl
in @ref(eq:optimisationmoment) above. The functions
stress_mean
, stress_mean_sd
and
stress_moment
can be applied to multiple model components
and are the only stress
functions that have scenario
weights calculated via numerical optimisation, using the nleqslv
package. Thus, depending on the choice of constraints, existence or
uniqueness of a stressed model is not guaranteed.
The functions stress_VaR
and stress_VaR_ES
provide stressed models, under which a model component fulfils a stress
on the risk measures Value-at-Risk (VaR) and/or Expected Shortfall (ES). The VaR
at level 0 < α < 1 of a
random variable Z with
distribution F is defined as
its left-inverse evaluated at α, that is VaRα(Z) = F−1(α) = inf {y ∈ ℝ | F(y) ≥ α}.
The ES at level 0 < α < 1 of a random variable
Z is given by $$\text{ES}_\alpha(Z)
=\frac{1}{1-\alpha}\int_\alpha^1 \text{VaR}_u(Z) \mathrm{d}u.$$
The details of the constraints that stress_VaR
and
stress_VaR_ES
solve, are as follows:
For 0 < α < 1 and q, s such that q < s,
stress_VaR
solves @ref(eq:optimisation) with the constraint andstress_VaR_ES
solves @ref(eq:optimisation) with the constraints
Note that, since SWIM works with discrete
distributions, the exact required VaR may not be achievable. In that
case, stress_VaR
will return scenario weights inducing the
largest quantile in the dataset smaller or equal to the required VaR
(i.e. q); this guarantees that
PW(Xj ≤ q) = α.
The option type = "user"
allows to generate a
SWIM object with scenario weights defined by a user.
The scenario weights can be provided directly via the
new_weights
argument or through a list of functions,
new_weightsfun
, that applied to the data x
generates the scenario weights.
Table @ref(tab:Ranalysis) provides a complete list of all implemented R functions in SWIM for analysing stressed models, which are described below in detail.
R function | Analysis of stressed models |
---|---|
summary |
summary statistics |
cdf |
cumulative distribution function |
quantile_stressed |
quantile function |
VaR_stressed |
VaR |
ES_stressed |
ES |
sensitivity |
sensitivity measures |
importance_rank |
importance ranks |
plot_cdf |
plots cumulative distributions functions |
plot_quantile |
plots quantile functions |
plot_weights |
plots scenario weights |
plot_hist |
plots histograms |
plot_sensitivity |
plots sensitivity measures |
The SWIM package contains functions to compare the
distribution of model components under different (stressed) models. The
function summary
is a method for an object of class
SWIM and provides summary statistics of the baseline
and stressed models. If the SWIM object contains more
than one set of scenario weights, each corresponding to one stressed
model, the summary
function returns for each set of
scenario weights a list, containing the elements shown in Table
@ref(tab:summary).
R output | Description |
---|---|
mean |
sample mean |
sd |
sample standard deviation |
skewness |
sample skewness |
ex kurtosis |
sample excess kurtosis |
1st Qu. |
25 quantile |
Median |
median, 50 quantile |
3rd Qu. |
75 quantile |
The empirical distribution function of model components under a
stressed model2 can be calculated using the
cdf
function of the SWIM package, applied
to a SWIM object. To calculate sample quantiles of
stressed model components, the function quantile_stressed
can be used. The function VaR_stressed
and
ES_stressed
provide the stressed VaR and ES of model
components, which is of particular interest for stressed models
resulting from constraints on risk measures, see Section
@ref(Sec:RiskMeasures). (While quantile_stressed
works very
similarly to the base R function quantile
,
VaR_stressed
provides better capabilities for comparing
different models and model components.)
Implemented visualisation of distribution functions are
plot_cdf
, for plotting empirical distribution functions,
and plot_hist
, for plotting histograms of model components
under different (stressed) models.
Comparison of baseline and stressed models and how model components
change under different models, is typically done via sensitivity
measures. The SWIM packages contains the
sensitivity
function, which calculates sensitivity measures
of stressed models and model components. The implemented sensitivity
measures, summarised in the table below, are the Wasserstein,
Kolmogorov and the Gamma sensitivity measures, see
Silvana M. Pesenti, Millossovich, and Tsanakas
(2016) Silvana M. Pesenti, Millossovich, and Tsanakas (2019) Emmer,
Kratz, and Tasche (2015).
Metric | Definition | |
---|---|---|
Wasserstein | ∫|FXW(x) − FX(x)|dx | |
Kolmogorov | supx|FXW(x) − FX(x)| | |
Gamma | $\frac{E^W(X) - E(X)}{c}$, for a normalisation c |
The Gamma sensitivity measure is normalised such that it takes values
between -1 and 1, with higher positive (negative) values corresponding
to a larger positive (negative) impact of the stress on the particular
model component. The sensitivity measures can be plotted using
plot_sensitivity
. The function importance_rank
returns the effective rank of model components according to the chosen
sensitivity measure.
The credit model in this section is a conditionally binomial loan
portfolio model including systematic and specific portfolio risk. We
refer to the Appendix @ref(AppendixCM) for details about the model and
the generation of the simulated data. A key variable of interest is the
total aggregate portfolio loss L = L1 + L2 + L3,
where L1, L2, L3
are homogeneous subportfolios on a comparable scale (say, thousands of
$). The dataset contains 100,000 simulations of the portfolio L, the subportfolios L1, L2, L3
as well as the random default probabilities within each subportfolio,
H1, H2, H3.
These default probabilities represent the systematic risk
within each subportfolio, while their dependence structure
represents a systematic risk effect between the subportfolios.
The simulated data of the credit risk portfolio are included in the
SWIM package and can be accessed via
data("credit_data")
. A snippet of the dataset looks as
follows:
## L L1 L2 L3 H1 H2 H3
## [1,] 692 0 346.9 345 1.24e-04 0.00780 0.0294
## [2,] 1006 60 515.6 430 1.16e-03 0.01085 0.0316
## [3,] 1661 0 806.2 855 5.24e-04 0.01490 0.0662
## [4,] 1708 0 937.5 770 2.58e-04 0.02063 0.0646
## [5,] 807 0 46.9 760 8.06e-05 0.00128 0.0632
## [6,] 1159 20 393.8 745 2.73e-04 0.00934 0.0721
In this section, following a reverse sensitivity approach, we study the effects that stresses on (the tail of) the aggregate portfolio loss L have on the three subportfolios; thus assessing their comparative importance. First, we impose a 20% increase on the VaR at level 90% of the portfolio loss.
## Stressed VaR specified was 2174.25 , stressed VaR achieved is 2173.75
The 20% increase was specified by
setting the q_ratio
argument to 1.2 – alternatively the argument
q
can be set to the actual value of the stressed VaR.
Using the function VaR_stressed
, we can quantify how
tail quantiles of the aggregate portfolio loss change, when moving from
the baseline to the stressed model. We observe that the increase in the
VaR of the portfolio loss changes more broadly its tail quantiles; thus
the stress on VaR also induces an increase in ES. The implemented
functions VaR_stressed
and ES_stressed
calculate respectively VaR and ES; the argument alpha
specifies the levels of VaR and ES, respectively, while the stressed
model under which the risk measures are calculated can be chosen using
wCol
(by default equal to 1).
VaR_stressed(object = stress.credit, alpha = c(0.75, 0.9, 0.95, 0.99), xCol = "L", wCol = 1, base = TRUE)
## L base L
## 75% 1506 1399
## 90% 2174 1812
## 95% 2426 2085
## 99% 2997 2671
## L base L
## 90% 2535 2191
As a second stress, we consider, additionally to the 20% increase in the VaR0.9, an increase in ES0.9 of the portfolio loss L. When stressing VaR and ES
together via stress_VaR_ES
, both VaR and ES need to be
stressed at the same level, here alpha = 0.9
. We observe
that when stressing the VaR alone, ES increases to 2535. For the second
stress we want a greater impact on ES, thus we require that the stressed
ES be equal to 3500. This can be achieved by specifying the argument
s
, which is the stressed value of ES (rather than
s_ratio
, the proportional increase).
stress.credit <- stress(type = "VaR ES", x = stress.credit, k = "L", alpha = 0.9, q_ratio = 1.2, s = 3500)
## Stressed VaR specified was 2174.25 , stressed VaR achieved is 2173.75
When applying the stress
function or one of its
alternative versions to a SWIM object rather than to a
data frame (via x = stress.credit
in the example above),
the result will be a new SWIM object with the new
stress “appended” to existing stresses. This is convenient when large
datasets are involved, as the stress
function returns an
object containing the original simulated data and the scenario weights.
Note however, that this only works if the underlying data are exactly
the same.
The summary
function provides a statistical summary of
the stressed models. Choosing base = TRUE
compares the
stressed models with the the baseline model.
## $base
## L L1 L2 L3 H1 H2 H3
## mean 1102.914 19.96 454.04 628.912 0.000401 0.00968 0.0503
## sd 526.536 28.19 310.99 319.713 0.000400 0.00649 0.0252
## skewness 0.942 2.10 1.31 0.945 1.969569 1.30836 0.9501
## ex kurtosis 1.326 6.21 2.52 1.256 5.616080 2.49803 1.2709
## 1st Qu. 718.750 0.00 225.00 395.000 0.000115 0.00490 0.0318
## Median 1020.625 0.00 384.38 580.000 0.000279 0.00829 0.0464
## 3rd Qu. 1398.750 20.00 609.38 810.000 0.000555 0.01296 0.0643
##
## $`stress 1`
## L L1 L2 L3 H1 H2 H3
## mean 1193.39 20.83 501.10 671.46 0.000417 0.01066 0.0536
## sd 623.48 29.09 363.57 361.21 0.000415 0.00756 0.0285
## skewness 1.01 2.09 1.36 1.02 1.973367 1.35077 1.0284
## ex kurtosis 0.94 6.14 2.24 1.22 5.630325 2.23363 1.2383
## 1st Qu. 739.38 0.00 234.38 405.00 0.000120 0.00512 0.0328
## Median 1065.62 20.00 412.50 605.00 0.000290 0.00878 0.0483
## 3rd Qu. 1505.62 40.00 675.00 865.00 0.000578 0.01422 0.0688
##
## $`stress 2`
## L L1 L2 L3 H1 H2 H3
## mean 1289.90 21.70 558.27 709.93 0.000437 0.01180 0.0566
## sd 875.89 30.57 507.78 447.30 0.000448 0.01045 0.0351
## skewness 1.90 2.17 2.10 1.57 2.090457 2.10131 1.5385
## ex kurtosis 3.67 6.74 4.79 2.80 6.203613 4.97016 2.6143
## 1st Qu. 739.38 0.00 234.38 405.00 0.000123 0.00512 0.0328
## Median 1065.62 20.00 412.50 605.00 0.000297 0.00879 0.0484
## 3rd Qu. 1505.62 40.00 675.00 875.00 0.000594 0.01439 0.0697
Note that stress 1
is the summary
output
corresponding to the 20% increase in
the VaR, while stress 2
corresponds to the stress in both
VaR and ES. The information on individual stresses can be recovered
through the get_specs
function, and the actual scenario
weights using get_weights
. Since the SWIM
object stress.credit
contains two stresses, the scenario
weights that are returned by get_weights
form a data frame
consisting of two columns, corresponding to stress 1
and to
stress 2
, respectively.
## type k alpha q s
## stress 1 VaR L 0.9 2173.75 <NA>
## stress 2 VaR ES L 0.9 2173.75 3500
Next, we produce a scatter plot of the scenario weights against the
portfolio loss L. As the
number of scenario weights is large, we only 5000 data points. This can be achieved via
the parameter n
in the function plot_weights
,
that has a default of n = 5000.
plot_weights(stress.credit, xCol = "L", wCol = 1, n = 2000)
# parameter `wCol` specifies the stresses, whose scenario weights are plotted.
plot_weights(stress.credit, xCol = "L", wCol = 2, n = 7000)
It is seen in Figure @ref(fig:credit-weights) that the weights generated to stress VaR, and VaR and ES together, follow different patterns to the weights used to stress means and standard deviations, as shown in Section @ref(Sec:Intro). Recall that SWIM calculates the scenario weights such that under the stressed model the given constraints are fulfilled. Thus, an increase in the VaR and/or ES of the portfolio loss L results in large positive realisations of L being assigned higher weight. On the other hand, when the standard deviation is stressed, scenario weights are calculated that inflate the probabilities of both large positive and negative values.
The change in the distributions of the portfolio and subportfolio
losses, when moving from the baseline to the stressed models, can be
visualised through the functions plot_hist
and
plot_cdf
. The following figure displays the histogram of
the aggregate portfolio loss under the baseline and the two stressed
models. It is seen how stressing VaR and ES has a higher impact on the
right tail of L, compared to
stressing VaR only. This is consistent with the tail-sensitive nature of
the Expected Shortfall risk measure (McNeil, Frey, and Embrechts
2015).
The arguments xCol
and wCol
(with default
to plot all stresses) define the columns of the data and the columns of
the scenario weights, respectively, that are used for plotting. Next, we
analyse the impact that stressing the aggregate loss L has on the subportfolios L1, L2 L3.
Again, we use the function plot_hist
and
plot_cdf
for visual comparison, but this time placing the
distribution plots and histograms of subportfolio losses along each
other via the function ggarrange
(from the package
ggpubr). The plots obtained from plot_hist
and plot_cdf
can be further personalised when specifying
the argument displ = FALSE
, as then the graphical functions
plot_hist
and plot_cdf
return data frames
compatible with the package ggplot2.
pL1.cdf <- plot_cdf(object = stress.credit, xCol = 2, wCol = "all", base = TRUE)
pL2.cdf <- plot_cdf(object = stress.credit, xCol = 3, wCol = "all", base = TRUE)
pL3.cdf <- plot_cdf(object = stress.credit, xCol = 4, wCol = "all", base = TRUE)
pL1.hist <- plot_hist(object = stress.credit, xCol = 2, wCol = "all", base = TRUE)
pL2.hist <- plot_hist(object = stress.credit, xCol = 3, wCol = "all", base = TRUE)
pL3.hist <- plot_hist(object = stress.credit, xCol = 4, wCol = "all", base = TRUE)
ggarrange(pL1.cdf, pL1.hist, pL2.cdf, pL2.hist, pL3.cdf, pL3.hist, ncol = 2, nrow = 3,
common.legend = TRUE)
It is seen from both the distribution plots and the histograms in Figures @ref(fig:CM-histL) and @ref(fig:CM-plot1) that the stresses have no substantial impact on L1, while L2 and L3 are more affected, indicating a higher sensitivity. The higher impact on the tails of stress 2 (on both VaR and ES) is also visible. Sensitivity measures quantifying these effects are introduced in the following subsection.
The impact of the stressed models on the model components can be
quantified through sensitivity measures. The function
sensitivity
includes the Kolmogorov distance, the
Wasserstein distance, and the sensitivity measure
Gamma; the choice of measure is by the argument
type
. We refer to Section @ref(Sec:analysis) for the
definitions of those sensitivity measures. The Kolmogorov distance is
useful for comparing different stressed models. Calculating the
Kolmogorov distance, we observe that stress 2
produces a
larger Kolmogorov distance compared to stress 1
, which
reflects the additional stress on the ES for the stressed model
stress 2
.
## stress type L
## 1 stress 1 Kolmogorov 0.0607
## 2 stress 2 Kolmogorov 0.0748
We now rank the sensitivities of model components by the measure Gamma, for each stressed model. Consistently with what the distribution plots showed, L2 is the most sensitive subportfolio, followed by L3 and L1. The respective default probabilities H1, H2, H3 are similarly ranked.
## stress type L1 L2 L3 H1 H2 H3
## 1 stress 1 Gamma 0.150 0.819 0.772 0.196 0.811 0.767
## 2 stress 2 Gamma 0.113 0.734 0.639 0.171 0.708 0.636
Using the sensitivity
function we can analyse whether
the sensitivity of the joint subportfolio L1 + L3
exceeds the sensitivity of the (most sensitive) subportfolio L2. This can be
accomplished by specifying, through the argument f
, a list
of functions applicable to the columns k
of the dataset. By
setting xCol = NULL
only the transformed data is
considered. The sensitivity measure of functions of columns of the data
is particularly useful when high dimensional models are considered,
providing a way to compare the sensitivity of blocks of model
components.
## stress type f1
## 1 stress 1 Gamma 0.783
We observe that the sensitivity of L1 + L3 is larger than the sensitivity to either L1 and L3, reflecting the positive dependence structure of the credit risk portfolio. Nonetheless, subportfolio L2 has not only the largest sensitivity compared to L1 and L3 but also a higher sensitivity than the combined subportfolios L1 + L3.
The importance_rank
function, having the same structure
as the sensitivity
function, returns the ranks of the
sensitivity measures. This function is particularly useful when several
risk factors are involved.
## stress type L1 L2 L3 H1 H2 H3
## 1 stress 1 Gamma 6 1 3 5 2 4
From the preceding analysis, it transpires that the subportfolios
L2 and L3 are, in that order,
most responsible for the stress in the portfolio loss, under both
stresses considered. Furthermore, most of the sensitivity seems to be
attributable to the systematic risk components H2 and H3, reflected by their
high values of the Gamma measure. To investigate this, we perform
another stress, resulting once again in a 20% increase in VaR(L), but this time fixing some
elements of the distribution of H2. Specifically, in
addition to the 20% increase in VaR(L), we fix the mean and the
75% quantile of H2 to the same values as
in the baseline model. This set of constraints is implemented via the
function stress_moment
.
# 90% VaR of L under the baseline model
VaR.L <- quantile(x = credit_data[, "L"], prob = 0.9, type = 1)
# 75th quantile of H2 under the baseline model
q.H2 <- quantile(x = credit_data[, "H2"], prob = 0.75, type = 1)
# columns to be stressed (L, H2, H2)
k.stressH2 = list(1, 6, 6)
# functions to be applied to columns
f.stressH2 <- list(
# indicator function for L, for stress on VaR
function(x)1 * (x <= VaR.L * 1.2),
# mean of H2
function(x)x,
# indicator function for 75th quaantile of H2
function(x)1 * (x <= q.H2))
# new values for the 90% VaR of L, mean of H2, 75th quantile of H2
m.stressH2 = c(0.9, mean(credit_data[, "H2"]), 0.75)
stress.credit <- stress_moment(x = stress.credit, f = f.stressH2, k = k.stressH2,
m = m.stressH2)
## cols required_moment achieved_moment abs_error rel_error
## 1 1 0.90000 0.90000 7.62e-09 8.47e-09
## 2 6 0.00968 0.00968 7.59e-11 7.84e-09
## 3 6 0.75000 0.75000 3.80e-10 5.06e-10
Using the summary
function, we verify that the
distribution of H2
under the new stress has unchanged mean and 75th quantile.
Then we compare the sensitivities of the subportfolio losses under all
three stresses applied.
## $base
## H2
## mean 0.00968
## sd 0.00649
## skewness 1.30836
## ex kurtosis 2.49803
## 1st Qu. 0.00490
## Median 0.00829
## 3rd Qu. 0.01296
##
## $`stress 3`
## H2
## mean 0.00968
## sd 0.00706
## skewness 1.39137
## ex kurtosis 2.26516
## 1st Qu. 0.00453
## Median 0.00786
## 3rd Qu. 0.01296
## stress type L1 L2 L3
## 1 stress 1 Gamma 0.1501 0.8195 0.772
## 2 stress 2 Gamma 0.1131 0.7336 0.639
## 3 stress 3 Gamma 0.0102 0.0203 0.366
It is seen that, by fixing part of the distribution of H2, the importance ranking of the subportfolios changes, with L2 now being significantly less sensitive than L3. This confirms, in the credit risk model, the dominance of the systematic risk reflected in the randomness of default probabilities.
Up to now, we have considered the impact of stressing the aggregate
portfolio loss on subportfolios. Now, following a forward sensitivity
approach, we consider the opposite situation: stressing the subportfolio
losses and monitoring the impact on the aggregate portfolio loss L. First, we impose a stress
requiring a simultaneous 20% increase
in the 90th quantile of the losses in subportfolios L2 and L3. Note that the
function stress_VaR
(and stress_VaR_ES
) allow
to stress the VaR and/or the ES of only one model component. Thus, to
induce a stress on the 90th quantiles of L2 and L3, we use the function
stress_moments
and interpret the quantile constraints as
moment constraints, via E(1L2 ≤ VaRW(L2))
and E(1L3 ≤ VaRW(L3)),
respectively, where VaRW = VaR ⋅ 1.2 denotes
the VaRs in the stressed model.
# VaR of L2 and L3, respectively
VaR.L2 <- quantile(x = credit_data[, "L2"], prob = 0.9, type = 1)
VaR.L3 <- quantile(x = credit_data[, "L3"], prob = 0.9, type = 1)
#stressing VaR of L2 and L3
f.stress <- list(function(x)1 * (x <= VaR.L2 * 1.2),
function(x)1 * (x <= VaR.L3 * 1.2))
stress.credit.L2L3 <- stress_moment(x = credit_data, f = f.stress, k = list(3, 4),
m = c(0.9, 0.9))
## cols required_moment achieved_moment abs_error rel_error
## 1 3 0.9 0.9 -1.13e-09 -1.26e-09
## 2 4 0.9 0.9 -1.70e-10 -1.89e-10
#impact on portfolio tail
VaR_stressed(stress.credit.L2L3, alpha = c(0.75, 0.9, 0.95, 0.99), xCol = "L",
base = TRUE)
## L base L
## 75% 1556 1399
## 90% 2086 1812
## 95% 2423 2085
## 99% 3072 2671
It is seen how the stressing of subportfolios L2 and L3 has a substantial impact on the portfolio loss. Given the importance of dependence for the distribution of the aggregate loss of the portfolio, we strengthen this stress further, by additionally requiring that the frequency of joint high losses from L2 and L3 is increased. Specifically, we require the joint exceedance probability to be PW(L2 > VaRW(L2), L3 > VaRW(L3)) = 0.06, which is almost doubling the corresponding probability in the last stressed model, which was equal to 0.0308.
# probability of joint exceendance under the baseline model
mean(1 * (credit_data[, "L2"] > VaR.L2 * 1.2) * (credit_data[, "L3"] > VaR.L3 * 1.2))
## [1] 0.00865
# probability of joint exceendance under the stressed model
mean(get_weights(stress.credit.L2L3) * (credit_data[, "L2"] > VaR.L2 * 1.2) * (credit_data[, "L3"] > VaR.L3 * 1.2))
## [1] 0.0308
# additionally stress joint exceedance probability of L2 and L3
f.stress.joint <- c(f.stress, function(x)1 * (x[1] > VaR.L2 * 1.2) * (x[2] > VaR.L3 * 1.2))
stress.credit.L2L3 <- stress_moment(x = stress.credit.L2L3, f = f.stress.joint, k = list(3,4,c(3, 4)), m = c(0.9, 0.9, 0.06))
## cols required_moment achieved_moment abs_error rel_error
## 1 3 0.90 0.90 1.96e-11 2.18e-11
## 2 4 0.90 0.90 2.72e-11 3.03e-11
## 3 c(3, 4) 0.06 0.06 -2.55e-10 -4.26e-09
We analyse the impact the stresses of the tail of the subportfolios L2 and L3 have on the aggregate portfolio L. For this, we plot in Figure @ref(fig:CM-joint-stress-VaR-2-effect) the quantile of the aggregate portfolio under the baseline model (blue), under the stress on the tail of L2 and L3 (red), and under the additional stress on the joint tail of L2 and L3 (green).
The results and the plots indicate that the additional stress on joint exceedances of subportfolios, increases the tail quantiles of L even further, demonstrating the importance of (tail-)dependence in portfolio risk management.
The SWIM package enables users to perform flexible and efficient sensitivity analyses of simulation models, using the method of stressing model components by re-weighting simulated scenarios. Multiple possibilities were demonstrated, from prioritising risk factors via reverse stress testing, to evaluating the impact on a portfolio distribution of increasing the probability of subportfolios’ joint exceedances. The implemented analysis and visualisation tools help users derive insights into their models and perform formal comparisons of the importance of model components. Since SWIM does not require re-simulation from the model, these sensitivity analyses have a low computational cost; moreover, they can be performed on black-box models.
Future work includes enhancing analysis tools, for example by
including a plot_weights
function that enables the seamless
visualisation and comparison of scenario weights arising from different
stresses, as well as functions that will make it easier to extract
distributional characteristics of stressed models – e.g. a
stressed_cor
function that enables the monitoring of
correlation changes when models are stressed. Furthermore, we consider
including alternative ways of calculating scenario weights, such as
minimising the χ2
divergence instead of the Kullback-Leiber divergence.
This vignette was written in R (R Core Team 2019) using the packages SWIM (Silvana M. Pesenti et al. 2020) and bookdown (Xie 2019).
The credit risk portfolio of Section @ref(Sec:CreditModel) is based
on the conditionally binomial credit model described in Section 11.2 of
McNeil, Frey, and Embrechts (2015) which belongs to the family of
mixture models. Specifically, we consider a portfolio that consists of
three homogeneous subportfolios and denote the aggregate portfolio loss
by L = L1 + L2 + L3,
with L1, L2, L3
the losses of each subportfolio, given by where ei and Mi are the
exposure and number of defaults of the ith subportfolio,
respectively, and LGDi is the loss given
default of subportfolio i.
Mi is
Binomially distributed, conditional on Hi, a random
common default probability. Specifically Mi|Hi ∼ Binomial(mi, Hi),
where mi
is the portfolio size. The His follow a
Beta distributions with parameters chosen so as to match given overall
unconditional default probabilities pi and default
correlations ρi, that is, the
correlation between (the indicators of) two default events within a
subportfolio, see McNeil, Frey, and Embrechts (2015). The dependence structure of
(H1, H2, H3)
is modelled via a Gaussian copula with correlation matrix
Table @ref(tab:credit) summarises the parameter values used in the simulation.
i | mi | ei | pi | ρi | LGDi |
---|---|---|---|---|---|
1 | 2500 | 80 | 0.0004 | 0.00040 | 0.250 |
2 | 5000 | 25 | 0.0097 | 0.00440 | 0.375 |
3 | 2500 | 10 | 0.0503 | 0.01328 | 0.500 |
set.seed(1)
library(copula)
nsim <- 100000
# counterparties subportfolio 1, 2 and 3
m1 <- 2500
m2 <- 5000
m3 <- 2500
# prob of default for subportfolios 1, 2 and 3
p1 <- 0.0004
p2 <- 0.0097
p3 <- 0.0503
# correlation between default probabilities
rho1 <- 0.0004
rho2 <- 0.0044
rho3 <- 0.01328
# exposures
e1 <- 80
e2 <- 25
e3 <- 10
# loss given default
LGD1 <- 0.25
LGD2 <- 0.375
LGD3 <- 0.5
# beta parameters: matching subportfolios default probabilities and correlation
alpha1 <- p1 * (1 / rho1 - 1)
beta1 <- alpha1 * (1 / p1 - 1)
alpha2 <- p2 * (1 / rho2 - 1)
beta2 <- alpha2 * (1 / p2 - 1)
alpha3 <- p3 * (1 / rho3 - 1)
beta3 <- alpha3 * (1 / p3 - 1)
# correlations between subportfolios
cor12 <- 0.3
cor13 <- 0.1
cor23 <- 0.4
# Gaussian copula structure
myCop <- normalCopula(param = c(cor12, cor13, cor23), dim = 3, dispstr = "un")
# multivariate beta with given copula
myMvd <- copula::mvdc(copula = myCop,
margins = c("beta", "beta", "beta"),
paramMargins = list(list(alpha1, beta1),
list(alpha2, beta2),
list(alpha3, beta3)))
# simulation from the chosen copula
H <- copula::rMvdc(nsim, myMvd)
# simulate number of default per subportfolios (binomial distributions)
M1 <- rbinom(n = nsim, size = m1, prob = H[, 1])
M2 <- rbinom(n = nsim, size = m2, prob = H[, 2])
M3 <- rbinom(n = nsim, size = m3, prob = H[, 3])
# total loss per subportfolio
L1 <- M1 * e1 * LGD1
L2 <- M2 * e2 * LGD2
L3 <- M3 * e3 * LGD3
# aggregate portfolio loss
L <- L1 + L2 + L3
# the credit data included in SWIM
credit_data <- cbind(L, L1, L2, L3, H)
colnames(credit_data) <- c("L", "L1", "L2", "L3", "H1", "H2", "H3")
Correspondence to Silvana Pesenti, Department of Statistical Sciences, University of Toronto, Canada. [email protected]↩︎
Note that functions implementing the empirical cdf or
the quantile, ecdf
and quantile
, will not
return the empirical distribution function or the quantile function
under a stressed model.↩︎