Information

This R Markdown document was created as part of a group assignment for SMM068 at Bayes Business School, City St George's, University of London in Term 2 2025-26.

Bonus Interactive Dashboard Available

This static report is accompanied by a live R Shiny dashboard allowing one to use a selection of 16 US stocks in different 5 portfolio combinations.

Launch Interactive Dashboard

1 Downloading the data

  • Information resource used: Yahoo finance
  • URL: https://finance.yahoo.com (accessed 2026-01-28)
  • Name and ticker code for stocks (in alphabetical order):
    • AIG (American International Group, Inc)
    • GS (The Goldman Sachs Group, Inc.)
    • JNJ (Johnson & Johnson)
    • MSFT (Microsoft)
    • NEE (NextEra Energy, Inc)
  • Date range: 1 January 2023 – 31 December 2025
  • Price label used: Adjusted closing price (to account for corporate actions such as splits/dividends)
  • Rationale: The five stocks were chosen to provide diversification across industries (technology, healthcare, utilities/renewables, insurance/financial services, and investment banking) from large, U.S. listed, liquid companies with continuous daily price histories.

2 Portfolio optimisation

Note: Average daily returns were converted to annualised units and a standard 252 trading day multiplier was chosen for consistency (i.e., \(z_{ann} = z_{daily} \times 252\)).

\[ \begin{array}{ll} \hline \textbf{Quantity} & \textbf{Expression} \\ \hline \text{Portfolio return } & R_p = \sum_{i=1}^{N} R_i \\[6pt] \text{Expected portfolio return} &\overline{R}_p = \mathbb{E}[R_p] = \sum_{i=1}^{N} x_i \, \overline{R}_i \\[6pt] \text{Standard deviation of portfolio return } & \sigma_p^2 = \sum_{i=1}^{N} x_i^2 \sigma_i^2 + 2 \sum_{i=1}^{N-1} \sum_{j=i+1}^{N} x_i x_j \rho_{ij} \sigma_i \sigma_j \\[6pt] \text{Constraint 1} & \sum_{i=1}^{N} x_i = 1 \\[6pt] \text{Constraint 2} & \sum_{i=1}^{N}x_i\overline{R}_i = E \\[6pt] \hline \end{array} \]

2.1 Stock statistics

Required statistics for the 5 selected stocks were calculated in preparation for mean-variance portfolio optimisation. Daily returns were calculated from the adjusted closing prices. The expected returns and variance-covariance matrix were estimated using the sample mean and covariance of these daily returns, scaled by 252 to annualise them.

Expected annual returns

Stock Expected returns (annualised)
AIG 15.38%
GS 37.48%
JNJ 9.56%
MSFT 27.05%
NEE 5.21%

Annualised covariance matrix

AIG GS JNJ MSFT NEE
AIG 0.056 520 0.028 239 0.009 219 0.009 264 0.014 049
GS 0.028 239 0.073 831 0.005 420 0.021 318 0.011 019
JNJ 0.009 219 0.005 420 0.029 607 -0.003 187 0.012 903
MSFT 0.009 264 0.021 318 -0.003 187 0.053 846 0.002 834
NEE 0.014 049 0.011 019 0.012 903 0.002 834 0.074 092

Annualised standard deviations

Stock Standard deviation (annualised)
AIG 23.77%
GS 27.17%
JNJ 17.21%
MSFT 23.20%
NEE 27.22%

Correlation coefficient matrix

Note: the correlation coefficient is not used in the efficient frontier calculation. We include it here for posterity.

AIG GS JNJ MSFT NEE
AIG 1.000 000 0.437 155 0.225 358 0.167 925 0.217 094
GS 0.437 155 1.000 000 0.115 927 0.338 098 0.148 980
JNJ 0.225 358 0.115 927 1.000 000 -0.079 815 0.275 497
MSFT 0.167 925 0.338 098 -0.079 815 1.000 000 0.044 868
NEE 0.217 094 0.148 980 0.275 497 0.044 868 1.000 000

2.2 Efficient frontier formed by investing in the 5 stocks

To find portfolio weights \(w\) that minimise the portfolio variance while yielding the target mean return \(E\) invovles solving the following optimisation problem: \[\begin{align*} \min &\quad {w}^T {D} {w} \\ \text{s.t.} &\quad {1}^T {w} = 1 \\ &\quad \overline{{R}}^T {w} = E \\ &\quad {w} \geq {0} \quad \text{(No short selling)} \end{align*}\]

The solve.QP function was used in R to solve this quadratic programming problem, structuring our constraints into matrices accordingly. We therefore set:

\[ {d} = {0}, \quad {A} = \begin{bmatrix} {1}^T \\ \overline{{R}}^T \\ {I}_5 \\ \end{bmatrix}, \quad {b_0} = \begin{bmatrix} 1 \\ E \\ {0}_5 \\ \end{bmatrix} \] as arguments to the solve.QP function.

AIG GS JNJ MSFT NEE Global minimum variance portfolio 0% 10% 20% 30% 40% 0% 10% 20% 30% Risk (annualised standard deviation) Expected return (annualised) Individual Assets: AIG GS JNJ MSFT NEE Mean-variance efficient frontier (no short selling)

Figure 2.1: Mean-variance efficient frontier assuming no short selling for the five stocks described in Q1 based upon data from 2023-2025. The bold green line represents the efficient frontier, containing portfolios that offer the highest expected return for a given level of risk. The dashed grey line shows inefficient portfolios, which have lower returns for the same amount of risk. The global minimum variance portfolio is highlighted.

The mean-variance efficient frontier generated by solving this for a range of target returns is shown in Figure 2.1. The global minimum variance (GMV) portfolio is marked on the frontier and represents the portfolio with the lowest possible risk given the no-short-selling constraint.

The GMV portfolio yields an expected annualised return of 15.814% with an annualised risk (standard deviation) of 12.523%. These results are in agreement (to one decimal place) with the values calculated in the Excel implementation, which is described in more detail in the appendix (B.1). Minor differences in the final asset weights are not unexpected due to the distinct numerical optimisation algorithms and convergence tolerances utilised by R and Excel.

The optimal asset allocations required to achieve this portfolio are detailed below:

Stock Portfolio Weight
AIG 11.10%
GS 3.99%
JNJ 47.87%
MSFT 27.94%
NEE 9.09%

2.3 Points to raise with investor

Context: An individual investor would like to invest in the 5 stocks and intends to use the efficient frontier that you computed above. List, in approximate descending order of importance, ten points that you will raise with the investor to help her construct her investment portfolio.

Points we would raise with the investor about the portfolio constructed using mean-variance portfolio theory (MPT) are as follows:

Investor specific points

  1. Establish her objectives, single-period time horizon, and liquidity needs as MPT is a single-step framework and liquidating large positions can be costly or force her to sell during a downturn.

  2. Assess her true attitude to risk (risk capacity & tolerance) and confirm baseline risk profile (i.e., does she exhibit risk aversion and non-satiation & her financial capacity to absorb losses).

  3. Evaluate her human capital and outside investments. Check her holistic wealth, if her income is tied to same sector as stocks, and establish no conflicts of interest or insider trading.1

Portfolio selection / strategy

  1. Outline option to add risk-free asset in addition to portfolio. Describe blending portfolio with a risk-free asset (T-bills) to achieve a superior risk-adjusted profile on the capital allocation line.

  2. Select optimal portfolio using indifference curves. Explain that any portfolio below curve is sub-optimal and use utility theory to find her optimal portfolio - maximising expected utility.

  3. Explicitly outline impact of constraints used in portfolio construction. No short-selling can produce situations where one is overexposed to one or two stocks.

Model limitations

  1. Highlight estimation error risk. MPT assumes future returns match 2023-2025 history and all metrics are known, but these are difficult to forecast with small errors distorting optimal weights.

  2. Warn about variance as an incomplete measure of risk (treats unexpected gains identically to crashes). Review downside specific metrics (i.e., VaR2) for probability of severe losses.

  3. Limitation with only 5 U.S. stocks means exposed to region-specific risk. Warn her about correlation breakdown during severe market downturns (all stock correlations converge to 1).

  4. The model used requires periodic review and ignores transaction costs and taxes. Emphasise need to monitor inflation, currency, and reinvestment risk to protect her purchasing power.

3 Asset risk theory

Consider 2 assets with random returns. Asset 1 has a Normally distributed return \(R_{1}\) with mean 6% and variance 2.44% and asset 2 has a random return \(R_{2}\) (where w.p. means “with probability”).

\[ R_{1} \sim \mathcal{N}(\mu=0.06, \sigma^{2}=0.0244), \quad R_{2} \sim \begin{cases} \mathcal{N}(\mu=0.0, \, \sigma^{2}=0.1^{2}) \quad &\text{w.p. } 0.8\\ \mathcal{N}(\mu=0.3, \, \sigma^{2}=0.1^{2}) &\text{w.p. } 0.2 \end{cases} \]

3.1 Calculate \(\mathbb{E}[R_{2}]\) and \(\mathrm{Var}(R_{2})\)

\[ \begin{array}{ll} \hline \textbf{Quantity} & \textbf{Expression} \\ \hline \text{Distribution of }R_{2} & \begin{cases} X\mid Z =1 &\sim \mathcal{N}(\mu=0.0, \, \sigma^{2}=0.1^{2}) \quad \text{w.p. } 0.8\\ X\mid Z =2 &\sim \mathcal{N}(\mu=0.3, \, \sigma^{2}=0.1^{2}) \quad \text{w.p. } 0.2\\[2pt] \end{cases} \\[8pt] \hline \text{Expectation} & \mathbb{E}[R_{2}] = \sum_{z=1}^2 \mathbb{P}(Z=z) \times \mu_z \\[6pt] \text{Law of total variance} & \text{Var}(R_{2}) = \mathbb{E}\big[\text{Var}(X\mid Z)\big] + \text{Var}\big(\mathbb{E}[X\mid Z]\big) \\[6pt] \text{Expectation of variance} & \mathbb{E}\big[\text{Var}(X\mid Z)\big] = \sum_{z=1}^2 \mathbb{P}(Z=z) \times \sigma_z^2 \\[6pt] \text{Variance of expectation} & \text{Var}\big(\mathbb{E}[X\mid Z]\big) = \sum_{z=1}^2 \mathbb{P}(Z=z) \times \left(\mu_z - \mathbb{E}[R_{2}]\right)^2 \\[6pt] \hline \end{array} \]

\[\begin{align*} \mathbb{E}[R_{2}] & = (0 \times 0.8) + (0.3 \times 0.2) = 0.06 \\ \mathbb{E}[\text{Var}(X\mid Z)] & = p_{1} \sigma_{1}^{2} + p_{2} \sigma_{2}^{2} \\ & = (0.8)(0.1^2) + (0.2)(0.1^2) = 0.01 \\ \text{Var}(\mathbb{E}[X\mid Z]) & = (0.8 \times (0 - 0.06)^2) + (0.2 \times (0.3 - 0.06)^2) \\ & = 0.00288 + 0.01152 = 0.0144 \\ \text{Var}(R_{2}) & = 0.01 + 0.0144 = 0.0244 \end{align*}\]

Answers: \(\mathbb{E}[R_{2}] =\) 0.06, \(\text{Var}(R_{2})=\) 0.0244

Point of note: The first two moments about the mean of asset one and asset two (the mean and the variance) are the same.

3.2 Use R to estimate \(\mathbb{E}[R_{2}]\) and \(\mathrm{Var}(R_{2})\)

The mean and variance of the returns generated by Asset 2 were calculated using two computational methods: Monte Carlo simulation (stochastic) and numerical integration (deterministic). These results are shown, along with the theoretical values derived in 3.1, are shown in Table 3.1.

Table 3.1: Table 3.2: Comparison of expected value and variance across methods.
Method Mean Variance
Monte Carlo (\(10^7\) runs) 0.060005 0.024404
Numerical Integration 0.060000 0.024400
Theoretical (Part 3A) 0.060000 0.024400

Method 1: Monte Carlo Simulation

Monte Carlo simulation was used to estimate the mean and variance by generating a large number of random samples from the mixture distribution.An indicator \(Z\sim \text{Bernoulli}(0.2)\) was used to choose the mixture component, and values for \(R_{2}\mid Z\) were generated from a normal distribution with standard deviation \(0.1\) and mean \(0\) (if \(Z=0\)) or \(0.3\) (if \(Z=1\)). The sample mean and variance of the simulated values provide estimates for \(\mathbb{E}[R_{2}]\) and \(\mathrm{Var}(R_{2})\) (\(\widehat{\mathbb{E}[R_{2}]}\) and \(\widehat{\mathrm{Var}(R_{2})}\) respectively).

Using 10,000,000 simulations, one obtains \(\widehat{\mathbb{E}[R_{2}]}=\) 0.0600048… and \(\widehat{\mathrm{Var}(R_{2})}=\) 0.0244044…, or 0.060005 and 0.024404 to 5 significant figures. These results are in agreement with the theoretical values of 0.06 and 0.0244 calculated in 3.1. As shown in Table 3.3, the Monte Carlo estimates converge to the theoretical values as the number of simulations increases.

The 95% confidence intervals (CI) in Table 3.3 were computed from the simulated sample moments at each simulation size. For the mean, a large-sample normal approximation was used: \[ \widehat{\mathbb{E}[R_{2}]} \pm z_{0.975} \widehat{SE}(\widehat{\mathbb{E}[R_{2}]}) ,\qquad \widehat{SE}(\widehat{\mathbb{E}[R_{2}]}) = \sqrt{ \frac{\widehat{\text{Var}}(R_{2})}{n} } \] where \(z_{0.975}=\Phi^{-1}(0.975)\) and \(\widehat{\text{Var}}(R_2)\) is the sample variance computed from the n simulated values.

For the variance, a chi-squared based CI is reported: \[ \left[ \frac{(n-1)\widehat{\text{Var}}(R_{2})}{\chi^{2}_{0.975, \,n-1}}, \frac{(n-1)\widehat{\text{Var}{}}(R_{2})}{\chi^{2}_{0.025,\,n-1}} \right] \] where \(\chi^{2}_{\alpha, \,\nu}\) denotes the \(\alpha\) quantile of a chi-squared distribution with \(\nu=n-1\) degrees of freedom. This chi-square interval is exact for normally distributed samples and is used here as a standard large-sample approximation for assessing Monte Carlo convergence.

Table 3.3: Table 3.4: Monte Carlo convergence for \(\mathbb{E}[R_2]\) and \(\mathrm{Var}(R_2)\) with 95 percent confidence intervals (CI). See text for more information about the calculations used for these intervals.
Simulations Mean 95% CI (mean) Variance 95% CI (var)
\(10^{3}\) 0.063667 [0.05410, 0.07323] 0.023824 [0.02187, 0.02606]
\(10^{4}\) 0.059733 [0.05670, 0.06277] 0.024004 [0.02335, 0.02468]
\(10^{5}\) 0.059787 [0.05882, 0.06075] 0.024312 [0.02410, 0.02453]
\(10^{6}\) 0.060036 [0.05973, 0.06034] 0.024395 [0.02433, 0.02446]
\(10^{7}\) 0.060005 [0.05991, 0.06010] 0.024404 [0.02438, 0.02443]

Method 2: Numerical Integration

The mean and variance were also calculated by integrating the probability density function directly using the integrate function in R. \(\mathbb{E}[R_{2}]\) was calculated using \(\mathbb{E}[R_{2}] = \int_{-\infty}^{\infty} x f_{R_2}(x) \, dx\) followed by \(\mathbb{E}[R_{2}^2] = \int_{-\infty}^{\infty} x^2 f_{R_2}(x) \, dx\). The variance was calculated using \(\text{Var}(R_{2}) = \mathbb{E}[R_{2}^2] - (\mathbb{E}[R_{2}])^2\).

3.3 Shortfall probabilities for asset 1

\[ \begin{array}{ll} \hline \textbf{Quantity} & \textbf{Expression} \\ \hline \text{Distribution of } R_1 & R_1 \sim \mathcal{N}(\mu = 0.06,\ \sigma^2 = 0.0244) \\[6pt] \text{Expected return} & \mathbb{E}[R_1] = 0.06 \\[6pt] \text{Standard deviation} & \sigma = \sqrt{0.0244} \approx 0.156205 \\[6pt] \text{Shortfall probability} & \mathbb{P}[R_1 < L] = \int_{-\infty}^{L} f_{R_1}(x)\, \mathrm{d}x \\[6pt] \hline \end{array} \]

3.3.1 Assuming a benchmark return of 0%

\[\begin{align*} \mathbb{P}[R_1 < 0] &= \int_{-\infty}^{0} f_R(x)\mathrm{d}x \\ &= \Phi\left( \frac{0 - 0.06}{0.156205} \right) \\ &= \Phi\left(-0.3841\right) \\ &= 0.3504482... \end{align*}\]

Answer: 35.045% (3dp)

3.3.2 Assuming a benchmark return of -10%

\[\begin{align*} \mathbb{P}[R_1 < -0.1] &= \int_{-\infty}^{-0.1} f_R(x)\mathrm{d}x \\ &= \Phi \left( \frac{-0.1 - 0.06}{0.156205} \right) \\ &= \Phi \left(-1.0243\right)\\ &=0.1528479... \end{align*}\]

Answer: 15.285% (3dp)

3.4 Repeat part Q3c above for asset 2

\[ \begin{array}{ll} \hline \textbf{Quantity} & \textbf{Expression} \\ \hline \text{Distribution of }R_{2} & \begin{cases} X\mid Z =1 &\sim \mathcal{N}(\mu=0.0, \, \sigma^{2}=0.1^{2}) \quad \text{w.p. } 0.8\\ X\mid Z =2 &\sim \mathcal{N}(\mu=0.3, \, \sigma^{2}=0.1^{2}) \quad \text{w.p. } 0.2 \end{cases} \\[8pt] \text{Shortfall probability} & \mathbb{P}[R_2 < L] = 0.8 \int_{-\infty}^{L} f_{\mathcal{N}(0,\,0.1^2)}(x)\, \mathrm{d}x + 0.2 \int_{-\infty}^{L} f_{\mathcal{N}(0.3,\,0.1^2)}(x)\, \mathrm{d}x \\[6pt] \hline \end{array} \]

3.4.1 Assuming a benchmark return of 0%,

\[\begin{align*} \mathbb{P}[R_2 < 0] &= 0.8 \int_{-\infty}^{0} f_{\mathcal{N}(0,\,0.1^2)}(x)\, \mathrm{d}x + 0.2 \int_{-\infty}^{0} f_{\mathcal{N}(0.3,\,0.1^2)}(x)\, \mathrm{d}x \\ &= 0.8 \times \Phi \left( \frac{0 - 0}{0.1} \right) + 0.2 \times \Phi \left( \frac{0 - 0.3}{0.1} \right) \\ &= 0.8 \times \Phi \left(0\right) + 0.2 \times \Phi \left(-3\right) \\ &=0.4002699... \end{align*}\]

Answer: 40.027% (3dp)

3.4.2 Assuming a benchmark return of -10%,

\[\begin{align*} \mathbb{P}[R_2 < -0.1] &= 0.8 \int_{-\infty}^{-0.1} f_{\mathcal{N}(0,\,0.1^2)}(x)\, \mathrm{d}x + 0.2 \int_{-\infty}^{-0.1} f_{\mathcal{N}(0.3,\,0.1^2)}(x)\, \mathrm{d}x \\ &= 0.8 \times \Phi \left( \frac{-0.1 - 0}{0.1} \right) + 0.2 \times \Phi \left( \frac{-0.1 - 0.3}{0.1} \right) \\ &= 0.8 \times \Phi \left(-1\right) + 0.2 \times \Phi \left(-4\right) \\ &=0.1269305... \end{align*}\]

Answer: 12.693% (3dp)

Comments

The results from part 3.3 and 3.4 are summarised and expanded upon in Table 3.5.

While Asset 2 has a higher probability of moderate losses (returns between roughly −5.77% and 0%) Asset 1 carries a higher probability of severe drawdowns (any loss < -5.77%) and consequently, despite both assets having the same variance, Asset 1 has a greater risk of catastrophic loss.

Table 3.5: Table 3.6: Shortfall probabilities for asset 1 and 2. Asset 2 has a higher probability of any loss but Asset 1 has a larger tail probability below a shortfall threshold of -5.77%.
Shortfall threshold \(R_1\) (%) \(R_2\) (%)
0.0000 35.0 40.0
-0.0500 24.1 24.7
-0.0577 22.6 22.6
-0.1000 15.3 12.7
-0.1500 8.94 5.34
-0.2000 4.80 1.82
-0.2500 2.36 0.497
-0.5000 0.0169 2.29e-05

3.5 Use R to plot the probability density functions of returns for assets 1 and 2 on the same axes

The probability density functions of returns for assets 1 and 2 are shown in Figure 3.1.

0 1 2 3 -0.6 -0.3 0.0 0.3 0.6 Density R1 R2 Probability density functions of returns x = -0.1502 x = 0.0704 x = 0.2930 Higher probability density (R2) Higher probability density (R1) -0.5 0.0 0.5 -0.6 -0.3 0.0 0.3 0.6 Return Density Difference between probability density functions (Asset 2 - Asset 1)

Figure 3.1: Top: Probability density functions for the returns generated by Asset 1 (\(R_1\), solid green) and Asset 2 (\(R_2\), dotted orange). Vertical dotted lines indicate the crossover points where the densities are equal (returns of approximately -15%, 7% and 29%). Bottom: The density difference (\(R_2−R_1\)). Positive values indicate regions where Asset 2 has a higher probability of occurrence, while negative values indicate regions where Asset 1 is more probable.

Asset 1 dominates the intermediate zone between 7% and 29%. However, this comes at a significant cost as Asset 1 carries a higher probability of substantive losses of -15% or worse compared to Asset 2. Asset 2 exhibits a higher probability density for returns between -15% and 7% and the upside tail region above 29%. The ‘barbell’ profile of Asset 2 results in a higher utility theory for all typical risk-averse investors3, providing a lower risk of large losses and the exclusive potential for large windfalls, strictly dominating Asset 1 in the tails.

3.6 Use R to plot the cumulative density functions of returns for assets 1 and 2 on the same axes

The probability density functions of returns for assets 1 and 2 are shown in Figure 3.2.

x = -0.0577 x = 0.1982 0.00 0.25 0.50 0.75 1.00 -0.6 -0.3 0.0 0.3 0.6 Probability R1 R2 Cumulative distribution functions Higher cumulative probability (R2) Higher cumulative probability (R1) 0.00 0.05 -0.6 -0.3 0.0 0.3 0.6 Return Difference Cumulative distribution function difference (Asset 2 - Asset 1)

Figure 3.2: Top: Cumulative distribution functions for the returns generated by assets 1 and 2 with crossover points marked by dotted vertical lines (equal shortfall probabilities). Bottom: The difference in cumulative distribution function (\(R_2−R_1\)), where a positive value shows a higher shortfall probability for \(R_2\) at that \(x\).

The intersection of the cumulative distribution functions at x \(\approx\) 5.77% illustrates that neither asset exhibits first-order stochastic dominance over the other. To the left of this crossover, the curve for Asset 1 lies above Asset 2, indicating that Asset 1 has a higher probability of generating severe losses (returns worse than −5.77%). To the right of the crossover, Asset 2’s curve rises above Asset 1, indicating that Asset 2 is more likely to generate small to moderate losses. While Asset 2 is more likely to result in a return near zero, Asset 1 carries the distinct tail risk of catastrophic drawdown.

3.7 Based on your answers to parts 3.13.6 above, comment on the risk involved in asset 2, compared to asset 1

Assets 1 & 2 possess a different risk profile, despite their identical means and variances. Figure 3.1 reveals that Asset 1 is symmetric and Asset 2 is a positively skewed mixture with a higher probability of small losses (e.g., falling short of a 0% or −5% benchmark). However, Asset 1 dominates the extreme left tail (returns < −5.8%), and carries higher probability of severe losses (Figure 3.2). Asset 1 is a riskier holding for a risk-averse investor due to its greater exposure to deep downside events4.

Appendices

A Bonus exploration: R Shiny dashboard

To explore this project further (and in an optimistic pursuit of extra credit) we constructed an app to allow for an additional exploration of the work described in 1 & 2 using different portfolios.

As R Markdown was used to write up this project, one natural approach would have been simply to update the five stocks and render a new R Markdown document for each portfolio. While this would work in principle, it is not fully satisfactory: the document takes time to render and each iteration would require manual narrative adjustments. To address this, we developed an interactive R Shiny dashboard, hosted on https://www.shinyapps.io/, which provides a dynamic interface for the construction of a portfolio under the framework described in this report. The dashboard is available via:

Figure A.1 shows the dashboard start page. Currently sixteen US stocks are available to select, along with the range of dates one wishes to use to obtain the stock statistics. The plotly library is used to allow interactivity for both stock performance figures.

The R Shiny dashboard start page. A selection of stocks are downloaded when the app is first loaded to avoid repeated API requests and throttling restrictions.

Figure A.1: The R Shiny dashboard start page. A selection of stocks are downloaded when the app is first loaded to avoid repeated API requests and throttling restrictions.

The statistics tab (not shown here) mirrors 2.1, showing the expected returns, standard deviation and the covariance matrix for selected securities.

 The Efficient Frontier tab in the R Shiny dashboard showing the portfolio weights associated with a given portfolio on the efficient frontier, along with the expected return and risk. The subjective risk tolerance can change using a slider from 0 to 100, updating both the metrics, portfolio weights, and the visual representation showing the location of the portfolio on the efficient frontier.

Figure A.2: The Efficient Frontier tab in the R Shiny dashboard showing the portfolio weights associated with a given portfolio on the efficient frontier, along with the expected return and risk. The subjective risk tolerance can change using a slider from 0 to 100, updating both the metrics, portfolio weights, and the visual representation showing the location of the portfolio on the efficient frontier.

The interactive efficient frontier element of the app is shown in Figure A.2. The risk tolerance slider represents the amount of risk an investor is willing to accept in nominal units of 0 to 100. Changing this slider automatically updates the portfolio weights, the risk and expected return, and the location of the portfolio on the efficient frontier.

Future development of this app could expand upon the present implementation which strictly mirrors the project’s core analysis (a 5-asset portfolio with a no-short-selling constraint). The more alluring areas to develop are incorporating flexible portfolio sizes, allowing for short selling in the model, and including the option to select and use a risk-free asset in the portfolio. Future development could also expand the universe of available assets and allow users to input dynamic Yahoo Finance tickers, though this would require optimising the data-fetching architecture to avoid API rate limits.

B Supplemental methodology

B.1 Spreadsheet implementation (Excel)

This subsection outlines the structure of the accompanying Excel workbook, using screenshots to demonstrate the layout and contents of each sheet.

B.1.1 About sheet: workbook guide and navigation

The information users are greeted with when opening the spreadsheet. The 'about' sheet is split into three parts: (1) key information, (2) a workbook map with hyperlinks, and (3) version control.

Figure B.1: The information users are greeted with when opening the spreadsheet. The ‘about’ sheet is split into three parts: (1) key information, (2) a workbook map with hyperlinks, and (3) version control.

As shown in Figure B.1, this sheet contains:

  1. A key information block that documents the module name, assessment title, lecturer, submission files, and group members, providing immediate context when the file is opened.
  2. A workbook map (or navigation table) that contains a clickable link to each worksheet:
    • Q1: raw imported price data
    • Q1-Fig: figures showing returns for chosen stocks
    • Q2a: returns + annualised statistics
    • Q2b: efficient frontier + global minimum variance (estimated using 3 separate methods)
  3. Version control information, including a lightweight audit trail with version number and last updated timestamp.

B.1.2 Q1 sheet: raw adjusted close prices

Overview of sheet Q1.

Figure B.2: Overview of sheet Q1.

The stock data are downloaded to CSV using an R script and imported into Excel. This sheet (Figure B.2) contains:

  1. A hyperlink that returns the user to the starting sheet. This is present across all future sheets.
  2. An imported data table containing the daily adjusted close prices for the five stocks, with the first column as the trading date.
  3. Additional information about the stocks, including the data source, URL, ticker list, date range, and price label.

B.1.3 Q1-fig sheet: price charts and rebased series

Overview of sheet Q1-fig.

Figure B.3: Overview of sheet Q1-fig.

As shown in Figure B.3, the Q1-fig sheet builds upon Q1 and contains:

  1. A figure showing the adjusted close prices over 3 years for each stock.
  2. The rebased adjusted close over the same period.
  3. The numerical rebased series used to generate the rebased chart.

B.1.4 Q2a sheet: daily returns and statistics

Overview of sheet Q2a.

Figure B.4: Overview of sheet Q2a.

The Q2a sheet shown in Figure B.4 mirrors 2.1 and contains:

  1. Daily arithmetic returns computed from adjusted close prices in Q1.
  2. Required statistics for mean-variance portfolio optimisation, including
    • expected returns (daily and annualised)
    • variance–covariance matrix (annualised)
    • standard deviation for each stock (annualised)
    • correlation matrix
  3. Additional information about the statistics calculated.
  4. Figures showing the daily returns for each stock.

B.1.5 Q2b sheet: efficient frontier and global minimum variance

This sheet computes the global minimum-variance portfolio and plots the efficient frontier using three approaches:

  1. Excel Solver & VBA (GRG Nonlinear)
  2. Monte Carlo (N=5000)
  3. Sequential grid (~41,000 weights)
Partial view of sheet Q2b. The full data for the Solver and VBA, and all of the Monte Carlo and sequential grid data are too large to display via screenshot. They are located to the east of the worksheet.

Figure B.5: Partial view of sheet Q2b. The full data for the Solver and VBA, and all of the Monte Carlo and sequential grid data are too large to display via screenshot. They are located to the east of the worksheet.

Figure B.5 shows key parts of Q2b, including:

  1. The stock statistics required for the mean-variance optimisation, including the annualised expected returns vector and the annualised variance–covariance matrix (imported/referenced from Q2a). These quantities are used throughout the sheet to compute portfolio expected return and portfolio risk (standard deviation).

  2. A GMV portfolio summary for each method, including the expected return and risk (standard deviation) under the three approaches (Solver & VBA, Monte Carlo, and sequential grid), along with the corresponding portfolio weights for each method. All methods enforce full investment (\(\sum_i w_i = 1\)) and non-negativity (\(w_i \ge 0\), i.e., no short-selling).

  3. Figures showing the efficient frontier generated under each method. The sheet also includes a comparison figure overlaying the efficient frontiers to visualise agreement across methods (only the Solver & VBA individual plot is shown in Figure B.5).

  4. The start of the Solver & VBA efficient frontier calculation table containing the target return grid and the corresponding optimal weights obtained by automating Solver via VBA. The full Solver & VBA table, as well as the full Monte Carlo and sequential-grid portfolio tables, are too large to display in the screenshot and are located to the east of the visible worksheet area.

  5. Additional information about the efficient frontier calculations.

Additional comments

The Solver interface and the manual settings used to compute a single efficient portfolio (for a given target expected return) are shown in Figure B.6. This approach required one to manually update the target return constraint and re-run Solver for each point on the efficient frontier, which is impractical for generating a dense set of frontier points.

An automated Solver procedure was implemented using VBA to loop over a grid of target returns, run Solver programmatically, and output the resulting optimal portfolio weights. The VBA implementation is reproduced below.

(Authors’ note: VBA macros cannot be embedded within the submitted .xlsx file format so the code is documented in this appendix rather than stored inside the workbook.)

The location of the solver interface in Microsoft Excel and the settings used to manually calculate the portfolio weights for a given expected return (minimising standard deviation).

Figure B.6: The location of the solver interface in Microsoft Excel and the settings used to manually calculate the portfolio weights for a given expected return (minimising standard deviation).

' Solver VBA code
Option Explicit
Dim r As Long
Dim targetCell As String, changeCells As String
Dim sumCell As String, portRetCell As String, targetRetCell As String
    
Sub EfficientFrontier_Run()

    Application.ScreenUpdating = False
    Application.EnableEvents = False
    Application.Calculation = xlCalculationManual
    
    For r = 10 To 209
        
        targetCell = "$s$" & r
        changeCells = "$l$" & r & ":$p$" & r
        sumCell = "$q$" & r
        portRetCell = "$r$" & r
        targetRetCell = "$k$" & r
       
        Range(changeCells).ClearContents
        Application.Run "Solver.xlam!SolverReset"
        
        Application.Run "Solver.xlam!SolverOk", targetCell, 2, 0, changeCells
        
        Application.Run "Solver.xlam!SolverAdd", sumCell, 2, "1"
        
        Application.Run "Solver.xlam!SolverAdd", portRetCell, 2, targetRetCell
        
        Application.Run "Solver.xlam!SolverAdd", changeCells, 3, "0"
        
        Application.Run "Solver.xlam!SolverSolve", True
        Application.Run "Solver.xlam!SolverFinish", 1
        
    Next r
    
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True
    Application.ScreenUpdating = True
    
End Sub

Additionally, the Monte Carlo simulation requires random portfolio weights. To ensure results were reproducible, VBA was used to generate the random numbers using a fixed seed before constructing and normalising the random weights. The VBA code used for this is shown below.

' Monte Carlo VBA code
Dim r As Long, c As Long
Sub InitRandomWeights()
    Randomize 123   'set.seed 123
    For r = 10 To 5004
        For c = Range("w1").Column To Range("aa1").Column
            Cells(r, c).Value = Rnd
        Next c
    Next r
End Sub

B.2 Examining 3.7 through the lens of utility

Here we evaluate an investors decision between the two assets using the three standard utility theory models covered in SMM068: Quadratic, Logarithmic, and Power.

While the probability density functions and cumulative distribution functions visualised above describe the statistical properties of the returns, they do not prescribe a decision. To choose between Asset 1 and Asset 2, we must map these returns to an investor’s satisfaction using utility functions. As a rational investors goal is always to maximise expected utility, an asset that results in a higher utility score will be more desirable (Bodie, Kane, and Marcus 2024, ch. 6).

Quadratic utility

Quadratic utility assumes the investor cares only about the expected return and the variance. It penalises volatility regardless of whether that volatility comes from upside or downside moves. The function is defined as:

\[ U(R) = R - \frac{k}{2}R^{2} \]

where \(k\) represents risk aversion.

\[\begin{align*} U(R) &= R - \frac{k}{2}R^{2} \\ \mathbb{E}\big[U(R)\big] &= \mathbb{E}\left[ R - \frac{k}{2}R^{2} \right] \\ &= \mathbb{E}\left[ R \right] - \frac{k}{2}\mathbb{E}\left[R^{2} \right] \\ &= \mathbb{E}\left[ R \right] - \frac{k}{2}\left(\text{Var}(R) - (\mathbb{E}[R])^2 \right) \end{align*}\]

In 3 where \(\mathbb{E}[R_1] = \mathbb{E}[R_2] = 0.06\) and \(\text{Var}(R_1) = \text{Var}(R_2) = 0.0244\), the quadratic utility function will be identical for asset 1 and asset 2.

Logarithmic and power utility

Logarithmic utility (\(\gamma = 1\)) is the baseline for rational growth (often associated with the Kelly Criterion). It assumes wealth cannot drop to zero, and risk aversion adjusts naturally with wealth:

\[ U(R)=\ln(1+R) \]

A power utility function allows one to fine-tune the investor’s risk appetite using the coefficient of relative risk aversion (\(\gamma\)):

\[ U(R) = \frac{(1+R)^{1-\gamma}}{1-\gamma} \]

where \(\gamma>1\) represents one being risk averse and \(\gamma<1\) represents one being more risk seeking. Cases where \(\gamma \leq 0\) are not considered here.

We can investigate how changing \(\gamma\) (representing increasing and decreasing risk aversion) affects an investor’s asset preference.

Comparative analysis

Table B.1 evaluates the expected decision made by an investor using exact integral calculations across our three utility frameworks. With the exception of investors with a quadratic utility function (where both assets result in the same expected utility), asset 2 provides a higher expected utility than asset 1 and therefore would be selected by all investors considered here. The expected utility benefits of asset 2, relative to asset 1, increase as one becomes more risk averse.

The range of \(\gamma\) values in Table B.1 were chosen based upon values previously used in the literature (Clare et al. 2016; Owadally, Jang, and Clare 2021).

Table B.1: Table B.2: Comparison of expected utility scores for returns generated by asset 1 (\(R_1\)) and asset 2 (\(R_2\)).
Model Assumption R1 R2 Decision
Quadratic \(k = 2\) 0.03200 0.03200 Indifferent
Logarithmic \(\gamma = 1\) 0.04703 0.04781 Choose Asset 2
Power \(\gamma = 3\) -0.47773 -0.47290 Choose Asset 2
Power \(\gamma = 5\) -0.25451 -0.24100 Choose Asset 2
Power \(\gamma = 10\) -0.26290 -0.15062 Choose Asset 2

Visualising the decision boundary (power utility function)

The difference in expected utility \((\mathbb{E}\big[U(R_2)\big] - \mathbb{E}\big[U(R_1)\big])\) for assets 1 and 2 is shown in Figure B.7 as a function of an investors risk aversion. The expected utility of asset 2, relative to asset 1, increases as a function of the coefficient of risk aversion (for \(0>\gamma \geq10\)).

Prefer Asset 2 0.00 0.03 0.06 0.09 0.0 2.5 5.0 7.5 10.0 Risk aversion coefficient ( γ ) Utility advantage ( E [ U ( R 2 ) ] E [ U ( R 1 ) ] ) Comparing expected utility across varying levels of risk aversion Expected utility advantage

Figure B.7: Expected utility advantage as a function of risk aversion

C Reproducibility, accessibility & AI declaration

C.1 Reproducibility & accessibility

An accessible HTML version of this report is available via a public GitHub page:

This report was created in R Markdown. The source code is open source and is available via:

Changes made to this document were tracked using Git and are also available via the same repository (https://github.com/ytterbiu/smm068-AS-assignment01-g08/commits/main/). Please note that this commit history reflects updates made to the project repository and is not a comprehensive timeline of all contributions by group members.

C.2 Generative AI Declaration

Generative AI (GenAI) tools were used throughout this project to assist with project planning, methodological validation, and debugging. In practice, these tools functioned as an additional technical team member, providing immediate feedback, troubleshooting code, and challenging statistical assumptions.

C.2.1 Tools Used

Google Gemini 3 Pro (Deep Research), ChatGPT 5.2

C.2.2 Prompts used

The GenAI tools were provided with the ‘coursework_CM2_250120.pdf’ file and extracts from our R code.

A non-exhaustive list of prompts used is shown below. We opted to include at least one example of each type of prompt used, rather than a list of all prompts. For example, if we entered the prompts ‘How can we put two ggplots side by side?’ and ‘How to remove grey background on a ggplot’, we would include only the first of these in the list below, as both relate to plotting issues (& are similar to ‘querying’ with a traditional search engine).

  • “…the top of both graphs in r are slightly different. (Pasting R code.) Can this be fixed for the plot area?”
  • “Help phrasing this in a simple and accessible way - remove any jargon & identify the key points”
  • “How can this error be tracked down in an rmd file? ! Text line contains an invalid character. l.1 ^@^@…”

C.2.3 Use of GenAI outputs & subsequent revisions

The GenAI tools served as a “consultant” member of the group. Notably, the AI output:

  • Helped debug plotting issues.
  • Commented on the structure of R code snippets and the report structure.

C.2.4 Changes made:

All AI suggestions were critically reviewed and debated by the group before implementation. For example:

  • While the AI suggested code for complex layouts, the specific aesthetic choices and annotations were manually adjusted to ensure clarity.
  • Where the AI proposed specific interpretations, these were cross-referenced with academic literature and statistics textbooks.
  • The final R code for implementation was written and debugged by the group with AI snippets being treated as drafts rather than final solutions.

While GenAI was treated as a collaborator, the final analysis, statistical interpretation, and the written content of this report were generated and verified by the (human) group members.

Bodie, Zvi, Alex Kane, and Alan J. Marcus. 2024. Investments ISE. 13th ed. The McGraw Hill Series in Finance, Insurance, and Real Estate. McGraw Hill. https://www.mheducation.co.uk/investments-ise-9781266085963-emea-group.
Clare, Andrew, James Seaton, Peter N. Smith, and Stephen Thomas. 2016. “The Trend Is Our Friend: Risk Parity, Momentum and Trend Following in Global Asset Allocation.” Journal of Behavioral and Experimental Finance 9: 63–80. https://doi.org/10.1016/j.jbef.2016.01.002.
Owadally, Iqbal, Chul Jang, and Andrew Clare. 2021. “Optimal Investment for a Retirement Plan with Deferred Annuities.” Insurance: Mathematics and Economics 98: 51–62. https://doi.org/10.1016/j.insmatheco.2021.02.001.

  1. We are implicitly assuming that this investor does not have any environmental, social, or governance (ESG) preferences given that she has already accepted the portfolio we have constructed, otherwise we would also ask about this here.↩︎

  2. Value at Risk↩︎

  3. Described in more detail in the appendix (B.2).↩︎

  4. This is demonstrated in the appendix (B.2 and Table B.1) where we show that a constant relative risk aversion (CRRA) utility function with \(\gamma > 0\) mathematically confirms that for risk-averse investors asset 1 results in a lower expected utility that asset 2.↩︎