Last Updated: 06/03/2020
Bootstrapping is any test or metric that uses random sampling with replacement. In time series, block bootstrapping is commonly used to capture structural dependency. Three block bootstrapping methods are introduced.
We use Bootstrapping for Univariate time series forecasting.
We can fetch the data in 2 possible ways:
The data has values (prices, indexes etc.) and corresponding timestamp.
Since Bootstrapping works with stationary data, and price values are not stationary. It is necessary to convert price into return.
The data is automatically converted from its original values into returns .
Apart from this, this section also provides functionality for clipping the data between start_date and end_date.
And users can choose one of the three bootstrapping methods above.
Training Start and End date can be used to clip the dataset as per the usage. Forecasting Horizon Forecast period specifies how much in future you want to forecast, determined by Start and End date. | |
Model Parameters[3]
|
The model comprises an three components:
The graph of price, which is converted back from returns, shows that most simulations capture the uptrending of stock movement. This is due to the historical data showing an upward trend, and bootstrapping method taking historical data would also show upward trending.
This section includes analysis over simulation data, and divided into the following subsections:
Simulations specify if you want to apply the following analysis over return or price. |
Take a look at some randomly selected simulations.Simulated results are quite different from each other. This implied that the simulation results may capture many different scenarios. The disadvantage would be due to the large number of simulated sequences, it is hard to have generalized insights.
It is recommended to plot the histogram of the first time point. If it is distributed around the latest realized value (prices/index), then at least we can say that simulated data is not unrealistic.
The middle line is the mean value of all simulations while the filled area is determined by the standard deviation of simulations at each time step. Due to bootstrapping is in nature a random copy of historical data and thus the expected value and variance of simulation would be consistent overtime.
By default , we choose 5 clusters using L2-K-Means clustering to extract different scenarios. Other than cluster 3, the other clusters are almost the same in terms of variance and mean value. Even cluster 3 is only a bit more volatile. The reason is that all of the 5 clusters follow the same distribution and show present similar behaviour.
By default , we choose 5 clusters using Hierarchical Clustering(KL Divergence Affinity) to extract different scenarios. The same conclusion from KMeans also applies here. The only difference is that Hierarchical Clustering tends to throw away one outlier instead of a group of outliers.
.
Clustering is a method allowing us to focus on major patterns reflected by the simulation. KMeans seems only able to identify outliers, almost impossible situations but KLD Hierarchical Clustering is able to separate scenarios based on the volatility imply stable and a little less stable future is most likely to happen.