Q&A for Time Series Analysis/Forecasting
Mohsin Raza
Posted on May 13, 2021
1. What is Time series analysis?
Time Series is a series of observations taken at specified time intervals usually equal intervals. Analysis of the series helps us to predict future values based on previously observed values. In the Time series, we have only 2 major variables, Time & the variable we want to forecast.
2. Why & where Time Series is used?
Time series data can be analyzed in order to extract meaningful statistics and other characteristics. It’s used in at least 4 scenarios:
a) Business Forecasting
b) Understand past behavior
c) Plan the future
d) Evaluate current accomplishment
3. When shouldn’t we use Time Series Analysis?
We don’t need to apply the Time series in at least the following two cases:
a) The dependant variable(y) (that is supposed to vary with time) is constant. Eq: y=f(x)=4
, a line parallel to x-axis(time) will always remain the same.
b) The dependant variable(y) represents values that can be denoted as a mathematical function. Eq: sin(x)
, log(x)
, Polynomials
etc. Thus, we can directly get value at some time using the function itself. No need for forecasting.
4. What are the components of the Time Series?
There are fourth components:
a) Trend Upward & downward movement of the data with time over a large. period of time. Eq: Appreciation of Dollar vs rupee.
b) Seasonality Seasonal variances. Eq: Ice cream sales increases in Summer only.
c) Noise or Irregularity Spikes & troughs at random intervals.
d) Cyclicity Behavior that repeats itself after a large interval of time, like days months & years, etc.
5. What is Stationarity?
Before applying any statistical model on a Time Series, the series has to be stationary, which means that over different time periods,
a) It should have a constant mean.
b) It should have constant variance or standard deviation.
c) Auto-covariance should not depend on the time.
6. Why does Time Series(TS) need to be stationary?
It is because of the following reasons:
a) If a TS has a particular behavior over a time interval, then there’s a high probability that over a different interval, it will have the same behavior, provided TS is stationary. This helps in forecasting accurately.
b) Theories & Mathematical formulas are more mature & easier to apply for as TS which is stationary.
7. Tests to check if a series is stationary or not.
There are 2 ways to check for Stationarity of a TS:
a) Rolling Statistics →Plot the moving avg or moving standard deviation to see if it varies with time. It's a visual technique.
b) ADCF Test →Augmented Dickey-Fuller test is used to gives us various values that can help in identifying stationarity. The Null hypothesis says that a TS is non-stationary. It comprises Test Statistics & some critical values for some confidence levels.
If the Test statistics are less than the critical values, we can reject the null hypothesis & say that the series is stationary. THE ADCF test also gives us a p-value. Acc to the null hypothesis, lower values of p is better.
8. What is the ARIMA model?
ARIMA(Auto-Regressive Integrated Moving Average) is a combination of 2 models AR(Auto-Regressive) & MA(Moving Average).
It has 3 hyperparameters — P(auto-regressive lags),d(order of differentiation), Q(moving avg.) which respectively come from the AR, I & MA components. The AR part is the correlation between prev & current time periods. To smooth out the noise, the MA part is used. The I part binds together the AR & MA parts.
9. How to find the value of P & Q for ARIMA?
We need to take the help of ACF(Auto Correlation Function) & PACF(Partial Auto Correlation Function) plots. ACF & PACF graphs are used to find the value of P & Q for ARIMA. We need to check, for which value in the x-axis, graph line drops to 0 in the y-axis for 1st time.
From PACF(at y=0), get P
From ACF(at y=0), get Q
10. What Is the ADCF test?
In statistics and econometrics, an augmented Dickey-Fuller test (ADF) tests the null hypothesis that a unit root is present in a time series sample. The alternative hypothesis is different depending on which version of the test is used but is usually stationarity or trend-stationarity. It is an augmented version of the Dickey-Fuller test for a larger and more complicated set of time series models.
The augmented Dickey-Fuller (ADF) statistic, used in the test, is a negative number. The more negative it is, the stronger the rejection of the hypothesis that there is a unit root at some level of confidence.
p-value(0<=p<=1) should be as low as possible. Critical values at different confidence intervals should be close to the Test statistics value.
11. What is Exponential Smoothing?
Exponential smoothing is a rule of thumb technique for smoothing time series data using the exponential window function. Whereas in the simple moving average the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time. It is an easily learned and easily applied procedure for making some determination based on prior assumptions by the user, such as seasonality. Exponential smoothing is often used for the analysis of time-series data.
The raw data sequence is often represented by xt beginning at time t=0, and the output of the exponential smoothing algorithm is commonly written as st, which may be regarded as the best estimate of what the next value of xx will be. When the sequence of observations begins at time t=0, the simplest form of exponential smoothing is given by the formulas:
s0 = x0
st = α∗xt+(1−α)∗st−1 , t>0
where α
is the smoothing factor, and 0<α<1.
12.What is Exponential decay?
A quantity is subject to exponential decay if it decreases at a rate proportional to its current value. Symbolically, this process can be expressed by the following differential equation, where N is the quantity and λ (lambda) is a positive rate called the exponential decay constant:
dN/dt = −λN
The solution to this equation (see derivation below) is:
N(t) = N0∗e−λt
where N(t) is the quantity at time t, and N0 = N(0) is the initial quantity, i.e. the quantity at time t = 0.
Half-Life: is the time required for the decaying quantity to fall to one-half of its initial value. It is denoted by t1/2. The half-life can be written in terms of the decay constant as:
t1/2=ln(2)/λ
Thanks for reading my article please share your thoughts.
Follow me on LinkedIn
Posted on May 13, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.