How to build Forecasting using ML/DL algorithms
Iona Ekonomi – Senior Solutions Architect
Introduction Forecasting Techniques
Amazon Forecasting
Amazon SageMaker
simplify what you need to know to deploy the Anaplan on AWS solution
The AWS ML Stack
Broadest and most complete set of Machine Learning capabilities
Amazon SageMaker Ground
Model Monitor
Deep Learning
AMIs & Containers
GPUs &
Inferentia FPGA
Fraud Detector
Contact Lens
For Amazon Connect
SageMaker Studio IDE
1995 2000 2007 2010 2015 2019
Forecasting at Amazon.com
Using machine learning to solve complex forecasting problems
Use of Machine Learning
High price variability Slow moving productsRegional vs national demand New products Highly seasonal
Traditional statistical methods Use of deep learning
in accuracy
forecasting API
Amazon Forecast
The technology that powers the world’s largest ecommerce business
Get started with the console or
Point Amazon Forecast to your data
stored in Amazon Simple Storage
Service (Amazon S3)
Automatically train your custom ML
Let Amazon Forecast auto select the best one for
your data through AutoML
Generate accurate forecasts
Retrieve forecasts through the console or
private API
Historical data
Related data
Sales, call volume, inventory,
resource demand
Price, promotions, weather
data, custom events
Item metadata
Color, city, country, category,
author, album name
Built-in dataset
(holiday, weekends)
Amazon Forecast
forecasting API
Select most
accurate model
from multiple
models using
Amazon Forecast
Behind the scenes
Fully managed by Amazon Forecast
Historical data
Related data
sales, call volume, inventory,
resource demand.
Price, promotions, weather
data, custom events
Item metadata
Color, city, country, category,
author, album name
Create dataset
Create predictor (train,
inference, metrics)
Create Forecast
Create Forecast export
Query Forecast
Amazon Forecast
Use of historical data
to predict future
Target time-
The primary variable to predict with its
historical values
(demand, sales)
Datasets used for forecasting
Use of related
attributes and
categorical data
Item metadata
Categorical data that provide more context
about items
(color, city, channel)
Use of known time-
varying data specific
to your business
Related time-
Time-varying related features that may
impact the target value
(price, promotion, weather)
Amazon Forecast
• Custom model trained on your data.
• A forecast horizon – how far you want to predicate also called the prediction length.
• the maximum forecast horizon is the lesser of 500 time-steps or 1/3 of the
TARGET_TIME_SERIES dataset length.
• Evaluation parameters – How to split a dataset into training and test datasets using
• Then either you chose the algorithm manually or make it Auto where AWS will try all
algorithms and choice the best one.
• AutoML optimizes the average of the weighted P10, P50 and P90 quantile losses,
and returns the algorithm with the lowest value.
• It’s your model deployed on the production on somewhere
on AWS cloud and is fully managed by AWS to match your
demand. And now all you need to call this end point to get
the results using Query Forecast.
• Call the CreateForecast operation to create a forecast.
• During forecast creation, Amazon Forecast trains a model
on the entire dataset before hosting the model and doing
• This operation creates a forecast for every item (item_id) in
the dataset group that was used to train the predictor.
• After a forecast is created, you can query the forecast or
export it to your Amazon Simple Storage Service (Amazon
S3) bucket.
forecasting API
Visualize the distribution of forecasted values
View probabilistic forecasts
at any quantile in the
Retrieve forecasts
through your private API
Export forecasts to .csv
Amazon Forecast
Handles tricky forecasting scenarios
Missing values
Cold start
(new product introduction)
Irregular seasonality
Product discontinuation
Highly spiky data
Sensitivity analysis
(future price change)
Amazon Forecast
Amazon Forecast Algorithms
Auto-regressive integrated
moving average
• The last step move from ARMA to ARIMA is
differencing step called integrate
• So we do it on two stages
• First apply differencing (order d)
• Then ARMA (p,q)
Linear regression
• Linear regression attempts to
model the relationship
between two variables by
fitting a linear equation to
observed data.
• WE can use linear regression
in forecast y=mx+b
• The slope of the line is m
(coefficient), and b is the
intercept (the value of y when
x = 0).
Capture the trend and seasonality
• Some time called serial
• Find the correlation between
the series and its past value to
improve the forecast.
• Correlation between pairs of
values at a certain lag.
• Lag-1 autocorrelation : Yt and
• Lag-2 autocorrelation : Yt and
Autoregression (AR Model)
• Capture autocorrelation in a series in an
regression type model and use it to improve
short-term forecasts
key concept is Order.
Only work for short term Forecast
Autoregression Moving Average (ARMA)
• It require Stationarity -no trend/s
• So we can have apply on two stage
1. capture trend using regression.
2. apply AR model to capture
autocorrelation and next forecast error
[Moving Average]
3. Combine the two to get the improve
(ExponenTial Smoothing)
Error Trend Seasonality
Statistical algorithm that uses exponential
Exponential smoothing forecasting : prediction is a weighted sum of past
observations, but the model explicitly uses an exponentially decreasing weight for
past observations.
create an approximating
function that attempts to
capture important patterns in
the data, while leaving out noise
or other fine-scale
structures/rapid phenomena
Holt’s exponential smoothing
Smoothing using differencing :
• help in the case of dataset has trends
and seasonality
• Differencing means taking the
difference between two values of the
• lag :- means how far apart these two
value are for example lag = 1 mean y(t) -
y(t-1) which help to remove trend.
• lag-M differencing y(t) - y(t-m) useful for
removing seasonality with M seasons.
Holt’s exponential smoothing (double
exponential smoothing ):
• Fore series with trend but no
• 𝐹!"# = 𝐿! + K 𝑇! ( T is the trend )
• 𝑇! = 𝛽 (𝐿! − 𝐿!$% ) + (1- 𝛽) 𝑇!$%
• 0 ≤ 𝛽 ≤ 1 à how fast we update
the trend
• 𝛽à is the trend constant
Winter’s exponential smoothing (triple exponential smoothing )
• Fore series with trend &
• 𝐹!"# = 𝐿! + K 𝑇! + 𝑆!"#$&
• 𝑆! = γ (𝐿! − 𝐿!$% ) + (1- 𝛾) 𝑆!$%
• 0 ≤ 𝛾 ≤ 1 à how fast we update
the seasonality
• Forecast = most recent
estimated level + trend
+ seasonality.
Non-parametric time series
Jan 06
Apr 07
Jul 07
Oct 06
Jan 05
Apr 06
Jul 06
Oct 05
Jan 04
Apr 04
A Typical Time Series in Large Inventories
• Fixed number of parameters
• computationally faster, but makes
stronger assumptions about the data.
• A common example of a parametric
algorithm is linear regression.
• we try to find y=mx+b then we though
the data away and use the equation in
the future to find y
• uses a flexible number of parameters
and grows as it learns from more data
• computationally slower
• example is K-nearest neighbour and
kernel regression
• we keep the data and we always come
back and consult the data to find the
right predication
Additive regression model with
Gaussian likelihood
Can find trend, seasonality, cyclical, and holiday effects
• Structural time series model
develop by Facebook became
opensource in Feb 2017.
• Use a very flexible regression
model (somewhat like curve-
• Builds model by finding a best
smooth line which as sum of
• Overall growth trend
• Yearly seasonality
• Weekly seasonality
• Holiday effects – X’mas, New Year
Supervised learning algorithm based on
autoregressive RNNs that can produce both
point and probabilistic forecasts .
Based on LSTM Networks
Global model that can use related time series and
Autoregressive history
Neural networks are good at leveraging long history to learn its influence on future points, and they can
handle high-dimensionality in the inputs (that is, they can handle many related-items).
Missing data
• The DeepAR+ forecasting algorithm has been used internally in Amazon
for mission-critical decisions
•Classical forecasting techniques such as ARIMA and ETS fit one model to
an individual time series. However, in many situations, a set of related
time series have been or can be collected.
•DeepAR+ can train a model over such a set of related time series for
additional insights and increased predictive power
•Requires minimal feature engineering and can produce forecasts that
are either point (amount sold was X) or probabilistic (amount sold was
between X and Y with Z probability).
Feature Engineering, Custom Feature
Amazon SageMaker
Fast &
accurate data
Built-in, high-
algorithms &
and tuning
Fully managed
hosting with
auto-scaling and
elastic inference
Build, train, and deploy ML models at scale
Amazon SageMaker Notebooks
• Jupyter notebooks
• Support JupyterLab
• Multiple built-in kernels
• Install external libraries
• Install external kernels
• Integrate with Git
• Sample notebooks
One click in console
- OR -
Amazon SageMaker training
Amazon SageMaker
built-in algorithms
AWS Marketplace
Data Data Data Data
Model Model Model
Custom script Algorithms or
Custom script on
supported frameworks
BYO algorithm and
17 built-in high-
Supported frameworks:
Apache MXNet,
TensorFlow, Scikit-learn,
PyTorch, Chainer
Docker containers with your
own algorithms and
Third-party algorithms
and models
Supported frameworks
Custom script
and custom
Amazon SageMaker training
AWS Europe (Milan) Region
On April, 28th AWS expanded its global footprint with the opening of the AWS Infrastructure Region in Italy. The new
Region AWS Europe (Milano) brings advanced cloud technologies that enable opportunities for innovation,
entrepreneurship, and digital transformation. For additional information about services and characteristics of an AWS
Region, you can check the website: aws.amazon.com/local/italy/milan/
  • 1. 1© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | How to build Forecasting using ML/DL algorithms Iona Ekonomi – Senior Solutions Architect
  • 2. 2© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Introduction Forecasting Techniques Amazon Forecasting Amazon SageMaker simplify what you need to know to deploy the Anaplan on AWS solution quickly Agenda
  • 3. 3© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | The AWS ML Stack Broadest and most complete set of Machine Learning capabilities VISION SPEECH TEXT SEARCH NEW CHATBOTS PERSONALIZATION FORECASTING FRAUD DEVELOPMENT CONTACT CENTERS Amazon SageMaker Ground Truth Augmented AI SageMaker Neo Built-in algorithms SageMaker Notebooks SageMaker Experiments Model tuning SageMaker Debugger SageMaker Autopilot Model hosting SageMaker Model Monitor Deep Learning AMIs & Containers GPUs & CPUs Elastic Inference Inferentia FPGA Amazon Rekognition Amazon Polly Amazon Transcribe +Medical Amazon Comprehend +Medical Amazon Translate Amazon Lex Amazon Personalize Amazon Forecast Amazon Fraud Detector Amazon CodeGuru AI SERVICES ML SERVICES ML FRAMEWORKS & INFRASTRUCTURE Amazon Textract Amazon Kendra Contact Lens For Amazon Connect SageMaker Studio IDE NEW NEW NEW
  • 4. 4© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 1995 2000 2007 2010 2015 2019 Forecasting at Amazon.com Using machine learning to solve complex forecasting problems Use of Machine Learning High price variability Slow moving productsRegional vs national demand New products Highly seasonal products Traditional statistical methods Use of deep learning 15x Improvement in accuracy
  • 5. 5© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 6. 6© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Private forecasting API Amazon Forecast The technology that powers the world’s largest ecommerce business Get started with the console or API Point Amazon Forecast to your data stored in Amazon Simple Storage Service (Amazon S3) Automatically train your custom ML model Let Amazon Forecast auto select the best one for your data through AutoML Generate accurate forecasts Retrieve forecasts through the console or private API Historical data Related data Sales, call volume, inventory, resource demand Price, promotions, weather data, custom events Item metadata Color, city, country, category, author, album name Built-in dataset (holiday, weekends) Amazon Forecast
  • 7. 7© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Customized forecasting API Inspect data Identify features Select most accurate model from multiple algorithms Select Hyper- parameters Host models Load data Train models using multiple algorithms Optimize models Amazon Forecast Behind the scenes Fully managed by Amazon Forecast Historical data Related data sales, call volume, inventory, resource demand. Price, promotions, weather data, custom events Item metadata Color, city, country, category, author, album name Create dataset Create predictor (train, inference, metrics) Create Forecast Create Forecast export Query Forecast Amazon Forecast
  • 8. 8© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Use of historical data to predict future values Target time- series dataset The primary variable to predict with its historical values (demand, sales) Datasets used for forecasting Use of related attributes and categorical data Item metadata (non-time- varying) Categorical data that provide more context about items (color, city, channel) Use of known time- varying data specific to your business Related time- series dataset Time-varying related features that may impact the target value (price, promotion, weather) Amazon Forecast
  • 9. 9© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Predictor • Custom model trained on your data. • A forecast horizon – how far you want to predicate also called the prediction length. • the maximum forecast horizon is the lesser of 500 time-steps or 1/3 of the TARGET_TIME_SERIES dataset length. • Evaluation parameters – How to split a dataset into training and test datasets using backtest • Then either you chose the algorithm manually or make it Auto where AWS will try all algorithms and choice the best one. • AutoML optimizes the average of the weighted P10, P50 and P90 quantile losses, and returns the algorithm with the lowest value.
  • 10. 10© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Forecast • It’s your model deployed on the production on somewhere on AWS cloud and is fully managed by AWS to match your demand. And now all you need to call this end point to get the results using Query Forecast. • Call the CreateForecast operation to create a forecast. • During forecast creation, Amazon Forecast trains a model on the entire dataset before hosting the model and doing inference. • This operation creates a forecast for every item (item_id) in the dataset group that was used to train the predictor. • After a forecast is created, you can query the forecast or export it to your Amazon Simple Storage Service (Amazon S3) bucket. Private forecasting API
  • 11. 11© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Visualize the distribution of forecasted values View probabilistic forecasts at any quantile in the console Retrieve forecasts through your private API Export forecasts to .csv Amazon Forecast
  • 12. 12© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Handles tricky forecasting scenarios Missing values Cold start (new product introduction) Irregular seasonality Product discontinuation Highly spiky data Sensitivity analysis (future price change) Amazon Forecast
  • 13. 13© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Amazon Forecast Algorithms +
  • 14. 14© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | ARIMA Auto-regressive integrated moving average AMAZON FORECAST - ARIMA • The last step move from ARMA to ARIMA is differencing step called integrate ARIMA(p,d,q). • So we do it on two stages • First apply differencing (order d) • Then ARMA (p,q)
  • 15. 15© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST - ARIMA Linear regression • Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. • WE can use linear regression in forecast y=mx+b • The slope of the line is m (coefficient), and b is the intercept (the value of y when x = 0). Capture the trend and seasonality
  • 16. 16© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST - ARIMA Autocorrelation • Some time called serial correlation. • Find the correlation between the series and its past value to improve the forecast. • Correlation between pairs of values at a certain lag. • Lag-1 autocorrelation : Yt and Yt-1 • Lag-2 autocorrelation : Yt and Yt-2
  • 17. 17© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Autoregression (AR Model) • Capture autocorrelation in a series in an regression type model and use it to improve short-term forecasts key concept is Order. AR(p) ARMA(p,q) Only work for short term Forecast Autoregression Moving Average (ARMA) • It require Stationarity -no trend/s seasonality • So we can have apply on two stage 1. capture trend using regression. 2. apply AR model to capture autocorrelation and next forecast error [Moving Average] 3. Combine the two to get the improve forecast AMAZON FORECAST - ARIMA
  • 18. 18© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST - ETS ETS (ExponenTial Smoothing) Error Trend Seasonality Statistical algorithm that uses exponential smoothing Exponential smoothing forecasting : prediction is a weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations.
  • 19. 19© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST - ETS Smoothing create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena
  • 20. 20© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – ETS Holt’s exponential smoothing Smoothing using differencing : • help in the case of dataset has trends and seasonality • Differencing means taking the difference between two values of the series • lag :- means how far apart these two value are for example lag = 1 mean y(t) - y(t-1) which help to remove trend. • lag-M differencing y(t) - y(t-m) useful for removing seasonality with M seasons. Holt’s exponential smoothing (double exponential smoothing ): • Fore series with trend but no seasonality. • 𝐹!"# = 𝐿! + K 𝑇! ( T is the trend ) • 𝑇! = 𝛽 (𝐿! − 𝐿!$% ) + (1- 𝛽) 𝑇!$% • 0 ≤ 𝛽 ≤ 1 à how fast we update the trend • 𝛽à is the trend constant
  • 21. 21© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – ETS Winter’s exponential smoothing (triple exponential smoothing ) • Fore series with trend & seasonality • 𝐹!"# = 𝐿! + K 𝑇! + 𝑆!"#$& • 𝑆! = γ (𝐿! − 𝐿!$% ) + (1- 𝛾) 𝑆!$% • 0 ≤ 𝛾 ≤ 1 à how fast we update the seasonality • Forecast = most recent estimated level + trend + seasonality.
  • 22. 22© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – NPTS NPTS Non-parametric time series Jan 06 2014 Apr 07 2014 Jul 07 2014 Oct 06 2014 Jan 05 2015 Apr 06 2015 Jul 06 2015 Oct 05 2015 Jan 04 2016 Apr 04 2016 0246810 A Typical Time Series in Large Inventories
  • 23. 23© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – NPTS Parametric • Fixed number of parameters • computationally faster, but makes stronger assumptions about the data. • A common example of a parametric algorithm is linear regression. • we try to find y=mx+b then we though the data away and use the equation in the future to find y Non-Parametric • uses a flexible number of parameters and grows as it learns from more data • computationally slower • example is K-nearest neighbour and kernel regression • we keep the data and we always come back and consult the data to find the right predication
  • 24. 24© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – Prophet Prophet Additive regression model with Gaussian likelihood Can find trend, seasonality, cyclical, and holiday effects
  • 25. 25© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | • Structural time series model develop by Facebook became opensource in Feb 2017. • Use a very flexible regression model (somewhat like curve- fitting) • Builds model by finding a best smooth line which as sum of • Overall growth trend • Yearly seasonality • Weekly seasonality • Holiday effects – X’mas, New Year etc. AMAZON FORECAST – Prophet
  • 26. 26© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – DEEPAR+ DeepAR+ Supervised learning algorithm based on autoregressive RNNs that can produce both point and probabilistic forecasts . Based on LSTM Networks Global model that can use related time series and attributes
  • 27. 27© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Forecast Autoregressive history Covariates Neural networks are good at leveraging long history to learn its influence on future points, and they can handle high-dimensionality in the inputs (that is, they can handle many related-items). AMAZON FORECAST – DEEPAR+
  • 28. 28© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – DEEPAR+ Missing data
  • 29. 29© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | • The DeepAR+ forecasting algorithm has been used internally in Amazon for mission-critical decisions •Classical forecasting techniques such as ARIMA and ETS fit one model to an individual time series. However, in many situations, a set of related time series have been or can be collected. •DeepAR+ can train a model over such a set of related time series for additional insights and increased predictive power •Requires minimal feature engineering and can produce forecasts that are either point (amount sold was X) or probabilistic (amount sold was between X and Y with Z probability). AMAZON FORECAST – DEEPAR+
  • 30. 30© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON FORECAST – DEEPAR+ Feature Engineering, Custom Feature
  • 31. 31© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Amazon SageMaker Fast & accurate data labeling Built-in, high- performance algorithms & notebooks Build 1 One-click training and tuning Train Model optimization 2 Deploy 3 Fully managed hosting with auto-scaling and elastic inference One-click deployment Build, train, and deploy ML models at scale
  • 32. 32© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Amazon SageMaker Notebooks • Jupyter notebooks • Support JupyterLab • Multiple built-in kernels • Install external libraries • Install external kernels • Integrate with Git • Sample notebooks
  • 33. 33© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | One click in console Using API/SDK - OR - Launch training Amazon SageMaker training
  • 34. 34© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Amazon SageMaker built-in algorithms Supported frameworks AWS Marketplace algorithms Model Data Data Data Data Orchestration Built-in algorithms AmazonSageMaker Model Model Model Orchestration AmazonSageMaker Custom script Algorithms or models Custom script on supported frameworks BYO algorithm and framework 17 built-in high- performance algorithms Supported frameworks: Apache MXNet, TensorFlow, Scikit-learn, PyTorch, Chainer Docker containers with your own algorithms and frameworks Third-party algorithms and models Supported frameworks Orchestration AmazonSageMaker Custom script and custom framework Orchestration AmazonSageMaker Amazon SageMaker training
  • 35. 35© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 36. 36© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 37. 37© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 38. 38© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 39. 39© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 40. 40© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 41. 41© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 42. 42© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 43. 43© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 44. 44© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  AWS Europe (Milan) Region
On April, 28th AWS expanded its global footprint with the opening of the AWS Infrastructure Region in Italy. The new Region AWS Europe (Milano) brings advanced cloud technologies that enable opportunities for innovation, entrepreneurship, and digital transformation. For additional information about services and characteristics of an AWS Region, you can check the website: aws.amazon.com/local/italy/milan/
  • 46. 46© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Thanks!