Coca cola demand forecast (models)
Moderators: EViews Gareth, EViews Moderator
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Coca cola demand forecast (models)
Hello,
I have a database of 84 different products (sku's from a Coca-Cola bottler). The database consists of the demand for each product from 01/01/2014 to 04/30/2017 day per day.
What types of models could you use to predict next day demand? I have used moving averages and exponential smoothing (although I do not know how to use exponential smoothing well). I also have some regressions with exogenous variables (for example, a dummy variable for the corresponding day of the week, or a variable such as the number of months to which the demand corresponds).
What types of model could you use? Could they leave some type schedules?
I have a programming that detects me which is the best model with the sample quasivariance, I have 15 models and the best so far is the moving average of order 6, but I would like to find a better one.
Thank you very much in advance,
I have a database of 84 different products (sku's from a Coca-Cola bottler). The database consists of the demand for each product from 01/01/2014 to 04/30/2017 day per day.
What types of models could you use to predict next day demand? I have used moving averages and exponential smoothing (although I do not know how to use exponential smoothing well). I also have some regressions with exogenous variables (for example, a dummy variable for the corresponding day of the week, or a variable such as the number of months to which the demand corresponds).
What types of model could you use? Could they leave some type schedules?
I have a programming that detects me which is the best model with the sample quasivariance, I have 15 models and the best so far is the moving average of order 6, but I would like to find a better one.
Thank you very much in advance,
-
- Non-normality and collinearity are NOT problems!
- Posts: 3775
- Joined: Wed Sep 17, 2008 2:25 pm
Re: Coca cola demand forecast (models)
Is this data you can post?
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
startz wrote:Is this data you can post?
- Attachments
-
- para pregunta del foro.WF1
- (786.16 KiB) Downloaded 282 times
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
karakilamaravilla wrote:startz wrote:Is this data you can post?
salida_del_mes : Corresponds to the day of the month that there was sale (it is different from 30 or 31 always smaller since it is not sold on Sundays or holidays).
Lunes : Corresponds 1 if it is Monday, 0 if it is another day
Martes : Corresponds 1 if it is Tuesday, 0 if it is another day
Miercoles : Corresponds 1 if it is Wednesday, 0 if it is another day
Jueves : Corresponds 1 if it is Thursday, 0 if it is another day
Viernes : Corresponds 1 if it is Friday, 0 if it is another day
sku{!i} i=1 to 84: Corresponds To the demand of the different products
corr: 1 1025
Thanks
-
- Non-normality and collinearity are NOT problems!
- Posts: 3775
- Joined: Wed Sep 17, 2008 2:25 pm
Re: Coca cola demand forecast (models)
I messed around a little with sku1, just as an example.
You can get some improvement in forecasting by including a lagged value of the sku and also a time trend. I am pretty sure you could also get improvement by including month dummies, but you'd need to create a dated workfile for that.
You can get some improvement in forecasting by including a lagged value of the sku and also a time trend. I am pretty sure you could also get improvement by including month dummies, but you'd need to create a dated workfile for that.
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
startz wrote:I messed around a little with sku1, just as an example.
You can get some improvement in forecasting by including a lagged value of the sku and also a time trend. I am pretty sure you could also get improvement by including month dummies, but you'd need to create a dated workfile for that.
Thank you startz,
In the database there is the series month:
Mes : month in the year
I have lost the day that corresponds to each "corr", if I get it you could help me to learn to work with dates?
One important thing is that each sku has a different behavior, there are many who have similar behavior, but there are some that escape those behaviors.
What I do is to evaluate for each sku many models and see which is the best for each sku.
-
- Non-normality and collinearity are NOT problems!
- Posts: 3775
- Joined: Wed Sep 17, 2008 2:25 pm
Re: Coca cola demand forecast (models)
Since you already have mes, maybe you don't need to change to a dated workfile. Try adding
to your regressions. It will probably make a lot of difference for some skus and not for others.
There is no single definition of "best." You need to ask what purpose the forecast will be used for. (You should also think about whether the forecasts are for one day ahead versus longer.)
However, the EViews Help system does have an extensive discussion of several of the commonly used forecast evaluation criteria.
Code: Select all
@expand(mes,@droplast)
to your regressions. It will probably make a lot of difference for some skus and not for others.
There is no single definition of "best." You need to ask what purpose the forecast will be used for. (You should also think about whether the forecasts are for one day ahead versus longer.)
However, the EViews Help system does have an extensive discussion of several of the commonly used forecast evaluation criteria.
-
- Non-normality and collinearity are NOT problems!
- Posts: 3775
- Joined: Wed Sep 17, 2008 2:25 pm
Re: Coca cola demand forecast (models)
By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?
The data come from the sales made in the region of Valparaíso Chile.
The interest is not in finally predicting the next day's demand.
The objective is as follows: You have a matrix (that I built) ("_dias_piso") which delivers for the period of 2017 day by day the number of days that the current stock can support the sales made in the future, ( This matrix is constructed with the stock series of each sku and accumulated demand). Then the matrix "_dias_piso" are the actual days in which the stock of a certain day can withstand the future demands.
I constructed matrices for each sku and each day of a forecast ("dias_piso") using the series of sku {! I} and different models, I compare it with the sample quasivariance.
I am not a programmer but the code I can do is something like this:
smpl @all
matrix(114, 84) _dias_piso
for !j = 1 to 84
for !i = 1 to 114
%demanda_acum = @str(!i) + " " + @str(114)
series pisos = @cumsum(demandasku{!j}, %demanda_acum) <= @elem(stocksku{!j}, !i)
_dias_piso(!i, !j) = @sum(pisos)
next
next
for !j= 2 to 6
for !i =1 to 84
group _SKU
_SKU.add sku{!i}
group _STOCKSKU
_STOCKSKU.add STOCKSKU{!i}
for !k=7 to 12
group _Modelo{!k}
equation Reg_Simple_eq1_sku{!i}.ls sku{!i} c corr
forecast sku{!i}_7
equation Reg_Simple_eq_sku{!i}.ls sku{!i} c corr corr^2 @expand(mes, @droplast) lunes miercoles viernes
forecast sku{!i}_8
equation AR1_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1)
forecast sku{!i}_9
equation AR2_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1) sku{!i}(-2)
forecast sku{!i}_10
equation AR3_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1) sku{!i}(-2) sku{!i}(-3) @expand(mes, @droplast) lunes martes miercoles jueves viernes
forecast sku{!i}_11
equation Salida_eq_sku{!i}.ls sku{!i} c @expand(mes, @droplast) lunes martes miercoles jueves viernes
forecast sku{!i}_12
_Modelo{!k}.add sku{!i}_{!k}
next
for !k=17 to 17
group _modelo{!k}
equation xxxx{!i}.ls sku{!i} corr corr^2 corr^3 @expand(mes, @droplast)
forecast sku{!i}_17
_modelo{!k}.add sku{!i}_{!k}
next
group _Modelo{!j}
series Moving_Average_{!j}sku{!i} =( @movsum(sku{!i}, {!j}) ) / {!j}
_Modelo{!j}.add Moving_Average_{!j}sku{!i}
for !l=13 to 16
group _Modelo{!l}
smooth(s) sku{!i} SE13_sku{!i} [4]
smooth(d) sku{!i} SE14_sku{!i} [4]
smooth(n) sku{!i} SE15_sku{!i} [4]
smooth(a) sku{!i} SE16_sku{!i} [4]
_Modelo{!l}.add SE{!l}_sku{!i}
next
next
next
'-----------------------------------------------------------------------------------------------------------------------------------------------------
for !m=2 to 17
matrix(97, 84) _matrix!m
for !c =1 to 84
for !r =1 to 97
_matrix!m(!r, !c) = _stocksku (!c) (!r) / _modelo!m (!c) (!r + 911)
next
next
next
'-----------------------------------------------------------------------------------------------------------------------------------------------------
matrix(97, 84) _diaspiso
for !c=1 to 84
for !r=1 to 97
_diaspiso(!r, !c) = _dias_piso(!r, !c)
next
next
'-----------------------------------------------------------------------------------------------------------------------------------------------------
for !m=2 to 17
matrix(97,84) _error!m
_error!m = _diaspiso - _matrix!m
next
'----------------------------------------------------------------------------------------------------------------------------------------------------
for !m=2 to 17
matrix _t!m= @transpose(@cmean(_error!m))
matrix(97, 84) _ep!m
for !c=1 to 84
for !r=1 to 97
_ep!m(!r,!c) = _t!m(1, !c)
next
next
next
'--------------------------------------------------------------------------------------------------------------------------------------------------
for !m=2 to 17
matrix(97, 84) _x!m
matrix(97, 84) _z!m
_x!m = _error!m - _ep!m
for !c=1 to 84
for !r=1 to 97
matrix _z!m(!r, !c) = _x!m(!r, !c) * _x!m(!r, !c)
next
next
next
for !m=2 to 17
matrix(1, 84) _w!m
matrix(16, 84) _evaluador
matrix _w!m = @transpose(@csum(_z!m)/96)
for !c=1 to 84
_evaluador(!m-1, !c) = _w!m(1, !c)
next
next
matrix __best_model = @cimin(_evaluador)
matrix __resultados = @sort(@uniquevals(__best_model),"a")
matrix(__resultados.@rows, 2) __resultados
for !i = 1 to __resultados.@rows
!iden = 0
for !c = 1 to 84
if __best_model(!c, 1) = __resultados(!i, 1) then
!iden = 1 + !iden
endif
next
__resultados(!i, 2) = !iden
next
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?
This attachment also has the series:
1. Stock of each sku per day (2017)
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
karakilamaravilla wrote:startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?
This attachment also has the series:
1. Stock of each sku per day (2017)
- Attachments
-
- Días Piso Reales.xlsx
- (140.57 KiB) Downloaded 278 times
-
- Fe ddaethom, fe welon, fe amcangyfrifon
- Posts: 13319
- Joined: Tue Sep 16, 2008 5:38 pm
Re: Coca cola demand forecast (models)
Before delving too deeply into this, I'll point out a couple of things.
First you might want to structure your workfile as a dated workfile. Going on the information you gave (it starts in 2014, the series "mes" is month, and "salida_del_mes" is day of the month), you can use the following code to turn your workfile dated:
Note that this then contradicts your first post (where you said the data ends 4/30/2017 - it actually ends 5/17/2017).
It also creates an irregular workfile, since, I presume, some days were holidays etc...
You might want to check the data in the workfile though. A quick look at, say, SKU14 is weird. There are lots of 1s and 0s, and also lots of negative numbers (negative demand?). The data don't appear to match up with the data in the later Excel file you posted.
First you might want to structure your workfile as a dated workfile. Going on the information you gave (it starts in 2014, the series "mes" is month, and "salida_del_mes" is day of the month), you can use the following code to turn your workfile dated:
Code: Select all
wfopen "para pregunta del foro.wf1"
series year = 2014
smpl @first+1 @last
year = @recode(mes<mes(-1), year(-1)+1, year(-1))
smpl @all
series date = @dateval(@str(year)+"/"+@str(mes)+"/"+@str(salida_del_mes), "YYYY/MM/DD")
pagestruct @date(date)
Note that this then contradicts your first post (where you said the data ends 4/30/2017 - it actually ends 5/17/2017).
It also creates an irregular workfile, since, I presume, some days were holidays etc...
You might want to check the data in the workfile though. A quick look at, say, SKU14 is weird. There are lots of 1s and 0s, and also lots of negative numbers (negative demand?). The data don't appear to match up with the data in the later Excel file you posted.
Follow us on Twitter @IHSEViews
-
- Non-normality and collinearity are NOT problems!
- Posts: 3775
- Joined: Wed Sep 17, 2008 2:25 pm
Re: Coca cola demand forecast (models)
One small addition to Gareth's points. I think the "day of the month" variable" is just a sequence number within the month. In other words, "1" might sometimes be the first observation but the second day of the month. Since you know the day of the week, you could probably manually insert dates, although I imagine it would take an hour or so.
-
- Posts: 24
- Joined: Tue Jun 13, 2017 12:04 pm
Re: Coca cola demand forecast (models)
EViews Gareth wrote:Before delving too deeply into this, I'll point out a couple of things.
First you might want to structure your workfile as a dated workfile. Going on the information you gave (it starts in 2014, the series "mes" is month, and "salida_del_mes" is day of the month), you can use the following code to turn your workfile dated:Code: Select all
wfopen "para pregunta del foro.wf1"
series year = 2014
smpl @first+1 @last
year = @recode(mes<mes(-1), year(-1)+1, year(-1))
smpl @all
series date = @dateval(@str(year)+"/"+@str(mes)+"/"+@str(salida_del_mes), "YYYY/MM/DD")
pagestruct @date(date)
Note that this then contradicts your first post (where you said the data ends 4/30/2017 - it actually ends 5/17/2017).
It also creates an irregular workfile, since, I presume, some days were holidays etc...
You might want to check the data in the workfile though. A quick look at, say, SKU14 is weird. There are lots of 1s and 0s, and also lots of negative numbers (negative demand?). The data don't appear to match up with the data in the later Excel file you posted.
Dear,
The database is made up from the dates 02/01/2014 to 05/20/2017. I left attached the file containing the dates. There are days that are not in the database because they are Sundays or holidays.
Salida_del_mes is not the day of the month, but rather is the correlative of the month corresponding to the days that exist in the database.
Responding to Gareth, and going deeper into the database, demand is composed of two things, the first one: sales actually made, and the second: the quantification of unsatisfied demand. Anyway the data that interests is the one that is in the database, called "demand".
Another thing, is that negative demand is given as they correspond to credit notes, which is not relevant to the analysis, since only occurs in a few SKU's.
I will also share the description of each SKU and the category of products to which it refers, in the Coca Cola business, we have different formats for drinks, different flavors, and some beverages (SKU's) are returnable (Ref. Pet bottles and bottles Of Glass: that have demand to the low) and some are disposable (PET, that have demand to the rise).
The objective of the estimation is not to have a better estimate, it is to minimize the quasivariation. That is observed in my calculations in the matrix _evaluador, where you have the quasivariaza of what I want to calculate for each SKU and for each model that is proposed.
A very interesting analysis would be to be able to program something that can minimize for each SKU the quasivariance sample with unique models for each SKU. At the moment I evaluate some models and compare which is the one that is most successful in the matrix _results.
If you have any idea how to improve these models, it would be very useful. I share again my code that I managed to generate with the help of this forum.
Code: Select all
smpl @all
matrix(114, 84) _dias_piso
for !j = 1 to 84
for !i = 1 to 114
%demanda_acum = @str(!i) + " " + @str(114)
series pisos = @cumsum(demandasku{!j}, %demanda_acum) <= @elem(stocksku{!j}, !i)
_dias_piso(!i, !j) = @sum(pisos)
next
next
for !i =1 to 84
group _SKU
_SKU.add sku{!i}
group _STOCKSKU
_STOCKSKU.add STOCKSKU{!i}
group _Modelo1
series Moving_Average_6sku{!i} =( @movsum(sku{!i}, 6) ) / 6
_Modelo1.add Moving_Average_6sku{!i}
for !l=2 to 5
group _Modelo{!l}
smooth(s) sku{!i} SE2_sku{!i}
smooth(d) sku{!i} SE3_sku{!i}
smooth(n) sku{!i} SE4_sku{!i}
smooth(a) sku{!i} SE5_sku{!i}
_Modelo{!l}.add SE{!l}_sku{!i}
next
for !k=6 to 7
group _Modelo{!k}
equation Reg1_sku{!i}.ls sku{!i} c corr corr^2 corr^3 @expand(mes, @droplast) SE4_sku{!i}
forecast sku{!i}_6
equation Reg2_sku{!i}.ls sku{!i} c corr corr^2 corr^3 sku{!i}(-1) sku{!i}(-2) sku{!i}(-3) @expand(mes, @droplast) lunes martes miercoles jueves viernes
forecast sku{!i}_7
_Modelo{!k}.add sku{!i}_{!k}
next
next
'----------------------------------------------------------------------------------------------------------------------------------------------------
for !m=1 to 7
matrix(97, 84) _matrix!m
matrix(97, 84) _diaspiso
matrix(97,84) _error!m
matrix(97, 84) _ep!m
matrix(97, 84) _x!m
matrix(97, 84) _z!m
matrix(1, 84) _w!m
matrix(7, 84) _evaluador
for !c =1 to 84
for !r =1 to 97
_matrix!m(!r, !c) = _stocksku (!c) (!r) / _modelo!m (!c) (!r + 911)
_diaspiso(!r, !c) = _dias_piso(!r, !c)
next
next
_error!m = _diaspiso - _matrix!m
matrix _t!m= @transpose(@cmean(_error!m))
for !c=1 to 84
for !r=1 to 97
_ep!m(!r,!c) = _t!m(1, !c)
_x!m = _error!m - _ep!m
matrix _z!m(!r, !c) = _x!m(!r, !c) * _x!m(!r, !c)
matrix _w!m = @transpose(@csum(_z!m)/96)
next
_evaluador(!m, !c) = _w!m(1, !c)
next
next
matrix __best_model = @cimin(_evaluador)
matrix __resultados = @sort(@uniquevals(__best_model),"a")
matrix(__resultados.@rows, 2) __resultados
for !i = 1 to __resultados.@rows
!iden = 0
for !c = 1 to 84
if __best_model(!c, 1) = __resultados(!i, 1) then
!iden = 1 + !iden
endif
next
__resultados(!i, 2) = !iden
next
- Attachments
-
- SKU'S DESCRIPCIÓN.xlsx
- (11.61 KiB) Downloaded 262 times
-
- DATE.xlsx
- (136.09 KiB) Downloaded 263 times
Return to “Econometric Discussions”
Who is online
Users browsing this forum: No registered users and 28 guests