Coca cola demand forecast (models)

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 10:09 am

Hello,

I have a database of 84 different products (sku's from a Coca-Cola bottler). The database consists of the demand for each product from 01/01/2014 to 04/30/2017 day per day.

What types of models could you use to predict next day demand? I have used moving averages and exponential smoothing (although I do not know how to use exponential smoothing well). I also have some regressions with exogenous variables (for example, a dummy variable for the corresponding day of the week, or a variable such as the number of months to which the demand corresponds).

What types of model could you use? Could they leave some type schedules?
I have a programming that detects me which is the best model with the sample quasivariance, I have 15 models and the best so far is the moving average of order 6, but I would like to find a better one.

Thank you very much in advance,

startz · Postby **startz** » Sat Jun 17, 2017 11:53 am

Is this data you can post?

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 12:58 pm

startz wrote:Is this data you can post?

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 1:07 pm

karakilamaravilla wrote:
startz wrote:Is this data you can post?

salida_del_mes : Corresponds to the day of the month that there was sale (it is different from 30 or 31 always smaller since it is not sold on Sundays or holidays).

Lunes : Corresponds 1 if it is Monday, 0 if it is another day
Martes : Corresponds 1 if it is Tuesday, 0 if it is another day
Miercoles : Corresponds 1 if it is Wednesday, 0 if it is another day
Jueves : Corresponds 1 if it is Thursday, 0 if it is another day
Viernes : Corresponds 1 if it is Friday, 0 if it is another day

sku{!i} i=1 to 84: Corresponds To the demand of the different products

corr: 1 1025

Thanks

startz · Postby **startz** » Sat Jun 17, 2017 1:23 pm

I messed around a little with sku1, just as an example.

You can get some improvement in forecasting by including a lagged value of the sku and also a time trend. I am pretty sure you could also get improvement by including month dummies, but you'd need to create a dated workfile for that.

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 1:38 pm

startz wrote:I messed around a little with sku1, just as an example.

You can get some improvement in forecasting by including a lagged value of the sku and also a time trend. I am pretty sure you could also get improvement by including month dummies, but you'd need to create a dated workfile for that.

Thank you startz,

In the database there is the series month:

Mes : month in the year

I have lost the day that corresponds to each "corr", if I get it you could help me to learn to work with dates?

One important thing is that each sku has a different behavior, there are many who have similar behavior, but there are some that escape those behaviors.

What I do is to evaluate for each sku many models and see which is the best for each sku.

startz · Postby **startz** » Sat Jun 17, 2017 1:47 pm

Since you already have mes, maybe you don't need to change to a dated workfile. Try adding

Code: Select all

@expand(mes,@droplast)

to your regressions. It will probably make a lot of difference for some skus and not for others.

There is no single definition of "best." You need to ask what purpose the forecast will be used for. (You should also think about whether the forecasts are for one day ahead versus longer.)

However, the EViews Help system does have an extensive discussion of several of the commonly used forecast evaluation criteria.

startz · Postby **startz** » Sat Jun 17, 2017 1:49 pm

By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 3:12 pm

startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?

The data come from the sales made in the region of Valparaíso Chile.
The interest is not in finally predicting the next day's demand.
The objective is as follows: You have a matrix (that I built) ("_dias_piso") which delivers for the period of 2017 day by day the number of days that the current stock can support the sales made in the future, ( This matrix is constructed with the stock series of each sku and accumulated demand). Then the matrix "_dias_piso" are the actual days in which the stock of a certain day can withstand the future demands.

I constructed matrices for each sku and each day of a forecast ("dias_piso") using the series of sku {! I} and different models, I compare it with the sample quasivariance.

I am not a programmer but the code I can do is something like this:

smpl @all

matrix(114, 84) _dias_piso

for !j = 1 to 84

for !i = 1 to 114

%demanda_acum = @str(!i) + " " + @str(114)
series pisos = @cumsum(demandasku{!j}, %demanda_acum) <= @elem(stocksku{!j}, !i)
_dias_piso(!i, !j) = @sum(pisos)

next

next

for !j= 2 to 6
for !i =1 to 84

group _SKU
_SKU.add sku{!i}
group _STOCKSKU
_STOCKSKU.add STOCKSKU{!i}

for !k=7 to 12

group _Modelo{!k}

equation Reg_Simple_eq1_sku{!i}.ls sku{!i} c corr
forecast sku{!i}_7
equation Reg_Simple_eq_sku{!i}.ls sku{!i} c corr corr^2 @expand(mes, @droplast) lunes miercoles viernes
forecast sku{!i}_8
equation AR1_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1)
forecast sku{!i}_9
equation AR2_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1) sku{!i}(-2)
forecast sku{!i}_10
equation AR3_eq_sku{!i}.ls sku{!i} c corr sku{!i}(-1) sku{!i}(-2) sku{!i}(-3) @expand(mes, @droplast) lunes martes miercoles jueves viernes
forecast sku{!i}_11
equation Salida_eq_sku{!i}.ls sku{!i} c @expand(mes, @droplast) lunes martes miercoles jueves viernes
forecast sku{!i}_12

_Modelo{!k}.add sku{!i}_{!k}

next

for !k=17 to 17

group _modelo{!k}

equation xxxx{!i}.ls sku{!i} corr corr^2 corr^3 @expand(mes, @droplast)
forecast sku{!i}_17

_modelo{!k}.add sku{!i}_{!k}

next

group _Modelo{!j}
series Moving_Average_{!j}sku{!i} =( @movsum(sku{!i}, {!j}) ) / {!j}
_Modelo{!j}.add Moving_Average_{!j}sku{!i}

for !l=13 to 16

group _Modelo{!l}
smooth(s) sku{!i} SE13_sku{!i} [4]
smooth(d) sku{!i} SE14_sku{!i} [4]
smooth(n) sku{!i} SE15_sku{!i} [4]
smooth(a) sku{!i} SE16_sku{!i} [4]
_Modelo{!l}.add SE{!l}_sku{!i}

next

next

next
'-----------------------------------------------------------------------------------------------------------------------------------------------------

for !m=2 to 17
matrix(97, 84) _matrix!m
for !c =1 to 84
for !r =1 to 97
_matrix!m(!r, !c) = _stocksku (!c) (!r) / _modelo!m (!c) (!r + 911)
next
next
next

'-----------------------------------------------------------------------------------------------------------------------------------------------------

matrix(97, 84) _diaspiso
for !c=1 to 84
for !r=1 to 97
_diaspiso(!r, !c) = _dias_piso(!r, !c)
next
next

'-----------------------------------------------------------------------------------------------------------------------------------------------------

for !m=2 to 17
matrix(97,84) _error!m
_error!m = _diaspiso - _matrix!m
next

'----------------------------------------------------------------------------------------------------------------------------------------------------

for !m=2 to 17
matrix _t!m= @transpose(@cmean(_error!m))
matrix(97, 84) _ep!m
for !c=1 to 84
for !r=1 to 97
_ep!m(!r,!c) = _t!m(1, !c)
next
next
next

'--------------------------------------------------------------------------------------------------------------------------------------------------

for !m=2 to 17
matrix(97, 84) _x!m
matrix(97, 84) _z!m
_x!m = _error!m - _ep!m
for !c=1 to 84
for !r=1 to 97
matrix _z!m(!r, !c) = _x!m(!r, !c) * _x!m(!r, !c)
next
next
next

for !m=2 to 17
matrix(1, 84) _w!m
matrix(16, 84) _evaluador
matrix _w!m = @transpose(@csum(_z!m)/96)
for !c=1 to 84
_evaluador(!m-1, !c) = _w!m(1, !c)
next
next

matrix __best_model = @cimin(_evaluador)

matrix __resultados = @sort(@uniquevals(__best_model),"a")
matrix(__resultados.@rows, 2) __resultados

for !i = 1 to __resultados.@rows
!iden = 0
for !c = 1 to 84
if __best_model(!c, 1) = __resultados(!i, 1) then
!iden = 1 + !iden
endif
next
__resultados(!i, 2) = !iden
next

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 3:15 pm

startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?

This attachment also has the series:

1. Stock of each sku per day (2017)

karakilamaravilla · Postby **karakilamaravilla** » Sat Jun 17, 2017 3:17 pm

karakilamaravilla wrote:
startz wrote:By the way, this is an interesting example. Perhaps it's something I could use in my undergraduate econometrics class. If you wouldn't mind, can you say a little more about where the data is from and what the objective of the forecast is?

This attachment also has the series:

1. Stock of each sku per day (2017)

startz · Postby **startz** » Sat Jun 17, 2017 3:33 pm

Thanks.

EViews Gareth · Postby **EViews Gareth** » Sun Jun 18, 2017 10:40 am

Before delving too deeply into this, I'll point out a couple of things.

First you might want to structure your workfile as a dated workfile. Going on the information you gave (it starts in 2014, the series "mes" is month, and "salida_del_mes" is day of the month), you can use the following code to turn your workfile dated:

Code: Select all

wfopen "para pregunta del foro.wf1"
series year = 2014
smpl @first+1 @last
year = @recode(mes<mes(-1), year(-1)+1, year(-1))
smpl @all
series date = @dateval(@str(year)+"/"+@str(mes)+"/"+@str(salida_del_mes), "YYYY/MM/DD")
pagestruct @date(date)

Note that this then contradicts your first post (where you said the data ends 4/30/2017 - it actually ends 5/17/2017).

It also creates an irregular workfile, since, I presume, some days were holidays etc...

You might want to check the data in the workfile though. A quick look at, say, SKU14 is weird. There are lots of 1s and 0s, and also lots of negative numbers (negative demand?). The data don't appear to match up with the data in the later Excel file you posted.

startz · Postby **startz** » Sun Jun 18, 2017 11:35 am

One small addition to Gareth's points. I think the "day of the month" variable" is just a sequence number within the month. In other words, "1" might sometimes be the first observation but the second day of the month. Since you know the day of the week, you could probably manually insert dates, although I imagine it would take an hour or so.

karakilamaravilla · Postby **karakilamaravilla** » Tue Jun 20, 2017 9:18 am

EViews Gareth wrote:Before delving too deeply into this, I'll point out a couple of things.

First you might want to structure your workfile as a dated workfile. Going on the information you gave (it starts in 2014, the series "mes" is month, and "salida_del_mes" is day of the month), you can use the following code to turn your workfile dated:
Code: Select all
wfopen "para pregunta del foro.wf1" series year = 2014 smpl @first+1 @last year = @recode(mes<mes(-1), year(-1)+1, year(-1)) smpl @all series date = @dateval(@str(year)+"/"+@str(mes)+"/"+@str(salida_del_mes), "YYYY/MM/DD") pagestruct @date(date)

Note that this then contradicts your first post (where you said the data ends 4/30/2017 - it actually ends 5/17/2017).

It also creates an irregular workfile, since, I presume, some days were holidays etc...

You might want to check the data in the workfile though. A quick look at, say, SKU14 is weird. There are lots of 1s and 0s, and also lots of negative numbers (negative demand?). The data don't appear to match up with the data in the later Excel file you posted.

Dear,

The database is made up from the dates 02/01/2014 to 05/20/2017. I left attached the file containing the dates. There are days that are not in the database because they are Sundays or holidays.

Salida_del_mes is not the day of the month, but rather is the correlative of the month corresponding to the days that exist in the database.

Responding to Gareth, and going deeper into the database, demand is composed of two things, the first one: sales actually made, and the second: the quantification of unsatisfied demand. Anyway the data that interests is the one that is in the database, called "demand".

Another thing, is that negative demand is given as they correspond to credit notes, which is not relevant to the analysis, since only occurs in a few SKU's.

I will also share the description of each SKU and the category of products to which it refers, in the Coca Cola business, we have different formats for drinks, different flavors, and some beverages (SKU's) are returnable (Ref. Pet bottles and bottles Of Glass: that have demand to the low) and some are disposable (PET, that have demand to the rise).

The objective of the estimation is not to have a better estimate, it is to minimize the quasivariation. That is observed in my calculations in the matrix _evaluador, where you have the quasivariaza of what I want to calculate for each SKU and for each model that is proposed.

A very interesting analysis would be to be able to program something that can minimize for each SKU the quasivariance sample with unique models for each SKU. At the moment I evaluate some models and compare which is the one that is most successful in the matrix _results.

If you have any idea how to improve these models, it would be very useful. I share again my code that I managed to generate with the help of this forum.

Code: Select all


smpl @all
      
      
matrix(114, 84) _dias_piso
      
for !j = 1 to 84
      
   for !i = 1 to 114
      
       %demanda_acum = @str(!i) + " " + @str(114)
         series pisos = @cumsum(demandasku{!j}, %demanda_acum) <= @elem(stocksku{!j}, !i)
         _dias_piso(!i, !j) = @sum(pisos)
      
   next   
      
next
      
      

for  !i =1 to 84
      
   group _SKU
   _SKU.add sku{!i}
   group _STOCKSKU
   _STOCKSKU.add STOCKSKU{!i}
   group _Modelo1
   series Moving_Average_6sku{!i} =( @movsum(sku{!i}, 6) ) / 6
   _Modelo1.add Moving_Average_6sku{!i}

   for !l=2 to 5
      
      group _Modelo{!l}
      smooth(s)  sku{!i} SE2_sku{!i} 
      smooth(d)  sku{!i} SE3_sku{!i} 
      smooth(n)  sku{!i} SE4_sku{!i} 
      smooth(a)  sku{!i} SE5_sku{!i} 
      _Modelo{!l}.add SE{!l}_sku{!i}
      
   next
      
   for !k=6 to 7
      
      group _Modelo{!k}
         
      equation Reg1_sku{!i}.ls sku{!i}  c corr corr^2 corr^3  @expand(mes, @droplast) SE4_sku{!i}
      forecast  sku{!i}_6

      equation Reg2_sku{!i}.ls sku{!i} c corr  corr^2 corr^3 sku{!i}(-1) sku{!i}(-2) sku{!i}(-3) @expand(mes, @droplast) lunes martes miercoles jueves viernes
      forecast  sku{!i}_7 
         
      _Modelo{!k}.add sku{!i}_{!k}
            
   next      
next      

'----------------------------------------------------------------------------------------------------------------------------------------------------
for !m=1 to 7
   matrix(97, 84) _matrix!m
   matrix(97, 84) _diaspiso
   matrix(97,84) _error!m
   matrix(97, 84) _ep!m
   matrix(97, 84) _x!m
   matrix(97, 84) _z!m
   matrix(1, 84)  _w!m
   matrix(7, 84) _evaluador
   for !c =1 to 84
      for !r =1 to 97
         _matrix!m(!r, !c) = _stocksku (!c) (!r) / _modelo!m (!c) (!r + 911)   
         _diaspiso(!r, !c)  = _dias_piso(!r, !c)
      next
   next
   _error!m = _diaspiso - _matrix!m
   matrix _t!m= @transpose(@cmean(_error!m))
   for !c=1 to 84
      for !r=1 to 97
         _ep!m(!r,!c) = _t!m(1, !c)
         _x!m =  _error!m - _ep!m
         matrix _z!m(!r, !c) = _x!m(!r, !c) * _x!m(!r, !c)
         matrix _w!m = @transpose(@csum(_z!m)/96)
      next
      _evaluador(!m, !c) = _w!m(1, !c)
   next
next

matrix __best_model = @cimin(_evaluador)

matrix __resultados = @sort(@uniquevals(__best_model),"a")
matrix(__resultados.@rows, 2) __resultados

for !i = 1 to __resultados.@rows
   !iden = 0
   for !c = 1 to 84
      if __best_model(!c, 1) = __resultados(!i, 1) then
         !iden = 1 + !iden
      endif
   next
   __resultados(!i, 2) = !iden
next

EViews.com

Coca cola demand forecast (models)

Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Re: Coca cola demand forecast (models)

Who is online