exact sample size for various series
Posted: Wed Mar 15, 2017 7:34 am
Dear All
I have a excel file with 5000 time series. All the series have not the same history, that's why there are a lot of N/A to fill the whole sample. I import the series as matrix and then I convert it to series (Problem 1: The N/A's are converted to zeros which is a problem as the variable does not actually take the value of zero, but it's just unavailable).
The goal is to run 5000 regressions which obviously have different start and end dates (Problem 2). It's mandatory to know the start/end dates as at the end of the day, I need to know the length of the series and do the regression using a part of the sample and use the rest for an out-of-sample forecast.
I wrote a small code to identify those dates, but I cannot map the created variable with the dates.
Maybe I have to change the way I think the structure of the program.
Below I give you the code I wrote.
Any ideas are more than welcome.
Thanks for you time,
Veni
matrix eqpremd=@subextract(eqprem,1,2,@rows(eqprem),@columns(eqprem)) 'removes the first column which contains the dates
mtos(eqpremd,eq_prem) 'coverts the matrix to series
'check the availability of the raw data (all_hf_returns)
matrix(@rows(all_hf_returns),@columns(all_hf_returns)-1) test
for !i=2 to @columns(all_hf_returns)
for !j=1 to @rows(all_hf_returns)
test(!j,!i-1) = (all_hf_returns(!j,!i) <> na)
next
next
'Run the regressions and save the results. As it is, it does not account for the different sample sizes
matrix(@columns(eq_prem),1) r2
matrix(5,@columns(eq_prem)) pvals
for !i=1 to eq_prem.@count
%sn = eq_prem.@seriesname(!i)
equation eq01.ls {%sn} c infl(-1) tms(-1) dfy(-1) dfr(-1)
r2(!i)=eq01.@rbar2
colplace(pvals,eq01.@pvals,!i)
next
I have a excel file with 5000 time series. All the series have not the same history, that's why there are a lot of N/A to fill the whole sample. I import the series as matrix and then I convert it to series (Problem 1: The N/A's are converted to zeros which is a problem as the variable does not actually take the value of zero, but it's just unavailable).
The goal is to run 5000 regressions which obviously have different start and end dates (Problem 2). It's mandatory to know the start/end dates as at the end of the day, I need to know the length of the series and do the regression using a part of the sample and use the rest for an out-of-sample forecast.
I wrote a small code to identify those dates, but I cannot map the created variable with the dates.
Maybe I have to change the way I think the structure of the program.
Below I give you the code I wrote.
Any ideas are more than welcome.
Thanks for you time,
Veni
matrix eqpremd=@subextract(eqprem,1,2,@rows(eqprem),@columns(eqprem)) 'removes the first column which contains the dates
mtos(eqpremd,eq_prem) 'coverts the matrix to series
'check the availability of the raw data (all_hf_returns)
matrix(@rows(all_hf_returns),@columns(all_hf_returns)-1) test
for !i=2 to @columns(all_hf_returns)
for !j=1 to @rows(all_hf_returns)
test(!j,!i-1) = (all_hf_returns(!j,!i) <> na)
next
next
'Run the regressions and save the results. As it is, it does not account for the different sample sizes
matrix(@columns(eq_prem),1) r2
matrix(5,@columns(eq_prem)) pvals
for !i=1 to eq_prem.@count
%sn = eq_prem.@seriesname(!i)
equation eq01.ls {%sn} c infl(-1) tms(-1) dfy(-1) dfr(-1)
r2(!i)=eq01.@rbar2
colplace(pvals,eq01.@pvals,!i)
next