if you look at VAR(p) model, the VEC representation of this model is p-1 (the p refers to the lags). so if you estimate first a VAR(1) model, this means that: Z(t)=a+b*Z(t-1)+e(t), where Z(t) is vector of the variables you are working with, a is the vector of constants, etc.
the VEC representation of VAR(1) is deltaZ(t)=c+d*Z(t-1)+u(t), and here you have deltaZ(t) - the differenced values in a vector, so you have to put a zero in the lag criteria for VEC model, because it refers to number of lagged values of deltaZ(t), and you don't have any of them on the right hand side of the VEC representation of VAR(1)
edit: and you don't difference the data for the VAR. you use them without differencing first. do everything up to the johansen test - and if the rank of the d matrix is full - all of the variables are stationary and you do just the VAR procedure, if the rank is not full - the variables are not stationary but a cointegration exists and you do the VEC procedure. if the rank is null - variables are not stationary but there is no cointegration, then you have to difference the data and do the VAR procedure over differenced data