I have never done a Calibration kit so broadband, but I'd tell you about what I know, for small bandwidth
The standards I build, I did them using Momentum. I use 3 standards.
Knowing fmax=60GHz, and fmin=10GHz, calculate fo=sqrt(fmax.fmin).
Calculate a line called dcal wich is lamda/4 @fmin.
Calculate another line called dline wich is lamda/4 @ fo
Then, study if it is feasible, or bandwidth is too big:
L=15/(fmin+fmax)GHz
Phase=12. f(GHz).L (cm)
this should be fullfilled
Phase(fmin)>20
Phase(fmax)<160
Then calculate the stardards:
Thru: 2.dcal
Reflect: dcal
Line: 2 dcal+dline. Calculate the delay of this line, which is the delay of dline. This can be calculated directly in ADS with function delay, or with the propagation equation
There are nice application notes from Agilent where it is well explained.
In your VNA, introduce the 3 Stardards:
Thru-- frequency range
Open=Reflect-- frequency range
Line-- frequency range and delay.
The test fixture to introduce your DUT, should have a lenght of dcal.