I think the project that you are talking is quite big ..... you need to break down it properly first ..... First you need real time capture of video from Camera will be one module in the FPGA... then you need to have Gesture detection module ( algorithmic) and then you need some way of output module for representing the detected gesture and input video captured too.....So In my view first tailor your requirement and scope..... then only you will be able to search for right software for simulation or design...... As TrickyDiicky said the work for gesture detection will be bit algorithmic and matlab generally provide function, even in some cases of their examples use of function to develop the algorithms too..... So please look in this prospective in your search.....
Good Luck