try synchronizing the encoded sizes/pointers instead of the control inputs.
for example, assume you are looking for the read side logic. you know your read pointer is 0000. you synchronize the write pointer to your clock domain and get 0000 -- you are empty. at some later time, a single write occurs, you measure either 0000, or 0001 depending on how close the writes are to the read clock edge. Either way, on the next cycle you will see 0001 and declare not empty.
likewise, if your read pointer is 0001, and the write pointer estimate is 0010, then you aren't empty. if a write and read occur, you will either get a 0010 or a 0011 for the write pointer, and a 0010 for the read pointer -- you will either declare the fifo as empty for one cycle, then correctly declare it is not empty on the next, or you will catch the change to the write pointer and declare "not empty".
This ends up giving a situation where the fifo can declare empty when the fifo is not empty (and then correct 1 cycle later). However, you never declare the fifo not empty when the fifo is empty.
The gray coding is used to ensure you see either write pointer, or write pointer -1, but never any other value due to multiple bits changing. (assuming you still constrain the paths for skew)
The above arguments can be made for the full flag as well.
edit -- and the above also allows for massive differences in clock frequency, eg if the write clock were 8x faster than the read clock.