@dmitryl
a quick thought tells me buffering offers two main advantages / need - High fanout fixing and addressing drive strength
Cloning also helps high fanout but this ideally should be more placement aware or area aware. For eg if you have a high fanout whose loads are at two ends of the placeable area then a clone at two ends of these loads helps address high fanout + congestion, timing and routing.
most common is the clock gates clone / declone which helps address CG setup times.
hope this helps.