For your example of a capacitor, the answer is simple in my mind. An FIR lacks "memory", that is the resulting output (say voltage on the plates) is the result of the input at the current time (current) and some limited number of inputs from the past.
An IIR has "memory" or some "internal" variable that is a result of the "accumulation" of past inputs (all of them from time zero). In your example the internal variable is the charge on the plates. So if we send an impulse of current into a capacitor it will physically accumulate a charge on the plates and the capacitor will (in an ideal world) maintain a voltage proportional to the accumulated charge forever. The response is not finite in duration.
In the practical world, you end up drawing the charge off the plates through time (whether the charges travel through the air or through the dielectic in the capacitor). The capacitor "leaks" the accumulated charge. I suppose you could model this as an FIR, but modeling as an ideal capacitor (IIR) in parallel with a resistor is more convient.
-jonathan