Yes, interesting discussion.
Related to albbg point of view:
There are several ways to define the step function, whose difference is the value it takes at the point of discontinuity. In order to avoid confusions, let me use the notation u(t) and take t0=0.
It is clear that u(t)=0 for t<0 and u(t)=1 for t>0. But what at t=0?
Consider these alternative forms:
(a) u(0)=0
(b) u(0)=1
(c) u(0)=1/2
Fourier theorem says that when a function has this type of discontinuity, at the disconinuity point its Fourier series or transform converges (but not uniformly) to the mean of the limits at left and right, so form (c) is the step we find in Fourier analysis.
Nevertheless, the difference between these 3 forms is a function of zero energy. So, from an engineering point of view, the 3 forms are equivalent (they are not distinguishable).
A problem with the ramp is that it is not differentiable at t=0, because the limits at left and right are not he same. If calculating its derivative you take limit at left (i.e. derivative at left) you find step in form (a), limit at right gives step in the form (b), and a "centered" lim(eps->0){[r(t0+eps/2)-r(t0-eps/2)]/2/eps} gives (c).
A "strictly mathematical" differentiator would not give any output at t=0 because "strictly mathematical derivative" does not exist at that point.
But a "physical world differentiator" must give an output at any time. All the three above mathematical models for representing it (derivative at left, at right or centered) are "practically" the same.
This subject can bring us to a different field, with other rather philisophical derivations: Up to which extent can mathematical models represent phisical or "real-world" systems? But let's leave this for another discussion.
If you find hard to conceive a transfer function V/V that is an ideal differentiator, consider a transfer function I/V (i.e. the output variable is current while the input variable is voltage). Such an ideal differentiator is an ideal capacitor. It is causal, stable and nice.
When a ramp voltage starting at t=0 is applied to an ideal capacitor, the current is a step.
A related V/V transfer function is H(S)=s/(s+1) . It is called sometimes "nonideal differentiator" and can be realizad as a highpass RC cell (the R converts I into V). When we solve this circuit, we treat without problem the C as as a differentiator I=C*dV/dt.
Related to LvW observations:
I would separate this discussion (causality of differentiators) from the transient and steady state problem. Transent behaviour characterizes systems that have memory (including "memory from the future" for non-causal systems!).
A differentiator has no memory. Its output does not depend of the past (nor of the future), but just of the rate of change of the input at the same instant.
One could consider that it "looks" at past (or future) a time eps (I would say that for this reason Albbg considered it non-causal), but eps->0, so eps is not really a memory but something needed in order to define the "rate of change".
Usually, the memory of a linear system is characterized by its impulse response. What is the impulse response of a differentiator? It is a monstruous thing called "doublet", that like Dirac Delta is 0 for any t different from 0. The impulse response of two cascaded differentiators? A triplet... and so on. [Recall that Dirac deltas, as well as its derivatives are not functions in standard sense, so we have to be cautious with them.]
Happy New Year for all!
Regards
Z