If you can assume that there is audio signal above some amplitude whenever it's not muted, then you can do it fairly easily, with an envelope detector and a comparator, depending on what you need as your output.
An envelope detector can be as simple as a series diode followed by a parallel RC to ground. You adjust the R and C for the desired frequency range and response speed. If the signal can be low-level, compared to a diode's forward voltage, then you could use an active "ideal rectifier" circuit instead of a diode, usually made with an opamp, so that it will rectify down to about zero volts.
Basically, the envelope detector will have output of about zero volts when there is no audio, and output above zero whenever there is audio. You might want to adjust the R and C to make it "slow", so it will hold up the voltage when there are short gaps in the audio. But note that any long gap with no audio signal will look like muting is on. I don't see any way around that, if the audio output is all you have access to.
It would be a good idea to simulate it in LT-Spice, or something like that, so you can adjust the R and C easily, and so you can figure out how to get the output level you need. LT-Spice can use WAV files of actual audio as inputs (and outputs), which might be handy for testing in this case.