Just cascade three serial shift registers together and use the SPI or just use a couple of pins to bit-bang the 3 bytes out. You could use some 273 or 274 type latches (or the 3state 37X types) as serial registers if you put the output on one stage to the input of the next stage. With the SPI and some CS pins (three pins to control a 1 of 8 74138 decoder or a 74259) and you can operate 8 different devices off the same serial bus.
To do the serial stuff in hardware (an SPI) use the 16f628 or 627. They are essentially upgraded F84s with an async/sync Serial Port, two analog comparators and a 16 bit timer with Capture/Compare and another 8 bit timer plus 10 bit PWM. And they are cheaper to buy than the F84. But the '84 can bit-bang the bytes out serially in it's sleep so that still isn't a problem.