I would like to see a Text-To-Speech product… receives text from another computer via UART and outputs audio using a Class D amplifier. Perhaps the product could have a feature to power down the amplifier (in case its quiet state uses significant power) after several seconds of quiet time.
If that product is successful, you could always try implementing the reverse, Speech-To-Text, but it requires more horsepower.
Various online services already have these features, but the MaixCube would operate completely offline.