Well, as far as I know, the board performs echo cancelation and voice extraction by subtracting the sound it is playing from the sound sensed by mics. (I.e. subtract output from input).
I use a different sound card for sound output, but I also copy it to the board with pulseaudio module-combine-think so it could have something to subtract