Greetings!
I have two questions about the ReSpeaker Lite USB 2 microphone device.
Is the physical and relative position of the microphones on the board (e.g. distance between them) used as a constant in the AEC algorithm? If we solder some compatible external microphones to the board and set the distance between them to 20 cm … …will the AEC still work?
Speaker to microphone position. I read somewhere that it is best to place the speakers 1-2 cm away from the board. Which distance is used in the AEC calculations and does a distance of 1-1.5 meters between speaker and microphones affect the AEC?
We are conducting a test with the speakers 1 meter from the board. The board is rotated with one microphone towards the speaker and the other microphone is facing the opposite direction.
The Xiao ReSpeaker merges far-field audio capture with embedded AI-based speech recognition, making it a neat platform for voice-driven projects—especially where offline or privacy-friendly operation is essential.
SO the way I understand it is YES the Measurement matters, If you watch the Video of it in Action , the reason it can pick out the Audio from the back ground is that measurement and the Design of the board to allow a depth of field too, that is the Magic that makes the AEC work well.
I’m sure one of the Seeed’rs will respond, or e-mail tech support.
The Microphone soldered on needs to be the same for both and ALL things equal if it is to work t all.
The Highlights are
Dual Microphone Array for Far-Field Voice Capture
(a) Microphone Array Basics
A microphone array (with two or more mics) allows a device to:
Capture sound from multiple directions simultaneously.
Differentiate noise from the user’s voice more effectively than a single mic can.
Perform audio beamforming or direction-of-arrival (DoA) estimation to “focus” on the speaker’s voice, even in a noisy environment.
(b) What “Far-Field” Means
Far-field generally refers to distances beyond ~1 meter, where a device needs to pick out voices or sound sources from ambient noise or reverberations.
By using two (or more) microphones, the device can apply phase/time differences between the signals to locate and enhance the desired source (e.g., a user speaking across the room).
(c) Noise Reduction & Echo Cancellation
Dual mic arrays can implement Acoustic Echo Cancellation (AEC) and Noise Suppression.
This technology helps filter out room echo, reduce background noise, and isolate the user’s speech—critical for better speech recognition and voice commands.
Result: Even if you’re speaking from several feet away (or with background noise), the dual microphones improve speech clarity, enabling the device to pick up commands reliably.
2. Onboard AI ASR Algorithms
(a) Automatic Speech Recognition (ASR) on the Device
Traditionally, speech recognition might require sending audio to a powerful server or cloud service (e.g., Google Cloud, Amazon Alexa) for processing.
Onboard AI-based ASR means the Xiao ReSpeaker can process audio locally, without needing a constant network connection or server round-trip.
(b) Advantages of Onboard ASR
Privacy: Audio data never leaves the device, so sensitive information isn’t sent over the internet.
Latency: Responses can be faster because you don’t wait on network or cloud servers.
Offline Operation: The device can function in offline environments—great for industrial or remote use cases where no Wi-Fi/cellular is available.
Reduced Cloud Costs: Fewer cloud API calls or subscriptions needed for ASR.
Better Integration: The device can run specialized or custom wake-word and voice commands that might be more complex or application-specific than typical cloud-based solutions allow.
(c) Potential Use Cases
Smart Home: Voice commands to control lights, appliances, or other IoT devices.
Kiosks or Appliances: Provide voice interfaces in store displays or commercial machines, where connectivity might be limited.
Robotics: A robot that must parse voice commands locally without relying on external servers.
Edge AI: Real-time, local speech processing for industrial or field deployments.
HTH
GL PJ
You may be able to scale everything up , but the power required too must be raised. The SPL is factored in.
Hello!
Thank you. Everything sounds as in advert-like materials like we already know.
But the problem is the device cannot cut it’s own played speech from the microphone input.
Setup is the next:
firmware ffva_ua_v2.0.6_output_proc0_ref0.bin I’ve found here on the forum
USB connection to the linux machine
active speaker connected to the 3.5 circuit output jack
play\record with aplay\arecord
The result:
I can hear that it tries to process the audio, but it still possible to recognize speech played on the speaker being recorded on the mic. Especially… speech-to-text algorithms can still recognize it too.
So it is not clear for me - is it like it supposed to be or the ReSpeaker should cut it’s own played speech from the microphone input? Because cutting the music and noise from the input - is just half of the magic. Another half - to cut its own played speech too. That makes the device to be fully usable.
Thank you for the answers!
P.S.
With respeaker_lite_usb_dfu_firmware_v2.0.7.bin from the documentations it looks like AEC is not enabled at all.