Audio Output Quality is 16 KHz on v2

I just received my ReSpeaker Mic Array v2 and the voice recognition is amazing.

However I noticed that the audio output is very dump / low quality.



I tested the output speaker and cable with my smartphone (same music) and it was better.



From my analyze I found that the hardware only supports 24-bit 16 KHz as output.



For the audio output, to have an acceptable quality 48 KHz would be good, I guess due to UAC 1.0 and DSP Algorithms this has to be limited at some point, but 16 KHz is a really poor output.



Issue also created here: https://github.com/respeaker/usb_4_mic_array/issues/10

Hi there

The Max Sample Rate is 16Khz. We suggest to update the firmware to 1_channel_firmware.bin by following <LINK_TEXT text=“http://wiki.seeedstudio.com/ReSpeaker_M … e-firmware”>http://wiki.seeedstudio.com/ReSpeaker_Mic_Array_v2.0/#update-firmware</LINK_TEXT>. the output audio will be Processed audio for ASR. thanks.



Bill

I already updated my Mic Array to the 1 channel firmware.

So does this mean there is no plan to raise the audio output quality / sample rates?



This would mean the mic array would always provide very poor audio output (even on good speakers).



The WM8960 Audio Chip on the Mic Array can handle these Samplerates: 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 KHz

The XVF3000 Chip can handle Samplerates: Up to 48kHz sample rate





Just for comparison:

ReSpeaker MicArray v2 Quality - 16 KHz/24 bit

CD Quality - 44,1 KHz/16 bit

DVD Quality - 48 KHz / 16 bit

Hi-Res Audio Quality - 96 kHz/24 bit

Studio Quality - 192 KHz/24 bit





UPDATE: Bill I think you misunderstood me, I’m talking about audio output (speaker) not audio input (microphones)

Hi there, I talked with the software engineer and the 16Khz is related with algorithm. Current we do not have plan to update the algorithm. thanks for understanding.



Bill

Hi, is there an update on this? Is true 48khz sample rate supported on audio output? I am fine if you can provide a firmware with proper 48kHz playback support without any algo processing. Is it possible?

Hey!



I’ve tested the 48kHz firmwares (both 1 channel and 6 channels):

  • The 6 channels version just output saturated sounds (not usable).
  • The 1 channel version output 48kHz rate (clear audio) but with strange “sh” artifacts rendering poor audio quality.



    Is it something that can be improved?



    Are you guys at Seeed still working on the ReSpeaker Mic Array v2?

Hi there~,



I just tested with 48k_1_channel_firmware.bin and 48k_6_channels_firmware.bin. Both of them work well.


  1. please try to burn the firmware once again.
  2. If the issue is not solved, please use audacity to record the audio files and posted it here. thanks.

Hey!



I tested the latest 48Khz firmwares and did not ear any changes.



Here is the file I’m playing (loop-test.wav) and the record from the playback channel of the ReSpeaker (mic.wav) with the 48khz_6channels_firmware.

Please download files here (too big for forum quota): <LINK_TEXT text=“https://wetransfer.com/downloads/ce1c7d … 000/3d4116”>https://wetransfer.com/downloads/ce1c7dbb71f48565f5b2413d16785a6820190701150000/da19f88edf3f680a2ad313c251775cb620190701150000/3d4116</LINK_TEXT>



I’m on a Raspberry Pi 3B+ with the latest Raspbian Stretch OS (fully updated).



Here is the verbose aplay output: aplay -Dplughw:ArrayUAC10 loop-test.wav -vv



Playing WAVE ‘loop-test.wav’ : Signed 24 bit Little Endian in 3bytes, Rate 48000 Hz, Stereo

Plug PCM: Hardware PCM card 1 ‘ReSpeaker 4 Mic Array (UAC1.0)’ device 0 subdevice 0

Its setup is:

stream : PLAYBACK

access : RW_INTERLEAVED

format : S24_3LE

subformat : STD

channels : 2

rate : 48000

exact rate : 48000 (48000/1)

msbits : 24

buffer_size : 24000

period_size : 6000

period_time : 125000

tstamp_mode : NONE

tstamp_type : MONOTONIC

period_step : 1

avail_min : 6000

period_event : 0

start_threshold : 24000

stop_threshold : 24000

silence_threshold: 0

silence_size : 0

boundary : 1572864000

appl_ptr : 0

hw_ptr : 0



Hope this helps. Tell me if you need more info.



Thanks for your help :slight_smile:

Hi there~,



Here is my hardware connection and audacity result.


  1. First, i run the below command to play music through terminal.
  2. Then i use the audacity to record the audio. You can say something. You can stop the recording later on.
  3. You can select the channel0 with solo mode to play back the processed audio. thanks.

</s>aplay -D plughw:1,0 -f cd loop-test.wav<e>



[img]https://github.com/SeeedDocument/forum_doc/raw/master/image/48khz-%20mic-array-loop-back-audacity.png[/img]

[url]https://github.com/SeeedDocument/forum_doc/raw/master/reg/48khz_6channel_loop_test_files.zip[/url]

Hi!



Thank you for your feedback.



After more investigation, I think I found the cause of the saturated audio playback with the 48Khz 6 channels firmware.



The culprit seems to be the USB bandwidth.



The saturated audio playback only happens when I capture the 6 channels mic input during playback.



Either the cable or the Raspberry Pi USB bandwidth cannot cope with all the 6 channels capture and 2 channels playback in 48Khz. Since your 6 channels Audacity file is correct and without saturation, the real culprit seems to be the USB cable.



Can you confirm this by checking the output of </s>cat /proc/asound/ArrayUAC10/stream0<e> on your setup to see if you have “full speed” or “high speed” :

Here is my output: SEEED ReSpeaker 4 Mic Array (UAC1.0) at usb-3f980000.usb-1.1.2, full speed : USB Audio



There are however strange audio playback artifacts on specific song lyrics (saturation on “sh” and “ss” sounds, “share”, “strong”) that I do not ear when playing the same audio file on Windows or with the 16khz firmwares.



The AEC with the 48khz firmwares are also a lot worse (almost inexistant).



Is it something that can be improved?



Thanks you again for your help.

Hi there~



Here is my output. I am checking with software team about saturation on “sh” and “ss” sounds. thanks.

[code]pi@raspberrypi:~ $ cat /proc/asound/ArrayUAC10/stream0
SEEED ReSpeaker 4 Mic Array (UAC1.0) at usb-3f980000.usb-1.2, full speed : USB Audio

Playback:
Status: Stop
Interface 1
Altset 1
Format: S24_3LE
Channels: 2
Endpoint: 1 OUT (ASYNC)
Rates: 48000, 48000, 48000

Capture:
Status: Stop
Interface 2
Altset 1
Format: S16_LE
Channels: 6
Endpoint: 2 IN (ASYNC)
Rates: 48000, 48000, 48000
pi@raspberrypi:~ $[/code]

Hi!



Using the 48k 6 channels firmware, the audio playback quality is degraded as soon as I start capturing.
[code]
SEEED ReSpeaker 4 Mic Array (UAC1.0) at usb-3f980000.usb-1.1.3, full speed : USB Audio

Playback:
Status: Stop
Interface 1
Altset 1
Format: S24_3LE
Channels: 2
Endpoint: 1 OUT (ASYNC)
Rates: 48000, 48000, 48000

Capture:
Status: Stop
Interface 2
Altset 1
Format: S16_LE
Channels: 6
Endpoint: 2 IN (ASYNC)
Rates: 48000, 48000, 48000
[/code]


Using usbtop, I can confirm that the USB bandwidth is nowhere near its max capacity (< 1MB/s).



Your setup is identical to mine but I am not able to get a clear audio playback during capture.



Here is what I do :



Session 1:
</s>aplay loop-test.wav -Dplughw:CARD=ArrayUAC10 -vv<e>
Stdout:
</s><i> </i>Playing WAVE 'loop-test.wav' : Signed 24 bit Little Endian in 3bytes, Rate 48000 Hz, Stereo Plug PCM: Hardware PCM card 1 'ReSpeaker 4 Mic Array (UAC1.0)' device 0 subdevice 0 Its setup is: stream : PLAYBACK access : RW_INTERLEAVED format : S24_3LE subformat : STD channels : 2 rate : 48000 exact rate : 48000 (48000/1) msbits : 24 buffer_size : 24000 period_size : 6000 period_time : 125000 tstamp_mode : NONE tstamp_type : MONOTONIC period_step : 1 avail_min : 6000 period_event : 0 start_threshold : 24000 stop_threshold : 24000 silence_threshold: 0 silence_size : 0 boundary : 1572864000 appl_ptr : 0 hw_ptr : 0 <e>

Audio is clear. Then I start the capture in another session.



Session 2:
</s>arecord -Dhw:CARD=ArrayUAC10 mic.wav -c 6 -r 48000 -f S16_LE -vv<e>
Stdout:
</s><i> </i>Recording WAVE 'mic.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 6 Hardware PCM card 1 'ReSpeaker 4 Mic Array (UAC1.0)' device 0 subdevice 0 Its setup is: stream : CAPTURE access : RW_INTERLEAVED format : S16_LE subformat : STD channels : 6 rate : 48000 exact rate : 48000 (48000/1) msbits : 16 buffer_size : 24000 period_size : 6000 period_time : 125000 tstamp_mode : NONE tstamp_type : MONOTONIC period_step : 1 avail_min : 6000 period_event : 0 start_threshold : 1 stop_threshold : 24000 silence_threshold: 0 silence_size : 0 boundary : 1572864000 appl_ptr : 0 hw_ptr : 0 <e>

Audio is degraded as soon as arecord starts capturing.



See the Audacity capture: https://we.tl/t-xi2h2gJpZZ



I have no clue why… Hope this helps your software engineers identify the cause. Tell me if you need more info.



Thanks again for your help.

Hi there~



The software team looked at the audio performance downgrade @ 48Khz. We do not have a good solution to solve this issue. thanks for understanding.

Hey!

Thank you for your feedback.



I understand completely that it can be challenging and you guys have already made an awesome product. This audio quality issue is the only thing preventing it from being perfect.



Instead of fixing the 48Khz 6 channels firmware audio downgrade issue, do you think you can improve the AEC of the 48Khz 1 channel firmware which is really bad comparing to the 16khz one (which is excellent) and fix the audio artifacts? The 1 channel 48Khz firmware does not have this audio downgrade issue (only strange audio artifacts which may be related). The AEC is almost inexistant though.



Is using 44100 Hz instead of 48000 Hz possible? Using 16bits instead of 24bits? Would it help?



Thanks for you help :slight_smile: Really appreciate your support

Hi, I wanted to ask something different. So, for AEC we need to use Respeaker both as source & sink. Both of them have 16kHz sampling rate, but sink has 24bit depth while source has 16bit depth.

Do you think difference in the bitrate will cause problems from the AEC perspective. I am trying to equalize them, no success so far :slight_smile: