noise suppression with librespeaker in RespeakerCorev2

Noise Suppression doesn’t work



Hi!
I've tried again using a fast sd card (sundisk 128Gb V30 U1 A2) but still the same error.
What about if I create an image of my respeaker core v2 and I send to you for testing?
how can I send you such a big image?

Thanks!

Hi Cinettoa,



You can upload your image to Google Drive. But one more thing need to confirm before that, check if you are using the latest image downloaded from here: <LINK_TEXT text=“https://v2.fangcloud.com/share/7395fd13 … 14&lang=en”>https://v2.fangcloud.com/share/7395fd138a1cab496fd4792fe5?folder_id=188000311814&lang=en</LINK_TEXT>



Thanks.

Hi,

I confirm that I am using the respeaker-debian-9-lxqt-sd-20180801-4gb.img which should be the last.

Tomorrow I upload the image of my sd card and send you the link of my google drive.

Hi, I’ve sent you the google drive link via private messaging.

Please let me know your findings

Hi Jerry,

Do you have any feedback from your testing?

I tried to compile and link the code, but there are references to header files that do not exist in the librespeaker-dev include files, e.g. “vep_doa_kws_node.h”.



If I download the missing header files from searching on the net, it compiles but does not link:



/tmp/ccaZ9AuP.o: In function main':<br/> main_vep_nr_test.cc:(.text+0x482): undefined reference to respeaker::VepAecBeamformingNode::Create(int, bool)’

main_vep_nr_test.cc:(.text+0x504): undefined reference to `respeaker::VepDoaKwsNode::Create(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, int, bool, bool)’

collect2: error: ld returned 1 exit status



librespeaker Version: 2.1.1-build181119

Hi,

I’ve seen the new release 2.1.1 and I’ve tried to compile the new examples. It seems that the librespeaker-dev package doesn’t install the header files. So I’ve downloaded them (from librespeaker website one by one with copy and paste…)

then put respeaker.h in /usr/include/respeaker/

and all the other headers in /usr/include/respeaker/chain_nodes .



Then you need to modify the examples because they contain the wrong path to the various models for hotwords (librespeaker package actually install those models but the examples contains a wrong path).



Then you can compile with g++ as usual.



However I would like to say that it is pretty disappointing that nobody from Respeaker is following this forum since one month.

Hi Cinettoa,



I apologize for my late reply. I am very sorry about this.



Does this new librespeaker2.1.1 work on your board?

For the header files problems, I think you have to remove “/usr/local/lib/pkgconfig/respeaker.pc” manually.



Sincere apology,

Jerry

Hi Jerry,

Thanks for your reply.



I’ve just finished to test the new Librespeaker 2.1.1 and seems like the issue for the NS module is fixed.



Thanks for your effort to improve the product.



Now I have to find out the way to send the output to Snips or to another ASR but I guess I have to try out the respeakerd code for this.



If you have any additional suggestion you are more than welcome!

How do you think that if we capture audio via alsa api and redirect the processed audio stream with alsa-aloop? Then snips or other ASRs can capture the processed audio stream easily.

Hi Jerry,



This way could be god. the more general the best so to adapt to future available ASR API.



Are you going to develop a new respeaker release for this?

Yes, this is what we are doing.

sounds good! do you a timeline for the new release?

Hi,



The new release is still need some time to go, but I can send you a test version, you can download from this link:

<LINK_TEXT text=“https://v2.fangcloud.com/share/11a20219 … f8?lang=en”>https://v2.fangcloud.com/share/11a20219f73655959b42aa57f8?lang=en</LINK_TEXT>



Install the lib:

tar -jxvf librespeaker.tar.bz2

sudo dpkg -i librespeaker_2.1.2-build181212_armhf.deb

sudo dpkg -i librespeaker-dev_2.1.2-build181212_armhf.deb



Compile the example:

g++ alsa_aloop_test.cc -o alsa_aloop_test -lrespeaker -lsndfile -fPIC -std=c++11 -fpermissive -I/usr/include/respeaker/ -DWEBRTC_LINUX -DWEBRTC_POSIX -DWEBRTC_NS_FLOAT -DWEBRTC_APM_DEBUG_DUMP=0 -DWEBRTC_INTELLIGIBILITY_ENHANCER=0



/** @example alsa_aloop_test.cc

  • This is an example that shows how to redirect the processed audio stream into a specific Alsa device(Loopback PCM).
  • To run this example, you have to run ‘sudo modprobe snd-aloop’ first. And make sure “pulseaudio” doesn’t start, then
  • you can use “arecord -Dhw:Loopback,1,0 -c 1 -r 16000 -f S16_LE loop_test.wav” to arecord the processed audio stream.
  • Further more, you can setup a third party voice assistant to capture voice from “hw:Loopback,1,0”, to run the assistant directly.

    */



    Note that the header files are located in /usr/include/respeaker, and you can find how to use aloop_node in “aloop_output_node.h”. The other APIs are the same as

    the previous version.

    And the config files are in /usr/share/respeaker



    Feel free to ask me if you have any question.



    Thanks,

    Jerry

Hi,



Thanks for this new aloop node.

This is very useful.



I got it to work with your example.

But if you let it run for sometimes, ultimately you get a long list of “underrun” error like the following ones and it stops recording at that point.



(145514ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

(145986ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]





Any idea on how to fix this?

Thanks

“underrun” means the aloop node doesn’t send the audio data to the loopback buffer on time.

If you got a long list of this, it was probably because the previous nodes didn’t send the audio data to aloop node. I think something stuck “vep node” at that time. You can monitor the CPU usage with “htop” and you will find 2 threads of this example rise to 100% before “underrun” happens. We are trying to avoid this “stuck”, and this example will still work after “underrun” happened.



Thanks,

Jerry

Thanks for the reply.

Using htop helps the monitoring indeed.



I also tried to adapt the example as I don’t need wakeup word detection but instead I look for a clean audio while capturing it live before sending it to a ASR API.

So I linked the nodes: alsa -> hybrid(agc) -> hybrid(ns) -> hybrid (vad) -> vep_1beam -> selector -> aloop

Based on my tests, it seems to be the best combination but maybe you can recommend a better one based on your experience.

I used the latest version 2.1.2 of the library and could not compile successfully with the Hybrid Create function so the multiple individual hybrid nodes:

“… undefined reference to `respeaker::HybridNode::Create(bool, int, int, int, bool, int, bool)”



In any case, after few minutes, the program generates the underrun messages and never stop doing it.

I can still record indeed but miss lots of blocks and the result is not very good any more.



Would you have any suggestions to improve my use cases?

Let me know if you need more details too.



Thanks you very much for your help,

Fred

Hi,



Thanks for the feedback.

Let me explain these nodes in details:

(1)hybrid_node: this node provides NS, AGC, VAD algorithms from WebRTC lib, not the Alango, so we are more recommended to use the vep_node. What is more, you can enable NS, AGC, VAD in the same instance in your case, for examples: </s>hybrid.reset(HybridNode::Create((bool)enable_ns, (int)ns_level, (int)agc_type, (int)agc_level, (bool)enable_vad, (int)vad_sensitivity,true));<e>
As the more nodes you have, the more latency you will get.

(2)vep_node: this node also provides: NS, BF, AEC algorithms from Alango, and these algorithms are better than the hybrid_node algorithms

(3)snowboy_1b_node: this node not only provides KWS, but also provides AGC and VAD. You can call respeaker->GetVad() to get VAD status(Please see pulse_snowboy_1b_multi_hotword_vad_test.cc for more detail).



Back to your case, I still recommend to use the example I provided: alsa->vep->snowboy->aloop, you can just ignore the KWS part.

And can you send me a debug log from your program when the endless “underrun” happen?



Thanks,

Jerry.

Thanks

That was very useful information which helped me a lot.



So I went back to your example but still have the “underrun” issue.

I just realized that it starts when i use aplay (or play) to play the recording from another SSH session.



Here is the use case:

  • On a first SSH session, I run your example as following:

    $ sudo ./alsa_aloop_test -t CIRCULAR_6MIC_7BEAM -g 5

    using snowboy kws

    AGC = -5

    (48ms)INFO – Sample rate is 48000 [alsa_collector_node.cc:337]

    (48ms)DEBUG – Buffer time max is 341334 [alsa_collector_node.cc:355]

    (196ms)DEBUG – Chunk size is 1920 [alsa_collector_node.cc:396]

    (196ms)INFO – Finish setting Alsa hardware params. [alsa_collector_node.cc:399]

    (206ms)DEBUG – VepAecBeamformingNode read microphone config file successfully. [vep_aec_beamforming_node.cc:181]

    (207ms)DEBUG – VEP: I need 604248 bytes of memory in memory region 1 to work.

    [vep_aec_beamforming_node.cc:207]

    (207ms)DEBUG – VEP: I need 0 bytes of memory in memory region 2 to work.

    [vep_aec_beamforming_node.cc:207]

    (211ms)DEBUG – VepAecBeamformingNode input: channels 8 rate 16000, output: channels 3 rate 16000 [vep_aec_beamforming_node.cc:340]

    (212ms)INFO – VepAecBeamformingNode thread started. [vep_aec_beamforming_node.cc:342]

    (238ms)INFO – Snowboy1bDoaKwsNode enabled AGC. [snowboy_1b_doa_kws_node.cc:276]

    (238ms)DEBUG – Snowboy1bDoaKwsNode input: channels 3 rate 16000, output: channels 1 rate 16000 [snowboy_1b_doa_kws_node.cc:291]

    (238ms)INFO – Snowboy1bDoaKwsNode thread started. [snowboy_1b_doa_kws_node.cc:293]

    (241ms)INFO – Finish setting Aloop [aloop_output_node.cc:225]

    (242ms)INFO – MaxBlockDelayTime is actually set to: 6 * 40ms = 240ms. [aloop_output_node.cc:343]

    num channels: 1, rate: 16000

    collector: 1, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    collector: 2, vep_1beam: 0, snowboy_kws: 0, aloop: 0




  • On a second SSH session, I record as following:

    $ sox -t alsa hw:Loopback,1,0 -t wavpcm -c 1 -b 16 -r 16000 -e signed-integer --endian little - silence 1 0.1 1% 1 1.0 3% > aloop_test.wav



    Everything works fine so I play it to listen to it:

    $ aplay aloop_test.wav


  • Then on the screen of the first session, it starts displaying the following underrun message non-stop and I can not record any more:

    (17988ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    (18473ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 3, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (18960ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 0, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (19432ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 1, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (19907ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 0, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (20391ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    (20867ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 1, vep_1beam: 1, snowboy_kws: 0, aloop: 0

    (21341ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 0, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (21821ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 1, vep_1beam: 0, snowboy_kws: 0, aloop: 0

    (22296ms)DEBUG – Try to recovery from underrun [aloop_output_node.cc:277]

    collector: 0, vep_1beam: 0, snowboy_kws: 0, aloop: 0



    Am I doing anything wrong?



    Thanks,

    Fred