Voice Activity Detection (VAD) on ReSpeaker_4-Mic_Linear_Array

I was able to run voice activity detection using the ReSpeaker_4_Mic_Array_for_Raspberry_Pi (circular array) and the code provided in the mic_array repository.



Is there any available code to run voice activity detection with the linear mic array? Using the code for the circular mic array did not produce nice results.

Hi there~



Please follow below instructions.



DOA without Keywords



Step 1. Setup the dependency

</s>sudo apt-get install portaudio19-dev sudo pip install pyaudio sudo pip install webrtcvad sudo apt-get install python-numpy sudo pip install pyusb <e>

Step 2. Run the vad_doa.py

</s>cd ~ git clone https://github.com/respeaker/mic_array.git cd mic_array nano vad_doa.py #change to below code, then run python vad_doa.py <e>

Here is the vad_doa.py code, which disable the doa.

[code]
import sys
import webrtcvad
import numpy as np
from mic_array import MicArray
#from pixel_ring import pixel_ring

RATE = 16000
CHANNELS = 8
VAD_FRAMES = 10 # ms
#DOA_FRAMES = 200 # ms

def main():
vad = webrtcvad.Vad(3)

speech_count = 0
chunks = []
#doa_chunks = int(DOA_FRAMES / VAD_FRAMES)

try:
    with MicArray(RATE, CHANNELS, RATE * VAD_FRAMES / 1000)  as mic:
        for chunk in mic.read_chunks():
            # Use single channel audio to detect voice activity
            if vad.is_speech(chunk[0:].tobytes(), RATE):
                speech_count += 1
                sys.stdout.write('1')
            else:
                sys.stdout.write('0')

            sys.stdout.flush()

            chunks.append(chunk)
            #if len(chunks) == doa_chunks:
                #if speech_count > (doa_chunks / 2):
                    #frames = np.concatenate(chunks)
                    #direction = mic.get_direction(frames)
                    #pixel_ring.set_direction(direction)
                    #print('\n{}'.format(int(direction)))

                #speech_count = 0
                #chunks = []

except KeyboardInterrupt:
    pass

pixel_ring.off()

if name == ‘main’:
main()
[/code]

Run the vad python code.

</s>pi@raspberrypi:~/mic_array $ python vad_doa.py Expression 'alsa_snd_pcm_hw_params_set_period_size_near( pcm, hwParams, &alsaPeriodFrames, &dir )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 924 Expression 'alsa_snd_pcm_hw_params_set_period_size_near( pcm, hwParams, &alsaPeriodFrames, &dir )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 924 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958 ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'defaults.bluealsa.device' ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'defaults.bluealsa.device' ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa ALSA lib pcm_dmix.c:990:(snd_pcm_dmix_open) The dmix plugin supports only playback stream Cannot connect to server socket err = No such file or directory Cannot connect to server request channel jack server is not running or cannot be started JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock (0, 'bcm2835 ALSA: IEC958/HDMI (hw:0,1)', 0L, 2L) (1, 'seeed-8mic-voicecard: - (hw:1,0)', 8L, 8L) Use seeed-8mic-voicecard: - (hw:1,0) 111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111000000000000000000000000000000000000000000011111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111100000000000000000000000000011111111111111111101111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000pi@raspberrypi:~/mic_array $<e>

Thanks for the response, but my problem was less about the ability to produce data and more about how the results don’t seem meaningful. At best this implementation of VAD is more like noise detection using a threshold, but has no processing / filtering for actual speech. In addition, DOA should only show 180 degrees for a linear mic array, instead of 360 degrees, but clearly the code is intended for the circular mic array.



My question was more about a way to tune the code to match the hardware rather than to get the provided code to run.





Some more details:



I am testing speech detection in a wall mounted application using a few different products from ReSpeaker. The ReSpeaker Mic Array v2.0 includes a chip from XMOS that has algorithms built in for VAD and DOA. These algorithms seem to perform better than the software algorithms as discussed here, but the circular array doesn’t necessarily make sense in a wall mounted application so I wanted to test the accuracy of the software algorithms using the ReSpeaker 4-Mic Linear Array Kit for Raspberry Pi. All provided algorithms seem geared towards use for ReSpeaker 4 Mic Array for Raspberry Pi (pi-hat circular array). I would expect that the software algorithms would need to take into account the mic orientation, mic separation, etc – but haven’t seen a different code base that takes this into account.



Let me know if anybody has tested the ReSpeaker 4-Mic Linear Array Kit for Raspberry Pi and has produced promising results.

A bit late to the party.



The XMOS chip can work with a linear array, but it would have to be designed an laid out. Seeed can provide this service if you commit to ordering and paying for the customization. Essentially we have a minimal price for this kind of service. If you want to know more you can toss an email to me at seth.welday@seeed.cc.



The algorithms you get for VAD on the Pi Shields, if they are present, are based on software.

The algorithms on the XMOS are hardware based. I’ve not the solution, but just to clarify that.

I am pushing this to the software team to ask them to look at this again, there should be a DOA program for the linear array - I’m pretty sure I saw it working before. However, have you tested to see if a voice at about 15° also generates 345°? If so then it may just return Either (x°) Or (360° - x°). If this is the case it’s easy to solve on your end, though obviously not ideal.