I was able to run voice activity detection using the ReSpeaker_4_Mic_Array_for_Raspberry_Pi (circular array) and the code provided in the mic_array repository.
Is there any available code to run voice activity detection with the linear mic array? Using the code for the circular mic array did not produce nice results.
Hi there~
Please follow below instructions.
DOA without Keywords
Step 1. Setup the dependency
</s>sudo apt-get install portaudio19-dev
sudo pip install pyaudio
sudo pip install webrtcvad
sudo apt-get install python-numpy
sudo pip install pyusb
<e>
Step 2. Run the vad_doa.py
</s>cd ~
git clone https://github.com/respeaker/mic_array.git
cd mic_array
nano vad_doa.py
#change to below code, then run python vad_doa.py <e>
Here is the vad_doa.py code, which disable the doa.
[code]
import sys
import webrtcvad
import numpy as np
from mic_array import MicArray
#from pixel_ring import pixel_ring
RATE = 16000
CHANNELS = 8
VAD_FRAMES = 10 # ms
#DOA_FRAMES = 200 # ms
def main():
vad = webrtcvad.Vad(3)
speech_count = 0
chunks = []
#doa_chunks = int(DOA_FRAMES / VAD_FRAMES)
try:
with MicArray(RATE, CHANNELS, RATE * VAD_FRAMES / 1000) as mic:
for chunk in mic.read_chunks():
# Use single channel audio to detect voice activity
if vad.is_speech(chunk[0:].tobytes(), RATE):
speech_count += 1
sys.stdout.write('1')
else:
sys.stdout.write('0')
sys.stdout.flush()
chunks.append(chunk)
#if len(chunks) == doa_chunks:
#if speech_count > (doa_chunks / 2):
#frames = np.concatenate(chunks)
#direction = mic.get_direction(frames)
#pixel_ring.set_direction(direction)
#print('\n{}'.format(int(direction)))
#speech_count = 0
#chunks = []
except KeyboardInterrupt:
pass
pixel_ring.off()
if name == ‘main’:
main()
[/code]
Run the vad python code.
</s>pi@raspberrypi:~/mic_array $ python vad_doa.py
Expression 'alsa_snd_pcm_hw_params_set_period_size_near( pcm, hwParams, &alsaPeriodFrames, &dir )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 924
Expression 'alsa_snd_pcm_hw_params_set_period_size_near( pcm, hwParams, &alsaPeriodFrames, &dir )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 924
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'defaults.bluealsa.device'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'defaults.bluealsa.device'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4996:(snd_config_expand) Args evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM bluealsa
ALSA lib pcm_dmix.c:990:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
(0, 'bcm2835 ALSA: IEC958/HDMI (hw:0,1)', 0L, 2L)
(1, 'seeed-8mic-voicecard: - (hw:1,0)', 8L, 8L)
Use seeed-8mic-voicecard: - (hw:1,0)
111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111000000000000000000000000000000000000000000011111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111100000000000000000000000000011111111111111111101111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000pi@raspberrypi:~/mic_array $<e>
Thanks for the response, but my problem was less about the ability to produce data and more about how the results don’t seem meaningful. At best this implementation of VAD is more like noise detection using a threshold, but has no processing / filtering for actual speech. In addition, DOA should only show 180 degrees for a linear mic array, instead of 360 degrees, but clearly the code is intended for the circular mic array.
My question was more about a way to tune the code to match the hardware rather than to get the provided code to run.
Some more details:
I am testing speech detection in a wall mounted application using a few different products from ReSpeaker. The ReSpeaker Mic Array v2.0 includes a chip from XMOS that has algorithms built in for VAD and DOA. These algorithms seem to perform better than the software algorithms as discussed here, but the circular array doesn’t necessarily make sense in a wall mounted application so I wanted to test the accuracy of the software algorithms using the ReSpeaker 4-Mic Linear Array Kit for Raspberry Pi. All provided algorithms seem geared towards use for ReSpeaker 4 Mic Array for Raspberry Pi (pi-hat circular array). I would expect that the software algorithms would need to take into account the mic orientation, mic separation, etc – but haven’t seen a different code base that takes this into account.
Let me know if anybody has tested the ReSpeaker 4-Mic Linear Array Kit for Raspberry Pi and has produced promising results.
A bit late to the party.
The XMOS chip can work with a linear array, but it would have to be designed an laid out. Seeed can provide this service if you commit to ordering and paying for the customization. Essentially we have a minimal price for this kind of service. If you want to know more you can toss an email to me at [email protected].
The algorithms you get for VAD on the Pi Shields, if they are present, are based on software.
The algorithms on the XMOS are hardware based. I’ve not the solution, but just to clarify that.
I am pushing this to the software team to ask them to look at this again, there should be a DOA program for the linear array - I’m pretty sure I saw it working before. However, have you tested to see if a voice at about 15° also generates 345°? If so then it may just return Either (x°) Or (360° - x°). If this is the case it’s easy to solve on your end, though obviously not ideal.