Detecting specific sounds, not voices

I’m currently waiting for my hardware, But I thought I’d open the topic since I can’t find much documentation on the topic.

I want to track the direction of loud sounds with the “ReSpeaker 6 Microphone Circular Array Add On Board for Raspberry Pi”. I want to first classify the sound and if it meets the set criteria for the situation I’d like to stream the sound byte with 30 second before the sound and 30 seconds after the sound.

What is the best way to approach this?
The environment is basically a workshop where nothing is allowed to be dropped. So there will be a noise floor that needs to be ignored.