Bluefruit scanner BLE Callback hanging after 5 minutes

Hi All

I have a XIAO nRF acting as a BLE scanner using Arduino Bluefruit library, picking up sensor data from multiple other XIAO nRFs.

All works fine for 4-5 minutes, then the BLE scan_callback stops firing.
Bluefruit.Scanner.setRxCallback(scan_callback);

I have a watch function that checks every 10 seconds to confirm that the callback counter is being incremented, if not I try to restart the Bluefruit scanner service.

A call to Bluefruit.Scanner.resume() doesn’t work, and calling Stop() then Start(0) also doesn’t restart the callbacks.

Doing the full Bluefruit.begin… set up again causes a device reboot.

I could update the watch function to force a reboot but that feel a bit like going straight for the hammer.

Any ideas on why a callback would stop getting called?

Thanks
Dave

edit: possibly related to the following, although they used Arduino BLE not Bluefruit

Hi there,
You sure it’s NOT running out of ram used by the SD when switching peripherals?
You try just one to one for more time?
Without seeing the code, it’s a guess. Do you BLE.END anywhere?
HTH
GL :slight_smile: PJ

Hi there

SD=SoftDevice. I am not connecting to the other devices, I am in Observer mode so reading the advertisements in ACTIVE scan.

Does Bluefruit have a BLE.end command? I have used the Bluefruit.Scanner.stop() command to no effect.

It’s more to the point of have you had issues with long running scans with the XIAO?

The code is a bit big to post, happy to DM it to you to run locally if you would like to have a look.

Dave

Hi there,
Yes, I have seen these issues pop up occasionally only scanning though. seems others have had it stop unexpectedly b4 also.
BLE.end or Advertise.stop ? not sure on Bluefruit, I know ArduinoBLE does.
How is it powered? Start with that being the first look.
Are you looking at the Manufacturer.Data changing in the scans to pass the data?
maybe post up the BLE SETUP part only?
HTH
GL :slight_smile: PJ :v:

Thanks,

Not sure if I am running out of memory or if it is a timing issue. I haven’t managed to get my jlink debug running yet, but that is a story for another post.

Powered via USB for testing, has a regulated 5v supply in the field, not battery powered.

Setup calls the startScanner function which contains the standard Bluefruit settings for an Observer.

The callback needs to run as fast as possible so the only function there is to move the received packet into a queue for later processing, increment the counter, then end.

Loop firstly checks if we have any new scans from the callback to process, dequeue, then process.
Secondly we check those processed scans to link the scans and responses together as a single object, some sensors store data in the scan others store it in the response. The packet is then forwarded to the packetCheck function for decode the Manufactorer Data depending on the sensor type.

Lastly the loop checks every 10 seconds if the callback Count has increased, there are multple sensors we should be catching multiple scans each second.

There are other files managing decoding the different sensor types, global settings, and serializing to JSON to send to the data logger.


void setup()
{
#ifdef DEBUG_LOCAL
  Serial.begin(115200);
#endif
  Serial1.begin(57600); // Serial1 to send data to the Logger ESP32
  //   delay(10); // for nrf52840 with native usb

  DEBUG_PRINTLN("BLE Scan to Serial");
  DEBUG_PRINTLN("------------------------------------\n");
  startScanner();
}

void loop()
{

  if (callbackQueueItemCount() > 0)
  {
    // DEBUG_PRINT("Callback Queue Depth: ");
    // DEBUG_PRINT(callbackQueueItemCount());
    // DEBUG_PRINTLN();
    ble_gap_evt_adv_report_t report = callbackDequeue();
    scan_callback_process(&report);
    return;
  }

  // uint32_t startTime = micros();
  // Process packets from the queue
  // only process if there are 2 or more packets in queue
  if (queueItemCount() >= 2)
  {


    BlePacket sensorBLEPacket;

    BlePacket packet = dequeuePacket();
    BlePacket packet2 = dequeuePacket();
    {
      if (packet.ScanResponse == 0 && packet2.ScanResponse == 0)
      { // process packet and return packet2 to the start of the queue
        // DEBUG_PRINT("Scan & Scan   ");

        if (packet.deviceNameLength > 0)
        {
          sensorBLEPacket = packet;
        }

        queueInsertStart(packet2);
        memset(&packet2, 0, sizeof(packet2));
      }
      else if (packet.ScanResponse == 0 && packet2.ScanResponse == 1)
      { // merge packet and packer 2 then process
        // DEBUG_PRINT("Scan & Response  ");
        if (packet.AddressString == packet2.AddressString)
        {
          if (packet.deviceNameLength > 0 || packet2.deviceNameLength > 0) // Or
          {
            sensorBLEPacket = mergePacket(packet, packet2);
          }
          else
          {
            sensorBLEPacket = emptyPacket; // Clear the packet
          }
        }
        else
        {
          sensorBLEPacket = packet;
        }
      }
      else
      { // drop packet and return packet2 to the start of the queue
        sensorBLEPacket = emptyPacket; // Clear the packet
      }

      if (sensorBLEPacket.deviceNameLength > 0)
      {
        packetCheck(sensorBLEPacket);
      }

      memset(&packet, 0, sizeof(packet));
      memset(&packet2, 0, sizeof(packet2));
      memset(&sensorBLEPacket, 0, sizeof(sensorBLEPacket));
    }
  }
  else
  {
    uint32_t checkTime = millis();
    if (checkTime == lastCheckTime)
    {
      return;
    }
    lastCheckTime = checkTime;
    if (checkTime % 10000 == 0) // Every 10 seconds
    {
      DEBUG_PRINT("Total Time :");
      DEBUG_PRINT(millis());
      DEBUG_PRINT("  Queue Depth: ");
      DEBUG_PRINT(queueItemCount());
      DEBUG_PRINT("  Callback Queue Depth: ");
      DEBUG_PRINT(callbackQueueItemCount());
      DEBUG_PRINT("  Callback Count: ");
      DEBUG_PRINT(callbackCount);
      DEBUG_PRINT("  Scanner Running: ");
      DEBUG_PRINT(Bluefruit.Scanner.isRunning());
      DEBUG_PRINTLN();
      if (callbackLastCount == callbackCount)
      {
        DEBUG_PRINTLN("*** Scanner Stopped - Restarting ***");
        startScanner();
      }
      callbackLastCount = callbackCount;
    }
  }
}

void startScanner()
{
  // Initialize Bluefruit with maximum connections as Peripheral = 0, Central = 1
  // SRAM usage required by SoftDevice will increase dramatically with number of connections
  Bluefruit.begin(0, 1);
  Bluefruit.setTxPower(0); // Check bluefruit.h for supported values

  /* Set the device name */
  Bluefruit.setName("Bluefruit52");

  /* Set the LED interval for blinky pattern on BLUE LED */
  Bluefruit.setConnLedInterval(250); // Only for testing
  // Bluefruit.setConnLedInterval(false);  // Turn off in Prod

  /* Start Central Scanning
   * - Enable auto scan if disconnected
   * - Filter out packet with a min rssi
   * - Interval = 100 ms, window = 50 ms
   * - Use active scan (used to retrieve the optional scan response adv packet)
   * - Start(0) = will scan forever since no timeout is given
   */
  Bluefruit.Scanner.setRxCallback(scan_callback);
  Bluefruit.Scanner.restartOnDisconnect(true);
  // Bluefruit.Scanner.filterMSD(0xFFFF);          // Filter on Manufacturer Specific Data
  // Bluefruit.Scanner.filterRssi(-80);
  // Bluefruit.Scanner.setInterval(160, 80); // in units of 0.625 ms
  Bluefruit.Scanner.setIntervalMS(100, 100);
  Bluefruit.Scanner.useActiveScan(true); // Request scan response data
  Bluefruit.Scanner.start(0);            // 0 = Don't stop scanning after n seconds

  DEBUG_PRINTLN("Scanning ...");
}

void scan_callback(ble_gap_evt_adv_report_t *report)
{
  // Add the packet to the queue
  callbackQueueInsert(report);

  callbackCount++;

  Bluefruit.Scanner.resume();
}


void callbackQueueInsert(ble_gap_evt_adv_report_t *report)
{
  // Check if the queue is full
  if ((callbackQueueEnd + 1) % MAX_QUEUE_SIZE == callbackQueueStart)
  {
    DEBUG_PRINT("Callback Queue Max Depth: ");
    DEBUG_PRINT(callbackQueueItemCount());
    DEBUG_PRINTLN();
    // Queue is full, handle overflow (e.g., discard oldest packet)
    callbackDequeue();
  }

  // Add the packet to the end of the queue
  callbackQueue[callbackQueueEnd] = *report;
  callbackQueueEnd = (callbackQueueEnd + 1) % MAX_QUEUE_SIZE;
}

Hi there,
Ok, the flow sound fine, I would look at the que first , either specify a size and make it small on purpose, see if the scan stalls again earlier. Then make it twice the size.
What transmit power level is set at?
So with an Active scan , you request additional info from the AP, any ap can respond if you have more than one, Do you have a BLE sniffer available? (nrf BLE dongle and wireshark) are easy to see if it’s the BLE stalling or MCU locking up.
I think it’s the former. Is the responder ready to give data after an Active scan request?
try it with a passive scan, don’t take any data and see if it last longer than 10 minutes?
HTH
GL :slight_smile: PJ
You may want to start with active scan , then switch to passive scan?
I use this code to test being able to read the changing (every 30 seconds)'ish Manufacturer data in the scans with Nrf connect for desktop and a nordic $9 dongle. YMMV :v:
repeats it out of the serial port also.

#include <ArduinoBLE.h>

//#define APP_COMPANY_IDENTIFIER 0x0059
#define BUTTON_PIN 1
#define MAX_ADVERTISEMENTS 10
#define ADVERTISEMENT_INTERVAL 30000 // 30 seconds in milliseconds

BLEAdvertisingData advertisingData;
BLEAdvertisingData scanResponseData;
bool advertisingEnabled = true;
int advertisementNumber = 1;

void setup() {
  Serial.begin(9600);
  while (!Serial)
    delay(100); // for nrf52840 with native usb
  delay(1000);
  Serial.println("Bluetooth advertising Example");
  Serial.println("------------------------------\n");
  pinMode(BUTTON_PIN, INPUT_PULLUP);

  // Initialize BLE
  if (!BLE.begin()) {
    Serial.println("Failed to initialize BLE!");
    while (1);
  }

  // Set device name
  BLE.setLocalName("MyDevice");

  // Set initial manufacturer data
  uint8_t initialManufacturerData[] = {0x4E, 0x65, 0x77, 0x01}; // "New1" in Hex
  advertisingData.setManufacturerData(initialManufacturerData, sizeof(initialManufacturerData));

  // Start advertising
  BLE.setAdvertisingData(advertisingData);
  BLE.advertise();
  Serial.println("Advertising started...");
}

void loop() {
  static unsigned long lastUpdateTime = 0;
  static int counter = 1;
// Check if the button is pressed to stop advertising
  if (digitalRead(BUTTON_PIN) == LOW) {
    advertisingEnabled = false;
    BLE.stopAdvertise(); // Stop advertising
    Serial.println("Advertising stopped by button press");
  }
  // Check if 30 seconds have elapsed
  if (millis() - lastUpdateTime >= ADVERTISEMENT_INTERVAL) {
    counter++;
    if (advertisingEnabled && millis() % ADVERTISEMENT_INTERVAL == 0) {
    advertisementNumber++;
    if (counter > MAX_ADVERTISEMENTS) {
     return;
      counter = 1;
    }
    }
    updateManufacturerData(counter);
    lastUpdateTime = millis();
    Serial.print("Updated manufacturer data to: New");
    Serial.println(counter);
  }

  delay(100);
}

void updateManufacturerData(int counter) {
  uint8_t manufacturerData[] = {
    0x4E, 0x65, 0x77, // "New" in ASCII
    '0' + counter      // Convert counter to ASCII
  };

  // Set manufacturer data for advertisingData object
  advertisingData.setManufacturerData(manufacturerData, sizeof(manufacturerData));
  BLE.setAdvertisingData(advertisingData);
  BLE.advertise();
}

Thanks for the feedback.

The Queue defaults to MAX 10, but I have never seen more than 2 in there at a time, the loop runs much faster than the callbacks so clears it quickly. I have set it to MAX=3 for testing.

I have commented out the packetCheck function, so none of the post processing will run, to check the callback is not getting bogged down with the other processing.

      if (sensorBLEPacket.deviceNameLength > 0)
      {
        // packetCheck(sensorBLEPacket);
      }

It lasted 10 minutes but then still hung, so the issue is somewhere is the code I shared.

Power is set to Bluefruit.setTxPower(0);

I have the nrf dongle so I will load up Wireshark to sniff the BLE traffic.

I think that the BLE is stalling first, then left long enough the MCU hangs, the USB port hangs and the blue led stops flashing.

1 Like

Hi there,
Yes , I agree. Your close , You’ll get it.
You may want to change it to the callback just set’s a flag and the loop does the post stuff, kind of polling itself for the flag to set. Should be plenty fast enough though.
Good Stuff
GL :slight_smile: PJ :v:

also try setting the TXpower to 4 .

Found it, and it was not what I was expecting. Running for 3 hours + and still going.

In the Loop I was cleaning up the packet values using memset() once finished.

      if (sensorBLEPacket.deviceNameLength > 0)
      {
        packetCheck(sensorBLEPacket);
      }

      memset(&packet, 0, sizeof(packet));
      memset(&packet2, 0, sizeof(packet2));
      memset(&sensorBLEPacket, 0, sizeof(sensorBLEPacket));

Replacing memset() with a copy of an empty packet value solved the problem.

      packet = emptyPacket;          // Clear the packet
      packet2 = emptyPacket;         // Clear the packet
      sensorBLEPacket = emptyPacket; // Clear the packet

A better programmer than me will need to explain why this worked, but it does.

Thanks for your help.

I will create a new post explaining my jlink issues, if would be great if you have any thoughts on that.

Thanks
Dave

Would you beleave it… about 2 minutes after my last post it hung.

It ran for 3:18 hours so that is a win…

Hi there,
No Kidding…LOL man sometimes this stuff… YIKES :face_with_peeking_eye:
well this is a great project for some PlatformIO with it superior debugging.
If you have J-link then that advances the cause easily. I have several posts on j-link etc.
check those on here, lots of screenshots from j-link commander , et’al.
Considering the deep :nerd_face:meaning of memset, ;

memset() function is used to fill a block of memory with a particular value either 0 or 1.
It sets the destination value with the given value. Before copying the character, it first converts the signed character into the unsigned character. This function generally increases the readability of the program.
I think the answer may be in the highlight, I ponder
HTH
GL :slight_smile: PJ

Found the issue.

Using the Arduino “String” data type can be problematic for longer running programs as it can cause heap fragmentation.

This looks to have been my issue when generating the JSON output, where each result was dynamicly sized.

When using the String datatype my device would hang after 3 hours, removing the String datatype and replacing it with fixed Char array types resulted in the device running 36+hours

A good explanation can be found here.

1 Like