Bus faults with xiao nrf54l15?

Has anyone had issues with bus faults on the xiao nrf54l15?

I was able to get my matter application working on the xiao nrf54l15, however, I was blocked for a chunk of time with bus faults. I happened to turn off the debug config and then my matter app was stable. I was then able to reproduce the bus faults with blinky and debug config enabled. I’m not sure if it is related to HW or SW. It seems to be more prevalent when I have debug config turned on. Right now, blinky with the config below is stable again, so it seems intermittent. I’m about to dive in with the debugger, unless someone already knows something about this.

Blinky config:

  • nrf connect and toolchain 3.1.0 (I plan to try 3.1.1 today)
  • board file from: GitHub platform-seeedboards/zephyr/boards/arm/xiao_nrf54l15
  • blinky zephyr sample

prj:

CONFIG_GPIO=y

# Other settings
CONFIG_MPU_STACK_GUARD=y
CONFIG_RESET_ON_FATAL_ERROR=n

CONFIG_ASSERT=y

CONFIG_DEBUG=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_OPTIMIZATIONS=y
CONFIG_DEBUG_THREAD_INFO=y
CONFIG_THREAD_NAME=y

CONFIG_LOG_MODE_IMMEDIATE=y
CONFIG_LOG=y
CONFIG_LOG_DEFAULT_LEVEL=4

Hi there,

So, I’m not getting any issue on 3.1.1 with Building, on 3.1.0. I was having cryptic errors, seems to have gone away or been fixed.
a NOTE , I did remove the other toolchains 2.8.0, 2.9.1

HTH
GL :slight_smile: PJ :v:

I moved forward to 3.1.1 and I still reproduce the issue. I also tested on two boards and built using two different systems and can still reproduce the issue. This mostly rules out a one-off xiao board issue or a system build issue.

My simple recipe is below if anyone has some bandwidth to try to reproduce. More background below that.

Config:

  • Plain non-sense xiao nrf54l15
  • nrf connect and tool chain 3.1.0 or 3.1.1

Steps

  • Create blinky app
  • Build and run with all defaults → should work
  • Add the debug config below to prj.conf → Should stop working
  • Create an overlay file with the snippet below → should work again

prj.conf snippet

CONFIG_MPU_STACK_GUARD=y
CONFIG_RESET_ON_FATAL_ERROR=n

CONFIG_ASSERT=y

CONFIG_DEBUG=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_OPTIMIZATIONS=y
CONFIG_DEBUG_THREAD_INFO=y
CONFIG_THREAD_NAME=y

CONFIG_LOG_MODE_IMMEDIATE=y
CONFIG_LOG=y
CONFIG_LOG_DEFAULT_LEVEL=4

xiao_nrf54l15_nrf54l15_cpuapp.overlay content

&uart21 {
    status = "disabled";
};

An interesting clue is that I can “fix” the problem by simply disabling uart21 in the .overlay file. I discovered this because my matter app suddenly started working after I had aligned the resulting xiao zephyr.dts to match the zephyr.dts produced by the nrf54l15dk, which I assume is a solid baseline (it actually ran on my xiao). I then backed out changes one by one until only the disabling of uart21 solved the issue.

Debugging with gdb seems to show memory corruption. Since adding and subtracting debug config and disabling/enabling uart21 change the issue, I’m guessing memory is moved into a convenient configuration. I’m leery to rely on that as a “fix”.

I’m using the plain, non-sense version of xiao, so that could be part of the issue. I noticed the board files provided by seeed are set up for the sense version. I disabled/deleted the non-applicable HW in my .overlay, but that didn’t make a difference.

I’m not making good headway getting to the root with the debugger. I’m might try to binary search my way through startup or step my way through it with the debugger. Maybe start with uart driver initialization.

I have a raytac board arriving tomorrow. It will be interesting to see if I can reproduce the problem there.

As I type I am wondering if enabling extra debug functionality somehow makes use of uart21 if it is enabled. Even then it shouldn’t cause a crash.

1 Like

Warning. There might be some risk trying this. Loading firmware seems a bit flaky in general. I often run into problems where openocd won’t flash firmware. J-link seems to be more consistent. I wonder if memory corruption might be the root of some of the flashing problems reported in the forum.

Hi there,

So I ran this past my rep’s engineer and confirmed it with the AI,
the general consensus is The "LOGGING is Notorious for faults , check the forum…as well.
But You’re likely seeing a debug+logging-induced timing/stack issue, not flaky HW. Several things in that prj.conf can push Zephyr over the edge on nRF54L15—especially CONFIG_LOG_MODE_IMMEDIATE=y, high log level, and small/default stacks. Immediate logging runs in the caller’s context (often ISR), which is notorious for tripping faults on Nordic targets when combined with heavy debug builds. Zephyr Project Documentation+2Nordic Semiconductor Docs+2

What I’d try (in this order)

  1. Go deferred logging
  • CONFIG_LOG=y
  • CONFIG_LOG_MODE_IMMEDIATE=n
  • CONFIG_LOG_PROCESS_THREAD=y
  • CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=10
  • Drop the global level to INFO while testing: CONFIG_LOG_DEFAULT_LEVEL=3
    Rationale: avoids ISR-context logging + reduces pressure. Zephyr Project Documentation
  1. Bump stacks (debug builds use more stack)
  • CONFIG_MAIN_STACK_SIZE=2048
  • CONFIG_ISR_STACK_SIZE=2048
  • CONFIG_IDLE_STACK_SIZE=512
  • Keep: CONFIG_STACK_SENTINEL=y (or your CONFIG_MPU_STACK_GUARD=y, but note alignment/size constraints and that guard can fault sooner with tight stacks). Zephyr Project Documentation
  1. Trim “heavy” debug toggles
  • Temporarily disable: CONFIG_DEBUG_THREAD_INFO, CONFIG_DEBUG_OPTIMIZATIONS, maybe even CONFIG_ASSERT just to confirm sensitivity—then re-enable one by one.
  1. Console backend sanity
  • If you’re using RTT: CONFIG_USE_SEGGER_RTT=y, CONFIG_LOG_BACKEND_RTT=y.
  • If UART: keep RTT off to avoid two backends fighting.
  1. Re-test on NCS 3.1.1
    3.1.1 does fix a handful of 3.1.0 paper cuts; several folks report better stability on 3.1.1 with the Xiao/54L15. (Your forum thread lines up with that trajectory.) Seeed Studio Forum
  2. If still flaky, A/B the guard
  • Try without CONFIG_MPU_STACK_GUARD to see if it’s just catching undersized stacks (then restore it after tuning). Guarded stacks on Cortex-M require stricter alignment/size and can surface marginal configs as “bus faults.” Zephyr Project Documentation

Why I think it’s this

  • The user can reproduce only with DEBUG enabled, and even Blinky trips it. That screams context/timing/stack rather than GPIO.
  • Immediate logging in ISR context has a track record of crashes on Nordic targets when debug is on and stacks are tight.

Try something along these lines…

# Core
CONFIG_GPIO=y

# Logging (deferred)
CONFIG_LOG=y
CONFIG_LOG_DEFAULT_LEVEL=3
CONFIG_LOG_MODE_IMMEDIATE=n
CONFIG_LOG_PROCESS_THREAD=y
CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=10
CONFIG_LOG_BACKEND_RTT=y
CONFIG_USE_SEGGER_RTT=y

# Debug (start light)
# CONFIG_ASSERT=y
# CONFIG_DEBUG=y
# CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_OPTIMIZATIONS=y
# CONFIG_DEBUG_THREAD_INFO=y
# CONFIG_THREAD_NAME=y

# Stacks
CONFIG_MAIN_STACK_SIZE=2048
CONFIG_ISR_STACK_SIZE=2048
CONFIG_IDLE_STACK_SIZE=512
CONFIG_STACK_SENTINEL=y
# Or: CONFIG_MPU_STACK_GUARD=y  (swap in after stack sizes are proven)

If you want a smoking gun

Enable the fault dump and read CFSR/BFAR to confirm a BusFault from log/ISR pressure, then re-enable pieces until it returns. (Standard Cortex-M33 fault doc here.) Arm Developer+1

One more practical check

Make sure the Seeed Xiao nRF54L15 board files you use match your NCS version; stale board defs against 3.1.x can cause oddities. Folks on the Seeed forum reported better luck moving from 3.0.x→3.1.1 with the Xiao. Seeed Studio Forum

Bottom line: flip to deferred logging, grow stacks, dial back debug flags, and try 3.1.1. That combo should stop the bus faults on the Xiao 54L15.

HTH
GL :slight_smile: PJ :v:

Thanks @PJ_Glasso for the excellent info!

What you describe seems to align with what I was seeing. There was a point where turning CONFIG_ASSERT on/off would make the problem appear/disappear.

This issue seems unfortunate. It is natural to turn on debug config when you have an issue like a bus fault. The bus faults with this issue might make it tricky to debug similar looking real issue bus faults and send some on a wild goose chase like my own.

Hi there,

I agree :100: with you and Awesome leg work you are doing in this. :+1:
The bus fault is the hardest sometimes to ferret out. The debugging also IMO isn’t there yet, It works but seems slow and a little clanky to me. The nRF52840 was way smoother even in VScode and PLIO.

I bet all cash money there will be another BIG release on the SDK to tweak and fix a few performance issue’s users are reporting on the Dev academy support forum.

I get the sense from even the Dev team at nordic that this was pushed hard and Support is just getting in stride. I do like the focus though they have on getting it out and revising to get it right. Seeedineers IMO should do the same.

Keep rolling :grin: :+1:

HTH
GL :slight_smile: PJ :v:

i say this as it feels like the Zephyr side is ahead of the SDK side or something :stuck_out_tongue_winking_eye:

I was able to reproduce what might be the same issue using a near default version of the template app in nrf connect. See the discussion here: XIAO nrf54l15 Matter App Recipe. (Didn’t publish that info here since it seemed like a nice recipe, and I didn’t want it to get lost in this discussion.)

I have a raytac board arriving today. I’m going to see if I can reproduce it there. After that I might put on my big boy boots and dive in with the debugger again.

1 Like

Looks like @PJ_Glasso found an explanation for the problem. See info here: XIAO nrf54l15 Matter App Recipe

The issue appears to be with uart21 being enabled. Including the following in your .overlay file should prevent the issue.

&uart21 {
    status = "disabled";
};

I have been running a day or so with uart21 disabled and haven’t seen the issue.

Hi there,

Fuzzing Awesome" thanks for getting back that it is running , so I posed the situation to them and I got some suggestions on the pin relocation.

first to, Keep UART21, but move it off SWO (advanced)

  • If you truly need uart21, remap it to non-SWO pins and ensure the board pinctrl + node reflect that; and/or disable SWO in your debug setup. Example (pick free pins that don’t clash):
&pinctrl {
	uart21_alt_default: uart21_alt_default {
		psels = <NRF_PSEL(UART_TX, 2, 4)>,  /* example: P2.4 */
		        <NRF_PSEL(UART_RX, 2, 2)>;  /* example: P2.2 */
		bias-pull-up;
	};
}

&uart21 {
	status = "okay";
	current-speed = <115200>;
	pinctrl-0 = <&uart21_alt_default>;
	pinctrl-names = "default";
};

…and make sure SWO/trace isn’t enabled on your probe.

With debug/logging on, use deferred logging and give stacks room; that prevents unrelated timing/ISR-logging crashes from muddying the waters:

CONFIG_LOG=y
CONFIG_LOG_MODE_IMMEDIATE=n
CONFIG_LOG_PROCESS_THREAD=y
CONFIG_LOG_DEFAULT_LEVEL=3
CONFIG_MAIN_STACK_SIZE=2048
CONFIG_ISR_STACK_SIZE=2048

(That’s the other failure mode you already suspected.) :+1:
On this board, uart21 uses P2.7 (SWO) by default. Disabling uart21 or switching your console to uart20 is the right move—and it’s what I’d standardize in the forum recipe.

This has lots of potential, I hope they get this and a few others sorted out quickly so they don’t lose the momentum. (software Lag) the Xiao line is big enough for a dedicated Software Seeedineer IMO. the project Developer seems to provide all support is probably why it is slowly coming out, I’m patient and confident. :yum:

HTH
GL :slight_smile: PJ :v:

They are looped in on ALL the finds too! :+1: