Skip to content

Robust Design Patterns - Part 4 - Stack Protection and Monitoring

Introduction

Every embedded program is allocated a fixed amount of stack space per thread at compile time. If a thread overflows its stack — through deep recursion, large local arrays, or unexpected call chains — it silently corrupts adjacent memory, producing crashes that are hard to reproduce and diagnose.

Zephyr provides three complementary mechanisms to detect and prevent stack overflows:

Mechanism Detection point Cost Kconfig
Stack canaries (STACK_SENTINEL) Every context switch Small (~1 µs/switch) CONFIG_STACK_SENTINEL
Hardware MPU guard (HW_STACK_PROTECTION) Any write past the stack bottom ~Zero runtime CONFIG_HW_STACK_PROTECTION
Stack watermark checker (INIT_STACKS + THREAD_ANALYZER) Periodic sampling Background thread only CONFIG_INIT_STACKS + CONFIG_THREAD_ANALYZER

This codelab adds all three to the CarSystem application and shows how to make each one intervene with a concrete example.

What you’ll build

  • A Kconfig with three boolean knobs — APP_STACK_SENTINEL, APP_HW_STACK_PROTECTION, and APP_CHECKER — that enable each mechanism independently.
  • A deliberately overflowing helper function to trigger the canary and MPU guard at will.
  • A StackChecker background thread that logs the stack high-water mark of every running thread once per minute.

What you’ll learn

  • The difference between detection (canaries, watermark) and prevention (MPU guard) and when each matters.
  • Why CONFIG_INIT_STACKS is the prerequisite for any meaningful watermark measurement.
  • How to place a StackChecker object outside the APP_DATA partition so it does not inflate the MPU-protected region used by CarSystem.

What you’ll need

Re-enable the task watchdog before this codelab

Part 2 instructs you to disable CONFIG_APP_TASK_WDT before adding userspace. Part 3 re-enables it via the proxy. By the time you start this codelab your prj_user_mode.conf should already contain CONFIG_APP_TASK_WDT=y and the proxy thread should be running. If you skipped Part 3, disable the task watchdog again (CONFIG_APP_TASK_WDT=n) before proceeding — otherwise the proxy thread is missing and the thread pool count below will be wrong.

Update zpp_lib first

If you have not already done so in Part 2 or Part 3, check out version v1.0 of the library:

cd deps/zpp_lib && git checkout tags/v1.0
Failing to do so can cause assertion failures inside try_get_for() / try_put_for() when the queue is empty or full.


Concepts: Three Layers of Stack Safety

1 — Stack canaries (sentinel)

At thread creation Zephyr writes a magic 4-byte value (sentinel) at the very bottom of the stack. At every context switch the kernel reads it back and panics if it has been overwritten:

High address  ┌───────────────┐  ← stack top (initial SP)
              │   thread data │
              │       ↓       │  grows downward
              │               │
              │    (unused)   │
              │               │
              │    sentinel   │  ← 4-byte magic value
Low address   └───────────────┘  ← stack_bottom

This mechanism has the following characteristics:

  • Latency: the overflow is detected at the next context switch, not at the moment of the overflow. Memory between the sentinel and the overflow site may already be corrupted.

  • Portability: works on any architecture — no MPU required.

2 — ARMv8-M hardware stack limit (PSPLIM) — “always active” on nRF5340

Cortex-M33 (ARMv8-M) has a dedicated Process Stack Pointer Limit register (PSPLIM). Zephyr programs it per-thread unconditionally — no Kconfig knob controls it. When SP drops below the limit, the hardware raises a UsageFault (STKOF) during the next interrupt-entry stacking. This happens in silicon, before any software check runs:

stackOverflow() recurses → SP falls below PSPLIM
    │
    ├── SysTick (or any IRQ) fires
    │
    ├── Cortex-M33 hardware tries to PUSH the exception frame
    │   but SP < PSPLIM → STKOF bit set
    │
    └── UsageFault → "Stack overflow (context area not valid)"
        PC = 0x00000000 (the push itself failed — no valid return address)

This mechanism has the following characteristics:

  • Latency: fires during interrupt entry — before STACK_SENTINEL gets a chance to run its software check at context-switch time.

  • Requirement: ARMv8-M core (e.g. Cortex-M33 / nRF5340) — automatic, always enabled.

Note

You will see later that this can actually be disabled.

3 — Hardware MPU guard (HW_STACK_PROTECTION)

The ARM Cortex-M MPU is programmed to place a no-access guard region immediately below each thread’s stack. The first write that crosses the stack boundary raises a MemManage fault before any data is corrupted:

High address  ┌───────────────┐  ← stack top
              │   thread data │
              │       ↓       │
Low address   ├───────────────┤  ← stack bottom (SP limit)
              │  MPU no-access│  ← guard region (typically 32 B)
              │    region     │
              └───────────────┘

This mechanism has the following characteristics:

  • Latency: immediate — the fault fires on the overflowing instruction, before SP crosses the PSPLIM limit.

  • Requirement: ARCH_HAS_STACK_PROTECTION (nRF5340 / Cortex-M33: yes).

4 — Stack watermark checker

CONFIG_INIT_STACKS fills every stack with a known byte pattern (0xAA) at creation. As the stack grows, it overwrites those bytes. At any later point the highest watermark can be computed by scanning from the bottom upwards for the first non-0xAA byte.

CONFIG_THREAD_ANALYZER provides thread_analyzer_run(), a callback-based API that iterates every thread and reports stack_used / stack_size. The StackChecker background thread calls this once per minute and logs the result:

--- stack watermark report ---
  Engine               312 /  512 B used ( 60%)
  Display              248 /  512 B used ( 48%)
  Sensor               176 /  512 B used ( 34%)
  StackChecker          96 /  512 B used ( 18%)
------------------------------

This mechanism has the following characteristic:

  • Latency: periodic — shows the historical maximum, not the current depth. Useful for right-sizing stacks before production.

Step 1 — Kconfig

1.0 — Thread pool sizing

Before adding any new source files, verify that CONFIG_ZPP_THREAD_POOL_SIZE is large enough. Every zpp_lib::Thread object consumes one slot from this pool. Count all threads in the system:

Thread Condition
Main thread always
4 periodic task threads CONFIG_PERIODIC_TASKS=y
WDT feeder thread CONFIG_APP_WATCHDOG=y + CONFIG_WDT_FEEDER_ZPP_THREAD=y
Task WDT proxy thread CONFIG_APP_TASK_WDT=y + CONFIG_USERSPACE=y
StackChecker thread CONFIG_APP_CHECKER=y (this codelab)

With all features enabled that is at least 8 threads. Add headroom and set:

# prj.conf
CONFIG_ZPP_THREAD_POOL_SIZE=10

Pool exhaustion crashes silently

If the pool is full when zpp_lib::Thread::start() is called, the library asserts and the system halts. The assertion message does not name the offending thread — it just prints a pool-exhaustion error. Note: presently zpp_lib::Thread supports up to 10 threads - declaring more will result in an assertion.

1.1 — Kconfig symbols

Four new boolean options are added to the project Kconfig — three for the protection mechanisms and one for the deliberate overflow test used in Step 2. Each selects the underlying Zephyr symbol so that the caller only needs to set one flag:

kconfig
config APP_CHECKER
bool "Enable periodic stack watermark checker"
select INIT_STACKS
select THREAD_ANALYZER
select THREAD_STACK_INFO
select THREAD_NAME
default n
help
    Enables a low-priority background thread that logs the stack high-water
    mark for every running thread once per minute.  CONFIG_INIT_STACKS fills
    each stack with a known pattern at creation time, which is what makes the
    watermark measurement possible.  CONFIG_THREAD_ANALYZER provides the
    iteration callback and CONFIG_THREAD_STACK_INFO exposes the stack
    boundaries through the thread struct.

config APP_STACK_SENTINEL
bool "Enable software stack sentinel overflow detection"
select STACK_SENTINEL
default n
help
    Writes a magic sentinel value at the bottom of every thread stack at
    creation time and verifies it at every context switch.  A corrupted
    sentinel triggers a fatal error.  Fully portable  no MPU or compiler
    support required.  Adds a small overhead to every context switch.

config APP_HW_STACK_PROTECTION
bool "Enable hardware MPU stack overflow protection"
select HW_STACK_PROTECTION
depends on ARCH_HAS_STACK_PROTECTION
default n
help
    Configures the ARM MPU to place a no-access guard region below each
    thread stack.  Any write past the stack bottom raises a MemManage fault
    instead of silently corrupting memory.  Requires Cortex-M33 / nRF5340
    (ARCH_HAS_STACK_PROTECTION).  Near-zero runtime overhead once configured.

config APP_STACK_OVERFLOW_TEST
bool "Deliberately overflow a task stack for testing"
default n
help
    When enabled, the periodic task whose taskIndex == 0 (arbitrarily chosen)
    triggers unbounded recursion on its third iteration.  The index can be
    changed to any other task simply by editing the condition in task_method().
    Use with APP_STACK_SENTINEL or APP_HW_STACK_PROTECTION to observe
    the detection mechanism.  Never enable in production.

Enable whichever mechanism you want in prj.conf:

CONFIG_ZPP_THREAD_POOL_SIZE=10   # must fit all threads including StackChecker
CONFIG_APP_STACK_SENTINEL=y      # software canary
CONFIG_APP_HW_STACK_PROTECTION=y # hardware MPU guard
CONFIG_APP_CHECKER=y             # watermark background thread

For userspace builds (prj_user_mode.conf), add the same APP_CHECKER line — the thread pool size is inherited from prj.conf automatically:

# prj_user_mode.conf (append)
CONFIG_APP_CHECKER=y

StackChecker always runs in supervisor mode

The StackChecker thread is started from main() before any user thread drops to unprivileged mode. It is created without the userMode=true flag, so it stays in supervisor mode regardless of CONFIG_USERSPACE. This means:

  • No domain partition grants are needed for the StackChecker thread.
  • It can call thread_analyzer_run() freely — this is a regular C function that requires no syscall wrapper.
  • Its stack lives in plain .bss (see Step 3.4) so it never inflates app_partition.

Mechanisms are independent

You can combine all three simultaneously. For production builds the MPU guard (APP_HW_STACK_PROTECTION) is the strongest: it prevents corruption. The watermark checker (APP_CHECKER) is complementary: it tells you how much headroom you still have.


Step 2 — Triggering a Stack Overflow (Canary Example)

To see each mechanism intervene, we need a function that deliberately blows the stack. The simplest approach is unbounded recursion:

// Defined in car_system.cpp
static constexpr uint8_t  kOverflowPadSize          = 64U;    // bytes per frame to accelerate overflow
static constexpr uint8_t  kOverflowTaskIndex        = 0U;    // which periodic task triggers the test
static constexpr uint32_t kOverflowTriggerIteration = 3000U; // loop iteration at which overflow fires

[[noreturn]] static void stackOverflow(uint32_t depth) {
  [[maybe_unused]] volatile uint8_t pad[kOverflowPadSize] = {};  // force stack growth; volatile prevents optimisation
  LOG_INF("depth=%u", depth);
  stackOverflow(depth + 1);   // tail-call prevented by volatile above
}

Call it inside one task’s loop with a Kconfig guard so it only compiles in when explicitly requested:

#if CONFIG_APP_STACK_OVERFLOW_TEST && !CONFIG_USERSPACE
  // kOverflowTaskIndex / kOverflowTriggerIteration are named constants.
  if (taskIndex == kOverflowTaskIndex) {
    static uint32_t overflow_iteration = 0;
    overflow_iteration++;
    if (overflow_iteration == kOverflowTriggerIteration) {
      LOG_WRN("Deliberately overflowing stack of task %u — expect a fatal error!", kOverflowTaskIndex);
      stackOverflow(0);
    }
  }
#endif  // CONFIG_APP_STACK_OVERFLOW_TEST && !CONFIG_USERSPACE

Supervisor mode only

The overflow test must not be enabled together with CONFIG_USERSPACE. When tasks run in user mode the MPU guard region fires before the sentinel is checked, producing a misleading Data Access Violation instead of the expected sentinel or MPU stack-overflow fault. The && !CONFIG_USERSPACE guard in the code prevents compilation in userspace builds.

Why static for overflow_iteration?

task_method() is called once per thread from a lambda; the stack frame is never re-entered. The counter must survive across loop iterations, so it is declared static. Using kOverflowTaskIndex == 0 picks the first task arbitrarily — change the constant to any index from 0 to 3 to overflow a different task’s stack.

The APP_STACK_OVERFLOW_TEST symbol was added to Kconfig in Step 1.1 above. Enable it in prj.conf alongside the detection mechanism you want to test:

CONFIG_APP_STACK_OVERFLOW_TEST=y
CONFIG_APP_STACK_SENTINEL=y   # or CONFIG_APP_HW_STACK_PROTECTION=y

2.1 — With APP_STACK_SENTINEL

CONFIG_APP_STACK_SENTINEL=y
CONFIG_APP_STACK_OVERFLOW_TEST=y

Actual output on nRF5340 (Cortex-M33 / ARMv8-M):

W: Deliberately overflowing stack of task 0 — expect a fatal error!
I: depth=0
I: depth=1
...
I: depth=N
E: ***** USAGE FAULT *****
E:   Stack overflow (context area not valid)
E: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
E: Current thread: 0x200... (Engine)

Why PSPLIM fires instead of the sentinel on Cortex-M33

You might expect STACK SENTINEL VIOLATED, but on ARMv8-M cores Zephyr programs the PSPLIM register per-thread unconditionally (see Concept 2 above). The sequence is:

  1. stackOverflow() recurses until SP drops below PSPLIM.
  2. The next SysTick (or any IRQ) fires and the CPU tries to push the exception frame onto the stack.
  3. The hardware detects SP < PSPLIM in silicon and raises a UsageFault (STKOF) — PC = 0x00000000 because the push itself failed.
  4. Zephyr’s fault handler prints Stack overflow (context area not valid) and halts.

The sentinel’s software check runs inside z_arm_context_switch(), which is reached only after a successful interrupt-entry stacking. Because step 3 aborts that stacking, the sentinel check never executes.

On ARMv7-M cores (Cortex-M3/M4, which have no PSPLIM), STACK_SENTINEL would be the first line of defence and you would see STACK SENTINEL VIOLATED.

Forcing the sentinel to fire on Cortex-M33: __set_PSPLIM(0)

The CMSIS intrinsic __set_PSPLIM(0U) writes 0 to the PSPLIM register, disabling the hardware limit for the current thread’s timeslice. With PSPLIM cleared the stack can grow unchecked until the sentinel value is overwritten, and the software check fires at the next context switch:

if (overflow_iteration == kOverflowTriggerIteration) {
    LOG_WRN("Deliberately overflowing stack ...");
#if CONFIG_APP_STACK_SENTINEL && !CONFIG_APP_HW_STACK_PROTECTION && CONFIG_STACK_CANARIES_ALL
    // Disable the ARMv8-M hardware stack limit so the software sentinel fires
    // instead of the PSPLIM UsageFault. Only meaningful when the sentinel is
    // active, the MPU guard is off, and compiler stack canaries are enabled.
    __set_PSPLIM(0U);
#endif
    stackOverflow(0);
}

The three required flags in prj.conf:

CONFIG_APP_STACK_SENTINEL=y      # sentinel must be active — something to detect the overflow
CONFIG_APP_HW_STACK_PROTECTION=n # MPU guard must be off — it would fire before the sentinel
CONFIG_STACK_CANARIES_ALL=y      # compiler canaries harden every frame; sentinel catches the escape

Expected output after adding this line:

W: Deliberately overflowing stack of task 0 — expect a fatal error!
I: depth=0
I: depth=1
...
I: depth=29
I: depth=3   ← corrupted log line: stack has already overwritten the log buffer
E: r0/a1:  0x...  r1/a2:  0x...  ...
E: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
E: Fault during interrupt handling

FATAL ERROR 2 is K_ERR_STACK_CHK_FAIL — that is the sentinel detection. There is no STACK SENTINEL VIOLATED banner on Cortex-M33 Zephyr builds; the fault handler goes straight to the register dump and fatal error code.

Two side effects visible in the output:

  • Corrupted log line — by depth 30 the stack has grown past its bottom and overwritten adjacent memory including the logging subsystem’s buffer, so the depth number is garbled. This is the “detection is not prevention” problem made concrete.
  • Fault during interrupt handling — the sentinel check runs inside the SysTick/context-switch ISR. If the stack corruption is severe enough to also corrupt the IRQ frame, a secondary fault fires within the handler.

Three important constraints:

  • Privileged mode only__set_PSPLIM faults in unprivileged (user-mode) code; the && !CONFIG_USERSPACE guard already ensures this.
  • Scoped to one timeslice — Zephyr reprograms PSPLIM from the thread struct at every context switch, so other threads are completely unaffected.
  • Does not prevent corruption — memory between the former PSPLIM limit and the sentinel value can be overwritten before the sentinel check fires.

Detection is not prevention

Between the actual overflow and the PSPLIM fault, some memory below the stack bottom may already have been partially corrupted by the recursion. The fault guarantees the overflow is caught — not that no damage was done. The corrupted depth=3 log line above is direct evidence: memory was already overwritten before the sentinel check ran.

2.2 — With APP_HW_STACK_PROTECTION

CONFIG_APP_HW_STACK_PROTECTION=y
CONFIG_APP_STACK_OVERFLOW_TEST=y

Expected output (MPU fires on the overflowing write instruction):

depth=0
depth=1
...
depth=N
E: ***** MPU FAULT *****
E: Data Access Violation
E: MMFAR Address: 0x200...  ← address just below the stack bottom
E: Current thread: 0x200... (Engine)

The fault fires on the exact instruction that crosses the stack boundary — no adjacent memory is corrupted.


Step 3 — Adding the StackChecker

The StackChecker is a zpp_lib background thread that wakes up every 60 seconds, calls thread_analyzer_run(), and logs the watermark for every thread.

3.1 — Create the source files

Create both files directly under car_system/src/:

car_system/src/stack_checker.hpp
car_system/src/stack_checker.cpp

The project CMakeLists.txt uses file(GLOB_RECURSE APP_SOURCES ... *.cpp), so stack_checker.cpp is picked up automatically — no manual CMakeLists.txt edit is needed.

3.2 — stack_checker.hpp

#pragma once

#include <zephyr/kernel.h>
#include <atomic>

#include "zpp_include/non_copyable.hpp"
#include "zpp_include/thread.hpp"
#include "zpp_include/zephyr_result.hpp"

namespace car_system {

class StackChecker : private zpp_lib::NonCopyable<StackChecker> {
 public:
  StackChecker();
  ~StackChecker() = default;

  [[nodiscard]] zpp_lib::ZephyrResult start();
  void stop();   // signal stop; returns immediately
  void join();   // block until the thread has exited

 private:
  void checker_loop();

  zpp_lib::Thread _thread{zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "StackChecker"};
  std::atomic<bool> _running{false};
  // k_sem used as a stop signal: stop() gives it, checker_loop() takes it
  // with a 60-second timeout so it wakes for a report or immediately on stop.
  struct k_sem _stopSem;
};

}  // namespace car_system

3.3 — stack_checker.cpp

#include "stack_checker.hpp"

#include <zephyr/debug/thread_analyzer.h>
#include <zephyr/logging/log.h>

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

static constexpr uint32_t kCheckIntervalSeconds = 60U;
static constexpr size_t   kSemMaxCount          = 1U;
static constexpr size_t   kSemInitCount         = 0U;
static constexpr uint32_t kPctScale             = 100U;

static void on_thread_info(struct thread_analyzer_info* info) {
  unsigned int pct = (info->stack_used * kPctScale) / info->stack_size;
  LOG_INF("  %-20s %4zu / %4zu B used (%3u%%)",
          info->name, info->stack_used, info->stack_size, pct);
}

namespace car_system {

StackChecker::StackChecker() {
  k_sem_init(&_stopSem, kSemInitCount, kSemMaxCount);
}

zpp_lib::ZephyrResult StackChecker::start() {
  _running.store(true);
  auto res = _thread.start([this]() { checker_loop(); });
  if (!res) {
    _running.store(false);
    LOG_ERR("StackChecker: cannot start thread: %d", static_cast<int>(res.error()));
    __ASSERT(false, "StackChecker: thread start failed");
  }
  return res;
}

void StackChecker::stop() {
  _running.store(false);
  k_sem_give(&_stopSem);  // wake immediately if sleeping
}

void StackChecker::join() {
  auto res = _thread.join();
  if (!res) {
    LOG_ERR("StackChecker: cannot join thread: %d", static_cast<int>(res.error()));
  }
}

void StackChecker::checker_loop() {
  LOG_INF("StackChecker: started (report every %u s)", kCheckIntervalSeconds);

  while (_running.load()) {
    // Sleep for one interval, or wake immediately when stop() gives the sem.
    k_sem_take(&_stopSem, K_SECONDS(kCheckIntervalSeconds));

    if (!_running.load()) {
      break;
    }

    LOG_INF("--- stack watermark report ---");
    thread_analyzer_run(on_thread_info, 0);
    LOG_INF("------------------------------");
  }

  LOG_INF("StackChecker: exiting");
}

}  // namespace car_system

Key design choices:

Choice Reason
PriorityVeryLow The checker must never preempt real-time tasks
k_sem with 60 s timeout One API call covers both the wait and the early-exit signal
std::atomic<bool> _running Safely shared between the caller (stop) and the thread (loop condition)
thread_analyzer_run(on_thread_info, 0) Iterates all threads; the callback is called once per thread

3.4 — Integrate in main.cpp

Do NOT add StackChecker as a member of CarSystem

CarSystem is declared APP_DATA static car_system::CarSystem carSystem — it lives in the MPU-protected app_partition. Adding a StackChecker member (which contains a zpp_lib::Thread with a Mutex, an Event object and a string) inflates sizeof(CarSystem) past the MPU-aligned partition boundary, causing an immediate MPU FAULT (Data Access Violation) on the first thread access.

The correct placement is plain .bss in main.cpp — no APP_DATA tag.

// main.cpp

#include "car_system.hpp"
#if CONFIG_APP_CHECKER
#include "stack_checker.hpp"
#endif

#if CONFIG_USERSPACE
APP_DATA static car_system::CarSystem carSystem;
#endif

#if CONFIG_APP_CHECKER
// StackChecker lives in plain .bss (not APP_DATA) — it runs in supervisor mode
// and must not inflate the MPU-protected app_partition that CarSystem occupies.
static car_system::StackChecker stackChecker;
#endif

int main() {
  // ... (userspace init, watchdog init, etc.) ...

#if CONFIG_APP_CHECKER
  {
    auto checkerRes = stackChecker.start();
    if (!checkerRes) {
      LOG_ERR("Cannot start StackChecker: %d", static_cast<int>(checkerRes.error()));
    }
  }
#endif

  auto res = carSystem.start();  // blocks until shutdown

#if CONFIG_APP_CHECKER
  stackChecker.stop();
  stackChecker.join();
#endif

  if (!res) {
    LOG_ERR("Could not start the car system: %d", static_cast<int>(res.error()));
    k_oops();
  }
  return 0;
}

The lifecycle is:

main()
  │
  ├─ stackChecker.start()   → StackChecker thread spawned (supervisor, PriorityVeryLow)
  │
  ├─ carSystem.start()      → blocks; all CarSystem threads run
  │       │
  │       │  (every 60 s)
  │       ├─ StackChecker wakes, logs watermarks, goes back to sleep
  │       │
  │  (carSystem.start() returns, e.g. on shutdown signal)
  │
  ├─ stackChecker.stop()    → sets _running=false, gives semaphore
  ├─ stackChecker.join()    → waits for thread to exit
  └─ return 0

3.5 — Build and verify

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine

With CONFIG_APP_CHECKER=y, after 60 seconds you should see:

I: --- stack watermark report ---
I:   Rain                  300 / 1024 B used ( 29%)
I:   Tire                  300 / 1024 B used ( 29%)
I:   Display               300 / 1024 B used ( 29%)
I:   Engine                348 / 1024 B used ( 33%)
I:   BackgroundWQ          244 / 1024 B used ( 23%)
I:   DS                    404 / 1024 B used ( 39%)
I:   SporadicGT            380 / 1024 B used ( 37%)
I:   StackChecker          516 / 1024 B used ( 50%)
I:   wdt_feeder            244 / 1024 B used ( 23%)
I:   sysworkq              148 / 1024 B used ( 14%)
I:   idle                   92 /  320 B used ( 28%)
I:   main                 1796 / 4096 B used ( 43%)
 ISR0                : STACK: unused 1827 usage 221 / 2048 (10 %)

Adjusting stack sizes

If any thread exceeds ~80 %, increase its stack size in the thread declaration and rebuild. The watermark is the historical maximum since the last reset, so values measured after a busy period are the most representative.


Step 4 — Build Matrix (combining all three)

The three mechanisms can be exercised independently with a build-matrix script. A minimal set of scenarios:

Scenario Extra conf What to observe
Baseline (no protection) prj.conf only Stack overflow corrupts silently
Canary only CONFIG_APP_STACK_SENTINEL=y Sentinel violation at next context switch
MPU guard only CONFIG_APP_HW_STACK_PROTECTION=y MemManage fault on the overflowing instruction
Watermark checker CONFIG_APP_CHECKER=y 60 s periodic log of all thread stack usage
All three CONFIG_APP_STACK_SENTINEL=y + CONFIG_APP_HW_STACK_PROTECTION=y + CONFIG_APP_CHECKER=y MPU fault fires first; watermark shows pre-fault usage

Summary

Mechanism Kconfig knob Detection moment Prevents corruption? Overhead
PSPLIM hardware limit (always on, ARMv8-M) Interrupt-entry stacking after SP < limit No Zero — silicon
Stack canary (STACK_SENTINEL) APP_STACK_SENTINEL Next context switch (after interrupt stacking) No ~1 µs/switch
MPU guard (HW_STACK_PROTECTION) APP_HW_STACK_PROTECTION Overflowing write instruction (before PSPLIM) Yes ~Zero
Watermark checker (THREAD_ANALYZER) APP_CHECKER Periodic (60 s) No Background thread

Priority order on nRF5340

When multiple mechanisms are active simultaneously, the one that fires earliest in the overflow timeline wins:

  1. MPU guard — fires on the first write past the stack bottom (deepest recursion frame, before SP reaches PSPLIM).
  2. PSPLIM — fires during the next interrupt-entry stacking once SP has dropped below the limit.
  3. Sentinel — would fire at the next context switch, but on Cortex-M33 PSPLIM always pre-empts it.

In practice on nRF5340: with APP_HW_STACK_PROTECTION=y you see an MPU fault; with only APP_STACK_SENTINEL=y you see the PSPLIM UsageFault.

Use the MPU guard in production for real protection. Use the canary as a portable fallback on platforms without an MPU. Use the watermark checker during development to right-size stacks before shipping.

Questions

  1. Why must CONFIG_INIT_STACKS be enabled for the watermark measurement to work? What would thread_analyzer_run() report without it?
  2. The canary is checked at every context switch. Name a scenario where a stack overflow could corrupt data without the canary ever detecting it.
  3. Why is StackChecker placed in plain .bss in main.cpp rather than as a member of CarSystem, even though it logically belongs to the system?
  4. What is the worst-case delay between a stack overflow and the MPU guard firing? Between a stack overflow and the canary firing?

Going beyond / References