Skip to content

Robust Design Patterns - Part 2 - Userspace Isolation

Introduction

In this codelab, we address the problem of unresponsive applications. Unresponsive is referring to the situation a program installed on an MCU becomes unavailable due to various reasons (e.g. gobbler task, dead-lock, …). So far every thread in our application runs in supervisor mode — the CPU’s most privileged level. Any thread can read or write any memory address, access any peripheral, and call any kernel function. That is convenient for development, but it means a single bug in one thread can corrupt data used by another, silently overwrite kernel structures, or poke hardware registers it has no business touching.

Zephyr RTOS offers userspace — an optional feature that leverages the CPU’s Memory Protection Unit (MPU) to restrict what each thread can access at the hardware level. A userspace thread can only touch memory regions it has been explicitly granted access to; any other access triggers an immediate CPU fault instead of silent corruption.

What you’ll build

In this codelab, you will create a ProdConsumer subsystem (a producer thread and a consumer thread exchanging SensorData via shared memory) and progressively increase the use of userspace protection:

  1. Step 1 — Single shared domain (app_partition): both threads run in user mode and share the same global memory partition.
  2. Step 2 — Dedicated per-thread domains: the producer and consumer each get their own private memory partition, with only the exchanged data placed in a shared partition accessible to both.

What you’ll learn

  • What supervisor mode vs. user mode means on an ARM Cortex-M33 with MPU.
  • The core Zephyr RTOS abstractions: memory partitions, memory domains, and kernel object permissions.
  • How to move an existing subsystem into userspace incrementally — first with a single partition, then with fine-grained per-thread isolation.
  • The practical constraints: what breaks when you flip the switch, and how to fix it.
  • How to verify that isolation actually works (a rogue write faults instead of corrupting).

What you’ll need

Disable task watchdog subsystem

Make sure you have disabled the task watchdog subsystem prior to starting this codelab. Failing to do so will result in crashes.

Concepts: why Userspace?

The problem with supervisor mode

In a typical bare-metal or RTOS application every thread runs at the CPU’s highest privilege level. This means:

Capability Risk
Read/write any RAM address One thread’s bug silently corrupts another’s data
Access any peripheral register Accidental write to a timer or UART can brick the system
Call any kernel API A buffer overflow can overwrite scheduler structures

In safety-critical projects this is unacceptable: you need fault containment — the guarantee that one component’s failure cannot propagate to another.

How the MPU enforces isolation

ARM Cortex-M33 (the core in the nRF5340) includes a Memory Protection Unit (MPU) — a hardware block that sits between the CPU and the bus. Before every memory access, the MPU checks:

  1. Is the address within a region the current thread is allowed to access?
  2. Does the access type (read / write / execute) match the region’s permissions?

If the check fails, the CPU raises a MemManage fault — an immediate, synchronous exception. The offending instruction never completes.

┌──────────────┐         ┌─────────┐         ┌──────────────┐
│   CPU core   │──addr──▶│   MPU   │──ok──▶  │  Bus / SRAM  │
│ (thread A)   │         │ regions │         │              │
└──────────────┘         │ check   │         └──────────────┘
                         │         │
                         │ FAULT ◀─┘  (if no matching region)

The key insight: the MPU is reconfigured on every context switch. When the scheduler switches from Thread A to Thread B, it loads Thread B’s region table into the MPU. Thread B literally cannot see Thread A’s private memory — it is as if it does not exist.

Zephyr’s memory model

Zephyr RTOS introduces three abstractions that map onto the MPU:

Abstraction What it is Analogy
Memory Partition (k_mem_partition) A named, contiguous memory region with fixed permissions A room in a building
Memory Domain (k_mem_domain) A set of partitions that a thread (or group of threads) can access A key card that opens specific rooms
Kernel Object Permissions Per-object ACL granting a thread the right to use a specific kernel object (mutex, semaphore, queue, …) A signed permission slip for a specific resource

A thread running in user mode can only access:

  • The partitions in its memory domain
  • The kernel objects it has been explicitly granted access to

Everything else triggers a fault.

Supervisor mode is always available

Code running in ISR context always executes in supervisor mode. Threads only run in user mode if they were created with the K_USER option (or the equivalent zpp_lib flag); all other threads remain in supervisor mode regardless of their priority. Both cooperative and preemptive threads can be user mode threads — the privilege level is determined by the creation flag, not the priority class (ref).

Step 0 — ProdConsumer in Supervisor Mode

Before introducing userspace, you first need a working ProdConsumer subsystem running entirely in supervisor mode. Two threads — a producer and a consumer — exchange SensorData through shared memory using a mutex for exclusive access and a semaphore for signalling. We will do so by adding this subsystem to CarSystem.

┌─────────────────────────────────────────────────────────────┐
│                   Supervisor Mode (all RAM visible)         │
│                                                             │
│   Producer thread              Consumer thread              │
│   ┌────────────────┐          ┌────────────────┐            │
│   │ fill SensorData│──mutex──▶│ read SensorData│            │
│   │ release sema   │──sema───▶│ acquire sema   │            │
│   └────────────────┘          └────────────────┘            │
│                                                             │
│                   _sharedData (class member)                │
└─────────────────────────────────────────────────────────────┘

0.1 — Add the Kconfig option

Create a PROD_CONSUMER option in the project’s Kconfig so the subsystem can be compiled in or out:

config PROD_CONSUMER
  bool "Enable producer-consumer userspace demo"
  default n
  help
    Enable a producer-consumer subsystem where a producer thread writes
    sensor data to a shared variable (protected by a mutex) and signals
    a consumer thread via a semaphore. Works in both supervisor and
    userspace modes.

Enable it in prj.conf:

CONFIG_PROD_CONSUMER=y

0.2 — Define the header (prod_consumer.hpp)

The ProdConsumer class owns two threads, a barrier for synchronized startup, a semaphore for producer → consumer signalling, and a mutex for protecting the shared data:

#pragma once

#if CONFIG_PROD_CONSUMER

#include <atomic>
#include <chrono>

#include "zpp_include/barrier.hpp"
#include "zpp_include/mutex.hpp"
#include "zpp_include/semaphore.hpp"
#include "zpp_include/non_copyable.hpp"
#include "zpp_include/thread.hpp"
#include "zpp_include/zephyr_result.hpp"

namespace car_system {

using std::literals::chrono_literals::operator""ms;

// Data structure shared between producer and consumer
struct SensorData {
  uint32_t sequence_number;
  std::chrono::milliseconds timestamp;
  uint32_t sensor_value;
};

class ProdConsumer : private zpp_lib::NonCopyable<ProdConsumer> {
 public:
  ProdConsumer();
  ~ProdConsumer() = default;

  [[nodiscard]] zpp_lib::ZephyrResult start();
  [[nodiscard]] zpp_lib::ZephyrResult join();
  void stop();

 private:
  void producer_task();
  void consumer_task();

  static constexpr uint8_t NBR_OF_TASKS = 2;
  static constexpr uint8_t PRODUCER_INDEX = 0;
  static constexpr uint8_t CONSUMER_INDEX = 1;

  zpp_lib::Thread _threads[NBR_OF_TASKS];

  // Barrier to synchronize both threads at startup
  zpp_lib::Barrier _barrier{NBR_OF_TASKS};

  // Semaphore to signal consumer that data is ready (initial 0, max 2).
  // max=2 reserves one extra count so stop() can always unblock the consumer
  // even if the producer has already posted a pending item.
  zpp_lib::Semaphore _syncSemaphore{0, 2};

  // Mutex to protect access to shared data
  zpp_lib::Mutex _dataMutex;

  // Shared data between producer and consumer
  SensorData _sharedData{};

  // Stop flag
  volatile std::atomic<bool> _stopFlag{false};

  // Sequence counter for producer data
  std::atomic<uint32_t> _sequenceCounter{0};

  // Task periods
  static constexpr std::chrono::milliseconds kProducerPeriod = 500ms;
  static constexpr std::chrono::milliseconds kConsumerPeriod = 500ms;

  // Simulated computation times
  static constexpr std::chrono::milliseconds kProducerComputation = 50ms;
  static constexpr std::chrono::milliseconds kConsumerComputation = 100ms;

  // Scaling factor applied to sequence number to produce sensor value
  static constexpr uint32_t kSensorValueScale = 42;
};

}  // namespace car_system

#endif  // CONFIG_PROD_CONSUMER

Why max = 2 on the semaphore?

The stop() method releases the semaphore to unblock the consumer from its acquire(). If the producer has already posted a pending item (count = 1), stop() would fail with max = 1. Setting max = 2 guarantees that stop() can always wake the consumer.

0.3 — Implement the class (prod_consumer.cpp)

Constructor — set thread priorities

#if CONFIG_PROD_CONSUMER

#include "prod_consumer.hpp"

#include <zephyr/logging/log.h>
#include "zpp_include/this_thread.hpp"
#include "zpp_include/time.hpp"

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

namespace car_system {

ProdConsumer::ProdConsumer()
    : _threads{
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer"),
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer")}
{
}

The producer runs at a higher priority than the consumer so that fresh data is always produced before the consumer wakes up.

start() — launch both threads

zpp_lib::ZephyrResult ProdConsumer::start() {
  zpp_lib::ZephyrResult res;

  res = _threads[PRODUCER_INDEX].start([this]() { producer_task(); });
  if (!res) {
    LOG_ERR("ProdConsumer: Cannot start Producer thread: %d",
            static_cast<int>(res.error()));
    return res;
  }

  res = _threads[CONSUMER_INDEX].start([this]() { consumer_task(); });
  if (!res) {
    LOG_ERR("ProdConsumer: Cannot start Consumer thread: %d",
            static_cast<int>(res.error()));
    return res;
  }

  return res;
}

join() and stop()

zpp_lib::ZephyrResult ProdConsumer::join() {
  zpp_lib::ZephyrResult res;

  for (uint8_t i = 0; i < NBR_OF_TASKS; i++) {
    res = _threads[i].join();
    if (!res) {
      LOG_ERR("ProdConsumer: Cannot join thread %d: %d", i,
              static_cast<int>(res.error()));
      return res;
    }
  }

  return res;
}

void ProdConsumer::stop() {
  _stopFlag.store(true);

  // Release semaphore to unblock the consumer from its indefinite acquire()
  auto res = _syncSemaphore.release();
  if (!res) {
    LOG_ERR("ProdConsumer stop: semaphore release failed");
  }
}

producer_task() — periodic data production

void ProdConsumer::producer_task() {
  std::chrono::milliseconds nextPeriod =
      std::chrono::duration_cast<std::chrono::milliseconds>(_barrier.wait());

  LOG_INF("ProdConsumer Producer starting at %lld ms", nextPeriod.count());

  while (!_stopFlag) {
    // Simulate work
    zpp_lib::ThisThread::busy_wait(kProducerComputation);

    // Prepare data
    SensorData data;
    data.sequence_number = _sequenceCounter.fetch_add(1);
    data.timestamp = std::chrono::duration_cast<std::chrono::milliseconds>(
        zpp_lib::Time::get_uptime());
    data.sensor_value = data.sequence_number * kSensorValueScale;

    LOG_INF("ProdConsumer Producer: writing data (seq=%u, value=%u)",
            data.sequence_number, data.sensor_value);

    // Write data under mutex protection
    auto lockRes = _dataMutex.lock();
    if (!lockRes) {
      LOG_ERR("ProdConsumer Producer: mutex lock failed");
    } else {
      _sharedData = data;
      auto unlockRes = _dataMutex.unlock();
      if (!unlockRes) {
        LOG_ERR("ProdConsumer Producer: mutex unlock failed");
      }
    }

    // Signal the consumer that data is ready
    auto semRes = _syncSemaphore.release();
    if (!semRes) {
      LOG_ERR("ProdConsumer Producer: semaphore release failed");
    }

    nextPeriod += kProducerPeriod;
    zpp_lib::ThisThread::sleep_until(nextPeriod);
  }
}

consumer_task() — wait, read, process

void ProdConsumer::consumer_task() {
  std::chrono::milliseconds nextPeriod =
      std::chrono::duration_cast<std::chrono::milliseconds>(_barrier.wait());

  LOG_INF("ProdConsumer Consumer starting at %lld ms", nextPeriod.count());

  while (true) {
    // Block until the producer signals that new data is ready
    auto semRes = _syncSemaphore.acquire();
    if (!semRes) {
      LOG_ERR("ProdConsumer Consumer: semaphore acquire failed");
      continue;
    }

    // stop() released the semaphore and set _stopFlag
    if (_stopFlag.load()) {
      break;
    }

    // Read shared data under mutex protection
    SensorData receivedData;
    auto lockRes = _dataMutex.lock();
    if (!lockRes) {
      LOG_ERR("ProdConsumer Consumer: mutex lock failed");
      continue;
    }

    receivedData = _sharedData;

    auto unlockRes = _dataMutex.unlock();
    if (!unlockRes) {
      LOG_ERR("ProdConsumer Consumer: mutex unlock failed");
    }

    LOG_INF(
        "ProdConsumer Consumer: received data (seq=%u, timestamp=%lld ms, value=%u)",
        receivedData.sequence_number, receivedData.timestamp.count(),
        receivedData.sensor_value);

    // Simulate consuming the data
    zpp_lib::ThisThread::busy_wait(kConsumerComputation);

    nextPeriod += kConsumerPeriod;
    zpp_lib::ThisThread::sleep_until(nextPeriod);
  }
}

}  // namespace car_system

#endif  // CONFIG_PROD_CONSUMER

0.4 — Integrate into CarSystem

Add a ProdConsumer member to CarSystem:

// car_system.hpp
#if CONFIG_PROD_CONSUMER
#include "prod_consumer.hpp"
#endif

class CarSystem : private zpp_lib::NonCopyable<CarSystem> {
  // ...existing members...
#if CONFIG_PROD_CONSUMER
  ProdConsumer _prodConsumer;
#endif
};

In CarSystem::start(), start and join the subsystem:

#if CONFIG_PROD_CONSUMER
  res = _prodConsumer.start();
  if (!res) {
    LOG_ERR("Cannot start ProdConsumer: %d", static_cast<int>(res.error()));
    return res;
  }
#endif

  // ...existing thread joins...

#if CONFIG_PROD_CONSUMER
  res = _prodConsumer.join();
  if (!res) {
    LOG_ERR("Cannot join ProdConsumer: %d", static_cast<int>(res.error()));
    return res;
  }
#endif

In CarSystem::stop():

#if CONFIG_PROD_CONSUMER
  _prodConsumer.stop();
#endif

0.5 — Build and verify

west build -b nrf5340dk/nrf5340/cpuapp car_system

You should see log output like:

ProdConsumer Producer starting at 42 ms
ProdConsumer Consumer starting at 42 ms
ProdConsumer Producer: writing data (seq=0, value=0)
ProdConsumer Consumer: received data (seq=0, timestamp=92 ms, value=0)
ProdConsumer Producer: writing data (seq=1, value=42)
ProdConsumer Consumer: received data (seq=1, timestamp=592 ms, value=42)

At this point everything works, but there is no memory protection whatsoever. A bug in the consumer could overwrite the producer’s stack, corrupt kernel data structures or poke random peripheral registers — and the system would silently continue with a corrupted state.

Questions

  1. Why does the consumer use _syncSemaphore.acquire() (blocking) instead of polling _stopFlag in a loop?
  2. What would happen if you removed the mutex and let both threads access _sharedData without synchronization? Would the system crash immediately?
  3. The barrier is initialized with NBR_OF_TASKS = 2, meaning only the producer and consumer synchronize on it. Why don’t we include CarSystem’s periodic tasks in this barrier?

Step 1 — Userspace with a Shared Domain

With the supervisor-mode ProdConsumer working (Step 0), you can now flip the userspace switch. In this step, every application thread (including the producer and consumer) belongs to one common memory domain (app_domain) containing:

  • z_libc_partition — the C library’s internal globals
  • zpp_lib_partition — the zpp_lib library globals
  • app_partition — the application’s own globals

Both the producer and the consumer see the same set of memory. There is no isolation between them yet, but they are isolated from the kernel and from peripherals they have not been granted access to.

┌──────────────────────────────────────────┐
│              app_domain                  │
│  ┌───────────┐ ┌──────────┐ ┌─────────┐  │
│  │ z_libc    │ │ zpp_lib  │ │   app   │  │
│  │ partition │ │ partition│ │partition│  │
│  └───────────┘ └──────────┘ └─────────┘  │
│                                          │
│  Producer thread ◀──────▶ Consumer thread│
│          (both share everything)         │
└──────────────────────────────────────────┘

1.1 — Enable userspace in the build

Userspace is activated via an overlay configuration file that is passed at build time with -DEXTRA_CONF_FILE.

Create (or update) prj_user_mode.conf:

# enable user mode
CONFIG_USERSPACE=y
CONFIG_LOG_MODE_MINIMAL=y
# ProdConsumer adds 1 semaphore + 1 barrier (default pool size of 2 is not enough)
CONFIG_ZPP_SEMAPHORE_POOL_SIZE=4
# Enable INFO-level app logging so ProdConsumer output is visible
CONFIG_APP_LOG_LEVEL_INF=y

Pool sizes

Userspace objects (semaphores, mutexes, barriers, …) are allocated from static pools whose sizes must be configured at build time. If you get a boot-time assertion such as __ASSERT zpp semaphore pool exhausted, increase the pool size in the conf file.

1.2 — Set up the memory domain

The domain must be created from supervisor context — typically early in main(), before any user thread is started.

userspace/init_domain.hpp:

#pragma once

#if CONFIG_USERSPACE
#include <zephyr/app_memory/app_memdomain.h>
#include <cstdint>

namespace car_system {

#define APP_DATA K_APP_DMEM(app_partition)

extern struct k_mem_domain app_domain;
extern void init_domain();

}  // namespace car_system
#else
#define APP_DATA
#endif

userspace/init_domain.cpp:

#if CONFIG_USERSPACE

#include "init_domain.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

K_APPMEM_PARTITION_DEFINE(zpp_lib_partition);
K_APPMEM_PARTITION_DEFINE(app_partition);

namespace car_system {

struct k_mem_domain app_domain;

void init_domain() {
  struct k_mem_partition* partitions[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif
      &zpp_lib_partition,
      &app_partition};

  auto ret = k_mem_domain_init(&app_domain, ARRAY_SIZE(partitions), partitions);
  __ASSERT(ret == 0, "k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);

  k_mem_domain_add_thread(&app_domain, k_current_get());
}

}  // namespace car_system

#endif

And in main():

#if CONFIG_USERSPACE
  car_system::init_domain();
#endif

K_APPMEM_PARTITION_DEFINE

This macro asks the linker to collect every global variable tagged with K_APP_DMEM(partition) or K_APP_BMEM(partition) into a dedicated, MPU-aligned section. At runtime the partition’s start and size fields point to that section. You never allocate memory manually — the build system does it for you.

The two tagging macros differ only in which linker section they target:

  • K_APP_DMEM(partition) — places an initialized variable (.data) into the partition. Use this for globals that have a non-zero initial value, e.g. K_APP_DMEM(app_partition) int counter = 42;.
  • K_APP_BMEM(partition) — places a zero-initialized variable (.bss) into the partition. Use this for globals whose initial value is zero, e.g. K_APP_BMEM(app_partition) int counter;.

The distinction mirrors the standard .data vs. .bss split: .data variables consume space in the binary image (their initial values must be stored in flash), while .bss variables only consume RAM and are zero-filled at boot.

1.3 — Adapt the ProdConsumer for userspace

Three things change compared to supervisor mode:

a) Thread creation flag

Threads that should drop to user mode must be created with the user-mode flag (the third bool parameter in zpp_lib::Thread):

ProdConsumer::ProdConsumer()
#if CONFIG_USERSPACE
    : _threads{
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer", true),
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer", true)}
#else
    : _threads{
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer"),
          zpp_lib::Thread(
              zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer")}
#endif

The true flag means “after thread_start(), drop this thread to unprivileged mode”. The thread begins executing in supervisor mode (so that start() can set up grants), but transitions to user mode before the thread function body runs.

b) Shared data placement

In supervisor mode, _sharedData lives as a private class member — any thread can access any address. In user mode, the variable must be placed in a partition that all accessing threads share.

First, guard _sharedData in the header so it only exists in supervisor mode:

  // prod_consumer.hpp — inside the class
#if !CONFIG_USERSPACE
  // Shared data between producer and consumer (class member in non-userspace mode)
  SensorData _sharedData{};
#endif

Then, in prod_consumer.cpp, declare a partition-tagged global that replaces the class member when userspace is enabled (place this at file scope, before the namespace):

#if CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gProdConsumerSharedData = {};
#endif

The APP_DATA macro expands to K_APP_DMEM(app_partition), which tells the linker to place this variable in the app_partition section. Both threads are in app_domain which includes app_partition, so both can read and write it.

Finally, update producer_task() and consumer_task() to use the global when in userspace mode. In the producer, the write changes to:

    // Write data under mutex protection
    auto lockRes = _dataMutex.lock();
    if (!lockRes) {
      LOG_ERR("ProdConsumer Producer: mutex lock failed");
    } else {
#if CONFIG_USERSPACE
      gProdConsumerSharedData = data;
#else
      _sharedData = data;
#endif
      auto unlockRes = _dataMutex.unlock();
      if (!unlockRes) {
        LOG_ERR("ProdConsumer Producer: mutex unlock failed");
      }
    }

In the consumer, the read changes to:

#if CONFIG_USERSPACE
    receivedData = gProdConsumerSharedData;
#else
    receivedData = _sharedData;
#endif

Class members and userspace

A class member variable lives wherever the class instance is allocated. If the instance is on the stack or in an untagged global, user threads cannot access it. That is why shared data moves to a partition-tagged global.

c) Kernel object grants

User threads cannot use a kernel object (mutex, semaphore, barrier, …) unless they have been explicitly granted access. This is done from supervisor context, typically in start() before the threads begin running:

#if CONFIG_USERSPACE
  _barrier.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _barrier.grant_access(_threads[CONSUMER_INDEX].get_tid());
  _syncSemaphore.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _syncSemaphore.grant_access(_threads[CONSUMER_INDEX].get_tid());
  _dataMutex.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _dataMutex.grant_access(_threads[CONSUMER_INDEX].get_tid());
#endif

Without these grants, the first syscall from user mode would trigger a kernel oops — the kernel detects an unauthorized operation and terminates the thread.

1.4 — Declaring the CarSystem instance in the partition

Because the CarSystem object (and its ProdConsumer member) must be accessible from user threads, its global instance must reside in app_partition:

#if CONFIG_USERSPACE
APP_DATA static car_system::CarSystem carSystem;
#endif

In non-userspace mode, it can stay as a local variable in main():

#if !defined(CONFIG_USERSPACE)
  car_system::CarSystem carSystem;
#endif

1.5 — Build and verify

Build with the overlay:

west build -b nrf5340dk/nrf5340/cpuapp car_system \
    -- -DEXTRA_CONF_FILE="prj_user_mode.conf"

You should see the producer and consumer exchanging data exactly as before. The difference is invisible at the log level — but the MPU is now enforcing access rules.

Questions

  1. What would happen if you removed the grant_access() call for the mutex? Try it and describe the resulting error.
  2. What would happen if gProdConsumerSharedData was declared without the APP_DATA tag? Where would it end up, and what fault would the consumer see?
  3. Why is CONFIG_LOG_MODE_MINIMAL=y needed in prj_user_mode.conf?
Solution
  1. The thread would trigger a kernel oops (syscall access denied) on the first _dataMutex.lock() call. The kernel checks per-object permissions on every syscall from user mode and terminates the thread if the grant is missing.

  2. Without APP_DATA, the variable would be placed in the default .bss / .data section, which is not part of any user-accessible partition. The first read or write from user mode would trigger a MemManage fault (MPU violation), causing a system reset or thread termination.

  3. The full deferred logging subsystem uses kernel objects internally (message queues, work queues) that are not automatically granted to user threads. Minimal logging avoids those internal objects by printing synchronously. Without it, the first LOG_INF from user mode would trigger a syscall access fault.

Step 2 — Dedicated Memory Partitions

The first step proved that isolation from the kernel works. But the producer and consumer still share app_partition — the producer can read and write the consumer’s private variables, and vice versa. In a real system with trust boundaries, this is not enough.

In this step, each thread gets its own private partition plus a small shared partition for the data they actually need to exchange:

┌─────────────────────────────────────────────────────────────┐
│                     producer_domain                         │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐  │
│  │z_libc  │ │zpp_lib │ │  app   │ │producer  │ │pc_shared│  │
│  │        │ │        │ │        │ │partition │ │partition│  │
│  └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘  │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                     consumer_domain                         │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐  │
│  │z_libc  │ │zpp_lib │ │  app   │ │consumer  │ │pc_shared│  │
│  │        │ │        │ │        │ │partition │ │partition│  │
│  └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘  │
└─────────────────────────────────────────────────────────────┘
  • The producer can access producer_partition (private) and pc_shared_partition (the exchanged SensorData), but not consumer_partition.
  • The consumer can access consumer_partition (private) and pc_shared_partition, but not producer_partition.
  • Both still need zpp_lib_partition, app_partition, and z_libc_partition for library code and common application globals.

A write from the producer into consumer_partition would trigger a MemManage fault — true isolation.

2.1 — Add the Kconfig option

config PROD_CONSUMER_PARTITIONS
  bool "Use separate memory partitions for producer and consumer"
  depends on PROD_CONSUMER && USERSPACE
  default n
  help
    When enabled, the producer and consumer each get their own memory
    partition (producer_partition, consumer_partition) plus a shared
    partition (pc_shared_partition) for the data exchanged between them.
    When disabled, both threads share app_partition.

Note the depends on PROD_CONSUMER && USERSPACE — dedicated partitions only make sense when userspace is already enabled.

2.2 — Define partition macros and the init function

userspace/prod_consumer_partitions.hpp:

#pragma once

#if CONFIG_PROD_CONSUMER_PARTITIONS

#include <zephyr/app_memory/app_memdomain.h>

// Macros for tagging globals into the correct partition
#define PRODUCER_DATA K_APP_DMEM(producer_partition)
#define PRODUCER_BSS  K_APP_BMEM(producer_partition)
#define CONSUMER_DATA K_APP_DMEM(consumer_partition)
#define CONSUMER_BSS  K_APP_BMEM(consumer_partition)
#define PC_SHARED_DATA K_APP_DMEM(pc_shared_partition)
#define PC_SHARED_BSS  K_APP_BMEM(pc_shared_partition)

namespace car_system {

void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid);

}  // namespace car_system

#endif

2.3 — Implement the domain setup

userspace/prod_consumer_partitions.cpp:

#if CONFIG_PROD_CONSUMER_PARTITIONS

#include "prod_consumer_partitions.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
#include "init_domain.hpp"

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

extern struct k_mem_partition zpp_lib_partition;
extern struct k_mem_partition app_partition;

// Define the three partitions
K_APPMEM_PARTITION_DEFINE(producer_partition);
K_APPMEM_PARTITION_DEFINE(consumer_partition);
K_APPMEM_PARTITION_DEFINE(pc_shared_partition);

// Each partition must contain at least one variable so the linker gives it
// a non-zero size — Zephyr rejects zero-sized partitions in k_mem_domain_init.
PRODUCER_DATA volatile uint8_t producer_placeholder;
CONSUMER_DATA volatile uint8_t consumer_placeholder;

namespace car_system {

static struct k_mem_domain producer_domain;
static struct k_mem_domain consumer_domain;

void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid) {
  int ret;

  // Producer domain: zpp_lib + app + producer's own + shared + libc
  struct k_mem_partition* producer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif
      &zpp_lib_partition,
      &app_partition,
      &producer_partition,
      &pc_shared_partition};

  ret = k_mem_domain_init(&producer_domain, ARRAY_SIZE(producer_parts),
                          producer_parts);
  __ASSERT(ret == 0, "producer k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);
  k_mem_domain_add_thread(&producer_domain, producer_tid);

  // Consumer domain: zpp_lib + app + consumer's own + shared + libc
  struct k_mem_partition* consumer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif
      &zpp_lib_partition,
      &app_partition,
      &consumer_partition,
      &pc_shared_partition};

  ret = k_mem_domain_init(&consumer_domain, ARRAY_SIZE(consumer_parts),
                          consumer_parts);
  __ASSERT(ret == 0, "consumer k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);
  k_mem_domain_add_thread(&consumer_domain, consumer_tid);
}

}  // namespace car_system

#endif

Placeholder variables

The linker will give a partition zero size if no variable is tagged into it. k_mem_domain_init() rejects zero-sized partitions with an assertion failure. The producer_placeholder / consumer_placeholder variables exist solely to ensure the partitions are non-empty. In a real application, these would be replaced by actual per-thread globals.

Why still include app_partition?

The CarSystem object, logging buffers, and other common application globals live in app_partition. Both threads need read access to those. If you removed app_partition from the domain, the first access to any APP_DATA-tagged variable would fault.

2.4 — Place shared data in pc_shared_partition

In prod_consumer.cpp, the shared data placement now depends on the configuration:

#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "userspace/prod_consumer_partitions.hpp"
PC_SHARED_DATA car_system::SensorData gProdConsumerSharedData = {};
#elif CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gProdConsumerSharedData = {};
#endif

When PROD_CONSUMER_PARTITIONS is enabled, gProdConsumerSharedData lives in pc_shared_partition — accessible to both threads. Any thread-private data tagged with PRODUCER_DATA or CONSUMER_DATA would only be accessible to its respective thread.

2.5 — Wire domain initialization into start()

In ProdConsumer::start(), after granting kernel object access and before the threads drop to user mode, call the domain setup:

#if CONFIG_USERSPACE
  // Grant kernel object access (same as Step 1)
  _barrier.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _barrier.grant_access(_threads[CONSUMER_INDEX].get_tid());
  _syncSemaphore.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _syncSemaphore.grant_access(_threads[CONSUMER_INDEX].get_tid());
  _dataMutex.grant_access(_threads[PRODUCER_INDEX].get_tid());
  _dataMutex.grant_access(_threads[CONSUMER_INDEX].get_tid());

#if CONFIG_PROD_CONSUMER_PARTITIONS
  // Move each thread into its own memory domain
  init_prod_consumer_domains(_threads[PRODUCER_INDEX].get_tid(),
                             _threads[CONSUMER_INDEX].get_tid());
#endif
#endif

Order matters

init_prod_consumer_domains() calls k_mem_domain_add_thread(), which removes the thread from its current domain (app_domain) and adds it to the new one. This must happen after init_domain() has run (so the common partitions already exist) and before the threads start executing in user mode.

2.6 — Activate in prj_user_mode.conf

# Enable separate memory partitions for producer/consumer
CONFIG_PROD_CONSUMER_PARTITIONS=y

Build and flash as before:

west build -b nrf5340dk/nrf5340/cpuapp car_system \
    -- -DEXTRA_CONF_FILE="prj_user_mode.conf"

The output should be identical to Step 1. The difference is purely in the fault isolation: the producer and consumer can no longer corrupt each other’s private data.

Questions

  1. Draw the MPU region map for the producer thread showing which partitions are accessible and which are not.
  2. What happens if the producer tries to write to a variable tagged with CONSUMER_DATA? At what level is the fault caught (hardware or software)?
  3. In the diagram above, both domains include app_partition. Could you remove it from one of them? What would break?
  4. Why are producer_domain and consumer_domain declared as static local variables in prod_consumer_partitions.cpp rather than as globals? What would change if they were global?
Solution
  1. Producer thread MPU regions:

    Region Partition Access
    0 z_libc_partition RW
    1 zpp_lib_partition RW
    2 app_partition RW
    3 producer_partition RW
    4 pc_shared_partition RW
    consumer_partition FAULT
    kernel memory FAULT
    peripheral MMIO FAULT (unless granted)
  2. A write to a CONSUMER_DATA variable triggers a MemManage fault — caught by the hardware MPU before the write completes. The CPU vectors to the fault handler, which typically logs the offending address and terminates the thread (or resets the system). No software check is needed; the MPU does it in a single clock cycle.

  3. Removing app_partition would cause a hardware fault on the first access to any APP_DATA-tagged variable — for example the CarSystem object itself, the log buffers, or any common application global. Both threads need read (and sometimes write) access to shared application state.

  4. The k_mem_domain structs are only used during initialization (k_mem_domain_init + k_mem_domain_add_thread). After that, the kernel maintains its own internal references. Making them static is a good practice (limits visibility), but making them global would also work — the kernel does not care about the C++ linkage of the struct, only about the pointer it received during init.

Summary

Aspect Step 0: supervisor Step 1: shared domain Step 2: per-thread domains
Thread privilege Supervisor User User
Memory isolation None Threads vs. kernel Threads vs. kernel and vs. each other
Shared data location Class member (_sharedData) APP_DATA global PC_SHARED_DATA global
Private data safety Trust-based Trust-based MPU-enforced
Kernel object access Implicit Explicit grants Explicit grants
Fault on bad access Silent corruption MemManage fault MemManage fault

The progression from supervisor → shared domain → per-thread domains is a practical pattern for introducing userspace incrementally:

  1. Get the application working in supervisor mode first (Step 0).
  2. Flip the userspace switch, fix the build errors (grants, data placement).
  3. When isolation between threads matters, split into per-thread domains.

Going beyond / References