Skip to content

Robust Design Patterns - Part 2 - Userspace Isolation

Introduction

In this codelab, we address the problem of unresponsive applications. Unresponsive is referring to the situation a program installed on an MCU becomes unavailable due to various reasons (e.g. gobbler task, dead-lock, …). So far every thread in our application runs in supervisor mode — the CPU’s most privileged level. Any thread can read or write any memory address, access any peripheral, and call any kernel function. That is convenient for development, but it means a single bug in one thread can corrupt data used by another, silently overwrite kernel structures, or poke hardware registers it has no business touching.

Zephyr RTOS offers userspace — an optional feature that leverages the CPU’s Memory Protection Unit (MPU) to restrict what each thread can access at the hardware level. A userspace thread can only touch memory regions it has been explicitly granted access to; any other access triggers an immediate CPU fault instead of silent corruption.

What you’ll build

In this codelab, you will turn two existing periodic tasks — Engine (producer) and Display (consumer) — into a producer-consumer pair exchanging SensorData via shared memory, and progressively increase the use of userspace protection:

  1. Step 0 — Producer/Consumer in supervisor mode: both threads run in supervisor mode and share data through a common variable.
  2. Step 1 — Single shared domain (app_partition): both threads run in user mode and share the same global memory partition.
  3. Step 2 — Dedicated per-thread domains: Engine and Display each get their own private memory partition, with only the exchanged data placed in a shared partition accessible to both.

This approach reuses the existing CarSystem periodic tasks. Engine produces sensor data at the end of its 50 ms computation cycle, and Display consumes it using a non-blocking semaphore check within its 125 ms cycle.

What you’ll learn

  • What supervisor mode vs. user mode means on an ARM Cortex-M33 with MPU.
  • The core Zephyr RTOS abstractions: memory partitions, memory domains, and kernel object permissions.
  • How to add producer-consumer communication between existing periodic threads without creating new ones.
  • How to move the subsystem into userspace incrementally — first with a single partition, then with fine-grained per-thread isolation.
  • The practical constraints: what breaks when you flip the switch, and how to fix it.
  • How to verify that isolation actually works (a rogue write faults instead of corrupting).

What you’ll need

Disable task watchdog subsystem

Make sure you have disabled the task watchdog subsystem prior to starting this codelab. Failing to do so will result in crashes.

Update zpp_lib

Do update the zpp_lib to the latest version. Do so by going to deps/zpp_lib and execute a `git checkout tags/v1.0 to get the latest version (v1.0). Failing to do so will result in compilation errors. You should also modify your west.yml file. If you prefer to perform a west update after modifying the west.yml file, do NOT forget to reapply the patch to the Zephyr RTOS libray with git cherry-pick 4a3c5ed1f6f3ae3351fe48f11382c0a94e0aea02.

Concepts: why Userspace?

The problem with supervisor mode

In a typical bare-metal or RTOS application every thread runs at the CPU’s highest privilege level. This means:

Capability Risk
Read/write any RAM address One thread’s bug silently corrupts another’s data
Access any peripheral register Accidental write to a timer or UART can brick the system
Call any kernel API A buffer overflow can overwrite scheduler structures

In safety-critical projects this is unacceptable: you need fault containment — the guarantee that one component’s failure cannot propagate to another.

How the MPU enforces isolation

ARM Cortex-M33 (the core in the nRF5340) includes a Memory Protection Unit (MPU) — a hardware block that sits between the CPU and the bus. Before every memory access, the MPU checks:

  1. Is the address within a region the current thread is allowed to access?
  2. Does the access type (read / write / execute) match the region’s permissions?

If the check fails, the CPU raises a MemManage fault — an immediate, synchronous exception. The offending instruction never completes.

┌──────────────┐         ┌─────────┐         ┌──────────────┐
│   CPU core   │──addr──▶│   MPU   │──ok──▶  │  Bus / SRAM  │
│ (thread A)   │         │ regions │         │              │
└──────────────┘         │ check   │         └──────────────┘
                         │         │
                         │ FAULT ◀─┘  (if no matching region)

The key insight: the MPU is reconfigured on every context switch. When the scheduler switches from Thread A to Thread B, it loads Thread B’s region table into the MPU. Thread B literally cannot see Thread A’s private memory — it is as if it did not exist.

Zephyr’s memory model

Zephyr RTOS introduces three abstractions that map onto the MPU:

Abstraction What it is Analogy
Memory Partition (k_mem_partition) A named, contiguous memory region with fixed permissions A room in a building
Memory Domain (k_mem_domain) A set of partitions that a thread (or group of threads) can access A key card that opens specific rooms
Kernel Object Permissions Per-object ACL (Access Control List) granting a thread the right to use a specific kernel object (mutex, semaphore, queue, …) A signed permission slip for a specific resource

A thread running in user mode can only access:

  • The partitions in its memory domain
  • The kernel objects it has been explicitly granted access to

Everything else triggers a fault.

Supervisor mode is always available

Code running in ISR context always executes in supervisor mode. Threads only run in user mode if they were created with the K_USER option (or the equivalent zpp_lib flag); all other threads remain in supervisor mode regardless of their priority. Both cooperative and preemptive threads can be user mode threads — the privilege level is determined by the creation flag, not the priority class (see https://docs.zephyrproject.org/latest/kernel/services/threads/index.html#thread-options).

Step 0 — Producer/Consumer in Supervisor Mode

In this codelab we integrate the producer-consumer roles directly into the existing CarSystem periodic tasks:

  • Engine (50 ms period, High priority) → producer: writes SensorData at the end of each computation cycle.
  • Display (125 ms period, AboveNormal priority) → consumer: attempts to read SensorData using a non-blocking semaphore check.

Because Engine runs at 50 ms while Display runs at 125 ms, Engine produces data roughly 2-3 times per Display cycle. Display picks up the latest sample each time it wakes.

┌──────────────────────────────────────────────────────────────────┐
│                   Supervisor Mode (all RAM visible)              │
│                                                                  │
│    Engine thread (50ms)             Display thread (125ms)       │
│   ┌─────────────────────┐          ┌──────────────────────┐      │
│   │ computation (10ms)  │          │ computation (15ms)   │      │
│   │ fill SensorData     │──mutex──▶│ try_acquire sema     │      │
│   │ release semaphore   │──sema───▶│ read SensorData      │      │
│   └─────────────────────┘          └──────────────────────┘      │
│                                                                  │
│                     _pcSharedData (class member)                 │
└──────────────────────────────────────────────────────────────────┘

0.1 — Add the Kconfig option

Create a PROD_CONSUMER_INTEGRATED option in the project’s Kconfig so the feature can be compiled in or out:

config PROD_CONSUMER_INTEGRATED
  bool "Engine/Display producer-consumer demo"
  depends on PERIODIC_TASKS
  default n
  help
    When enabled, the Engine periodic task acts as a producer and the
    Display periodic task acts as a consumer. Engine writes SensorData
    every period and Display reads it, demonstrating producer-consumer
    with existing threads rather than dedicated ones. Works in both
    supervisor and userspace modes.

Enable it in prj.conf:

CONFIG_PERIODIC_TASKS=y
CONFIG_PROD_CONSUMER_INTEGRATED=y

0.2 — Add members to CarSystem (car_system.hpp)

The producer-consumer primitives live directly in CarSystem. Add the SensorData struct and the required members:

// car_system.hpp — add the include
#if CONFIG_PROD_CONSUMER_INTEGRATED
#include "zpp_include/mutex.hpp"
#endif

Define SensorData before the class (inside the car_system namespace):

#if CONFIG_PROD_CONSUMER_INTEGRATED
// Data structure shared between Engine (producer) and Display (consumer)
struct SensorData {
  uint32_t sequence_number;
  std::chrono::milliseconds timestamp;
  uint32_t sensor_value;
};
#endif  // CONFIG_PROD_CONSUMER_INTEGRATED

Add the following private members at the end of the CarSystem class:

#if CONFIG_PROD_CONSUMER_INTEGRATED
  // Semaphore initial count
  static constexpr int kPcSemaphoreInitialCount = 0;
  // Max semaphore count: accommodates Engine's higher frequency (50ms) vs Display (125ms)
  static constexpr int kPcSemaphoreMaxCount = 5;
  // Semaphore for Engine (producer) → Display (consumer) signalling.
  zpp_lib::Semaphore _pcSemaphore{kPcSemaphoreInitialCount, kPcSemaphoreMaxCount};
  // Mutex to protect access to the shared sensor data
  zpp_lib::Mutex _pcMutex;
  // Sequence counter for produced data
  std::atomic<uint32_t> _pcSequenceCounter{0};
#if !CONFIG_USERSPACE
  // Shared data between Engine and Display (class member in supervisor mode)
  SensorData _pcSharedData{};
#endif  // !CONFIG_USERSPACE
  // Scaling factor applied to sequence number to produce sensor value
  static constexpr uint32_t kSensorValueScale = 42;
#endif  // CONFIG_PROD_CONSUMER_INTEGRATED

!!! failure “Workqueue must operate in kernel mode under Zephyr RTOS Running a k_work_q in user mode is not supported under Zephyr RTOS. Therefore, the thread that executes the k_work_queue_run function must be created as a kernel thread. The same applies to the use of zpp_lib::WorkQueue.

!!! failure “Tracing and user mode are incompatible under Zephyr RTOS A thread created in user mode cannot invoke SystemView primitives, such as SEGGER_SYSVIEW_MarkStart() and SEGGER_SYSVIEW_MarkStop(). Doing so results in an MPU fault. This is because the function accesses protected memory without invoking system calls. Therefore, tracing a user thread is not possible.

Why max = 5 on the semaphore?

Engine produces data every 50 ms while Display only consumes every 125 ms. Between two Display wakes, Engine may have posted 2-3 items. The semaphore max = 5 provides headroom so that release() never fails even if Display is temporarily delayed. Display uses try_acquire() (non-blocking) to drain available items without disrupting its own periodic schedule.

0.3 — Add producer/consumer logic to task_method() (car_system.cpp)

The key insight: Engine and Display already run in task_method(). We add the producer/consumer logic after the regular computation, but still within the same periodic invocation.

In car_system.cpp, add the following block inside task_method(), after the test recorder stop and before the period calculation:

#if CONFIG_PROD_CONSUMER_INTEGRATED
    // Engine = producer: write sensor data after computation
    if (taskIndex == kEngineTaskIndex) {
      SensorData data;
      data.sequence_number = _pcSequenceCounter.fetch_add(1);
      data.timestamp = std::chrono::duration_cast<std::chrono::milliseconds>(
          zpp_lib::Time::get_uptime());
      data.sensor_value = data.sequence_number * kSensorValueScale;

      auto lockRes = _pcMutex.lock();
      if (!lockRes) {
        LOG_ERR("Engine producer: mutex lock failed");
      } else {
        _pcSharedData = data;
        auto unlockRes = _pcMutex.unlock();
        if (!unlockRes) {
          LOG_ERR("Engine producer: mutex unlock failed");
        }
      }

      auto semRes = _pcSemaphore.release();
      if (!semRes) {
        LOG_ERR("Engine producer: semaphore release failed");
      }

      LOG_INF("Engine produced data (seq=%u, value=%u)",
              data.sequence_number, data.sensor_value);
    }

    // Display = consumer: try to read sensor data (non-blocking)
    if (taskIndex == kDisplayTaskIndex) {
      auto semRes = _pcSemaphore.try_acquire();
      if (semRes) {
        SensorData receivedData;
        auto lockRes = _pcMutex.lock();
        if (!lockRes) {
          LOG_ERR("Display consumer: mutex lock failed");
        } else {
          receivedData = _pcSharedData;
          auto unlockRes = _pcMutex.unlock();
          if (!unlockRes) {
            LOG_ERR("Display consumer: mutex unlock failed");
          }

          LOG_INF("Display consumed data (seq=%u, timestamp=%lld ms, value=%u)",
                  receivedData.sequence_number, receivedData.timestamp.count(),
                  receivedData.sensor_value);
        }
      }
    }
#endif  // CONFIG_PROD_CONSUMER_INTEGRATED

Why try_acquire() instead of acquire()?

Display is a periodic task with its own deadline. A blocking acquire() would stall Display until Engine produces — violating Display’s period contract. The non-blocking try_acquire() simply checks whether data is available: if yes, it consumes it; if not, Display continues to its next period unchanged.

0.4 — Build and verify

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine

You should see log output like:

Engine produced data (seq=0, value=0)
Engine produced data (seq=1, value=42)
Display consumed data (seq=1, timestamp=92 ms, value=42)
Engine produced data (seq=2, value=84)
Engine produced data (seq=3, value=126)
Display consumed data (seq=3, timestamp=192 ms, value=126)

Notice that Display does not consume every sample — it picks up the latest available one, which is the expected behavior given the period mismatch.

At this point everything works, but there is no memory protection whatsoever. A bug in Display could overwrite Engine’s stack, corrupt kernel data structures or poke random peripheral registers — and the system would silently continue with a corrupted state.

Questions

  1. Why does Display use a non-blocking try_acquire() while the standalone ProdConsumer consumer uses a blocking acquire()?
  2. How many Engine samples does Display miss between two consecutive Display invocations? Is this a problem?
  3. What would happen if you removed the mutex and let both threads access _pcSharedData without synchronization? Would the system crash immediately?

Step 1 — Userspace with a Shared Domain

With the supervisor-mode integrated producer/consumer working (Step 0), you can now flip the userspace switch. In this step, every application thread (including Engine and Display) belongs to one common memory domain (app_domain) containing:

  • z_libc_partition — the C library’s internal globals
  • zpp_lib_partition — the zpp_lib library globals
  • app_partition — the application’s own globals

Both Engine and Display see the same set of memory. There is no isolation between them yet, but they are isolated from the kernel and from peripherals they have not been granted access to.

┌──────────────────────────────────────────┐
│              app_domain                  │
│  ┌───────────┐ ┌──────────┐ ┌─────────┐  │
│  │ z_libc    │ │ zpp_lib  │ │   app   │  │
│  │ partition │ │ partition│ │partition│  │
│  └───────────┘ └──────────┘ └─────────┘  │
│                                          │
│  Engine thread ◀────────▶ Display thread │
│          (both share everything)         │
└──────────────────────────────────────────┘

1.1 — Enable userspace in the build

Userspace is activated via an overlay configuration file that is passed at build time with --extra-conf.

Create prj_user_mode.conf in your car_system:

# enable user mode
CONFIG_USERSPACE=y
CONFIG_LOG_MODE_MINIMAL=y
# Integrated P/C adds 1 semaphore + 1 mutex (default pool size may not be enough)
CONFIG_ZPP_SEMAPHORE_POOL_SIZE=4
# Enable INFO-level app logging so producer/consumer output is visible
CONFIG_APP_LOG_LEVEL_INF=y

Pool sizes

Userspace objects (semaphores, mutexes, barriers, …) are allocated from static pools whose sizes must be configured at build time. If you get a boot-time assertion such as __ASSERT zpp semaphore pool exhausted, increase the pool size in the conf file.

1.2 — Set up the memory domain

The domain must be created from supervisor context — typically early in main(), before any user thread is started.

userspace/init_domain.hpp:

#pragma once

#if CONFIG_USERSPACE

// zephyr
#include <zephyr/app_memory/app_memdomain.h>

// std
#include <cstdint>

namespace car_system {

#define APP_DATA K_APP_DMEM(app_partition)

extern struct k_mem_domain app_domain;
extern void init_domain();

}  // namespace car_system
#else  // CONFIG_USERSPACE
#define APP_DATA
#endif  // CONFIG_USERSPACE

userspace/init_domain.cpp:

#if CONFIG_USERSPACE

#include "init_domain.hpp"

// zephyr
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>

// local

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

/* Define zpp_lib_partition, where all globals for the zpp_lib library will be routed.
 * The partition starting address and size are populated by build system
 * and linker magic.
 */
K_APPMEM_PARTITION_DEFINE(zpp_lib_partition);

/* Define app_partition, where all globals for this app will be routed.
 * The partition starting address and size are populated by build system
 * and linker magic.
 */
K_APPMEM_PARTITION_DEFINE(app_partition);

namespace car_system {

/* Memory domain for application, set up and installed in main() */
struct k_mem_domain app_domain;

void init_domain() {
  LOG_INF("ZPP_LIB partition: %p %zu",
          (void*)zpp_lib_partition.start,
          (size_t)zpp_lib_partition.size);
  LOG_INF(
      "APP partition: %p %zu", (void*)app_partition.start, (size_t)app_partition.size);
#ifdef Z_LIBC_PARTITION_EXISTS
  LOG_INF("libc partition: %p %zu",
          (void*)z_libc_partition.start,
          (size_t)z_libc_partition.size);
#endif  // Z_LIBC_PARTITION_EXISTS

  /* Initialize a memory domain with the specified partitions
   * and add this thread to this domain. We need access to our own
   * partition, the shared partition, and any common libc partition
   * if it exists.
   */
  struct k_mem_partition* partitions[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif  // Z_LIBC_PARTITION_EXISTS
      &zpp_lib_partition,
      &app_partition};

  auto ret = k_mem_domain_init(&app_domain, ARRAY_SIZE(partitions), partitions);
  __ASSERT(ret == 0, "k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);

  k_mem_domain_add_thread(&app_domain, k_current_get());
}

}  // namespace car_system

#endif  // CONFIG_USERSPACE

And in main():

#if CONFIG_USERSPACE
  car_system::init_domain();
#endif

K_APPMEM_PARTITION_DEFINE

This macro asks the linker to collect every global variable tagged with K_APP_DMEM(partition) or K_APP_BMEM(partition) into a dedicated, MPU-aligned section. At runtime the partition’s start and size fields point to that section. You never allocate memory manually — the build system does it for you.

The two tagging macros differ only in which linker section they target:

  • K_APP_DMEM(partition) — places an initialized variable (.data) into the partition. Use this for globals that have a non-zero initial value, e.g. K_APP_DMEM(app_partition) int counter = 42;.
  • K_APP_BMEM(partition) — places a zero-initialized variable (.bss) into the partition. Use this for globals whose initial value is zero, e.g. K_APP_BMEM(app_partition) int counter;.

The distinction mirrors the standard .data vs. .bss split: .data variables consume space in the binary image (their initial values must be stored in flash), while .bss variables only consume RAM and are zero-filled at boot.

1.3 — Adapt the CarSystem for userspace

Three things change compared to supervisor mode:

a) Thread creation flag

The periodic task threads that should drop to user mode must be created with the user-mode flag (the third bool parameter in zpp_lib::Thread). This is already the case in the existing CarSystem constructor when CONFIG_USERSPACE is enabled:

CarSystem::CarSystem()
#if CONFIG_PERIODIC_TASKS && CONFIG_USERSPACE
    : _threads{zpp_lib::Thread(_taskInfos[kEngineTaskIndex]._priority,
                               _taskInfos[kEngineTaskIndex]._szTaskName,
                               true),
               zpp_lib::Thread(_taskInfos[kDisplayTaskIndex]._priority,
                               _taskInfos[kDisplayTaskIndex]._szTaskName,
                               true),
               // ... Tire and Rain also with true ...
              }

The true flag means “after thread_start(), drop this thread to unprivileged mode”. The thread begins executing in supervisor mode (so that start() can set up grants), but transitions to user mode before the thread function body runs.

b) Shared data placement

In supervisor mode, _pcSharedData lives as a private class member — any thread can access any address. In user mode, the variable must be placed in a partition that all accessing threads share.

The member is already guarded so it only exists in supervisor mode:

  // car_system.hpp — inside the class
#if !CONFIG_USERSPACE
  SensorData _pcSharedData{};
#endif

In car_system.cpp, declare a partition-tagged global that replaces the class member when userspace is enabled (place this at file scope):

#if CONFIG_PROD_CONSUMER_INTEGRATED
#if CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gIntegratedSharedData = {};
#endif
#endif

The APP_DATA macro expands to K_APP_DMEM(app_partition), which tells the linker to place this variable in the app_partition section. Both threads are in app_domain which includes app_partition, so both can read and write it.

Then update the producer and consumer code in task_method() to use the global when in userspace mode. In the producer (Engine), the write changes to:

      auto lockRes = _pcMutex.lock();
      if (!lockRes) {
        LOG_ERR("Engine producer: mutex lock failed");
      } else {
#if CONFIG_USERSPACE
        gIntegratedSharedData = data;
#else
        _pcSharedData = data;
#endif
        auto unlockRes = _pcMutex.unlock();
        // ...
      }

In the consumer (Display), the read changes to:

#if CONFIG_USERSPACE
          receivedData = gIntegratedSharedData;
#else
          receivedData = _pcSharedData;
#endif

Class members and userspace

A class member variable lives wherever the class instance is allocated. If the instance is on the stack or in an untagged global, user threads cannot access it. That is why shared data moves to a partition-tagged global.

c) Kernel object grants

User threads cannot use a kernel object (mutex, semaphore, barrier, …) unless they have been explicitly granted access. For the integrated variant, add grants for the producer-consumer primitives in CarSystem::start(), alongside the existing barrier grants:

#if CONFIG_USERSPACE
  // grant access to the barrier for each thread
  for (uint8_t taskIndex = 0; taskIndex < kNbrOfPeriodicTasks; taskIndex++) {
    k_tid_t tid = _threads[taskIndex].get_tid();
    _barrier.grant_access(tid);
  }
#if CONFIG_PROD_CONSUMER_INTEGRATED
  // Grant Engine and Display access to prod/consumer sync primitives
  _pcSemaphore.grant_access(_threads[kEngineTaskIndex].get_tid());
  _pcSemaphore.grant_access(_threads[kDisplayTaskIndex].get_tid());
  _pcMutex.grant_access(_threads[kEngineTaskIndex].get_tid());
  _pcMutex.grant_access(_threads[kDisplayTaskIndex].get_tid());
#endif
#endif

Without these grants, the first syscall from user mode would trigger a kernel oops — the kernel detects an unauthorized operation and terminates the thread.

1.4 — Declaring the CarSystem instance in the partition

Because the CarSystem object must be accessible from user threads, its global instance must reside in app_partition:

#if CONFIG_USERSPACE
APP_DATA static car_system::CarSystem carSystem;
#endif

In non-userspace mode, it can stay as a local variable in main():

#if !defined(CONFIG_USERSPACE)
  car_system::CarSystem carSystem;
#endif

1.5 — Build and verify

Build with the overlay:

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine \
    --extra-conf="prj_user_mode.conf"

You should see Engine and Display exchanging data exactly as before. The difference is invisible at the log level — but the MPU is now enforcing access rules.

Questions

  1. What would happen if you removed the grant_access() call for the mutex? Try it and describe the resulting error.
  2. What would happen if gIntegratedSharedData was declared without the APP_DATA tag? Where would it end up, and what fault would Display see?
  3. Why is CONFIG_LOG_MODE_MINIMAL=y used in prj_user_mode.conf?

Step 2 — Dedicated Memory Partitions

The first step proved that isolation from the kernel works. But Engine and Display still share app_partitionEngine can read and write Display’s private variables, and vice versa. In a real system with trust boundaries, this is not enough.

In this step, each thread gets its own private partition plus a small shared partition for the data they actually need to exchange:

┌─────────────────────────────────────────────────────────────┐
│                     Engine_domain (producer)                │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐  │
│  │z_libc  │ │zpp_lib │ │  app   │ │producer  │ │pc_shared│  │
│  │        │ │        │ │        │ │partition │ │partition│  │
│  └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘  │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                     Display_domain (consumer)               │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐  │
│  │z_libc  │ │zpp_lib │ │  app   │ │consumer  │ │pc_shared│  │
│  │        │ │        │ │        │ │partition │ │partition│  │
│  └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘  │
└─────────────────────────────────────────────────────────────┘
  • Engine can access producer_partition (private) and pc_shared_partition (the exchanged SensorData), but not consumer_partition.
  • Display can access consumer_partition (private) and pc_shared_partition, but not producer_partition.
  • Both still need zpp_lib_partition, app_partition, and z_libc_partition for library code and common application globals.

A write from Engine into consumer_partition would trigger a MemManage fault — true isolation.

2.1 — Add the Kconfig option

config PROD_CONSUMER_PARTITIONS
  bool "Use separate memory partitions for producer and consumer"
  depends on (PROD_CONSUMER || PROD_CONSUMER_INTEGRATED) && USERSPACE
  default n
  help
    When enabled, the producer and consumer each get their own memory
    partition (producer_partition, consumer_partition) plus a shared
    partition (pc_shared_partition) for the data exchanged between them.
    When disabled, both threads share app_partition.

Note the depends on (PROD_CONSUMER || PROD_CONSUMER_INTEGRATED) && USERSPACE — dedicated partitions work with both the standalone and integrated variants, and only make sense when userspace is already enabled.

2.2 — Define partition macros and the init function

userspace/prod_consumer_partitions.hpp:

#pragma once

#if CONFIG_PROD_CONSUMER_PARTITIONS

#include <zephyr/app_memory/app_memdomain.h>

// Macros for tagging globals into the correct partition
#define PRODUCER_DATA K_APP_DMEM(producer_partition)
#define PRODUCER_BSS  K_APP_BMEM(producer_partition)
#define CONSUMER_DATA K_APP_DMEM(consumer_partition)
#define CONSUMER_BSS  K_APP_BMEM(consumer_partition)
#define PC_SHARED_DATA K_APP_DMEM(pc_shared_partition)
#define PC_SHARED_BSS  K_APP_BMEM(pc_shared_partition)

namespace car_system {

void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid);

}  // namespace car_system

#endif

2.3 — Implement the domain setup

userspace/prod_consumer_partitions.cpp:

#if CONFIG_PROD_CONSUMER_PARTITIONS

#include "prod_consumer_partitions.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
#include "init_domain.hpp"

LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);

extern struct k_mem_partition zpp_lib_partition;
extern struct k_mem_partition app_partition;

// Define the three partitions
K_APPMEM_PARTITION_DEFINE(producer_partition);
K_APPMEM_PARTITION_DEFINE(consumer_partition);
K_APPMEM_PARTITION_DEFINE(pc_shared_partition);

// Each partition must contain at least one variable so the linker gives it
// a non-zero size — Zephyr rejects zero-sized partitions in k_mem_domain_init.
PRODUCER_DATA volatile uint8_t producer_placeholder;
CONSUMER_DATA volatile uint8_t consumer_placeholder;

namespace car_system {

static struct k_mem_domain producer_domain;
static struct k_mem_domain consumer_domain;

void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid) {
  int ret;

  // Producer domain: zpp_lib + app + producer's own + shared + libc
  struct k_mem_partition* producer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif
      &zpp_lib_partition,
      &app_partition,
      &producer_partition,
      &pc_shared_partition};

  ret = k_mem_domain_init(&producer_domain, ARRAY_SIZE(producer_parts),
                          producer_parts);
  __ASSERT(ret == 0, "producer k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);
  k_mem_domain_add_thread(&producer_domain, producer_tid);

  // Consumer domain: zpp_lib + app + consumer's own + shared + libc
  struct k_mem_partition* consumer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
      &z_libc_partition,
#endif
      &zpp_lib_partition,
      &app_partition,
      &consumer_partition,
      &pc_shared_partition};

  ret = k_mem_domain_init(&consumer_domain, ARRAY_SIZE(consumer_parts),
                          consumer_parts);
  __ASSERT(ret == 0, "consumer k_mem_domain_init failed %d", ret);
  ARG_UNUSED(ret);
  k_mem_domain_add_thread(&consumer_domain, consumer_tid);
}

}  // namespace car_system

#endif

Note that init_prod_consumer_domains() is the same function as in the standalone variant — it takes two generic thread IDs. The only difference is which threads we pass: Engine and Display instead of the dedicated Producer and Consumer threads.

Placeholder variables

The linker will give a partition zero size if no variable is tagged into it. k_mem_domain_init() rejects zero-sized partitions with an assertion failure. The producer_placeholder / consumer_placeholder variables exist solely to ensure the partitions are non-empty. In a real application, these would be replaced by actual per-thread globals.

Why still include app_partition?

The CarSystem object, logging buffers, and other common application globals live in app_partition. Both threads need read access to those. If you removed app_partition from the domain, the first access to any APP_DATA-tagged variable would fault.

2.4 — Place shared data in pc_shared_partition

In car_system.cpp, the shared data placement now depends on the configuration:

#if CONFIG_PROD_CONSUMER_INTEGRATED
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "userspace/prod_consumer_partitions.hpp"
PC_SHARED_DATA car_system::SensorData gIntegratedSharedData = {};
#elif CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gIntegratedSharedData = {};
#endif
#endif

When PROD_CONSUMER_PARTITIONS is enabled, gIntegratedSharedData lives in pc_shared_partition — accessible to both Engine and Display. Any thread-private data tagged with PRODUCER_DATA or CONSUMER_DATA would only be accessible to its respective thread.

2.5 — Wire domain initialization into start()

In CarSystem::start(), after granting kernel object access and before the threads drop to user mode, call the domain setup. This is added alongside the existing grants:

#if CONFIG_USERSPACE
  // grant access to the barrier for each thread
  for (uint8_t taskIndex = 0; taskIndex < kNbrOfPeriodicTasks; taskIndex++) {
    k_tid_t tid = _threads[taskIndex].get_tid();
    _barrier.grant_access(tid);
  }
#if CONFIG_PROD_CONSUMER_INTEGRATED
  // Grant Engine and Display access to prod/consumer sync primitives
  _pcSemaphore.grant_access(_threads[kEngineTaskIndex].get_tid());
  _pcSemaphore.grant_access(_threads[kDisplayTaskIndex].get_tid());
  _pcMutex.grant_access(_threads[kEngineTaskIndex].get_tid());
  _pcMutex.grant_access(_threads[kDisplayTaskIndex].get_tid());
#if CONFIG_PROD_CONSUMER_PARTITIONS
  // Move Engine into producer domain, Display into consumer domain
  init_prod_consumer_domains(_threads[kEngineTaskIndex].get_tid(),
                             _threads[kDisplayTaskIndex].get_tid());
#endif  // CONFIG_PROD_CONSUMER_PARTITIONS
#endif  // CONFIG_PROD_CONSUMER_INTEGRATED
#endif  // CONFIG_USERSPACE

Order matters

init_prod_consumer_domains() calls k_mem_domain_add_thread(), which removes the thread from its current domain (app_domain) and adds it to the new one. This must happen after init_domain() has run (so the common partitions already exist) and before the threads start executing in user mode.

2.6 — Activate in prj_user_mode.conf

# Enable separate memory partitions for Engine/Display
CONFIG_PROD_CONSUMER_PARTITIONS=y

Build and flash as before:

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine \
    --extra-conf="prj_user_mode.conf"

The output should be identical to Step 1. The difference is purely in the fault isolation: Engine and Display can no longer corrupt each other’s private data.

Questions

  1. Draw the MPU region map for the Engine thread showing which partitions are accessible and which are not.
  2. What happens if Engine tries to write to a variable tagged with CONSUMER_DATA? At what level is the fault caught (hardware or software)?
  3. In the diagram above, both domains include app_partition. Could you remove it from one of them? What would break?
  4. Why are producer_domain and consumer_domain declared as static local variables in prod_consumer_partitions.cpp rather than as globals? What would change if they were global?
  5. Tire and Rain threads are not moved into custom domains. What memory can they access?

Comparison: Standalone vs. Integrated

Aspect Standalone ProdConsumer Integrated Engine/Display
Extra threads 2 (Producer + Consumer) 0 (reuses existing tasks)
Thread periods 500 ms / 500 ms 50 ms (Engine) / 125 ms (Display)
Consumer wait acquire() (blocking) try_acquire() (non-blocking)
Sample delivery Every sample consumed Latest-value — some samples skipped
Code location prod_consumer.hpp/cpp Inline in car_system.hpp/cpp
Kconfig CONFIG_PROD_CONSUMER CONFIG_PROD_CONSUMER_INTEGRATED
Userspace partitions Same prod_consumer_partitions.* Same prod_consumer_partitions.*

The integrated variant is more representative of a real embedded system where dedicated producer-consumer threads are a luxury — tasks have existing responsibilities and the data exchange is an additional concern layered on top.

Summary

Aspect Step 0: supervisor Step 1: shared domain Step 2: per-thread domains
Thread privilege Supervisor User User
Memory isolation None Threads vs. kernel Threads vs. kernel and vs. each other
Shared data location Class member (_pcSharedData) APP_DATA global PC_SHARED_DATA global
Private data safety Trust-based Trust-based MPU-enforced
Kernel object access Implicit Explicit grants Explicit grants
Fault on bad access Silent corruption MemManage fault MemManage fault

The progression from supervisor → shared domain → per-thread domains is a practical pattern for introducing userspace incrementally:

  1. Get the application working in supervisor mode first (Step 0).
  2. Flip the userspace switch, fix the build errors (grants, data placement).
  3. When isolation between threads matters, split into per-thread domains.

Going beyond / References