Robust Design Patterns - Part 2 - Userspace Isolation
Introduction
In this codelab, we address the problem of unresponsive applications. Unresponsive is referring to the situation a program installed on an MCU becomes unavailable due to various reasons (e.g. gobbler task, dead-lock, …). So far every thread in our application runs in supervisor mode — the CPU’s most privileged level. Any thread can read or write any memory address, access any peripheral, and call any kernel function. That is convenient for development, but it means a single bug in one thread can corrupt data used by another, silently overwrite kernel structures, or poke hardware registers it has no business touching.
Zephyr RTOS offers userspace — an optional feature that leverages the CPU’s Memory Protection Unit (MPU) to restrict what each thread can access at the hardware level. A userspace thread can only touch memory regions it has been explicitly granted access to; any other access triggers an immediate CPU fault instead of silent corruption.
What you’ll build
In this codelab, you will turn two existing periodic tasks — Engine
(producer) and Display (consumer) — into a producer-consumer pair
exchanging SensorData via shared memory, and progressively increase the use
of userspace protection:
- Step 0 — Producer/Consumer in supervisor mode: both threads run in supervisor mode and share data through a common variable.
- Step 1 — Single shared domain (app_partition): both threads run in user mode and share the same global memory partition.
- Step 2 — Dedicated per-thread domains: Engine and Display each get their own private memory partition, with only the exchanged data placed in a shared partition accessible to both.
This approach reuses the existing CarSystem periodic tasks. Engine
produces sensor data at the end of its 50 ms computation cycle, and Display
consumes it using a non-blocking semaphore check within its 125 ms cycle.
What you’ll learn
- What supervisor mode vs. user mode means on an ARM Cortex-M33 with MPU.
- The core Zephyr RTOS abstractions: memory partitions, memory domains, and kernel object permissions.
- How to add producer-consumer communication between existing periodic threads without creating new ones.
- How to move the subsystem into userspace incrementally — first with a single partition, then with fine-grained per-thread isolation.
- The practical constraints: what breaks when you flip the switch, and how to fix it.
- How to verify that isolation actually works (a rogue write faults instead of corrupting).
What you’ll need
- You need to have finished the Digging into Zephyr RTOS.
- You need to have completed the Scheduling of periodic tasks codelab.
Disable task watchdog subsystem
Make sure you have disabled the task watchdog subsystem prior to starting this codelab. Failing to do so will result in crashes.
Update zpp_lib
Do update the zpp_lib to the latest version. Do so by going to deps/zpp_lib and
execute a `git checkout tags/v1.0 to get the latest version (v1.0). Failing to do so will
result in compilation errors.
You should also modify your west.yml file. If you prefer to perform a west update
after modifying the west.yml file, do NOT forget to reapply the patch to the
Zephyr RTOS libray with git cherry-pick 4a3c5ed1f6f3ae3351fe48f11382c0a94e0aea02.
Concepts: why Userspace?
The problem with supervisor mode
In a typical bare-metal or RTOS application every thread runs at the CPU’s highest privilege level. This means:
| Capability | Risk |
|---|---|
| Read/write any RAM address | One thread’s bug silently corrupts another’s data |
| Access any peripheral register | Accidental write to a timer or UART can brick the system |
| Call any kernel API | A buffer overflow can overwrite scheduler structures |
In safety-critical projects this is unacceptable: you need fault containment — the guarantee that one component’s failure cannot propagate to another.
How the MPU enforces isolation
ARM Cortex-M33 (the core in the nRF5340) includes a Memory Protection Unit (MPU) — a hardware block that sits between the CPU and the bus. Before every memory access, the MPU checks:
- Is the address within a region the current thread is allowed to access?
- Does the access type (read / write / execute) match the region’s permissions?
If the check fails, the CPU raises a MemManage fault — an immediate, synchronous exception. The offending instruction never completes.
┌──────────────┐ ┌─────────┐ ┌──────────────┐
│ CPU core │──addr──▶│ MPU │──ok──▶ │ Bus / SRAM │
│ (thread A) │ │ regions │ │ │
└──────────────┘ │ check │ └──────────────┘
│ │
│ FAULT ◀─┘ (if no matching region)
The key insight: the MPU is reconfigured on every context switch. When the scheduler switches from Thread A to Thread B, it loads Thread B’s region table into the MPU. Thread B literally cannot see Thread A’s private memory — it is as if it did not exist.
Zephyr’s memory model
Zephyr RTOS introduces three abstractions that map onto the MPU:
| Abstraction | What it is | Analogy |
|---|---|---|
Memory Partition (k_mem_partition) |
A named, contiguous memory region with fixed permissions | A room in a building |
Memory Domain (k_mem_domain) |
A set of partitions that a thread (or group of threads) can access | A key card that opens specific rooms |
| Kernel Object Permissions | Per-object ACL (Access Control List) granting a thread the right to use a specific kernel object (mutex, semaphore, queue, …) | A signed permission slip for a specific resource |
A thread running in user mode can only access:
- The partitions in its memory domain
- The kernel objects it has been explicitly granted access to
Everything else triggers a fault.
Supervisor mode is always available
Code running in ISR context always executes in supervisor mode. Threads
only run in user mode if they were created with the K_USER option (or
the equivalent zpp_lib flag); all other threads remain in supervisor
mode regardless of their priority. Both cooperative and preemptive
threads can be user mode threads — the privilege level is determined by
the creation flag, not the priority class
(see https://docs.zephyrproject.org/latest/kernel/services/threads/index.html#thread-options).
Step 0 — Producer/Consumer in Supervisor Mode
In this codelab we integrate the producer-consumer roles directly into the
existing CarSystem periodic tasks:
- Engine (50 ms period, High priority) → producer: writes
SensorDataat the end of each computation cycle. - Display (125 ms period, AboveNormal priority) → consumer: attempts to
read
SensorDatausing a non-blocking semaphore check.
Because Engine runs at 50 ms while Display runs at 125 ms, Engine produces data roughly 2-3 times per Display cycle. Display picks up the latest sample each time it wakes.
┌──────────────────────────────────────────────────────────────────┐
│ Supervisor Mode (all RAM visible) │
│ │
│ Engine thread (50ms) Display thread (125ms) │
│ ┌─────────────────────┐ ┌──────────────────────┐ │
│ │ computation (10ms) │ │ computation (15ms) │ │
│ │ fill SensorData │──mutex──▶│ try_acquire sema │ │
│ │ release semaphore │──sema───▶│ read SensorData │ │
│ └─────────────────────┘ └──────────────────────┘ │
│ │
│ _pcSharedData (class member) │
└──────────────────────────────────────────────────────────────────┘
0.1 — Add the Kconfig option
Create a PROD_CONSUMER_INTEGRATED option in the project’s Kconfig so the
feature can be compiled in or out:
config PROD_CONSUMER_INTEGRATED
bool "Engine/Display producer-consumer demo"
depends on PERIODIC_TASKS
default n
help
When enabled, the Engine periodic task acts as a producer and the
Display periodic task acts as a consumer. Engine writes SensorData
every period and Display reads it, demonstrating producer-consumer
with existing threads rather than dedicated ones. Works in both
supervisor and userspace modes.
Enable it in prj.conf:
CONFIG_PERIODIC_TASKS=y
CONFIG_PROD_CONSUMER_INTEGRATED=y
0.2 — Add members to CarSystem (car_system.hpp)
The producer-consumer primitives live directly in CarSystem. Add the
SensorData struct and the required members:
// car_system.hpp — add the include
#if CONFIG_PROD_CONSUMER_INTEGRATED
#include "zpp_include/mutex.hpp"
#endif
Define SensorData before the class (inside the car_system namespace):
#if CONFIG_PROD_CONSUMER_INTEGRATED
// Data structure shared between Engine (producer) and Display (consumer)
struct SensorData {
uint32_t sequence_number;
std::chrono::milliseconds timestamp;
uint32_t sensor_value;
};
#endif // CONFIG_PROD_CONSUMER_INTEGRATED
Add the following private members at the end of the CarSystem class:
#if CONFIG_PROD_CONSUMER_INTEGRATED
// Semaphore initial count
static constexpr int kPcSemaphoreInitialCount = 0;
// Max semaphore count: accommodates Engine's higher frequency (50ms) vs Display (125ms)
static constexpr int kPcSemaphoreMaxCount = 5;
// Semaphore for Engine (producer) → Display (consumer) signalling.
zpp_lib::Semaphore _pcSemaphore{kPcSemaphoreInitialCount, kPcSemaphoreMaxCount};
// Mutex to protect access to the shared sensor data
zpp_lib::Mutex _pcMutex;
// Sequence counter for produced data
std::atomic<uint32_t> _pcSequenceCounter{0};
#if !CONFIG_USERSPACE
// Shared data between Engine and Display (class member in supervisor mode)
SensorData _pcSharedData{};
#endif // !CONFIG_USERSPACE
// Scaling factor applied to sequence number to produce sensor value
static constexpr uint32_t kSensorValueScale = 42;
#endif // CONFIG_PROD_CONSUMER_INTEGRATED
!!! failure “Workqueue must operate in kernel mode under Zephyr RTOS
Running a k_work_q in user mode is not supported under Zephyr RTOS.
Therefore, the thread that executes the k_work_queue_run function
must be created as a kernel thread. The same applies to the use of zpp_lib::WorkQueue.
!!! failure “Tracing and user mode are incompatible under Zephyr RTOS
A thread created in user mode cannot invoke SystemView primitives, such as
SEGGER_SYSVIEW_MarkStart() and SEGGER_SYSVIEW_MarkStop(). Doing so results in an MPU fault.
This is because the function accesses protected memory without invoking system calls.
Therefore, tracing a user thread is not possible.
Why max = 5 on the semaphore?
Engine produces data every 50 ms while Display only consumes every 125 ms.
Between two Display wakes, Engine may have posted 2-3 items. The semaphore
max = 5 provides headroom so that release() never fails even if Display
is temporarily delayed. Display uses try_acquire() (non-blocking) to drain
available items without disrupting its own periodic schedule.
0.3 — Add producer/consumer logic to task_method() (car_system.cpp)
The key insight: Engine and Display already run in task_method(). We add
the producer/consumer logic after the regular computation, but still
within the same periodic invocation.
In car_system.cpp, add the following block inside task_method(), after the
test recorder stop and before the period calculation:
#if CONFIG_PROD_CONSUMER_INTEGRATED
// Engine = producer: write sensor data after computation
if (taskIndex == kEngineTaskIndex) {
SensorData data;
data.sequence_number = _pcSequenceCounter.fetch_add(1);
data.timestamp = std::chrono::duration_cast<std::chrono::milliseconds>(
zpp_lib::Time::get_uptime());
data.sensor_value = data.sequence_number * kSensorValueScale;
auto lockRes = _pcMutex.lock();
if (!lockRes) {
LOG_ERR("Engine producer: mutex lock failed");
} else {
_pcSharedData = data;
auto unlockRes = _pcMutex.unlock();
if (!unlockRes) {
LOG_ERR("Engine producer: mutex unlock failed");
}
}
auto semRes = _pcSemaphore.release();
if (!semRes) {
LOG_ERR("Engine producer: semaphore release failed");
}
LOG_INF("Engine produced data (seq=%u, value=%u)",
data.sequence_number, data.sensor_value);
}
// Display = consumer: try to read sensor data (non-blocking)
if (taskIndex == kDisplayTaskIndex) {
auto semRes = _pcSemaphore.try_acquire();
if (semRes) {
SensorData receivedData;
auto lockRes = _pcMutex.lock();
if (!lockRes) {
LOG_ERR("Display consumer: mutex lock failed");
} else {
receivedData = _pcSharedData;
auto unlockRes = _pcMutex.unlock();
if (!unlockRes) {
LOG_ERR("Display consumer: mutex unlock failed");
}
LOG_INF("Display consumed data (seq=%u, timestamp=%lld ms, value=%u)",
receivedData.sequence_number, receivedData.timestamp.count(),
receivedData.sensor_value);
}
}
}
#endif // CONFIG_PROD_CONSUMER_INTEGRATED
Why try_acquire() instead of acquire()?
Display is a periodic task with its own deadline. A blocking acquire()
would stall Display until Engine produces — violating Display’s period
contract. The non-blocking try_acquire() simply checks whether data is
available: if yes, it consumes it; if not, Display continues to its next
period unchanged.
0.4 — Build and verify
west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine
You should see log output like:
Engine produced data (seq=0, value=0)
Engine produced data (seq=1, value=42)
Display consumed data (seq=1, timestamp=92 ms, value=42)
Engine produced data (seq=2, value=84)
Engine produced data (seq=3, value=126)
Display consumed data (seq=3, timestamp=192 ms, value=126)
Notice that Display does not consume every sample — it picks up the latest available one, which is the expected behavior given the period mismatch.
At this point everything works, but there is no memory protection whatsoever. A bug in Display could overwrite Engine’s stack, corrupt kernel data structures or poke random peripheral registers — and the system would silently continue with a corrupted state.
Questions
- Why does Display use a non-blocking
try_acquire()while the standaloneProdConsumerconsumer uses a blockingacquire()? - How many Engine samples does Display miss between two consecutive Display invocations? Is this a problem?
- What would happen if you removed the mutex and let both threads access
_pcSharedDatawithout synchronization? Would the system crash immediately?
Step 1 — Userspace with a Shared Domain
With the supervisor-mode integrated producer/consumer working (Step 0), you can
now flip the userspace switch. In this step, every application thread (including
Engine and Display) belongs to one common memory domain (app_domain)
containing:
z_libc_partition— the C library’s internal globalszpp_lib_partition— thezpp_liblibrary globalsapp_partition— the application’s own globals
Both Engine and Display see the same set of memory. There is no isolation between them yet, but they are isolated from the kernel and from peripherals they have not been granted access to.
┌──────────────────────────────────────────┐
│ app_domain │
│ ┌───────────┐ ┌──────────┐ ┌─────────┐ │
│ │ z_libc │ │ zpp_lib │ │ app │ │
│ │ partition │ │ partition│ │partition│ │
│ └───────────┘ └──────────┘ └─────────┘ │
│ │
│ Engine thread ◀────────▶ Display thread │
│ (both share everything) │
└──────────────────────────────────────────┘
1.1 — Enable userspace in the build
Userspace is activated via an overlay configuration file that is passed at
build time with --extra-conf.
Create prj_user_mode.conf in your car_system:
# enable user mode
CONFIG_USERSPACE=y
CONFIG_LOG_MODE_MINIMAL=y
# Integrated P/C adds 1 semaphore + 1 mutex (default pool size may not be enough)
CONFIG_ZPP_SEMAPHORE_POOL_SIZE=4
# Enable INFO-level app logging so producer/consumer output is visible
CONFIG_APP_LOG_LEVEL_INF=y
Pool sizes
Userspace objects (semaphores, mutexes, barriers, …) are allocated from
static pools whose sizes must be configured at build time. If you get a
boot-time assertion such as __ASSERT zpp semaphore pool exhausted, increase
the pool size in the conf file.
1.2 — Set up the memory domain
The domain must be created from supervisor context — typically early in
main(), before any user thread is started.
userspace/init_domain.hpp:
#pragma once
#if CONFIG_USERSPACE
// zephyr
#include <zephyr/app_memory/app_memdomain.h>
// std
#include <cstdint>
namespace car_system {
#define APP_DATA K_APP_DMEM(app_partition)
extern struct k_mem_domain app_domain;
extern void init_domain();
} // namespace car_system
#else // CONFIG_USERSPACE
#define APP_DATA
#endif // CONFIG_USERSPACE
userspace/init_domain.cpp:
#if CONFIG_USERSPACE
#include "init_domain.hpp"
// zephyr
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
// local
LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);
/* Define zpp_lib_partition, where all globals for the zpp_lib library will be routed.
* The partition starting address and size are populated by build system
* and linker magic.
*/
K_APPMEM_PARTITION_DEFINE(zpp_lib_partition);
/* Define app_partition, where all globals for this app will be routed.
* The partition starting address and size are populated by build system
* and linker magic.
*/
K_APPMEM_PARTITION_DEFINE(app_partition);
namespace car_system {
/* Memory domain for application, set up and installed in main() */
struct k_mem_domain app_domain;
void init_domain() {
LOG_INF("ZPP_LIB partition: %p %zu",
(void*)zpp_lib_partition.start,
(size_t)zpp_lib_partition.size);
LOG_INF(
"APP partition: %p %zu", (void*)app_partition.start, (size_t)app_partition.size);
#ifdef Z_LIBC_PARTITION_EXISTS
LOG_INF("libc partition: %p %zu",
(void*)z_libc_partition.start,
(size_t)z_libc_partition.size);
#endif // Z_LIBC_PARTITION_EXISTS
/* Initialize a memory domain with the specified partitions
* and add this thread to this domain. We need access to our own
* partition, the shared partition, and any common libc partition
* if it exists.
*/
struct k_mem_partition* partitions[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif // Z_LIBC_PARTITION_EXISTS
&zpp_lib_partition,
&app_partition};
auto ret = k_mem_domain_init(&app_domain, ARRAY_SIZE(partitions), partitions);
__ASSERT(ret == 0, "k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&app_domain, k_current_get());
}
} // namespace car_system
#endif // CONFIG_USERSPACE
And in main():
#if CONFIG_USERSPACE
car_system::init_domain();
#endif
K_APPMEM_PARTITION_DEFINE
This macro asks the linker to collect every global variable tagged with
K_APP_DMEM(partition) or K_APP_BMEM(partition) into a dedicated, MPU-aligned
section. At runtime the partition’s start and size fields point to that
section. You never allocate memory manually — the build system does it for
you.
The two tagging macros differ only in which linker section they target:
K_APP_DMEM(partition)— places an initialized variable (.data) into the partition. Use this for globals that have a non-zero initial value, e.g.K_APP_DMEM(app_partition) int counter = 42;.K_APP_BMEM(partition)— places a zero-initialized variable (.bss) into the partition. Use this for globals whose initial value is zero, e.g.K_APP_BMEM(app_partition) int counter;.
The distinction mirrors the standard .data vs. .bss split: .data
variables consume space in the binary image (their initial values must be
stored in flash), while .bss variables only consume RAM and are
zero-filled at boot.
1.3 — Adapt the CarSystem for userspace
Three things change compared to supervisor mode:
a) Thread creation flag
The periodic task threads that should drop to user mode must be created with the
user-mode flag (the third bool parameter in zpp_lib::Thread). This is
already the case in the existing CarSystem constructor when CONFIG_USERSPACE
is enabled:
CarSystem::CarSystem()
#if CONFIG_PERIODIC_TASKS && CONFIG_USERSPACE
: _threads{zpp_lib::Thread(_taskInfos[kEngineTaskIndex]._priority,
_taskInfos[kEngineTaskIndex]._szTaskName,
true),
zpp_lib::Thread(_taskInfos[kDisplayTaskIndex]._priority,
_taskInfos[kDisplayTaskIndex]._szTaskName,
true),
// ... Tire and Rain also with true ...
}
The true flag means “after thread_start(), drop this thread to
unprivileged mode”. The thread begins executing in supervisor mode (so that
start() can set up grants), but transitions to user mode before the thread
function body runs.
b) Shared data placement
In supervisor mode, _pcSharedData lives as a private class member — any
thread can access any address. In user mode, the variable must be placed in
a partition that all accessing threads share.
The member is already guarded so it only exists in supervisor mode:
// car_system.hpp — inside the class
#if !CONFIG_USERSPACE
SensorData _pcSharedData{};
#endif
In car_system.cpp, declare a partition-tagged global that replaces the class
member when userspace is enabled (place this at file scope):
#if CONFIG_PROD_CONSUMER_INTEGRATED
#if CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gIntegratedSharedData = {};
#endif
#endif
The APP_DATA macro expands to K_APP_DMEM(app_partition), which tells the
linker to place this variable in the app_partition section. Both threads are
in app_domain which includes app_partition, so both can read and write it.
Then update the producer and consumer code in task_method() to use the global
when in userspace mode. In the producer (Engine), the write changes to:
auto lockRes = _pcMutex.lock();
if (!lockRes) {
LOG_ERR("Engine producer: mutex lock failed");
} else {
#if CONFIG_USERSPACE
gIntegratedSharedData = data;
#else
_pcSharedData = data;
#endif
auto unlockRes = _pcMutex.unlock();
// ...
}
In the consumer (Display), the read changes to:
#if CONFIG_USERSPACE
receivedData = gIntegratedSharedData;
#else
receivedData = _pcSharedData;
#endif
Class members and userspace
A class member variable lives wherever the class instance is allocated.
If the instance is on the stack or in an untagged global, user threads
cannot access it. That is why shared data moves to a partition-tagged
global.
c) Kernel object grants
User threads cannot use a kernel object (mutex, semaphore, barrier, …) unless
they have been explicitly granted access. For the integrated variant, add
grants for the producer-consumer primitives in CarSystem::start(), alongside
the existing barrier grants:
#if CONFIG_USERSPACE
// grant access to the barrier for each thread
for (uint8_t taskIndex = 0; taskIndex < kNbrOfPeriodicTasks; taskIndex++) {
k_tid_t tid = _threads[taskIndex].get_tid();
_barrier.grant_access(tid);
}
#if CONFIG_PROD_CONSUMER_INTEGRATED
// Grant Engine and Display access to prod/consumer sync primitives
_pcSemaphore.grant_access(_threads[kEngineTaskIndex].get_tid());
_pcSemaphore.grant_access(_threads[kDisplayTaskIndex].get_tid());
_pcMutex.grant_access(_threads[kEngineTaskIndex].get_tid());
_pcMutex.grant_access(_threads[kDisplayTaskIndex].get_tid());
#endif
#endif
Without these grants, the first syscall from user mode would trigger a kernel oops — the kernel detects an unauthorized operation and terminates the thread.
1.4 — Declaring the CarSystem instance in the partition
Because the CarSystem object must be accessible from user threads, its global
instance must reside in app_partition:
#if CONFIG_USERSPACE
APP_DATA static car_system::CarSystem carSystem;
#endif
In non-userspace mode, it can stay as a local variable in main():
#if !defined(CONFIG_USERSPACE)
car_system::CarSystem carSystem;
#endif
1.5 — Build and verify
Build with the overlay:
west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine \
--extra-conf="prj_user_mode.conf"
You should see Engine and Display exchanging data exactly as before. The difference is invisible at the log level — but the MPU is now enforcing access rules.
Questions
- What would happen if you removed the
grant_access()call for the mutex? Try it and describe the resulting error. - What would happen if
gIntegratedSharedDatawas declared without theAPP_DATAtag? Where would it end up, and what fault would Display see? - Why is
CONFIG_LOG_MODE_MINIMAL=yused inprj_user_mode.conf?
Step 2 — Dedicated Memory Partitions
The first step proved that isolation from the kernel works. But Engine and
Display still share app_partition — Engine can read and write Display’s
private variables, and vice versa. In a real system with trust boundaries,
this is not enough.
In this step, each thread gets its own private partition plus a small shared partition for the data they actually need to exchange:
┌─────────────────────────────────────────────────────────────┐
│ Engine_domain (producer) │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐ │
│ │z_libc │ │zpp_lib │ │ app │ │producer │ │pc_shared│ │
│ │ │ │ │ │ │ │partition │ │partition│ │
│ └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Display_domain (consumer) │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐ │
│ │z_libc │ │zpp_lib │ │ app │ │consumer │ │pc_shared│ │
│ │ │ │ │ │ │ │partition │ │partition│ │
│ └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
- Engine can access
producer_partition(private) andpc_shared_partition(the exchangedSensorData), but notconsumer_partition. - Display can access
consumer_partition(private) andpc_shared_partition, but notproducer_partition. - Both still need
zpp_lib_partition,app_partition, andz_libc_partitionfor library code and common application globals.
A write from Engine into consumer_partition would trigger a
MemManage fault — true isolation.
2.1 — Add the Kconfig option
config PROD_CONSUMER_PARTITIONS
bool "Use separate memory partitions for producer and consumer"
depends on (PROD_CONSUMER || PROD_CONSUMER_INTEGRATED) && USERSPACE
default n
help
When enabled, the producer and consumer each get their own memory
partition (producer_partition, consumer_partition) plus a shared
partition (pc_shared_partition) for the data exchanged between them.
When disabled, both threads share app_partition.
Note the depends on (PROD_CONSUMER || PROD_CONSUMER_INTEGRATED) && USERSPACE
— dedicated partitions work with both the standalone and integrated variants,
and only make sense when userspace is already enabled.
2.2 — Define partition macros and the init function
userspace/prod_consumer_partitions.hpp:
#pragma once
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include <zephyr/app_memory/app_memdomain.h>
// Macros for tagging globals into the correct partition
#define PRODUCER_DATA K_APP_DMEM(producer_partition)
#define PRODUCER_BSS K_APP_BMEM(producer_partition)
#define CONSUMER_DATA K_APP_DMEM(consumer_partition)
#define CONSUMER_BSS K_APP_BMEM(consumer_partition)
#define PC_SHARED_DATA K_APP_DMEM(pc_shared_partition)
#define PC_SHARED_BSS K_APP_BMEM(pc_shared_partition)
namespace car_system {
void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid);
} // namespace car_system
#endif
2.3 — Implement the domain setup
userspace/prod_consumer_partitions.cpp:
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "prod_consumer_partitions.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
#include "init_domain.hpp"
LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);
extern struct k_mem_partition zpp_lib_partition;
extern struct k_mem_partition app_partition;
// Define the three partitions
K_APPMEM_PARTITION_DEFINE(producer_partition);
K_APPMEM_PARTITION_DEFINE(consumer_partition);
K_APPMEM_PARTITION_DEFINE(pc_shared_partition);
// Each partition must contain at least one variable so the linker gives it
// a non-zero size — Zephyr rejects zero-sized partitions in k_mem_domain_init.
PRODUCER_DATA volatile uint8_t producer_placeholder;
CONSUMER_DATA volatile uint8_t consumer_placeholder;
namespace car_system {
static struct k_mem_domain producer_domain;
static struct k_mem_domain consumer_domain;
void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid) {
int ret;
// Producer domain: zpp_lib + app + producer's own + shared + libc
struct k_mem_partition* producer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif
&zpp_lib_partition,
&app_partition,
&producer_partition,
&pc_shared_partition};
ret = k_mem_domain_init(&producer_domain, ARRAY_SIZE(producer_parts),
producer_parts);
__ASSERT(ret == 0, "producer k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&producer_domain, producer_tid);
// Consumer domain: zpp_lib + app + consumer's own + shared + libc
struct k_mem_partition* consumer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif
&zpp_lib_partition,
&app_partition,
&consumer_partition,
&pc_shared_partition};
ret = k_mem_domain_init(&consumer_domain, ARRAY_SIZE(consumer_parts),
consumer_parts);
__ASSERT(ret == 0, "consumer k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&consumer_domain, consumer_tid);
}
} // namespace car_system
#endif
Note that init_prod_consumer_domains() is the same function as in the
standalone variant — it takes two generic thread IDs. The only difference is
which threads we pass: Engine and Display instead of the dedicated Producer and
Consumer threads.
Placeholder variables
The linker will give a partition zero size if no variable is tagged into
it. k_mem_domain_init() rejects zero-sized partitions with an assertion
failure. The producer_placeholder / consumer_placeholder variables exist
solely to ensure the partitions are non-empty. In a real application, these
would be replaced by actual per-thread globals.
Why still include app_partition?
The CarSystem object, logging buffers, and other common application globals
live in app_partition. Both threads need read access to those. If you
removed app_partition from the domain, the first access to any
APP_DATA-tagged variable would fault.
2.4 — Place shared data in pc_shared_partition
In car_system.cpp, the shared data placement now depends on the configuration:
#if CONFIG_PROD_CONSUMER_INTEGRATED
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "userspace/prod_consumer_partitions.hpp"
PC_SHARED_DATA car_system::SensorData gIntegratedSharedData = {};
#elif CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gIntegratedSharedData = {};
#endif
#endif
When PROD_CONSUMER_PARTITIONS is enabled, gIntegratedSharedData lives in
pc_shared_partition — accessible to both Engine and Display. Any thread-private
data tagged with PRODUCER_DATA or CONSUMER_DATA would only be accessible to
its respective thread.
2.5 — Wire domain initialization into start()
In CarSystem::start(), after granting kernel object access and before the
threads drop to user mode, call the domain setup. This is added alongside the
existing grants:
#if CONFIG_USERSPACE
// grant access to the barrier for each thread
for (uint8_t taskIndex = 0; taskIndex < kNbrOfPeriodicTasks; taskIndex++) {
k_tid_t tid = _threads[taskIndex].get_tid();
_barrier.grant_access(tid);
}
#if CONFIG_PROD_CONSUMER_INTEGRATED
// Grant Engine and Display access to prod/consumer sync primitives
_pcSemaphore.grant_access(_threads[kEngineTaskIndex].get_tid());
_pcSemaphore.grant_access(_threads[kDisplayTaskIndex].get_tid());
_pcMutex.grant_access(_threads[kEngineTaskIndex].get_tid());
_pcMutex.grant_access(_threads[kDisplayTaskIndex].get_tid());
#if CONFIG_PROD_CONSUMER_PARTITIONS
// Move Engine into producer domain, Display into consumer domain
init_prod_consumer_domains(_threads[kEngineTaskIndex].get_tid(),
_threads[kDisplayTaskIndex].get_tid());
#endif // CONFIG_PROD_CONSUMER_PARTITIONS
#endif // CONFIG_PROD_CONSUMER_INTEGRATED
#endif // CONFIG_USERSPACE
Order matters
init_prod_consumer_domains() calls k_mem_domain_add_thread(), which
removes the thread from its current domain (app_domain) and adds it to
the new one. This must happen after init_domain() has run (so the
common partitions already exist) and before the threads start executing
in user mode.
2.6 — Activate in prj_user_mode.conf
# Enable separate memory partitions for Engine/Display
CONFIG_PROD_CONSUMER_PARTITIONS=y
Build and flash as before:
west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine \
--extra-conf="prj_user_mode.conf"
The output should be identical to Step 1. The difference is purely in the fault isolation: Engine and Display can no longer corrupt each other’s private data.
Questions
- Draw the MPU region map for the Engine thread showing which partitions are accessible and which are not.
- What happens if Engine tries to write to a variable tagged with
CONSUMER_DATA? At what level is the fault caught (hardware or software)? - In the diagram above, both domains include
app_partition. Could you remove it from one of them? What would break? - Why are
producer_domainandconsumer_domaindeclared as static local variables inprod_consumer_partitions.cpprather than as globals? What would change if they were global? - Tire and Rain threads are not moved into custom domains. What memory can they access?
Comparison: Standalone vs. Integrated
| Aspect | Standalone ProdConsumer |
Integrated Engine/Display |
|---|---|---|
| Extra threads | 2 (Producer + Consumer) | 0 (reuses existing tasks) |
| Thread periods | 500 ms / 500 ms | 50 ms (Engine) / 125 ms (Display) |
| Consumer wait | acquire() (blocking) |
try_acquire() (non-blocking) |
| Sample delivery | Every sample consumed | Latest-value — some samples skipped |
| Code location | prod_consumer.hpp/cpp |
Inline in car_system.hpp/cpp |
| Kconfig | CONFIG_PROD_CONSUMER |
CONFIG_PROD_CONSUMER_INTEGRATED |
| Userspace partitions | Same prod_consumer_partitions.* |
Same prod_consumer_partitions.* |
The integrated variant is more representative of a real embedded system where dedicated producer-consumer threads are a luxury — tasks have existing responsibilities and the data exchange is an additional concern layered on top.
Summary
| Aspect | Step 0: supervisor | Step 1: shared domain | Step 2: per-thread domains |
|---|---|---|---|
| Thread privilege | Supervisor | User | User |
| Memory isolation | None | Threads vs. kernel | Threads vs. kernel and vs. each other |
| Shared data location | Class member (_pcSharedData) |
APP_DATA global |
PC_SHARED_DATA global |
| Private data safety | Trust-based | Trust-based | MPU-enforced |
| Kernel object access | Implicit | Explicit grants | Explicit grants |
| Fault on bad access | Silent corruption | MemManage fault | MemManage fault |
The progression from supervisor → shared domain → per-thread domains is a practical pattern for introducing userspace incrementally:
- Get the application working in supervisor mode first (Step 0).
- Flip the userspace switch, fix the build errors (grants, data placement).
- When isolation between threads matters, split into per-thread domains.