Robust Design Patterns - Part 2 - Userspace Isolation
Introduction
In this codelab, we address the problem of unresponsive applications. Unresponsive is referring to the situation a program installed on an MCU becomes unavailable due to various reasons (e.g. gobbler task, dead-lock, …). So far every thread in our application runs in supervisor mode — the CPU’s most privileged level. Any thread can read or write any memory address, access any peripheral, and call any kernel function. That is convenient for development, but it means a single bug in one thread can corrupt data used by another, silently overwrite kernel structures, or poke hardware registers it has no business touching.
Zephyr RTOS offers userspace — an optional feature that leverages the CPU’s Memory Protection Unit (MPU) to restrict what each thread can access at the hardware level. A userspace thread can only touch memory regions it has been explicitly granted access to; any other access triggers an immediate CPU fault instead of silent corruption.
What you’ll build
In this codelab, you will create a ProdConsumer subsystem (a producer
thread and a consumer thread exchanging SensorData via shared memory) and
progressively increase the use of userspace protection:
- Step 1 — Single shared domain (app_partition): both threads run in user mode and share the same global memory partition.
- Step 2 — Dedicated per-thread domains: the producer and consumer each get their own private memory partition, with only the exchanged data placed in a shared partition accessible to both.
What you’ll learn
- What supervisor mode vs. user mode means on an ARM Cortex-M33 with MPU.
- The core Zephyr RTOS abstractions: memory partitions, memory domains, and kernel object permissions.
- How to move an existing subsystem into userspace incrementally — first with a single partition, then with fine-grained per-thread isolation.
- The practical constraints: what breaks when you flip the switch, and how to fix it.
- How to verify that isolation actually works (a rogue write faults instead of corrupting).
What you’ll need
- You need to have finished the Digging into Zephyr RTOS.
- You need to have completed the Scheduling of periodic tasks codelab.
- The codelabs related to scheduling (part 1 and part 2) are concluded.
Disable task watchdog subsystem
Make sure you have disabled the task watchdog subsystem prior to starting this codelab. Failing to do so will result in crashes.
Concepts: why Userspace?
The problem with supervisor mode
In a typical bare-metal or RTOS application every thread runs at the CPU’s highest privilege level. This means:
| Capability | Risk |
|---|---|
| Read/write any RAM address | One thread’s bug silently corrupts another’s data |
| Access any peripheral register | Accidental write to a timer or UART can brick the system |
| Call any kernel API | A buffer overflow can overwrite scheduler structures |
In safety-critical projects this is unacceptable: you need fault containment — the guarantee that one component’s failure cannot propagate to another.
How the MPU enforces isolation
ARM Cortex-M33 (the core in the nRF5340) includes a Memory Protection Unit (MPU) — a hardware block that sits between the CPU and the bus. Before every memory access, the MPU checks:
- Is the address within a region the current thread is allowed to access?
- Does the access type (read / write / execute) match the region’s permissions?
If the check fails, the CPU raises a MemManage fault — an immediate, synchronous exception. The offending instruction never completes.
┌──────────────┐ ┌─────────┐ ┌──────────────┐
│ CPU core │──addr──▶│ MPU │──ok──▶ │ Bus / SRAM │
│ (thread A) │ │ regions │ │ │
└──────────────┘ │ check │ └──────────────┘
│ │
│ FAULT ◀─┘ (if no matching region)
The key insight: the MPU is reconfigured on every context switch. When the scheduler switches from Thread A to Thread B, it loads Thread B’s region table into the MPU. Thread B literally cannot see Thread A’s private memory — it is as if it does not exist.
Zephyr’s memory model
Zephyr RTOS introduces three abstractions that map onto the MPU:
| Abstraction | What it is | Analogy |
|---|---|---|
Memory Partition (k_mem_partition) |
A named, contiguous memory region with fixed permissions | A room in a building |
Memory Domain (k_mem_domain) |
A set of partitions that a thread (or group of threads) can access | A key card that opens specific rooms |
| Kernel Object Permissions | Per-object ACL granting a thread the right to use a specific kernel object (mutex, semaphore, queue, …) | A signed permission slip for a specific resource |
A thread running in user mode can only access:
- The partitions in its memory domain
- The kernel objects it has been explicitly granted access to
Everything else triggers a fault.
Supervisor mode is always available
Code running in ISR context always executes in supervisor mode. Threads
only run in user mode if they were created with the K_USER option (or
the equivalent zpp_lib flag); all other threads remain in supervisor
mode regardless of their priority. Both cooperative and preemptive
threads can be user mode threads — the privilege level is determined by
the creation flag, not the priority class
(ref).
Step 0 — ProdConsumer in Supervisor Mode
Before introducing userspace, you first need a working ProdConsumer subsystem
running entirely in supervisor mode. Two threads — a producer and a
consumer — exchange SensorData through shared memory using a mutex for
exclusive access and a semaphore for signalling. We will do so by adding this
subsystem to CarSystem.
┌─────────────────────────────────────────────────────────────┐
│ Supervisor Mode (all RAM visible) │
│ │
│ Producer thread Consumer thread │
│ ┌────────────────┐ ┌────────────────┐ │
│ │ fill SensorData│──mutex──▶│ read SensorData│ │
│ │ release sema │──sema───▶│ acquire sema │ │
│ └────────────────┘ └────────────────┘ │
│ │
│ _sharedData (class member) │
└─────────────────────────────────────────────────────────────┘
0.1 — Add the Kconfig option
Create a PROD_CONSUMER option in the project’s Kconfig so the subsystem can
be compiled in or out:
config PROD_CONSUMER
bool "Enable producer-consumer userspace demo"
default n
help
Enable a producer-consumer subsystem where a producer thread writes
sensor data to a shared variable (protected by a mutex) and signals
a consumer thread via a semaphore. Works in both supervisor and
userspace modes.
Enable it in prj.conf:
CONFIG_PROD_CONSUMER=y
0.2 — Define the header (prod_consumer.hpp)
The ProdConsumer class owns two threads, a barrier for synchronized startup,
a semaphore for producer → consumer signalling, and a mutex for protecting
the shared data:
#pragma once
#if CONFIG_PROD_CONSUMER
#include <atomic>
#include <chrono>
#include "zpp_include/barrier.hpp"
#include "zpp_include/mutex.hpp"
#include "zpp_include/semaphore.hpp"
#include "zpp_include/non_copyable.hpp"
#include "zpp_include/thread.hpp"
#include "zpp_include/zephyr_result.hpp"
namespace car_system {
using std::literals::chrono_literals::operator""ms;
// Data structure shared between producer and consumer
struct SensorData {
uint32_t sequence_number;
std::chrono::milliseconds timestamp;
uint32_t sensor_value;
};
class ProdConsumer : private zpp_lib::NonCopyable<ProdConsumer> {
public:
ProdConsumer();
~ProdConsumer() = default;
[[nodiscard]] zpp_lib::ZephyrResult start();
[[nodiscard]] zpp_lib::ZephyrResult join();
void stop();
private:
void producer_task();
void consumer_task();
static constexpr uint8_t NBR_OF_TASKS = 2;
static constexpr uint8_t PRODUCER_INDEX = 0;
static constexpr uint8_t CONSUMER_INDEX = 1;
zpp_lib::Thread _threads[NBR_OF_TASKS];
// Barrier to synchronize both threads at startup
zpp_lib::Barrier _barrier{NBR_OF_TASKS};
// Semaphore to signal consumer that data is ready (initial 0, max 2).
// max=2 reserves one extra count so stop() can always unblock the consumer
// even if the producer has already posted a pending item.
zpp_lib::Semaphore _syncSemaphore{0, 2};
// Mutex to protect access to shared data
zpp_lib::Mutex _dataMutex;
// Shared data between producer and consumer
SensorData _sharedData{};
// Stop flag
volatile std::atomic<bool> _stopFlag{false};
// Sequence counter for producer data
std::atomic<uint32_t> _sequenceCounter{0};
// Task periods
static constexpr std::chrono::milliseconds kProducerPeriod = 500ms;
static constexpr std::chrono::milliseconds kConsumerPeriod = 500ms;
// Simulated computation times
static constexpr std::chrono::milliseconds kProducerComputation = 50ms;
static constexpr std::chrono::milliseconds kConsumerComputation = 100ms;
// Scaling factor applied to sequence number to produce sensor value
static constexpr uint32_t kSensorValueScale = 42;
};
} // namespace car_system
#endif // CONFIG_PROD_CONSUMER
Why max = 2 on the semaphore?
The stop() method releases the semaphore to unblock the consumer from its
acquire(). If the producer has already posted a pending item (count = 1),
stop() would fail with max = 1. Setting max = 2 guarantees that
stop() can always wake the consumer.
0.3 — Implement the class (prod_consumer.cpp)
Constructor — set thread priorities
#if CONFIG_PROD_CONSUMER
#include "prod_consumer.hpp"
#include <zephyr/logging/log.h>
#include "zpp_include/this_thread.hpp"
#include "zpp_include/time.hpp"
LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);
namespace car_system {
ProdConsumer::ProdConsumer()
: _threads{
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer"),
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer")}
{
}
The producer runs at a higher priority than the consumer so that fresh data is always produced before the consumer wakes up.
start() — launch both threads
zpp_lib::ZephyrResult ProdConsumer::start() {
zpp_lib::ZephyrResult res;
res = _threads[PRODUCER_INDEX].start([this]() { producer_task(); });
if (!res) {
LOG_ERR("ProdConsumer: Cannot start Producer thread: %d",
static_cast<int>(res.error()));
return res;
}
res = _threads[CONSUMER_INDEX].start([this]() { consumer_task(); });
if (!res) {
LOG_ERR("ProdConsumer: Cannot start Consumer thread: %d",
static_cast<int>(res.error()));
return res;
}
return res;
}
join() and stop()
zpp_lib::ZephyrResult ProdConsumer::join() {
zpp_lib::ZephyrResult res;
for (uint8_t i = 0; i < NBR_OF_TASKS; i++) {
res = _threads[i].join();
if (!res) {
LOG_ERR("ProdConsumer: Cannot join thread %d: %d", i,
static_cast<int>(res.error()));
return res;
}
}
return res;
}
void ProdConsumer::stop() {
_stopFlag.store(true);
// Release semaphore to unblock the consumer from its indefinite acquire()
auto res = _syncSemaphore.release();
if (!res) {
LOG_ERR("ProdConsumer stop: semaphore release failed");
}
}
producer_task() — periodic data production
void ProdConsumer::producer_task() {
std::chrono::milliseconds nextPeriod =
std::chrono::duration_cast<std::chrono::milliseconds>(_barrier.wait());
LOG_INF("ProdConsumer Producer starting at %lld ms", nextPeriod.count());
while (!_stopFlag) {
// Simulate work
zpp_lib::ThisThread::busy_wait(kProducerComputation);
// Prepare data
SensorData data;
data.sequence_number = _sequenceCounter.fetch_add(1);
data.timestamp = std::chrono::duration_cast<std::chrono::milliseconds>(
zpp_lib::Time::get_uptime());
data.sensor_value = data.sequence_number * kSensorValueScale;
LOG_INF("ProdConsumer Producer: writing data (seq=%u, value=%u)",
data.sequence_number, data.sensor_value);
// Write data under mutex protection
auto lockRes = _dataMutex.lock();
if (!lockRes) {
LOG_ERR("ProdConsumer Producer: mutex lock failed");
} else {
_sharedData = data;
auto unlockRes = _dataMutex.unlock();
if (!unlockRes) {
LOG_ERR("ProdConsumer Producer: mutex unlock failed");
}
}
// Signal the consumer that data is ready
auto semRes = _syncSemaphore.release();
if (!semRes) {
LOG_ERR("ProdConsumer Producer: semaphore release failed");
}
nextPeriod += kProducerPeriod;
zpp_lib::ThisThread::sleep_until(nextPeriod);
}
}
consumer_task() — wait, read, process
void ProdConsumer::consumer_task() {
std::chrono::milliseconds nextPeriod =
std::chrono::duration_cast<std::chrono::milliseconds>(_barrier.wait());
LOG_INF("ProdConsumer Consumer starting at %lld ms", nextPeriod.count());
while (true) {
// Block until the producer signals that new data is ready
auto semRes = _syncSemaphore.acquire();
if (!semRes) {
LOG_ERR("ProdConsumer Consumer: semaphore acquire failed");
continue;
}
// stop() released the semaphore and set _stopFlag
if (_stopFlag.load()) {
break;
}
// Read shared data under mutex protection
SensorData receivedData;
auto lockRes = _dataMutex.lock();
if (!lockRes) {
LOG_ERR("ProdConsumer Consumer: mutex lock failed");
continue;
}
receivedData = _sharedData;
auto unlockRes = _dataMutex.unlock();
if (!unlockRes) {
LOG_ERR("ProdConsumer Consumer: mutex unlock failed");
}
LOG_INF(
"ProdConsumer Consumer: received data (seq=%u, timestamp=%lld ms, value=%u)",
receivedData.sequence_number, receivedData.timestamp.count(),
receivedData.sensor_value);
// Simulate consuming the data
zpp_lib::ThisThread::busy_wait(kConsumerComputation);
nextPeriod += kConsumerPeriod;
zpp_lib::ThisThread::sleep_until(nextPeriod);
}
}
} // namespace car_system
#endif // CONFIG_PROD_CONSUMER
0.4 — Integrate into CarSystem
Add a ProdConsumer member to CarSystem:
// car_system.hpp
#if CONFIG_PROD_CONSUMER
#include "prod_consumer.hpp"
#endif
class CarSystem : private zpp_lib::NonCopyable<CarSystem> {
// ...existing members...
#if CONFIG_PROD_CONSUMER
ProdConsumer _prodConsumer;
#endif
};
In CarSystem::start(), start and join the subsystem:
#if CONFIG_PROD_CONSUMER
res = _prodConsumer.start();
if (!res) {
LOG_ERR("Cannot start ProdConsumer: %d", static_cast<int>(res.error()));
return res;
}
#endif
// ...existing thread joins...
#if CONFIG_PROD_CONSUMER
res = _prodConsumer.join();
if (!res) {
LOG_ERR("Cannot join ProdConsumer: %d", static_cast<int>(res.error()));
return res;
}
#endif
In CarSystem::stop():
#if CONFIG_PROD_CONSUMER
_prodConsumer.stop();
#endif
0.5 — Build and verify
west build -b nrf5340dk/nrf5340/cpuapp car_system
You should see log output like:
ProdConsumer Producer starting at 42 ms
ProdConsumer Consumer starting at 42 ms
ProdConsumer Producer: writing data (seq=0, value=0)
ProdConsumer Consumer: received data (seq=0, timestamp=92 ms, value=0)
ProdConsumer Producer: writing data (seq=1, value=42)
ProdConsumer Consumer: received data (seq=1, timestamp=592 ms, value=42)
At this point everything works, but there is no memory protection whatsoever. A bug in the consumer could overwrite the producer’s stack, corrupt kernel data structures or poke random peripheral registers — and the system would silently continue with a corrupted state.
Questions
- Why does the consumer use
_syncSemaphore.acquire()(blocking) instead of polling_stopFlagin a loop? - What would happen if you removed the mutex and let both threads access
_sharedDatawithout synchronization? Would the system crash immediately? - The barrier is initialized with
NBR_OF_TASKS = 2, meaning only the producer and consumer synchronize on it. Why don’t we includeCarSystem’s periodic tasks in this barrier?
Step 1 — Userspace with a Shared Domain
With the supervisor-mode ProdConsumer working (Step 0), you can now flip the
userspace switch. In this step, every application thread (including the producer
and consumer) belongs to one common memory domain (app_domain) containing:
z_libc_partition— the C library’s internal globalszpp_lib_partition— thezpp_liblibrary globalsapp_partition— the application’s own globals
Both the producer and the consumer see the same set of memory. There is no isolation between them yet, but they are isolated from the kernel and from peripherals they have not been granted access to.
┌──────────────────────────────────────────┐
│ app_domain │
│ ┌───────────┐ ┌──────────┐ ┌─────────┐ │
│ │ z_libc │ │ zpp_lib │ │ app │ │
│ │ partition │ │ partition│ │partition│ │
│ └───────────┘ └──────────┘ └─────────┘ │
│ │
│ Producer thread ◀──────▶ Consumer thread│
│ (both share everything) │
└──────────────────────────────────────────┘
1.1 — Enable userspace in the build
Userspace is activated via an overlay configuration file that is passed at
build time with -DEXTRA_CONF_FILE.
Create (or update) prj_user_mode.conf:
# enable user mode
CONFIG_USERSPACE=y
CONFIG_LOG_MODE_MINIMAL=y
# ProdConsumer adds 1 semaphore + 1 barrier (default pool size of 2 is not enough)
CONFIG_ZPP_SEMAPHORE_POOL_SIZE=4
# Enable INFO-level app logging so ProdConsumer output is visible
CONFIG_APP_LOG_LEVEL_INF=y
Pool sizes
Userspace objects (semaphores, mutexes, barriers, …) are allocated from
static pools whose sizes must be configured at build time. If you get a
boot-time assertion such as __ASSERT zpp semaphore pool exhausted, increase
the pool size in the conf file.
1.2 — Set up the memory domain
The domain must be created from supervisor context — typically early in
main(), before any user thread is started.
userspace/init_domain.hpp:
#pragma once
#if CONFIG_USERSPACE
#include <zephyr/app_memory/app_memdomain.h>
#include <cstdint>
namespace car_system {
#define APP_DATA K_APP_DMEM(app_partition)
extern struct k_mem_domain app_domain;
extern void init_domain();
} // namespace car_system
#else
#define APP_DATA
#endif
userspace/init_domain.cpp:
#if CONFIG_USERSPACE
#include "init_domain.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);
K_APPMEM_PARTITION_DEFINE(zpp_lib_partition);
K_APPMEM_PARTITION_DEFINE(app_partition);
namespace car_system {
struct k_mem_domain app_domain;
void init_domain() {
struct k_mem_partition* partitions[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif
&zpp_lib_partition,
&app_partition};
auto ret = k_mem_domain_init(&app_domain, ARRAY_SIZE(partitions), partitions);
__ASSERT(ret == 0, "k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&app_domain, k_current_get());
}
} // namespace car_system
#endif
And in main():
#if CONFIG_USERSPACE
car_system::init_domain();
#endif
K_APPMEM_PARTITION_DEFINE
This macro asks the linker to collect every global variable tagged with
K_APP_DMEM(partition) or K_APP_BMEM(partition) into a dedicated, MPU-aligned
section. At runtime the partition’s start and size fields point to that
section. You never allocate memory manually — the build system does it for
you.
The two tagging macros differ only in which linker section they target:
K_APP_DMEM(partition)— places an initialized variable (.data) into the partition. Use this for globals that have a non-zero initial value, e.g.K_APP_DMEM(app_partition) int counter = 42;.K_APP_BMEM(partition)— places a zero-initialized variable (.bss) into the partition. Use this for globals whose initial value is zero, e.g.K_APP_BMEM(app_partition) int counter;.
The distinction mirrors the standard .data vs. .bss split: .data
variables consume space in the binary image (their initial values must be
stored in flash), while .bss variables only consume RAM and are
zero-filled at boot.
1.3 — Adapt the ProdConsumer for userspace
Three things change compared to supervisor mode:
a) Thread creation flag
Threads that should drop to user mode must be created with the user-mode
flag (the third bool parameter in zpp_lib::Thread):
ProdConsumer::ProdConsumer()
#if CONFIG_USERSPACE
: _threads{
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer", true),
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer", true)}
#else
: _threads{
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityLow, "Producer"),
zpp_lib::Thread(
zpp_lib::PreemptableThreadPriority::PriorityVeryLow, "Consumer")}
#endif
The true flag means “after thread_start(), drop this thread to
unprivileged mode”. The thread begins executing in supervisor mode (so that
start() can set up grants), but transitions to user mode before the thread
function body runs.
b) Shared data placement
In supervisor mode, _sharedData lives as a private class member — any
thread can access any address. In user mode, the variable must be placed in
a partition that all accessing threads share.
First, guard _sharedData in the header so it only exists in supervisor mode:
// prod_consumer.hpp — inside the class
#if !CONFIG_USERSPACE
// Shared data between producer and consumer (class member in non-userspace mode)
SensorData _sharedData{};
#endif
Then, in prod_consumer.cpp, declare a partition-tagged global that replaces
the class member when userspace is enabled (place this at file scope, before the
namespace):
#if CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gProdConsumerSharedData = {};
#endif
The APP_DATA macro expands to K_APP_DMEM(app_partition), which tells the
linker to place this variable in the app_partition section. Both threads are
in app_domain which includes app_partition, so both can read and write it.
Finally, update producer_task() and consumer_task() to use the global when
in userspace mode. In the producer, the write changes to:
// Write data under mutex protection
auto lockRes = _dataMutex.lock();
if (!lockRes) {
LOG_ERR("ProdConsumer Producer: mutex lock failed");
} else {
#if CONFIG_USERSPACE
gProdConsumerSharedData = data;
#else
_sharedData = data;
#endif
auto unlockRes = _dataMutex.unlock();
if (!unlockRes) {
LOG_ERR("ProdConsumer Producer: mutex unlock failed");
}
}
In the consumer, the read changes to:
#if CONFIG_USERSPACE
receivedData = gProdConsumerSharedData;
#else
receivedData = _sharedData;
#endif
Class members and userspace
A class member variable lives wherever the class instance is allocated.
If the instance is on the stack or in an untagged global, user threads
cannot access it. That is why shared data moves to a partition-tagged
global.
c) Kernel object grants
User threads cannot use a kernel object (mutex, semaphore, barrier, …) unless
they have been explicitly granted access. This is done from supervisor
context, typically in start() before the threads begin running:
#if CONFIG_USERSPACE
_barrier.grant_access(_threads[PRODUCER_INDEX].get_tid());
_barrier.grant_access(_threads[CONSUMER_INDEX].get_tid());
_syncSemaphore.grant_access(_threads[PRODUCER_INDEX].get_tid());
_syncSemaphore.grant_access(_threads[CONSUMER_INDEX].get_tid());
_dataMutex.grant_access(_threads[PRODUCER_INDEX].get_tid());
_dataMutex.grant_access(_threads[CONSUMER_INDEX].get_tid());
#endif
Without these grants, the first syscall from user mode would trigger a kernel oops — the kernel detects an unauthorized operation and terminates the thread.
1.4 — Declaring the CarSystem instance in the partition
Because the CarSystem object (and its ProdConsumer member) must be
accessible from user threads, its global instance must reside in
app_partition:
#if CONFIG_USERSPACE
APP_DATA static car_system::CarSystem carSystem;
#endif
In non-userspace mode, it can stay as a local variable in main():
#if !defined(CONFIG_USERSPACE)
car_system::CarSystem carSystem;
#endif
1.5 — Build and verify
Build with the overlay:
west build -b nrf5340dk/nrf5340/cpuapp car_system \
-- -DEXTRA_CONF_FILE="prj_user_mode.conf"
You should see the producer and consumer exchanging data exactly as before. The difference is invisible at the log level — but the MPU is now enforcing access rules.
Questions
- What would happen if you removed the
grant_access()call for the mutex? Try it and describe the resulting error. - What would happen if
gProdConsumerSharedDatawas declared without theAPP_DATAtag? Where would it end up, and what fault would the consumer see? - Why is
CONFIG_LOG_MODE_MINIMAL=yneeded inprj_user_mode.conf?
Solution
-
The thread would trigger a kernel oops (
syscall access denied) on the first_dataMutex.lock()call. The kernel checks per-object permissions on every syscall from user mode and terminates the thread if the grant is missing. -
Without
APP_DATA, the variable would be placed in the default.bss/.datasection, which is not part of any user-accessible partition. The first read or write from user mode would trigger a MemManage fault (MPU violation), causing a system reset or thread termination. -
The full deferred logging subsystem uses kernel objects internally (message queues, work queues) that are not automatically granted to user threads. Minimal logging avoids those internal objects by printing synchronously. Without it, the first
LOG_INFfrom user mode would trigger a syscall access fault.
Step 2 — Dedicated Memory Partitions
The first step proved that isolation from the kernel works. But the producer
and consumer still share app_partition — the producer can read and write the
consumer’s private variables, and vice versa. In a real system with trust
boundaries, this is not enough.
In this step, each thread gets its own private partition plus a small shared partition for the data they actually need to exchange:
┌─────────────────────────────────────────────────────────────┐
│ producer_domain │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐ │
│ │z_libc │ │zpp_lib │ │ app │ │producer │ │pc_shared│ │
│ │ │ │ │ │ │ │partition │ │partition│ │
│ └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ consumer_domain │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────┐ │
│ │z_libc │ │zpp_lib │ │ app │ │consumer │ │pc_shared│ │
│ │ │ │ │ │ │ │partition │ │partition│ │
│ └────────┘ └────────┘ └────────┘ └──────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
- The producer can access
producer_partition(private) andpc_shared_partition(the exchangedSensorData), but notconsumer_partition. - The consumer can access
consumer_partition(private) andpc_shared_partition, but notproducer_partition. - Both still need
zpp_lib_partition,app_partition, andz_libc_partitionfor library code and common application globals.
A write from the producer into consumer_partition would trigger a
MemManage fault — true isolation.
2.1 — Add the Kconfig option
config PROD_CONSUMER_PARTITIONS
bool "Use separate memory partitions for producer and consumer"
depends on PROD_CONSUMER && USERSPACE
default n
help
When enabled, the producer and consumer each get their own memory
partition (producer_partition, consumer_partition) plus a shared
partition (pc_shared_partition) for the data exchanged between them.
When disabled, both threads share app_partition.
Note the depends on PROD_CONSUMER && USERSPACE — dedicated partitions only
make sense when userspace is already enabled.
2.2 — Define partition macros and the init function
userspace/prod_consumer_partitions.hpp:
#pragma once
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include <zephyr/app_memory/app_memdomain.h>
// Macros for tagging globals into the correct partition
#define PRODUCER_DATA K_APP_DMEM(producer_partition)
#define PRODUCER_BSS K_APP_BMEM(producer_partition)
#define CONSUMER_DATA K_APP_DMEM(consumer_partition)
#define CONSUMER_BSS K_APP_BMEM(consumer_partition)
#define PC_SHARED_DATA K_APP_DMEM(pc_shared_partition)
#define PC_SHARED_BSS K_APP_BMEM(pc_shared_partition)
namespace car_system {
void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid);
} // namespace car_system
#endif
2.3 — Implement the domain setup
userspace/prod_consumer_partitions.cpp:
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "prod_consumer_partitions.hpp"
#include <zephyr/logging/log.h>
#include <zephyr/sys/libc-hooks.h>
#include "init_domain.hpp"
LOG_MODULE_DECLARE(car_system, CONFIG_APP_LOG_LEVEL);
extern struct k_mem_partition zpp_lib_partition;
extern struct k_mem_partition app_partition;
// Define the three partitions
K_APPMEM_PARTITION_DEFINE(producer_partition);
K_APPMEM_PARTITION_DEFINE(consumer_partition);
K_APPMEM_PARTITION_DEFINE(pc_shared_partition);
// Each partition must contain at least one variable so the linker gives it
// a non-zero size — Zephyr rejects zero-sized partitions in k_mem_domain_init.
PRODUCER_DATA volatile uint8_t producer_placeholder;
CONSUMER_DATA volatile uint8_t consumer_placeholder;
namespace car_system {
static struct k_mem_domain producer_domain;
static struct k_mem_domain consumer_domain;
void init_prod_consumer_domains(k_tid_t producer_tid, k_tid_t consumer_tid) {
int ret;
// Producer domain: zpp_lib + app + producer's own + shared + libc
struct k_mem_partition* producer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif
&zpp_lib_partition,
&app_partition,
&producer_partition,
&pc_shared_partition};
ret = k_mem_domain_init(&producer_domain, ARRAY_SIZE(producer_parts),
producer_parts);
__ASSERT(ret == 0, "producer k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&producer_domain, producer_tid);
// Consumer domain: zpp_lib + app + consumer's own + shared + libc
struct k_mem_partition* consumer_parts[] = {
#if Z_LIBC_PARTITION_EXISTS
&z_libc_partition,
#endif
&zpp_lib_partition,
&app_partition,
&consumer_partition,
&pc_shared_partition};
ret = k_mem_domain_init(&consumer_domain, ARRAY_SIZE(consumer_parts),
consumer_parts);
__ASSERT(ret == 0, "consumer k_mem_domain_init failed %d", ret);
ARG_UNUSED(ret);
k_mem_domain_add_thread(&consumer_domain, consumer_tid);
}
} // namespace car_system
#endif
Placeholder variables
The linker will give a partition zero size if no variable is tagged into
it. k_mem_domain_init() rejects zero-sized partitions with an assertion
failure. The producer_placeholder / consumer_placeholder variables exist
solely to ensure the partitions are non-empty. In a real application, these
would be replaced by actual per-thread globals.
Why still include app_partition?
The CarSystem object, logging buffers, and other common application globals
live in app_partition. Both threads need read access to those. If you
removed app_partition from the domain, the first access to any
APP_DATA-tagged variable would fault.
2.4 — Place shared data in pc_shared_partition
In prod_consumer.cpp, the shared data placement now depends on the
configuration:
#if CONFIG_PROD_CONSUMER_PARTITIONS
#include "userspace/prod_consumer_partitions.hpp"
PC_SHARED_DATA car_system::SensorData gProdConsumerSharedData = {};
#elif CONFIG_USERSPACE
#include "userspace/init_domain.hpp"
APP_DATA car_system::SensorData gProdConsumerSharedData = {};
#endif
When PROD_CONSUMER_PARTITIONS is enabled, gProdConsumerSharedData lives in
pc_shared_partition — accessible to both threads. Any thread-private data
tagged with PRODUCER_DATA or CONSUMER_DATA would only be accessible to its
respective thread.
2.5 — Wire domain initialization into start()
In ProdConsumer::start(), after granting kernel object access and before
the threads drop to user mode, call the domain setup:
#if CONFIG_USERSPACE
// Grant kernel object access (same as Step 1)
_barrier.grant_access(_threads[PRODUCER_INDEX].get_tid());
_barrier.grant_access(_threads[CONSUMER_INDEX].get_tid());
_syncSemaphore.grant_access(_threads[PRODUCER_INDEX].get_tid());
_syncSemaphore.grant_access(_threads[CONSUMER_INDEX].get_tid());
_dataMutex.grant_access(_threads[PRODUCER_INDEX].get_tid());
_dataMutex.grant_access(_threads[CONSUMER_INDEX].get_tid());
#if CONFIG_PROD_CONSUMER_PARTITIONS
// Move each thread into its own memory domain
init_prod_consumer_domains(_threads[PRODUCER_INDEX].get_tid(),
_threads[CONSUMER_INDEX].get_tid());
#endif
#endif
Order matters
init_prod_consumer_domains() calls k_mem_domain_add_thread(), which
removes the thread from its current domain (app_domain) and adds it to
the new one. This must happen after init_domain() has run (so the
common partitions already exist) and before the threads start executing
in user mode.
2.6 — Activate in prj_user_mode.conf
# Enable separate memory partitions for producer/consumer
CONFIG_PROD_CONSUMER_PARTITIONS=y
Build and flash as before:
west build -b nrf5340dk/nrf5340/cpuapp car_system \
-- -DEXTRA_CONF_FILE="prj_user_mode.conf"
The output should be identical to Step 1. The difference is purely in the fault isolation: the producer and consumer can no longer corrupt each other’s private data.
Questions
- Draw the MPU region map for the producer thread showing which partitions are accessible and which are not.
- What happens if the producer tries to write to a variable tagged with
CONSUMER_DATA? At what level is the fault caught (hardware or software)? - In the diagram above, both domains include
app_partition. Could you remove it from one of them? What would break? - Why are
producer_domainandconsumer_domaindeclared as static local variables inprod_consumer_partitions.cpprather than as globals? What would change if they were global?
Solution
-
Producer thread MPU regions:
Region Partition Access 0 z_libc_partitionRW 1 zpp_lib_partitionRW 2 app_partitionRW 3 producer_partitionRW 4 pc_shared_partitionRW — consumer_partitionFAULT — kernel memory FAULT — peripheral MMIO FAULT (unless granted) -
A write to a
CONSUMER_DATAvariable triggers a MemManage fault — caught by the hardware MPU before the write completes. The CPU vectors to the fault handler, which typically logs the offending address and terminates the thread (or resets the system). No software check is needed; the MPU does it in a single clock cycle. -
Removing
app_partitionwould cause a hardware fault on the first access to anyAPP_DATA-tagged variable — for example theCarSystemobject itself, the log buffers, or any common application global. Both threads need read (and sometimes write) access to shared application state. -
The
k_mem_domainstructs are only used during initialization (k_mem_domain_init+k_mem_domain_add_thread). After that, the kernel maintains its own internal references. Making themstaticis a good practice (limits visibility), but making them global would also work — the kernel does not care about the C++ linkage of the struct, only about the pointer it received during init.
Summary
| Aspect | Step 0: supervisor | Step 1: shared domain | Step 2: per-thread domains |
|---|---|---|---|
| Thread privilege | Supervisor | User | User |
| Memory isolation | None | Threads vs. kernel | Threads vs. kernel and vs. each other |
| Shared data location | Class member (_sharedData) |
APP_DATA global |
PC_SHARED_DATA global |
| Private data safety | Trust-based | Trust-based | MPU-enforced |
| Kernel object access | Implicit | Explicit grants | Explicit grants |
| Fault on bad access | Silent corruption | MemManage fault | MemManage fault |
The progression from supervisor → shared domain → per-thread domains is a practical pattern for introducing userspace incrementally:
- Get the application working in supervisor mode first (Step 0).
- Flip the userspace switch, fix the build errors (grants, data placement).
- When isolation between threads matters, split into per-thread domains.