Robust Design Patterns - Part 3 - Task Watchdog in Userspace

Introduction

In this codelab, we address the problem stemming from the introduction of userspace/supervisor separation in the application.

In fact, in Part 1 you added a task-level software watchdog that monitors every periodic thread: each task registers a channel with task_wdt_add() at startup and refreshes it with task_wdt_feed() every period. If a task hangs, the watchdog fires its callback.

In Part 2 you moved those same tasks into userspace — they now run under MPU protection and can only access memory they have been explicitly granted.

The problem: task_wdt_add() and task_wdt_feed() are ordinary C functions, not Zephyr syscalls. A user-mode thread that tries to call them directly triggers an immediate CPU fault — the MPU rejects the attempt to execute kernel-only code from unprivileged context.

This codelab shows how to bridge the gap with a supervisor-mode proxy thread that accepts watchdog requests through message queues (which are proper syscalls) and calls the real watchdog API on behalf of user-mode tasks.

What you’ll build

A thin proxy layer composed of:

task_wdt_proxy.hpp / task_wdt_proxy.cpp — the proxy implementation.
Three message queues:
- a feed queue (user → proxy, non-blocking) and a
- register request / response pair (user ↔ proxy, synchronous).
A supervisor-mode proxy thread that drains feeds and handles registrations in a tight polling loop.

Integration with the existing CarSystem:

Step 0 — Supervisor mode: direct task_wdt_add() / task_wdt_feed() calls inside task_method(), initialised from main().
Step 1 — Userspace mode: the proxy is initialised instead; every task goes through task_wdt_proxy_register() / task_wdt_proxy_feed(), and the proxy message queues are granted to each user thread before it starts.

What you’ll learn

Why task_wdt_add() / task_wdt_feed() cannot be called from user-mode threads and how the Zephyr syscall boundary works.
The proxy pattern: delegating privileged operations to a dedicated supervisor thread via message queues.
How to design synchronous (registration) versus asynchronous (feed) inter-thread communication, and the trade-offs of each.
Why proxy globals must be placed in zpp_lib_partition and how the PROXY_DATA macro achieves this.
How to grant user threads access to kernel message queue objects.

What you’ll need

Completed Part 1 — Task Watchdog.
Completed Part 2 — Userspace Isolation.

Ensure userspace is enabled

This codelab only makes sense when CONFIG_USERSPACE=y. The supervisor-mode path (Step 0) already exists; you are adding the proxy layer on top for the userspace path.

Concepts: Why the Proxy is Necessary

The Zephyr syscall boundary

When a thread runs in user mode, the CPU’s MPU enforces a strict boundary:

Operation	Mechanism	User-mode accessible?
Kernel IPC (`k_msgq_put`, `k_sem_give`, …)	Syscall — traps to supervisor via `svc` instruction	Yes
`task_wdt_add()` / `task_wdt_feed()`	Regular C function — executes inline in caller context	No — MPU fault
`k_object_access_grant()`	Syscall	Yes (from supervisor; grants another thread access)

The task watchdog subsystem was designed for supervisor-mode threads. Its API functions are not wrapped as syscalls, so they cannot be called directly from a user thread.

The proxy pattern

The solution is a permanent supervisor-mode helper thread — the proxy — that owns the task watchdog calls and exposes the functionality through message queues:

User thread (user mode)            Proxy thread (supervisor mode)
─────────────────────────          ────────────────────────────────
task_wdt_proxy_register()          while (true) {
  put → register_req_queue    ──▶    if nbr_queued(feed) > 0:
  get ← register_resp_queue   ◀──      try_get(feed) → task_wdt_feed()
                                     if nbr_queued(reg_req) > 0:
task_wdt_proxy_feed()                  try_get(reg_req)
  if not full:                           → task_wdt_add()
    put → feed_queue          ──▶        put(reg_resp)
                                     sleep 10 ms
                                   }

Two communication patterns:

Channel	Direction	Blocking?	Why
`register_req_queue`	User → Proxy	Blocking put (1 s timeout)	One registration at a time; caller must get a channel ID back
`register_resp_queue`	Proxy → User	Blocking get (1 s timeout)	Caller waits for the ID
`feed_queue`	User → Proxy	Non-blocking (occupancy-guarded)	A stalling feed defeats the purpose of the watchdog

Why is feeding non-blocking?

If task_wdt_proxy_feed() blocked, a slow proxy would stall the calling task — which might cause it to miss its own watchdog deadline. A dropped feed (queue full) is logged at error level and is far more visible than a silent livelock.

Step 0 — Supervisor Mode (Direct API)

In supervisor mode the task watchdog API is used directly. This step consolidates the existing code into a single reviewable picture before the proxy layer is added.

0.1 — Kconfig

Enable the task watchdog subsystem in prj.conf:

CONFIG_APP_TASK_WDT=y

The APP_TASK_WDT option (defined in the project Kconfig) selects TASK_WDT, which pulls in Zephyr’s software watchdog subsystem.

config APP_TASK_WDT
  bool "Enable task-level watchdog"
  select TASK_WDT
  default y
  help
    Enables per-thread software watchdog channels via task_wdt_init().

Full Kconfig

menu "Zephyr"
source "Kconfig.zephyr"
endmenu

config PERIODIC_TASKS
  bool "Build system with periodic tasks"
  default n
  help
    This option must be enabled for adding periodic tasks to the CarSystem.

config APERIODIC_TASKS
  bool "Build system with aperiodic tasks"
  depends on PERIODIC_TASKS
  default n
  help
    This option must be enabled for adding aperiodic tasks to the CarSystem.
    This option cannot be enabled without PERIODIC_TASKS.

config TASK_DEPENDENCIES
  bool "Build system with task dependencies"
  depends on PERIODIC_TASKS
  default n
  help
    This option must be enabled for adding depedencies between periodic tasks.
    This option cannot be enabled without PERIODIC_TASKS.

config PERIOD_OFFSET_TOLERANCE
  int "Value in microseconds used for checking period offset tolerance (TaskRecorder)"
  depends on TEST
  default 700
  help 
    This option allows to specify the period offset tolerance to be used in TaskRecorder

config COMPUTATION_TIME_OFFSET_TOLERANCE
  int "Value in microseconds used for checking computation time offset tolerance (TaskRecorder)"
  depends on TEST
  default 300
  help 
    This option allows to specify the computation time offset tolerance to be used in TaskRecorder

choice PROD_CONSUMER_MODE
  prompt "Producer-consumer mode"
  default PROD_CONSUMER_NONE
  help
    Select the producer-consumer demonstration mode.
    Only one mode can be active at a time.

config PROD_CONSUMER_NONE
  bool "Disabled"

config PROD_CONSUMER
  bool "Standalone producer-consumer demo"
  help
    Enable a producer-consumer subsystem where a producer thread writes
    sensor data to a shared variable (protected by a mutex) and signals
    a consumer thread via a semaphore. Works in both supervisor and
    userspace modes.

config PROD_CONSUMER_INTEGRATED
  bool "Engine/Display producer-consumer demo"
  depends on PERIODIC_TASKS
  help
    When enabled, the Engine periodic task acts as a producer and the
    Display periodic task acts as a consumer. Engine writes SensorData
    every period and Display reads it, demonstrating producer-consumer
    with existing threads rather than dedicated ones. Works in both
    supervisor and userspace modes.

endchoice

config PROD_CONSUMER_PARTITIONS
  bool "Use separate memory partitions for producer and consumer"
  depends on (PROD_CONSUMER || PROD_CONSUMER_INTEGRATED) && USERSPACE
  default n
  help
    When enabled, the producer and consumer each get their own memory
    partition (producer_partition, consumer_partition) plus a shared
    partition (pc_shared_partition) for the data exchanged between them.
    When disabled, both threads share app_partition.

config PROD_CONSUMER_SLAB
  bool "Use a memory slab for zero-copy producer-consumer exchange"
  depends on PROD_CONSUMER
  default n
  help
    When enabled, the producer allocates SensorData blocks from a
    k_mem_slab pool and passes pointers via message queues to the
    consumer, which returns them after processing. This eliminates
    data copies and the mutex. In userspace mode, the slab is
    initialized from supervisor context at startup and only
    message queue syscalls are used at runtime.

config PROD_CONSUMER_SLAB_NUM_BLOCKS
  int "Number of memory slab blocks for producer-consumer"
  depends on PROD_CONSUMER_SLAB
  default 4
  range 2 16
  help
    Number of fixed-size SensorData blocks pre-allocated in the slab.
    Controls how many samples can be in-flight between the producer
    and consumer at the same time.

config APP_WATCHDOG
  bool "Enable system hardware watchdog"
  select WATCHDOG
  default y
  help
    Enables the hardware watchdog and its feeder thread. When disabled,
    the initSystemWatchdog() call in main.cpp is compiled out.

if APP_WATCHDOG

config WDT_FEEDER_ZPP_THREAD
  bool "Use a zpp_lib thread for the system watchdog feeder"
  depends on USE_ZPP_LIB
  default y
  help
    When enabled, the system watchdog feeder uses a zpp_lib thread at idle priority.
    When disabled, a native K_THREAD_DEFINE thread is used instead.

config WDT_FEEDER_MAX_FEEDS
  int "Maximum number of watchdog feeds (0 = infinite)"
  default 0
  range 0 2147483647
  help
    Controls how many times the feeder thread feeds the hardware watchdog
    before stopping. 0 means feed indefinitely (production behaviour).
    Set to a positive value to deliberately let the watchdog fire after
    that many feeds — useful for testing the watchdog reset path.
    Negative values are invalid.

config APP_TASK_WDT
  bool "Enable task-level watchdog"
  select WATCHDOG
  select TASK_WDT
  default y
  help
    Enables per-thread software watchdog channels via task_wdt_init().
    Requires APP_WATCHDOG as hardware fallback.

# Set TASK_WDT_CHANNELS default to ZPP_THREAD_POOL_SIZE if zpp_lib is used
config TASK_WDT_CHANNELS
    int "Maximum number of task watchdog channels (override)"
    depends on APP_TASK_WDT && USE_ZPP_LIB
    default ZPP_THREAD_POOL_SIZE
    help
      Override: Set the default number of task watchdog channels to match the zpp thread pool size.

endif  # APP_WATCHDOG

module = APP
module-str = APP
source "subsys/logging/Kconfig.template.log_config"

One thing to note compared to the simplified excerpt above:

TASK_WDT_CHANNELS is automatically set to ZPP_THREAD_POOL_SIZE when zpp_lib is in use. This means the feed queue and the k_msgq backing it are sized to exactly the number of threads that could ever register a channel — no manual tuning needed.

0.2 — Initialise the subsystem in `main()`

The watchdog subsystem must be initialised before any task registers a channel. In main(), before carSystem.start():

#if CONFIG_APP_TASK_WDT && !CONFIG_USERSPACE
  int err = task_wdt_init(nullptr);
  if (err < 0) {
    LOG_ERR("Task watchdog init failed: %d", err);
    return err;
  }
#endif  // CONFIG_APP_TASK_WDT && !CONFIG_USERSPACE

The nullptr argument means “use the hardware watchdog as the fallback if CONFIG_TASK_WDT_HW_FALLBACK is enabled; otherwise just fail gracefully when all software channels have expired”.

0.3 — Register and feed inside `task_method()`

In car_system.cpp, at the top of task_method() (after the barrier wait), each task registers one channel whose timeout is twice its period — a generous margin that allows one full period to be late before the watchdog fires:

#if CONFIG_TASK_WDT && !CONFIG_USERSPACE
  const uint32_t wdt_timeout_ms = static_cast<uint32_t>((2 * taskInfo._period).count());
  int wd_id = task_wdt_add(wdt_timeout_ms, my_task_wd_callback, nullptr);
  if (wd_id < 0) {
    LOG_ERR("Failed to add task watchdog channel");
    return;
  }
#endif

In the task method/function, at the bottom of the loop (after computation, before sleep_until()):

#if CONFIG_TASK_WDT && !CONFIG_USERSPACE
  task_wdt_feed(wd_id);
#endif

The callback defined at file scope is called by the watchdog subsystem if the timeout expires:

static void my_task_wd_callback(int channel_id, void* user_data) {
  LOG_ERR("Thread on channel %d is stuck!\n", channel_id);
  k_thread_abort(sensor_tid);
  k_sem_give(&restart_sem);
}

ISR context

The watchdog callback runs in ISR context (or in a dedicated work-queue thread, depending on the Zephyr version). Keep it short: log, abort the offending thread, signal a supervisor. Never call blocking APIs.

0.4 — Build and verify

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine

Each task logs a feed line every period:

Task watchdog refreshed (iteration=1)
Task watchdog refreshed (iteration=2)
...

Stop feeding one task (e.g., by injecting a long busy_wait) and confirm the callback fires within 2× that task’s period.

Questions

What is the difference between the hardware watchdog (CONFIG_APP_WATCHDOG) and the task watchdog (CONFIG_APP_TASK_WDT)? When would you need both?
Why is the timeout set to 2 × period rather than 1 × period?
What happens to the system if task_wdt_init() is called after the first task thread has already started?

Step 1 — Userspace Mode (Proxy)

With CONFIG_USERSPACE=y, the direct calls to task_wdt_add() and task_wdt_feed() no longer compile without a fault. This step adds the supervisor-mode proxy.

1.1 — Create `task_wdt_proxy.hpp`

The header exposes three public functions and two request/response structs, all guarded by CONFIG_APP_TASK_WDT && CONFIG_USERSPACE:

// task_wdt_proxy.hpp
#pragma once

#if CONFIG_APP_TASK_WDT && CONFIG_USERSPACE

#include <zephyr/kernel.h>
#include <zephyr/task_wdt/task_wdt.h>

namespace car_system {

struct TaskWdtRegisterRequest {
  uint32_t reload_period_ms;
  task_wdt_callback_t callback;
  void* user_data;
};

struct TaskWdtRegisterResponse {
  int channel_id;  // >=0 on success, <0 on error
};

// Initialise: calls task_wdt_init() and starts the proxy thread.
// Must be called from supervisor mode (e.g. main()).
int task_wdt_proxy_init();

// Register a new task watchdog channel via the proxy.
// Callable from user-mode threads. Blocks until the proxy processes the
// request and returns the assigned channel_id (or a negative error code).
int task_wdt_proxy_register(uint32_t reload_period_ms,
                            task_wdt_callback_t callback,
                            void* user_data);

// Feed a task watchdog channel via the proxy.
// Callable from user-mode threads. Non-blocking put into the feed queue.
int task_wdt_proxy_feed(int channel_id);

// Grant the given thread access to the proxy message queues.
// Must be called from supervisor mode before the thread drops to user mode.
void task_wdt_proxy_grant_access(k_tid_t tid);

}  // namespace car_system

#endif  // CONFIG_APP_TASK_WDT && CONFIG_USERSPACE

1.2 — Create `task_wdt_proxy.cpp`

a) The `PROXY_DATA` partition tag

All message queue objects and the proxy thread must be placed in a memory partition that user-mode threads are allowed to address — otherwise the syscall validation inside k_msgq_put / k_msgq_get would reject the object pointer even after grant_access() has been called.

The appropriate partition is zpp_lib_partition: it is already included in every thread’s domain by init_domain(), and zpp_lib places its own kernel objects there. A PROXY_DATA macro is defined to tag proxy globals into that same partition:

#include <zephyr/app_memory/app_memdomain.h>

#define PROXY_DATA K_APP_DMEM(zpp_lib_partition)

Why zpp_lib_partition and not app_partition?

The message queue objects are allocated at file scope and initialised before main() runs. At that point app_partition may not yet be set up by init_domain(). zpp_lib_partition is defined unconditionally by the build system and is safe to use for pre-main() globals. In addition, placing proxy objects in the same partition as other zpp_lib objects keeps the MPU region count low — each partition costs one MPU region slot.

b) Message queue buffers and objects

Three message queues bridge the privilege boundary. zpp_lib::MessageQueue requires an externally allocated buffer, and both the buffer and the queue object itself must carry the PROXY_DATA tag so that the MPU allows user-mode threads to reach them:

static constexpr int kRegQueueDepth = 1;

// Buffers
PROXY_DATA static char feed_buffer[sizeof(int) * CONFIG_TASK_WDT_CHANNELS];
PROXY_DATA static char reg_req_buffer[sizeof(TaskWdtRegisterRequest) * kRegQueueDepth];
PROXY_DATA static char reg_resp_buffer[sizeof(TaskWdtRegisterResponse) * kRegQueueDepth];

// Queues
PROXY_DATA zpp_lib::MessageQueue<int, CONFIG_TASK_WDT_CHANNELS>
    feed_queue(feed_buffer);

PROXY_DATA zpp_lib::MessageQueue<TaskWdtRegisterRequest, kRegQueueDepth>
    register_req_queue(reg_req_buffer);

PROXY_DATA zpp_lib::MessageQueue<TaskWdtRegisterResponse, kRegQueueDepth>
    register_resp_queue(reg_resp_buffer);

The proxy thread object is also tagged, and created with userMode = false so it stays in supervisor mode:

PROXY_DATA zpp_lib::Thread proxy_thread(
    zpp_lib::PreemptableThreadPriority::PriorityHigh,
    "task_wdt_proxy",
    false /* supervisor mode */);

c) Call `try_get_for()/try_put_for()` with no timeout to check for queued messages/queue messages

-ENOMSG vs -EAGAIN in Zephyr

Zephyr message queue functions distinguish two failure cases depending on the timeout argument:

K_NO_WAIT (timeout = 0): z_impl_k_msgq_put() / z_impl_k_msgq_get() return -ENOMSG immediately when the queue is full or empty.
Finite timeout that expires: the same functions return -EAGAIN once the deadline passes without the operation completing.

In zpp_lib this is mapped explicitly to a boolean value:

ZephyrBoolResult res;
if ((K_TIMEOUT_EQ(k_timeout, K_NO_WAIT) && ret == -ENOMSG) ||
    (!K_TIMEOUT_EQ(k_timeout, K_NO_WAIT) && ret == -EAGAIN)) {
  // timeout -> return false without error
  res.assign_value(false);
} else if (ret != 0) {
  // other failure -> return false with error
  __ASSERT(
      false, "Cannot put message: %d (timeout is %lld msecs)", ret, timeout.count());
  res.assign_value(false);
  res.assign_error(zephyr_to_zpp_error_code(ret));
}
return res;

Update zpp_lib

Do update the zpp_lib to version v1.0. Do so by going to deps/zpp_lib and execute a git checkout tags/v1.0 to get the latest version. Failing to do so will result in faulty behaviors when using try_get_for()/try_put_for(). You should also modify your west.yml file. If you prefer to perform a west update after modifying the west.yml file, do NOT forget to reapply the patch to the Zephyr RTOS library with git cherry-pick 4a3c5ed1f6f3ae3351fe48f11382c0a94e0aea02.

d) Proxy thread function

static constexpr auto kNoWait = 0ms; 
static constexpr auto kTimeout = 10ms; 

void proxy_thread_func() {
  LOG_INF("Task WDT proxy thread started");

  while (true) {
    // 1. Drain all pending feed requests.
    int channel_id;
    while (feed_queue.try_get_for(kNoWait, channel_id)) {
      int ret = task_wdt_feed(channel_id);
      if (ret < 0) {
        LOG_ERR("task_wdt_feed(%d) failed: %d", channel_id, ret);
      }
    }

    // 2. Check for registration requests.
    TaskWdtRegisterRequest req;
    if (register_req_queue.try_get_for(kNoWait, req)) {
      int id = task_wdt_add(req.reload_period_ms, req.callback, req.user_data);
      if (id < 0) {
        LOG_ERR("task_wdt_add(%u) failed: %d", req.reload_period_ms, id);
      } else {
        LOG_INF("Task WDT channel %d registered (timeout=%u ms)",
                id, req.reload_period_ms);
      }
      TaskWdtRegisterResponse resp{.channel_id = id};
      if (!register_resp_queue.try_put_for(1000ms, resp)) {
        LOG_ERR("Failed to send registration response for channel %d", id);
      }
    }

    // 3. Sleep briefly before polling again.
    //    10 ms keeps feed latency well below the shortest task period (50 ms).
    zpp_lib::ThisThread::sleep_for(kTimeout);
  }
}

Why a polling loop with sleep_for(10ms)?

The proxy must service both feeds and registrations in a round-robin fashion. A blocking wait on one queue would prevent the other from being serviced. Occupancy-guarded non-blocking reads followed by a short sleep_for(10ms) ensure both queues are checked on every iteration without burning CPU.

e) Public API implementations

task_wdt_proxy_init() calls the real task_wdt_init() (supervisor privilege required) then starts the proxy thread:

int task_wdt_proxy_init() {
  int ret = task_wdt_init(nullptr);
  if (ret < 0) {
    LOG_ERR("task_wdt_init failed: %d", ret);
    return ret;
  }
  auto res = proxy_thread.start(proxy_thread_func);
  if (!res) {
    LOG_ERR("Failed to start task WDT proxy thread: %d",
            static_cast<int>(res.error()));
    return -ENODEV;
  }
  LOG_INF("Task WDT proxy initialized");
  return 0;
}

task_wdt_proxy_register() is synchronous — it puts a request and blocks until the proxy responds with the channel ID:

int task_wdt_proxy_register(uint32_t reload_period_ms,
                            task_wdt_callback_t callback,
                            void* user_data) {
  TaskWdtRegisterRequest req{
      .reload_period_ms = reload_period_ms,
      .callback         = callback,
      .user_data        = user_data,
  };
  if (!register_req_queue.try_put_for(1000ms, req)) {
    LOG_ERR("Registration request queue full or timed out");
    return -ETIMEDOUT;
  }
  TaskWdtRegisterResponse resp;
  if (!register_resp_queue.try_get_for(1000ms, resp)) {
    LOG_ERR("Registration response timed out");
    return -ETIMEDOUT;
  }
  return resp.channel_id;
}

task_wdt_proxy_feed() is asynchronous. The occupancy check avoids the -ENOMSG assertion trap described earlier — if the queue is full a feed is simply dropped and logged:

int task_wdt_proxy_feed(int channel_id) {
  // Check occupancy first: try_put_for(0ms) would map to K_NO_WAIT and
  // return -ENOMSG on a full queue, which zpp_lib asserts on.
  if (feed_queue.get_nbr_of_queued_messages() >= CONFIG_TASK_WDT_CHANNELS) {
    LOG_ERR("WDT feed queue full — channel %d feed dropped", channel_id);
    return -EAGAIN;
  }
  // Space is guaranteed by the check above; 10ms is a safety net only.
  static_cast<void>(feed_queue.try_put_for(10ms, channel_id));
  return 0;
}

task_wdt_proxy_grant_access() grants a thread access to all three queues so it can call put/get from user mode:

void task_wdt_proxy_grant_access(k_tid_t tid) {
  feed_queue.grant_access(tid);
  register_req_queue.grant_access(tid);
  register_resp_queue.grant_access(tid);
}

1.3 — Initialise the proxy in `main()`

Replace the direct task_wdt_init() call with the proxy initialiser, guarded by the userspace flag. The full conditional block in main() reads:

#if CONFIG_APP_TASK_WDT && !CONFIG_USERSPACE
  int err = task_wdt_init(nullptr);
  if (err < 0) {
    LOG_ERR("Task watchdog init failed: %d", err);
    return err;
  }
#elif CONFIG_APP_TASK_WDT && CONFIG_USERSPACE
  int err = car_system::task_wdt_proxy_init();
  if (err < 0) {
    LOG_ERR("Task watchdog proxy init failed: %d", err);
    return err;
  }
#endif  // CONFIG_APP_TASK_WDT

The include at the top of main.cpp:

#if CONFIG_APP_TASK_WDT && CONFIG_USERSPACE
#include "task_wdt_proxy.hpp"
#endif

1.4 — Grant proxy queue access in `CarSystem::start()`

Before the task threads drop to user mode they must be granted access to all three proxy queues. In the existing CONFIG_USERSPACE block inside start(), alongside the barrier grants, add:

#if CONFIG_USERSPACE
  for (uint8_t taskIndex = 0; taskIndex < kNbrOfPeriodicTasks; taskIndex++) {
    k_tid_t tid = _threads[taskIndex].get_tid();
    _barrier.grant_access(tid);
#if CONFIG_APP_TASK_WDT
    car_system::task_wdt_proxy_grant_access(tid);
#endif  // CONFIG_APP_TASK_WDT
  }
  // ... existing PROD_CONSUMER_INTEGRATED grants ...
#endif  // CONFIG_USERSPACE

Grant before the thread executes its first syscall

task_wdt_proxy_grant_access() must be called after the thread TID is known (i.e., after _threads[taskIndex].start()) but before the thread executes its first put/get from user mode. The CarSystem::start() sequence already satisfies this: threads are started, grants are applied, and then the threads proceed through the barrier.

1.5 — Use the proxy API in `task_method()`

Replace the two direct watchdog calls with their proxy equivalents, guarded by the appropriate flags:

Registration (runs once, after the barrier):

#if CONFIG_TASK_WDT && !CONFIG_USERSPACE
  const uint32_t wdt_timeout_ms = static_cast<uint32_t>((2 * taskInfo._period).count());
  int wd_id = task_wdt_add(wdt_timeout_ms, my_task_wd_callback, nullptr);
  if (wd_id < 0) {
    LOG_ERR("Failed to add task watchdog channel");
    return;
  }
#elif CONFIG_APP_TASK_WDT && CONFIG_USERSPACE
  const uint32_t wdt_timeout_ms = static_cast<uint32_t>((2 * taskInfo._period).count());
  int wd_id = car_system::task_wdt_proxy_register(
      wdt_timeout_ms, my_task_wd_callback, nullptr);
  if (wd_id < 0) {
    LOG_ERR("Failed to add task watchdog channel via proxy");
    return;
  }
#endif

Feeding (runs every period, after computation, before sleep_until()):

#if CONFIG_TASK_WDT && !CONFIG_USERSPACE
  LOG_DBG("Task watchdog refreshed (iteration=%d)", iteration++);
  task_wdt_feed(wd_id);
#elif CONFIG_APP_TASK_WDT && CONFIG_USERSPACE
  LOG_DBG("Task watchdog refreshed via proxy (iteration=%d)", iteration++);
  const auto feedRes = car_system::task_wdt_proxy_feed(wd_id);
  if (feedRes != 0) {
    LOG_ERR("Task watchdog proxy feed failed (wd_id=%d, err=%d)", wd_id, feedRes);
  }
#endif

iteration counter — declared once, covers both branches because CONFIG_APP_TASK_WDT selects CONFIG_TASK_WDT through Kconfig:

#if CONFIG_TASK_WDT
  int iteration = 1;
#endif

1.6 — Configure `prj_user_mode.conf`

The userspace overlay file prj_user_mode.conf is inherited from Part 2. Two entries must be added (or verified present) for the task watchdog proxy:

# task_wdt_proxy needs 3 message queues (feed + register_req + register_resp)
CONFIG_ZPP_MSGQ_POOL_SIZE=3

# Enable watchdog — system (hardware fallback) and task (software channels)
CONFIG_APP_WATCHDOG=y
CONFIG_APP_TASK_WDT=y

Entry	Why it is needed
`CONFIG_ZPP_MSGQ_POOL_SIZE=3`	`zpp_lib` pre-allocates `MessageQueue` control blocks from a fixed pool. The proxy creates three queues (`feed`, `register_req`, `register_resp`). Without this the third queue construction will assert.
`CONFIG_APP_WATCHDOG=y`	`APP_TASK_WDT` selects `WATCHDOG` (hardware backend). Without the hardware watchdog enabled the build will fail.
`CONFIG_APP_TASK_WDT=y`	Enables the task watchdog subsystem and compiles in the proxy layer under `CONFIG_APP_TASK_WDT && CONFIG_USERSPACE`.

The complete prj_user_mode.conf (for reference):

Full prj_user_mode.conf

# enable user mode
CONFIG_USERSPACE=y
CONFIG_LOG_MODE_MINIMAL=y
# ProdConsumer adds 1 semaphore + 1 barrier (default pool size of 2 is not enough)
CONFIG_ZPP_SEMAPHORE_POOL_SIZE=4
# task_wdt_proxy needs 3 message queues (feed + register_req + register_resp)
CONFIG_ZPP_MSGQ_POOL_SIZE=3
# Enable INFO-level app logging so ProdConsumer output is visible
CONFIG_APP_LOG_LEVEL_INF=y
# Enable producer_consumer (integrated) — requires PERIODIC_TASKS
CONFIG_PERIODIC_TASKS=y
CONFIG_PROD_CONSUMER_INTEGRATED=y
# Enable separate memory partitions for producer/consumer
CONFIG_PROD_CONSUMER_PARTITIONS=y
# Enable watchdog - system and task
CONFIG_APP_WATCHDOG=y
CONFIG_APP_TASK_WDT=y

1.7 — Build and verify

Build with the userspace overlay:

west build -b nrf5340dk/nrf5340/cpuapp car_system --pristine \
    --extra-conf="prj_user_mode.conf"

Expected log output:

Task WDT proxy initialized
Task WDT proxy thread started
Task WDT channel 0 registered (timeout=100 ms)
Task WDT channel 1 registered (timeout=250 ms)
Task WDT channel 2 registered (timeout=400 ms)
Task WDT channel 3 registered (timeout=500 ms)
Task watchdog refreshed via proxy (iteration=1)
Task watchdog refreshed via proxy (iteration=2)
...

The channel IDs increase monotonically as each task registers during its startup phase. All four tasks refresh their channels every period without any visible delay compared to the supervisor-mode baseline.

Questions

The registration queue has depth 1. What would happen if two tasks tried to register simultaneously? Is this a realistic scenario in this design?
Why does task_wdt_proxy_feed() check get_nbr_of_queued_messages() instead of calling try_put_for(0ms, …) directly?
The proxy thread sleeps for 10 ms between polls. The shortest task period is 50 ms. What is the worst-case delay between a user thread calling task_wdt_proxy_feed() and the underlying task_wdt_feed() actually executing?
Why are the message queue objects tagged with PROXY_DATA / K_APP_DMEM(zpp_lib_partition) even though they also receive grant_access() calls?
What would happen if you forgot to call task_wdt_proxy_grant_access() for a task thread?

Sequence Diagram

main()          proxy_thread         Engine task (user mode)
  │                │                        │
  │ task_wdt_proxy_init()                   │
  │──────────────▶│                         │
  │               │ task_wdt_init()         │
  │               │ (supervisor OK)         │
  │               │                         │
  │ start() + grant_access(Engine tid)      │
  │─────────────────────────────────────▶   │
  │                                         │ (barrier wait)
  │               │          task_wdt_proxy_register()
  │               │◀──── reg_req_queue ─────│
  │               │ task_wdt_add()          │
  │               │──── reg_resp_queue ────▶│ wd_id = 0
  │               │                         │
  │               │  (every period…)        │
  │               │          task_wdt_proxy_feed(0)
  │               │◀──── feed_queue ────────│
  │               │ task_wdt_feed(0)        │
  │               │                         │

Summary

Aspect	Step 0: Supervisor	Step 1: Userspace proxy
WDT init	`task_wdt_init()` in `main()`	`task_wdt_proxy_init()` in `main()`
Registration	`task_wdt_add()` in `task_method()`	`task_wdt_proxy_register()` in `task_method()`
Feeding	`task_wdt_feed()` in `task_method()`	`task_wdt_proxy_feed()` in `task_method()`
Proxy thread	None	High-priority supervisor thread
Queue grants	None	`task_wdt_proxy_grant_access()` per thread
Queue + object placement	N/A	`PROXY_DATA` → `zpp_lib_partition`
Feed latency	0 (direct)	≤ 10 ms (proxy polling period)
Registration latency	0 (direct)	≤ 20 ms (two proxy polling cycles)
Zero-wait queue calls	N/A	Not used — occupancy check + 10 ms timeout

The proxy pattern is a general technique applicable to any non-syscall API that must be used from user-mode threads. The key design decisions are:

PROXY_DATA partition tag: proxy globals must live in a partition all threads can address, so their pointers are valid when passed to syscalls.
Occupancy-guarded reads/writes: zpp_lib maps zero-duration timeouts to K_NO_WAIT.
Asynchronous feeds (occupancy-checked, non-blocking): protect caller timing at the cost of at most one dropped feed in extreme circumstances.
Synchronous registration (blocking request/response pair): the caller must receive a valid channel ID before it can feed, so blocking is unavoidable.
Polling with a short sleep (10 ms): simple, predictable, low CPU cost, and feeds are serviced well within any task’s watchdog timeout.