Robust Design Patterns - Part 1

Introduction

In this codelab, we address the problem of unresponsive applications. Unresponsive is referring to the situation a program installed on an MCU becomes unavailable due to various reasons (e.g. gobbler task, dead-lock, …).

What you’ll build

In this codelab, you will reuse the application used in previous codelabs (or that of the project) and add watchdog to it so as to ensure that we “always” return to a known, running state no matter what the issue in the SW is.

What you’ll learn

What type of Wathdogs the STM32 platform offers and know what their main differences are.
How to dimension and implement a Watchdog - the solution of last resort for safe and reliable systems.
The inner working of watchdog in the Cortex-M4 we are using as target.
How to make visible an idle system… 🙃

What you’ll need

You need to have finished the Digging into RTX.
You need to have completed the Scheduling of periodic tasks codelab
You need to have finalized the Robust Development Methodologies (I) codelab and the Robust Development Methodologies (II) codelab.
The codelabs related to scheduling (part 1 and part 2) are concluded.

Introduction to STM32 Watchdog

In the first part of the codelab, we learn about what watchdog options on our board there are. The board we are using for this course (STM32L475VG), offers an independent watchdog called IWDG and is used to detect and resolve malfunctions due to software failures. It functions by triggering a reset sequence when it is not refreshed within the expected time-window. This solution offers a wide range of timeout values: from 125 microseconds to 32 seconds.
Since IWDG clock is an independent 32-kHz low-speed internal RC oscillator (LSI), it remains active even if the main clock fails. Once enabled, it forces the activation of the low-speed internal oscillator, and it can only be disabled by a reset.

There exists a different type of watchdog on the same board called WWDG (Window Watchdog). You will find the official presentation here.

Questions

Can you explain :

what the main differences between IWDG and WWDG are?
what the rationale for using “IWDG” in this lab is? (hint: read below)
how to avoid a reset while in debug and having WWDG activated?

Solution

Below the responses to the questions:

Here a summary table:

Feature	IWDG	WWDG
triggers a reset if counter == 0	X	X
programmable refresh window ^A	(X)	X
uses an independent clock	X
calls an ISR prior to resetting		X
can be disabled in various modes^B	X

^A : this feature is optional for IWDG and not present on all models.
^B : the WWDG makes a difference whether run in HW or SW mode. In SW mode one can indeed prevent the counter from running when in sleep mode.

We only want to make sure the spare capacity of the CPU is available, thus the choice of IWDG. In addition, the dedicated clock makes it a reliable, independent source of trust for ensuring the main program runs correctly (at least from the spare capacity requirement standpoint).
The early wakeup interrupt can be used to reload the downcounter in order to avoid a reset.

Calculating the dimensioning of the watchdog

So, this first part of the codelab, we calculate the refresh period of the watchdog. That is, the maximum time allowed without a refresh from the application prior to issuing a reset. In this calculation the assumption is that the idle task shall run at least every second engine task period (see Phase A - Periodic tasks) because at design it was calculated that the system shall have spare capacity. So, not being able to run the idle task means that there is no such spare capacity and this is deemed an unsustainable situation.

The corresponding Block Diagram of the IWDG is the following:

The formula for calculating the maximum time allowed without a refresh is the following: \(time_{IWDG} = time_{LSI} * 4 * 2^{IWDG\_PR} * (RL_{reg} + 1)\)

where LSI clock is \(32 kHz\) (thus \(time_{LSI} = 1/32000 seconds\)), the IWDG_PR prescaler register can divide the LSI clock frequency by 4 up to 256. The watchdog counter reload value (\(RL_{reg}\)) is a 12-bit value written in the IWDG_RLR register.

Questions

First of all :

calculate the parameters of the watchdog for complying to above stated timing requirements

Secondly, although a watchdog is very important, it can be a burden while debugging as it triggers resets at speeds the human being cannot handle (in the milliseconds - single digit seconds range). So :

How could we possibly tackle this?

Solution

Below the responses to the questions:

The period set forth by the requirements is: \(2 * 50ms\). If we take above formula and modify it for calculating the value of the register we get:
\(RL_{reg} = \frac{time_{IWDG}}{(time_{LSI} * 4 * 2^{IWDG\_PR})} - 1\)

So, replacing it with values:
\(RL_{reg} = \frac{100ms * 32000}{4 * 2^{4}} - 1\)
\(RL_{reg} = \frac{3200}{64} - 1\)
\(RL_{reg} = 49\)

We now have the value to program the reload value in order to have a watchdog expecting the application to refresh its register every \(100 ms\) maximum.
Note: \(IWDG\_PR = 4\) is equal to \(1/64\), see :
Source : STM32L475xx Datasheet
Disabling the watchdog during debugging by making use of DBG_IWDG_STOP (setting it to 1 as per STM32L475 Manual) makes the counting stop if the core is halted)

Note

With the present solution it is assumed that the application cannot
withstand a situation in which the idle task is not served for a whole period as this would imply the system has no free capacity left and at design this was deemed impossible. Obviously, this is not a case that will be applied oftentimes. In fact, the watchdog is more often used as a solution of last resort - thus its application will be scrutinized carefully and resulting in slower triggering of such measure. For instance applying other measures first to recover from the situation prior to resetting the system.

Implementing a watchdog

First of all, we will add a visual help to make sure our idle task runs for good. In order to do so, you will replace the function osRtxIdleThread (found normally under RTE/CMSIS/RTX_Config.c) by placing a new function in one of the files you created with the following content:

__NO_RETURN void osRtxIdleThread (void *argument) {
  (void)argument;

  static int active_flag = 1;
  for (;;) {
      if (active_flag) {
        vioSetSignal(vioLED0, vioLEDoff);    // Switch LED0 off
    }
      else {
        vioSetSignal(vioLED0, vioLEDon);     // Switch LED0 on
    }
      active_flag = ! active_flag;
      busy_wait_ms(50MS_DELAY);              // wait at least 50 ms

  }
}

From this point onwards, every time the idle task is active, you should be able to spot it as the led blinks 😎.

Warning

In order to make it work, you obviously need to :

define the constant 50MS_DELAY
add the necessary #include references as applicable

Question

What does __WEAK in __WEAK uint32_t osRtxErrorNotify mean?

Solution

It means that if a second, non-weak function with the same signature exists, this will be taken instead. That is why you could define a new variant of osRtxErrorNotify without too much of a hassle.

Reference: ARM Developer - __weak

So, with this change you can see the idle task running. Now, we want to add the watchdog to the application. For doing so, we first need to activate the IWDG component. This is done by :

activating the environment under “Manage Run-Time Environment”/”Device”/”STM32Cube HAL“
ensure that the file RTE_Components.h has a #define RTE_DEVICE_HAL_IWDG active
add the activation of the watchdog to your main.c with the help of a local function System_IWDG_Init :

    static void System_IWDG_Init(void) {
      IWDG_HandleTypeDef hiwdg;

      hiwdg.Instance = IWDG;
      hiwdg.Init.Prescaler = IWDG_PRESCALER_32;
      hiwdg.Init.Reload    = 4095;
      hiwdg.Init.Window    = 4095;
      if (HAL_IWDG_Init(&hiwdg) != HAL_OK)
      {
        startup_error_handler();
      }
    }

Note

Should the different IWDG functions (e.g.HAL_IWDG_Init) not be available, make sure HAL_IWDG_MODULE_ENABLEDis enabled under RTE\Device\STM32L475VGTx\STCubeGenerated\Core\Inc\stm32l4xx_hal_conf.h.

Warning

You should use the above Reload and Prescaler values to ensure you see the behaviour of the watchdog once activated. However, remember to replace the values of Reload and Prescaler with those of your choice once ready for the real thing!

and finally adding the activation of IWDG shortly before the call to osKernelStart. In order to be able to see when it goes over reset (without needing an oscilloscope 😉), I suggest adding a waiting period so that visually the reset can be identified through lack of LED blinking like this :

   /* Start application watchdog */
   /* Long wait to realize, optically, that the board went over */
   /* reset */
   busy_wait_ms(500);
   System_IWDG_Init();

Questions

Now all has been setup for having a supervision of the application. One thing is still outstanding for avoiding troubles though! 😤

You need to implement the refresh of the watchdog - through HAL_IWDG_Refresh - so as to avoid going over reset on a regular basis like now. WARNING: your idle thread may be run later than thought…

Well, there is one more: deactivate the watchdog while debugging. But you found the response a few lines above… 😉

Solution

Do so by calling HAL_IWDG_Refresh on a regular base from osRtxIdleThread in the idle task. Obviously we need to adapt the delay so that it happens sufficiently fast.

for (;;) {
  if (counter >= 10) {
      if (active_flag) {
          vioSetSignal(vioLED0, vioLEDoff);    // Switch LED0 off
    }
      else {
          vioSetSignal(vioLED0, vioLEDon);     // Switch LED0 on
    }
      active_flag = ! active_flag;
    counter = 0;
  }
  HAL_IWDG_Refresh(&hiwdg);
    busy_wait_ms(5MS_DELAY);              // wait at least 5 ms
}

Robust Design Patterns - Part 1

Introduction

What you’ll build

What you’ll learn

What you’ll need

Introduction to STM32 Watchdog

Calculating the dimensioning of the watchdog

Implementing a watchdog

Going beyond