Bare Metal vs RTOS
Bare metal means running code directly on hardware with no operating system — just main(), a super loop, and interrupt handlers. An RTOS adds a scheduler, tasks, and synchronization primitives. The choice shapes every aspect of firmware architecture.
Why It Matters
Picking the wrong approach costs time. Bare metal on a complex multi-sensor system leads to a tangled state machine. An RTOS on a simple LED blinker adds RAM overhead and debug complexity for no benefit. Understanding the trade-offs lets you choose correctly and know when to switch.
How It Works
The Super Loop (Bare Metal)
The simplest embedded architecture: initialize hardware, then loop forever.
int main(void) {
clock_init();
gpio_init();
uart_init();
adc_init();
while (1) {
read_sensors(); // ~2 ms
process_data(); // ~5 ms
update_display(); // ~10 ms
check_buttons(); // ~1 ms
// total loop time: ~18 ms -> everything runs at ~55 Hz
}
}Problem: every function must wait for the previous one to finish. If update_display() takes 50 ms (e.g., SPI to an LCD), check_buttons() only runs at 20 Hz. A button press could be missed.
Improving Bare Metal: ISR + Flags
Timer and peripheral interrupts break the loop dependency:
volatile uint8_t sensor_flag = 0;
volatile uint8_t button_flag = 0;
volatile uint32_t ms_ticks = 0;
void SysTick_Handler(void) {
ms_ticks++;
if (ms_ticks % 100 == 0) sensor_flag = 1; // every 100ms
}
void EXTI0_IRQHandler(void) {
EXTI->PR = (1 << 0);
button_flag = 1;
}
int main(void) {
init_all();
while (1) {
if (sensor_flag) {
sensor_flag = 0;
read_sensors();
process_data();
}
if (button_flag) {
button_flag = 0;
handle_button();
}
update_display(); // runs every loop iteration
__WFI(); // sleep until next interrupt
}
}This “super loop + ISR flags” pattern handles moderate complexity well. It is deterministic, uses zero RAM overhead, and is easy to reason about.
When RTOS Adds Value
The RTOS version of the same system:
void sensor_task(void *p) {
while (1) {
read_sensors();
process_data();
xQueueSend(display_queue, &result, portMAX_DELAY);
vTaskDelay(pdMS_TO_TICKS(100));
}
}
void button_task(void *p) {
while (1) {
xSemaphoreTake(button_sem, portMAX_DELAY); // blocks until ISR
handle_button();
}
}
void display_task(void *p) {
Result r;
while (1) {
xQueueReceive(display_queue, &r, portMAX_DELAY);
update_display(&r);
}
}Each concern is isolated. A slow display update does not block sensor reading. Priorities ensure the most important task runs first. Blocking calls (queue receive, semaphore take) are clean — no polling, no flags.
Decision Matrix
| Factor | Bare Metal | RTOS |
|---|---|---|
| Complexity | Simple (1-3 tasks) | Multiple concurrent activities (4+) |
| Timing | Deterministic (you control the loop) | Deterministic within priority rules |
| Latency | ISR response: immediate | Task response: within 1 tick (~1ms) + ISR |
| Blocking IO | Cannot block in main loop | Tasks block independently |
| RAM overhead | Zero (just your stack) | Kernel ~2-4KB + per-task stack (128-512 words each) |
| Flash overhead | Just your code | +6-10KB for kernel |
| Debug | Simple — single thread of execution | Harder — race conditions, priority issues |
| Networking/USB | Painful (state machines) | Natural (blocking socket/transfer calls) |
| Safety-critical | Easier to certify (less code) | FreeRTOS has SAFERTOS variant (IEC 61508) |
| Power | Full control over sleep modes | Tickless idle mode available in FreeRTOS |
| MCU size | Any, including tiny 8KB Flash | Needs at least ~20KB Flash, ~8KB RAM |
When Bare Metal Wins
- Simple, single-purpose devices: a thermostat, an LED controller, a motor driver with one PID loop
- Extremely resource-constrained MCUs: 8-bit AVR, tiny Cortex-M0 with 16KB Flash
- Hard real-time with sub-microsecond jitter: ISR-driven control where even 1 tick of RTOS jitter matters
- Safety certification: less code = less to certify
- Learning: understand the hardware first, add abstraction later
When RTOS Wins
- Multiple concurrent protocols: UART + SPI + I2C + USB all active
- Networking stacks: TCP/IP (lwIP), BLE, MQTT need blocking calls and background processing
- Complex async workflows: wait for sensor AND user input AND network response
- Multiple developers: tasks provide natural code boundaries
- Middleware that expects it: many vendor SDKs (STM32 USB, ESP-IDF) assume FreeRTOS
Hybrid Approaches
Real systems often mix both:
- Super loop + timer ISRs: the ISR+flag pattern above — 80% of embedded projects live here successfully
- RTOS with bare-metal ISRs: critical control loops run in ISRs at hardware priority (above RTOS), while communication and UI run as RTOS tasks
- Cooperative scheduler: a minimal round-robin without preemption — lighter than FreeRTOS, more structured than raw super loop
// Minimal cooperative scheduler (no RTOS needed)
typedef struct { void (*fn)(void); uint32_t period; uint32_t next; } Task;
Task tasks[] = {
{ read_sensors, 100, 0 }, // every 100ms
{ update_display, 200, 0 }, // every 200ms
{ check_comms, 50, 0 }, // every 50ms
};
int main(void) {
init_all();
while (1) {
uint32_t now = HAL_GetTick();
for (int i = 0; i < 3; i++) {
if (now >= tasks[i].next) {
tasks[i].fn();
tasks[i].next = now + tasks[i].period;
}
}
__WFI();
}
}This gives time-based scheduling without any RTOS overhead. It breaks down when tasks take longer than their period or when you need blocking calls.
Related
- RTOS Fundamentals — tasks, queues, mutexes, FreeRTOS API
- Interrupts and Timers — ISR+flag pattern that powers bare metal
- Scheduling — theory behind preemptive and cooperative schedulers
- Concurrency and Synchronization — mutex/semaphore concepts
- Microcontroller Architecture — RAM/Flash constraints that influence the choice