A Step-by-Step Tutorial
Writing an Optiboot-style UART Bootloader from Scratch
For Windows | Arduino IDE 1.8.x | ATmega328P @ 16MHz
Chapter 1: How the ATmega328P Boots
1.1 — What is the ATmega328P at its Core?
The ATmega328P is an 8-bit microcontroller made by Microchip (formerly Atmel). It contains three separate memory types:
|
Memory |
Size |
Purpose |
Survives Power Off? |
|
Flash |
32KB |
Stores your program code |
Yes |
|
SRAM |
2KB |
Stores variables while running |
No |
|
EEPROM |
1KB |
Stores small persistent data |
Yes |
We only care about Flash for our bootloader. SRAM and EEPROM are not involved in booting.
|
📖 Datasheet §1 — "The ATmega328P is a low-power CMOS 8-bit microcontroller based on the AVR enhanced RISC architecture" |
1.2 — The Program Counter (PC)
The CPU has a special internal register called the Program Counter (PC). It holds the address of the next instruction to execute. The CPU is a machine that does this millions of times per second:
loop forever:
1. Read instruction at address stored in PC
2. Execute that instruction
3. Increment PC (or jump if instruction says so)
|
📖 Datasheet §6.3 — "The Program Counter (PC) is 14 bits wide, addressing the 16K word (32KB) program memory space" |
|
⚠️ Note: PC counts in WORDS (2 bytes each). 2^14 = 16,384 words = 32,768 bytes = 32KB. You will see both word and byte addresses in the datasheet. |
1.3 — The Flash Memory Map in Detail
The full 32KB Flash with real addresses:
BYTE ADDRESS CONTENT
┌────────────────────────────────────────────┐
│ 0x0000 ← RESET VECTOR │
│ 0x0002 ← INT0 vector │
│ 0x0004 ← INT1 vector │
│ ... other interrupt vectors │
├────────────────────────────────────────────┤
│ │
│ APPLICATION SECTION │
│ (your sketch / program) │
│ │
├────────────────────────────────────────────┤
│ 0x7C00 ← BOOT START (BOOTSZ=512 words) │
│ BOOTLOADER SECTION │
│ (our code lives here) │
│ 0x7FFF ← TOP OF FLASH │
└────────────────────────────────────────────┘
Total: 32,768 bytes (32KB)
|
📖 Datasheet §27.5, Table 27-5 — Defines boot section sizes and start addresses |
1.4 — What Happens in the First Nanoseconds After Reset?
Things that cause a Reset: Power on (VCC rises), RESET pin pulled low, Watchdog Timer timeout, Brown-out detection, Software reset via watchdog.
|
📖 Datasheet §8.1 — "The ATmega328P provides several reset sources" |
Regardless of reset source, the sequence is always:
Reset occurs
│
▼
CPU initializes internally (registers cleared)
│
▼
┌─────────────────────────────────────────────┐
│ CPU checks: Is BOOTRST fuse programmed? │
└─────────────────────────────────────────────┘
│ │
NO YES
│ │
▼ ▼
PC = 0x0000 PC = 0x7C00
Your app runs Bootloader runs
|
📖 Datasheet §27.4 — "If the BOOTRST Fuse is programmed, the reset vector is pointing to the Boot Flash start address" |
1.5 — The BOOTRST Fuse Bit
Fuse bits are configuration bits stored outside Flash in dedicated hardware. They survive power cycling and are written with a programmer like USBasp.
|
⚠️ AVR Fuse Convention: Bit = 1 means UNPROGRAMMED (not active, factory default). Bit = 0 means PROGRAMMED (active). So BOOTRST = 0 means bootloader IS active. This inverted logic confuses everyone at first! |
1.6 — The Two Upload Scenarios
Scenario A: USBasp Without a Bootloader
The USBasp talks SPI directly to the chip hardware (ICSP header). It writes raw bytes to Flash starting at 0x0000. No bootloader is needed or involved. Even a completely blank chip can be programmed this way. BOOTRST fuse is NOT set, so CPU starts at 0x0000 and your code runs immediately.
Scenario B: Arduino IDE Upload (With Bootloader)
The Arduino IDE runs Avrdude which talks to the bootloader running on the chip over UART. A DTR pin pulse triggers a hardware reset, the chip resets with BOOTRST set so PC goes to 0x7C00, the bootloader starts, waits for Avrdude, receives the sketch over UART, writes it to Flash at 0x0000, then jumps to the application.
|
|
USBasp (no bootloader) |
Arduino IDE (with bootloader) |
|
How it writes Flash |
SPI/ICSP hardware |
UART + bootloader software |
|
Reset goes to |
0x0000 always |
0x7C00 (bootloader) first |
|
Bootloader needed? |
No |
Yes |
|
Can overwrite bootloader? |
Yes (dangerous) |
No (lock bits protect it) |
1.7 — Every Startup, Without Exception
Once a bootloader is installed and BOOTRST is set, every single startup follows this flow:
POWER ON or RESET
│
▼
PC = 0x7C00 (BOOTRST forces this)
│
▼
BOOTLOADER RUNS
1. Initialize UART
2. Wait ~1 second for Avrdude
│
┌────┴────┐
│ │
Avrdude No response (timeout)
connects │
│ Jump to 0x0000
│ │
Receive YOUR SKETCH RUNS
& flash
sketch
│
Reset → bootloader → timeout → 0x0000
1.8 — Chapter 1 Summary
|
Concept |
Key Fact |
|
Program Counter |
Holds address of next instruction, starts at 0x0000 or 0x7C00 |
|
Flash layout |
Application at bottom (0x0000), bootloader at top (0x7C00) |
|
BOOTRST fuse |
0 = programmed = CPU starts at boot section on reset |
|
AVR fuse logic |
0 = active, 1 = inactive (inverted — always remember this!) |
|
USBasp upload |
SPI directly to hardware, no bootloader needed, writes from 0x0000 |
|
Arduino upload |
UART to bootloader, bootloader writes sketch to 0x0000 |
|
Every startup |
If BOOTRST set → bootloader always runs first, then jumps to app |
|
Bootloader's job |
Check for new upload → if none, hand control to application |
Chapter 2: Bootloader Section & Fuses
We only care about 3 things from the fuse system:
1. BOOTRST → Tell CPU to start at bootloader on reset
2. BOOTSZ → Tell CPU how big our bootloader is (sets start address)
3. Lock bits → Protect bootloader from being overwritten
2.1 — BOOTSZ: Choosing Our Bootloader Size
|
📖 Datasheet §27.5, Table 27-5 — Boot section sizes and start addresses |
|
BOOTSZ1 |
BOOTSZ0 |
Size (words) |
Size (bytes) |
Start Address (byte) |
|
1 |
1 |
256 words |
512 bytes |
0x7E00 |
|
1 |
0 |
512 words |
1024 bytes |
0x7C00 ← we use this |
|
0 |
1 |
1024 words |
2048 bytes |
0x7800 |
|
0 |
0 |
2048 words |
4096 bytes |
0x7000 |
We pick 512 words (1024 bytes) starting at 0x7C00 — same as Optiboot. 256 bytes is too small, 512 bytes is the sweet spot, 1024+ wastes application space.
2.2 — The Fuse Bytes (Only What We Touch)
The ATmega328P has 3 fuse bytes. We only touch the High Fuse Byte (HFUSE). Our HFUSE value is 0xDA — this sets BOOTRST=0 (active) and BOOTSZ=10 (512 words).
|
⚠️ Critical: SPIEN (bit 5) must stay 0 (programmed/active). It enables SPI programming. If you accidentally set it to 1, you can no longer program the chip with USBasp! |
|
📖 Datasheet §27.4, Table 27-3 — High Fuse Byte bit description |
avrdude -c usbasp -p m328p -U hfuse:w:0xDA:m
2.3 — Lock Bits: Protecting Our Bootloader
|
📖 Datasheet §27.6, Table 27-7 — Boot Lock Bit table |
|
BLB12 |
BLB11 |
Effect |
|
1 |
1 |
No restrictions (default — dangerous!) |
|
1 |
0 |
Application cannot WRITE to boot section ← we use this |
|
0 |
1 |
Application cannot READ from boot section |
|
0 |
0 |
Application cannot READ or WRITE boot section |
Our Lock byte value is 0xEF. Set lock bits LAST — after the bootloader is flashed and working. Lock bits can only be cleared by a full chip erase which would wipe your bootloader.
avrdude -c usbasp -p m328p -U lock:w:0xEF:m
2.4 — The Complete Fuse Setup
# 1. Set fuses (BOOTRST active, BOOTSZ = 512 words)
avrdude -c usbasp -p m328p -U hfuse:w:0xDA:m
# 2. Flash our bootloader binary
avrdude -c usbasp -p m328p -U flash:w:bootloader.hex
# 3. Set lock bits LAST (protect bootloader section)
avrdude -c usbasp -p m328p -U lock:w:0xEF:m
2.5 — Chapter 2 Summary
|
Thing |
Value |
Why |
|
Bootloader size |
512 words / 1024 bytes |
Small enough, fits our code |
|
Bootloader start |
0x7C00 |
Calculated from BOOTSZ |
|
HFUSE |
0xDA |
BOOTRST=0, BOOTSZ=10 |
|
Lock byte |
0xEF |
App cannot overwrite bootloader |
|
Set fuses |
First |
With USBasp before anything |
|
Set lock bits |
Last |
After bootloader is flashed and verified |
Chapter 3: Toolchain Setup (Windows)
3.1 — Tools Already Installed via Arduino IDE
Since Arduino IDE 1.8.x is installed, you already have everything needed. No downloads required.
C:\Users\<username>\AppData\Local\Arduino15\packages\arduino\
tools\avr-gcc\5.4.0-atmel3.6.1-arduino2\bin\
avr-gcc.exe ← the compiler
avr-objcopy.exe ← converts compiled output to .hex
avr-size.exe ← shows how big our binary is
|
⚠️ Note: Replace <username> with your actual Windows username everywhere in the scripts below. |
3.2 — Our Project Folder
C:\AVR_Bootloader\
├── src\
│ └── main.c ← our bootloader source code
├── build\ ← compiled output goes here
├── build.bat ← our build script
└── flash.bat ← our flash script
3.3 — The Build Batch File (build.bat)
Create build.bat in C:\AVR_Bootloader\ with this content:
@echo off
REM ATmega328P Bootloader Build Script
set AVR=C:\Users\<username>\AppData\Local\Arduino15\packages\arduino\tools\avr-gcc\5.4.0-atmel3.6.1-arduino2\bin
set SRC=src\main.c
set BUILD=build
set OUT=bootloader
set MCU=atmega328p
set F_CPU=16000000UL
set BOOT_ADDR=0x7C00
echo [1/4] Compiling...
%AVR%\avr-gcc.exe -mmcu=%MCU% -DF_CPU=%F_CPU% -Os -std=c99 ^
-Wl,--section-start=.text=%BOOT_ADDR% ^
-o %BUILD%\%OUT%.elf %SRC%
if errorlevel 1 goto error
echo [2/4] Creating .hex file...
%AVR%\avr-objcopy.exe -O ihex -R .eeprom %BUILD%\%OUT%.elf %BUILD%\%OUT%.hex
if errorlevel 1 goto error
echo [3/4] Checking binary size...
%AVR%\avr-size.exe --format=avr --mcu=%MCU% %BUILD%\%OUT%.elf
echo [4/4] Done!
echo Output: %BUILD%\%OUT%.hex
goto end
:error
echo BUILD FAILED!
:end
pause
3.4 — The Most Important Line Explained
-Wl,--section-start=.text=%BOOT_ADDR%
-Wl, → pass this flag to the linker
--section-start → place this section at this address
.text → where compiled code lives
=%BOOT_ADDR% → = 0x7C00 (our bootloader start address)
3.5 — Other Compiler Flags Explained
|
Flag |
Purpose |
|
-mmcu=atmega328p |
Tells compiler exactly which AVR chip — sets correct register addresses |
|
-DF_CPU=16000000UL |
Defines F_CPU as 16MHz — used in Chapter 4 for baud rate calculation |
|
-Os |
Optimize for SIZE not speed — bootloader must fit in 1024 bytes! |
|
-std=c99 |
Use C99 standard — allows cleaner code style |
3.6 — Flash Script (flash.bat)
@echo off
REM ATmega328P Bootloader Flash Script
set AVRDUDE=C:\Users\<username>\AppData\Local\Arduino15\packages\arduino\tools\avrdude\6.3.0-arduino17\bin\avrdude.exe
set AVRDUDE_CONF=C:\Users\<username>\AppData\Local\Arduino15\packages\arduino\tools\avrdude\6.3.0-arduino17\etc\avrdude.conf
set HEX=build\bootloader.hex
set MCU=atmega328p
set PROGRAMMER=usbasp
echo [1/3] Setting fuses...
%AVRDUDE% -C %AVRDUDE_CONF% -p %MCU% -c %PROGRAMMER% -U hfuse:w:0xDA:m
if errorlevel 1 goto error
echo [2/3] Flashing bootloader...
%AVRDUDE% -C %AVRDUDE_CONF% -p %MCU% -c %PROGRAMMER% -U flash:w:%HEX%:i
if errorlevel 1 goto error
echo [3/3] Setting lock bits...
%AVRDUDE% -C %AVRDUDE_CONF% -p %MCU% -c %PROGRAMMER% -U lock:w:0xEF:m
if errorlevel 1 goto error
echo All done! Bootloader installed successfully.
goto end
:error
echo FAILED! Check USBasp connection.
:end
pause
3.7 — Chapter 3 Summary
|
File |
Purpose |
|
build.bat |
Compiles src\main.c → build\bootloader.hex |
|
flash.bat |
Sets fuses, flashes hex, sets lock bits via USBasp |
Chapter 4: UART From Scratch
4.1 — What is UART?
UART stands for Universal Asynchronous Receiver Transmitter. It is the simplest way two devices can talk — just two wires: TX (transmit) and RX (receive). Asynchronous means there is no shared clock wire. Both sides must agree on the speed beforehand — called the Baud Rate, measured in bits per second.
|
📖 Datasheet §19.1 — "The Universal Synchronous and Asynchronous serial Receiver and Transmitter (USART) is a highly flexible serial communication device" |
4.2 — How UART Sends a Byte
When idle, the TX line sits HIGH. To send one byte (8 bits):
Frame = 1 start bit + 8 data bits + 1 stop bit = 10 bits total
Idle Start D0 D1 D2 D3 D4 D5 D6 D7 Stop Idle
────┐ ┌──┐ ┌─┐ ┌─┐ ┌─┐ ┌─┐ ┌─┐ ┌─┐ ┌─┐ ┌─┐ ┌────────
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
└──┘ └─┘ └─┘ └─┘ └─┘ └─┘ └─┘ └─┘ └─┘ └─┘
At 115200 baud:
1 bit = 1 / 115200 = 8.68 microseconds
1 byte = 8.68 × 10 = 86.8 microseconds
4.3 — The 3 Registers We Need
|
Register |
Purpose |
|
UBRR0 |
Sets the baud rate |
|
UCSR0B |
Enables transmitter and receiver |
|
UCSR0C |
Sets frame format (8 data bits, 1 stop bit) |
|
UDR0 |
Write here to send, Read here to receive |
|
UCSR0A |
Status flags — is TX done? is RX ready? |
4.4 — UBRR0: Setting the Baud Rate
|
📖 Datasheet §19.10 — USART Baud Rate Registers |
The formula to calculate UBRR:
F_CPU
UBRR = ────────── - 1
16 × BAUD
For 16MHz CPU, 115200 baud:
16,000,000
UBRR = ──────────── - 1 = 8.68 - 1 = 7.68 → 8 (rounded)
16 × 115200
|
📖 Datasheet §19.11, Table 19-9 — Confirms UBRR=8 for 16MHz / 115200 baud. Actual baud rate error is 3.5% which Optiboot also uses and works reliably in practice. |
#define F_CPU 16000000UL
#define BAUD 115200
#define UBRR (F_CPU / (16UL * BAUD)) - 1 // = 8
UBRR0H = (UBRR >> 8); // high byte first
UBRR0L = UBRR; // low byte
4.5 — UCSR0B: Enabling TX and RX
|
📖 Datasheet §19.9.3 — UCSR0B description. We only need RXEN0 (bit 4) to enable receiver and TXEN0 (bit 3) to enable transmitter. |
UCSR0B = (1 << RXEN0) | (1 << TXEN0);
4.6 — UCSR0C: Frame Format
We want 8N1 — 8 data bits, No parity, 1 stop bit. This is standard for all AVR bootloaders.
|
📖 Datasheet §19.9.4 — UCSR0C description. UCSZ01:00 = 11 sets 8 data bits. UPM = 00 = no parity. USBS = 0 = 1 stop bit. |
UCSR0C = (1 << UCSZ01) | (1 << UCSZ00);
4.7 — Putting It Together: uart_init()
void uart_init(void)
{
UBRR0H = 0; // high byte of 8
UBRR0L = 8; // low byte of 8
UCSR0B = (1 << RXEN0) | (1 << TXEN0);
UCSR0C = (1 << UCSZ01) | (1 << UCSZ00);
}
4.8 — UCSR0A: Status Flags
|
📖 Datasheet §19.9.1 — UCSR0A description. UDRE0: TX buffer empty — safe to send next byte. RXC0: New byte waiting to be read. |
4.9 — uart_send() and uart_receive()
void uart_send(uint8_t byte)
{
while (!(UCSR0A & (1 << UDRE0))); // wait until TX buffer empty
UDR0 = byte; // hardware sends it automatically
}
uint8_t uart_receive(void)
{
while (!(UCSR0A & (1 << RXC0))); // wait until byte arrives
return UDR0; // read and return byte
}
4.10 — Chapter 4 Summary
|
Register |
What We Set |
Why |
|
UBRR0H:L |
8 |
115200 baud at 16MHz |
|
UCSR0B |
RXEN0 | TXEN0 |
Enable RX and TX hardware |
|
UCSR0C |
UCSZ01 | UCSZ00 |
8N1 frame format |
|
UCSR0A |
Read only |
Check UDRE0 before send, RXC0 before receive |
|
UDR0 |
Write to send, Read to receive |
The actual data register |
Chapter 5: The STK500v1 Protocol
5.1 — What is STK500v1?
When you hit Upload in Arduino IDE, Avrdude runs and starts sending bytes over UART to our bootloader following a protocol called STK500v1. Every exchange follows the same structure:
Every COMMAND Avrdude sends:
┌──────────┬──────────────────────┬──────────┐
│ CMD │ PARAMETERS │ SYNC │
│ 1 byte │ 0 or more bytes │ 0x20 │
└──────────┴──────────────────────┴──────────┘
CRC_EOP
Every RESPONSE we send:
┌──────────┬──────────────────────┬──────────┐
│ 0x14 │ DATA (if any) │ 0x10 │
└──────────┴──────────────────────┴──────────┘
INSYNC OK
5.2 — Protocol Constants
#define STK_OK 0x10
#define STK_INSYNC 0x14
#define STK_NOSYNC 0x15
#define STK_CRC_EOP 0x20
#define STK_GET_SYNC 0x30
#define STK_GET_PARAMETER 0x41
#define STK_SET_DEVICE 0x42
#define STK_SET_DEVICE_EXT 0x45
#define STK_ENTER_PROGMODE 0x50
#define STK_LEAVE_PROGMODE 0x51
#define STK_LOAD_ADDRESS 0x55
#define STK_PROG_PAGE 0x64
#define STK_READ_PAGE 0x74
#define STK_READ_SIGN 0x75
5.3 — Commands We Handle
|
Command |
Parameters |
What We Do |
|
GET_SYNC (0x30) |
None |
Reply INSYNC + OK |
|
GET_PARAMETER (0x41) |
1 byte param ID |
Reply version number |
|
SET_DEVICE (0x42) |
20 bytes |
Drain bytes, reply OK |
|
SET_DEVICE_EXT (0x45) |
5 bytes |
Drain bytes, reply OK |
|
ENTER_PROGMODE (0x50) |
None |
Reply OK |
|
LOAD_ADDRESS (0x55) |
2 byte address |
Store address, reply OK |
|
PROG_PAGE (0x64) |
Length + data |
Write to Flash (Ch6), reply OK |
|
READ_PAGE (0x74) |
Length |
Send Flash bytes back |
|
READ_SIGN (0x75) |
None |
Send 0x1E 0x95 0x0F |
|
LEAVE_PROGMODE (0x51) |
None |
Reply OK, jump to app (Ch7) |
5.4 — Key Command Details
STK_LOAD_ADDRESS — Set Write Address
case STK_LOAD_ADDRESS:
{
uint16_t lo = uart_receive();
uint16_t hi = uart_receive();
address = (hi << 8) | lo;
/* address is a WORD address — multiply by 2 for SPM */
get_sync();
uart_send(STK_INSYNC);
uart_send(STK_OK);
break;
}
STK_PROG_PAGE — Write Flash
case STK_PROG_PAGE:
{
uint16_t len = ((uint16_t)uart_receive() << 8) | uart_receive();
uint8_t type = uart_receive(); // 'F' = Flash
for (uint16_t i = 0; i < len; i++)
page_buffer[i] = uart_receive();
get_sync();
if (type == 'F')
write_flash_page(address, page_buffer, len); // Chapter 6
uart_send(STK_INSYNC);
uart_send(STK_OK);
break;
}
STK_READ_SIGN — Chip Signature
|
📖 Datasheet §27.8.1 — "The ATmega328P has a three byte signature code: 0x1E, 0x95, 0x0F" |
case STK_READ_SIGN:
get_sync();
uart_send(STK_INSYNC);
uart_send(0x1E); // Atmel/Microchip
uart_send(0x95); // 32KB Flash
uart_send(0x0F); // ATmega328P
uart_send(STK_OK);
break;
5.5 — The Full Upload Sequence
Avrdude Our Bootloader
│── GET_SYNC (×several) ────────────►│
│◄── INSYNC + OK ───────────────────│
│── GET_PARAMETER (SW major/minor) ─►│
│◄── INSYNC + version + OK ─────────│
│── SET_DEVICE (20 bytes) ──────────►│ (ignored)
│── SET_DEVICE_EXT (5 bytes) ────────►│ (ignored)
│── ENTER_PROGMODE ────────────────►│
│── READ_SIGN ─────────────────────►│
│◄── INSYNC + 1E 95 0F + OK ────────│
│── LOAD_ADDRESS (page 0) ──────────►│ address = 0x0000
│── PROG_PAGE (128 bytes) ──────────►│ write page to flash
│── READ_PAGE (verify) ─────────────►│ read back and send
│ ... repeat for every page ... │
│── LEAVE_PROGMODE ────────────────►│
│◄── INSYNC + OK ───────────────────│
jump to 0x0000
Chapter 6: Flash Self-Programming
6.1 — What is Flash Self-Programming?
Normal code reads from Flash. Our bootloader needs to write to Flash — writing the incoming sketch data into the application section. This is called Self-Programming — the chip modifying its own Flash while running.
|
📖 Datasheet §27.1 — "The Boot program can use any available data interface and associated protocol to read code and write (program) that code into the Flash memory" |
6.2 — The Most Important Constraint: Pages
You cannot write individual bytes to Flash. Flash is organized into fixed blocks called pages. You must write a whole page at a time.
|
📖 Datasheet §27.5 — "The Flash is organized in pages. When programming the Flash, the program data must be written one page at a time" |
Flash page size = 64 WORDS = 128 BYTES
Total pages = 256 pages
256 pages × 128 bytes = 32,768 bytes = 32KB
|
📖 Datasheet §27.5, Table 27-5 — "Page Size: 64 words / 128 bytes" |
6.3 — The 3 Step Writing Process
Step 1 — ERASE the page
Flash bits can only go 1→0 when writing.
Erase resets all bits back to 1 (0xFF).
Must erase before writing new data.
Step 2 — FILL the page buffer
Load your 128 bytes into a temporary hardware
buffer inside the chip (word by word — 2 bytes at a time).
NOT written to Flash yet.
Step 3 — WRITE the page buffer to Flash
Hardware copies buffer into actual Flash page.
Takes ~3.7ms — CPU stalls during this.
|
⚠️ Critical: If you skip the erase step, bits that were 0 stay 0. Your new data gets corrupted. Always erase first. |
|
📖 Datasheet §27.3 — Page Erase, Fill Temporary Buffer, Write Page from Temporary Buffer |
6.4 — The SPM Instruction and boot.h
SPM (Store Program Memory) is a special AVR assembly instruction. We use <avr/boot.h> which wraps it in clean C macros:
#include <avr/boot.h>
boot_page_erase(byte_address); // Erase one page
boot_spm_busy_wait(); // Wait for operation to complete
boot_page_fill(byte_address, w); // Fill one word (2 bytes) into buffer
boot_page_write(byte_address); // Write page buffer to Flash
boot_rww_enable(); // Re-enable app section reading
6.5 — RWW vs NRWW Sections
|
📖 Datasheet §27.1 — "The Flash memory is organized in two sections: Read-While-Write (RWW) and No Read-While-Write (NRWW)" |
The application section (0x0000-0x7BFF) is RWW — while writing here, our bootloader code can still execute. The bootloader section (0x7C00-0x7FFF) is NRWW. After writing any RWW page, we must call boot_rww_enable() to re-enable reading from the application section.
6.6 — write_flash_page(): The Full Function
#define PAGE_SIZE 128
void write_flash_page(uint16_t word_addr,
uint8_t *data,
uint16_t length)
{
/* Safety — never write into bootloader section */
if (word_addr >= (BOOT_START / 2))
return;
/* Convert word address to byte address for SPM */
uint32_t byte_addr = (uint32_t)word_addr * 2;
/* Step 1 — Erase the page (~3.7ms) */
boot_page_erase(byte_addr);
boot_spm_busy_wait();
/* Step 2 — Fill page buffer word by word */
for (uint16_t i = 0; i < length; i += 2)
{
uint16_t word = data[i] | ((uint16_t)data[i + 1] << 8);
boot_page_fill(byte_addr + i, word);
}
/* Step 3 — Write page buffer to Flash (~3.7ms) */
boot_page_write(byte_addr);
boot_spm_busy_wait();
/* Re-enable RWW section for reading */
boot_rww_enable();
}
6.7 — Timing
|
📖 Datasheet §27.8.1, Table 27-14 — Page Erase: 3.7ms. Page Write: 3.7ms. Total per page: ~7.4ms. Worst case full 32KB sketch: 256 pages × 7.4ms ≈ 1.9 seconds. |
6.8 — Chapter 6 Summary
|
Concept |
Key Point |
|
Page size |
128 bytes — must write whole pages |
|
3 steps |
Erase → Fill buffer → Write |
|
SPM |
Hardware instruction for Flash writing |
|
<avr/boot.h> |
Clean C macros wrapping SPM |
|
Timing |
~7.4ms per page (erase + write) |
|
Word vs byte |
Multiply word address × 2 for SPM |
|
RWW |
Must call boot_rww_enable() after every write |
|
Guard check |
Never write above 0x7C00 |
Chapter 7: Jumping to the App & Watchdog Timer
7.1 — What is the Watchdog Timer?
The Watchdog Timer (WDT) is a completely independent hardware timer built into the ATmega328P. It runs on its own internal oscillator — separate from your main clock. Think of it like a dead man's switch: your code must periodically kick the watchdog to prove it is still alive. If it does not kick it in time — the watchdog resets the CPU. No exceptions.
|
📖 Datasheet §10.1 — "The Watchdog Timer is clocked from a separate on-chip oscillator which runs at 128kHz" |
7.2 — Why Its Own Oscillator?
The Watchdog runs on its own 128kHz oscillator completely independent from the main 16MHz system clock. Even if your code completely locks up in an infinite loop, crashes into invalid memory, or the main oscillator glitches — the watchdog timer keeps counting and resets the chip.
7.3 — What Can the Watchdog Do?
|
📖 Datasheet §10.2 — Watchdog Timer modes |
|
Mode |
What Happens When Timer Expires |
|
Reset mode ← we use |
Chip resets immediately. PC goes to 0x0000 or 0x7C00 |
|
Interrupt mode |
Fires an interrupt. Your ISR handles it. Code keeps running. |
|
Both |
First fires interrupt. If not cleared → then resets. |
7.4 — Watchdog Timeout Periods
|
📖 Datasheet §10.3, Table 10-2 — Watchdog Timer prescale select |
|
WDP bits |
Timeout |
Use Case |
|
000 |
16 ms |
Very tight safety loop |
|
101 |
500 ms |
|
|
110 |
1 sec |
← we use this (upload window) |
|
111 |
2 sec |
Most common safety timeout |
7.5 — Why We Need It In Our Bootloader
Without a timeout, our uart_receive() waits forever. If nobody connects, the bootloader is stuck and your sketch never runs. The watchdog timer gives us a 1-second window: if Avrdude connects, we disable the watchdog and proceed with upload. If nobody connects, the watchdog fires, chip resets, we detect WDRF and jump to the app immediately.
7.6 — Detecting a Watchdog Reset (MCUSR)
|
📖 Datasheet §8.4 — MCUSR register — tells us WHY the chip reset. WDRF (bit 3) is set when watchdog caused the reset. |
/* Very first thing in main() */
uint8_t mcusr = MCUSR; // save reset reason
MCUSR = 0; // clear all flags
if (mcusr & (1 << WDRF))
{
/* Watchdog reset → skip to app */
jump_to_app();
}
7.7 — Why We Cannot Just goto 0x0000
Problem 1: The watchdog keeps running. The app never pets it, watchdog fires after 1 second, chip resets, bootloader runs again — infinite loop.
Problem 2: Dirty hardware state. The bootloader has initialized UART and modified hardware state. The app inherits all that instead of a clean reset state and may break.
7.8 — The Correct Way: Watchdog Reset Trick
Instead of jumping → use the watchdog to RESET the chip!
1. Set watchdog to shortest timeout (16ms)
2. Do nothing (don't pet it)
3. Watchdog fires after 16ms
4. Chip fully resets — clean hardware state!
5. Bootloader runs again
6. Sees WDRF flag in MCUSR → jump to 0x0000 immediately
7. App runs with perfectly clean hardware state ✅
|
📖 Datasheet §10.8 — Watchdog System Reset Mode |
7.9 — The Code
#include <avr/wdt.h>
/* Enable 1 second watchdog */
void watchdog_enable_1s(void)
{
wdt_enable(WDTO_1S);
}
/* Safe jump to application */
void jump_to_app(void)
{
wdt_enable(WDTO_15MS); // shortest timeout
while(1); // sit and wait for reset
/* After ~16ms → RESET → clean hardware state */
/* Bootloader restarts → sees WDRF → jumps to 0x0000 */
}
/* Disable watchdog once upload starts */
void watchdog_disable(void)
{
wdt_reset();
wdt_disable();
}
7.10 — Chapter 7 Summary
|
Concept |
Key Point |
|
Watchdog |
Independent 128kHz hardware timer |
|
Purpose |
Resets chip if code hangs or crashes |
|
Our use |
1 second upload window timeout |
|
MCUSR |
Tells us WHY the chip reset |
|
WDRF flag |
Set when watchdog caused the reset |
|
jump_to_app() |
Set WDT to 16ms, loop, let it reset cleanly |
|
Disable WDT |
Call once Avrdude connects — upload takes time |
|
Clean reset |
Watchdog reset gives app a clean hardware state |
Chapter 8: The Complete Bootloader
8.1 — The Complete main.c
Create this file at C:\AVR_Bootloader\src\main.c
/*
* ATmega328P Bootloader
* Compatible with Arduino IDE (STK500v1 / Avrdude)
*
* Application : 0x0000 - 0x7BFF (31,744 bytes)
* Bootloader : 0x7C00 - 0x7FFF (1,024 bytes)
* HFUSE = 0xDA LOCK = 0xEF UART = 115200 8N1
*/
#include <avr/io.h>
#include <avr/boot.h>
#include <avr/pgmspace.h>
#include <avr/interrupt.h>
#include <avr/wdt.h>
#include <stdint.h>
#define BOOT_START 0x7C00
#define PAGE_SIZE 128
#define F_CPU 16000000UL
#define BAUD 115200UL
#define UBRR_VAL ((F_CPU / (16UL * BAUD)) - 1)
#define STK_OK 0x10
#define STK_INSYNC 0x14
#define STK_NOSYNC 0x15
#define STK_CRC_EOP 0x20
#define STK_GET_SYNC 0x30
#define STK_GET_PARAMETER 0x41
#define STK_SET_DEVICE 0x42
#define STK_SET_DEVICE_EXT 0x45
#define STK_ENTER_PROGMODE 0x50
#define STK_LEAVE_PROGMODE 0x51
#define STK_LOAD_ADDRESS 0x55
#define STK_PROG_PAGE 0x64
#define STK_READ_PAGE 0x74
#define STK_READ_SIGN 0x75
#define SIGNATURE_0 0x1E
#define SIGNATURE_1 0x95
#define SIGNATURE_2 0x0F
static uint8_t page_buffer[PAGE_SIZE];
static uint16_t address = 0;
/* ── UART ── */
void uart_init(void) {
UBRR0H = (uint8_t)(UBRR_VAL >> 8);
UBRR0L = (uint8_t)(UBRR_VAL);
UCSR0B = (1 << RXEN0) | (1 << TXEN0);
UCSR0C = (1 << UCSZ01) | (1 << UCSZ00);
}
void uart_send(uint8_t b) {
while (!(UCSR0A & (1 << UDRE0)));
UDR0 = b;
}
uint8_t uart_receive(void) {
while (!(UCSR0A & (1 << RXC0)));
return UDR0;
}
/* ── STK500v1 helper ── */
void get_sync(void) {
uint8_t eop = uart_receive();
if (eop != STK_CRC_EOP) uart_send(STK_NOSYNC);
}
/* ── Flash self-programming ── */
void write_flash_page(uint16_t word_addr, uint8_t *data, uint16_t len) {
if (word_addr >= (BOOT_START / 2)) return;
uint32_t byte_addr = (uint32_t)word_addr * 2;
boot_page_erase(byte_addr); boot_spm_busy_wait();
for (uint16_t i = 0; i < len; i += 2) {
uint16_t word = data[i] | ((uint16_t)data[i+1] << 8);
boot_page_fill(byte_addr + i, word);
}
boot_page_write(byte_addr); boot_spm_busy_wait();
boot_rww_enable();
}
/* ── Jump to application ── */
void jump_to_app(void) {
wdt_enable(WDTO_15MS);
while(1);
}
/* ── Main ── */
int main(void) {
uint8_t mcusr = MCUSR;
MCUSR = 0;
if (mcusr & (1 << WDRF)) {
wdt_disable();
((void (*)(void))0)(); // jump to 0x0000
}
wdt_enable(WDTO_1S);
uart_init();
while (1) {
uint8_t cmd = uart_receive();
switch (cmd) {
case STK_GET_SYNC:
wdt_disable();
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK);
break;
case STK_GET_PARAMETER: {
uint8_t p = uart_receive(); get_sync();
uart_send(STK_INSYNC);
if (p == 0x80) uart_send(0x02);
else if (p == 0x81) uart_send(0x01);
else uart_send(0x00);
uart_send(STK_OK); break; }
case STK_SET_DEVICE:
for (uint8_t i=0;i<20;i++) uart_receive();
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK); break;
case STK_SET_DEVICE_EXT:
for (uint8_t i=0;i<5;i++) uart_receive();
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK); break;
case STK_ENTER_PROGMODE:
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK); break;
case STK_LOAD_ADDRESS: {
uint16_t lo = uart_receive();
uint16_t hi = uart_receive();
address = (hi << 8) | lo;
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK); break; }
case STK_PROG_PAGE: {
uint16_t len = ((uint16_t)uart_receive()<<8)|uart_receive();
uint8_t type = uart_receive();
for (uint16_t i=0;i<len;i++) page_buffer[i]=uart_receive();
get_sync();
if (type=='F') write_flash_page(address,page_buffer,len);
uart_send(STK_INSYNC); uart_send(STK_OK); break; }
case STK_READ_PAGE: {
uint16_t len = ((uint16_t)uart_receive()<<8)|uart_receive();
uint8_t type = uart_receive(); get_sync();
uart_send(STK_INSYNC);
if (type=='F')
for (uint16_t i=0;i<len;i++)
uart_send(pgm_read_byte((uint32_t)(address*2)+i));
uart_send(STK_OK); break; }
case STK_READ_SIGN:
get_sync();
uart_send(STK_INSYNC);
uart_send(SIGNATURE_0);
uart_send(SIGNATURE_1);
uart_send(SIGNATURE_2);
uart_send(STK_OK); break;
case STK_LEAVE_PROGMODE:
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK);
jump_to_app(); break;
default:
get_sync();
uart_send(STK_INSYNC); uart_send(STK_OK); break;
}
}
return 0;
}
8.2 — Build It
Open a command prompt in C:\AVR_Bootloader\ and run build.bat. The critical check: Program must be under 1024 bytes. That is our bootloader section size.
8.3 — Verify the .hex File
Open build\bootloader.hex in Notepad. The first line should show address 7C00:
:10 7C00 00 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xx
▲
└── 7C00 = our bootloader start address ✅
If you see 0000 here — the linker flag in build.bat is not working.
8.4 — Flash It
Connect your USBasp and run flash.bat. It will set fuses, flash the bootloader, then set lock bits in that order.
8.5 — Test It
Test 1 — Timeout Works
Power on the board with no USB-Serial connected. Wait 2-3 seconds. Your existing sketch should run. This confirms the 1-second watchdog timeout and jump-to-app are working correctly.
Test 2 — Arduino IDE Upload Works
Open Arduino IDE, select Arduino Uno board, select the correct COM port, open the Blink sketch, and hit Upload. You should see bytes written and verified, then the LED starts blinking immediately.
8.6 — Complete Project Structure
C:\AVR_Bootloader\
├── src\
│ └── main.c ← the bootloader source
├── build\
│ ├── bootloader.elf ← compiled binary (intermediate)
│ └── bootloader.hex ← final hex file (flashed to chip)
├── build.bat ← compiles main.c → bootloader.hex
└── flash.bat ← sets fuses, flashes hex, sets lock bits
8.7 — Complete Tutorial Summary
|
Chapter |
What We Built |
|
1 — How the chip boots |
Memory map, BOOTRST fuse, two upload scenarios, startup flow |
|
2 — Fuses & memory |
BOOTSZ=512 words, start=0x7C00, HFUSE=0xDA, lock bits=0xEF |
|
3 — Toolchain |
Build and flash batch scripts for Windows using Arduino IDE tools |
|
4 — UART |
uart_init, uart_send, uart_receive — 3 registers, baud rate math |
|
5 — STK500v1 protocol |
10 commands, handshake, address loading, data transfer |
|
6 — Flash self-programming |
Erase, fill, write, 128 byte pages, boot.h macros |
|
7 — Watchdog & safe jump |
1 second window, WDRF detection, clean app handoff |
|
8 — Complete bootloader |
Everything assembled, built, flashed and tested |
✅ Tutorial Complete!
You now have a fully working Optiboot-style bootloader written from scratch, with every line of code tied back to the ATmega328P datasheet.