ATmega328P Boot loaders
Table of Contents
1. Introduction
2. Understanding Bootloader Memory Constraints
3. The ATMega328P Memory Architecture
4. Fuse Settings and Boot Section Configuration
5. Bootloader Approaches - Complete Overview
6. Approach 1: FAT Filesystem on SD Card
7. Approach 2: Raw Flash Image on SD Card
8. Approach 3: XMODEM Serial Protocol
9. Additional Bootloader Methods
10. Performance and Size Comparison
11. Recommendations by Use Case
12. Conclusion
Introduction
Developing a bootloader for the ATmega328P (the heart of Arduino Nano/Uno) is a challenging but rewarding task. This comprehensive guide covers everything from fundamental concepts to complete implementations, focusing on size-optimized solutions that fit within the limited flash memory.
What You'll Learn
· Memory architecture of ATmega328P
· Fuse configuration for boot sections
· Three main bootloader approaches
· Additional bootloader methods for special cases
· Performance comparisons and recommendations
Understanding Bootloader Memory Constraints
The ATmega328P has limited resources that every bootloader developer must understand:
Critical Memory Limits
|
Memory Type |
Size |
Bootloader Target |
% Available |
|
Flash (Program Memory) |
32 KB |
< 2-4 KB |
6-12% |
|
SRAM (Data Memory) |
2 KB |
< 200 bytes |
< 10% |
|
EEPROM |
1 KB |
Optional |
- |
Why Bootloaders Must Be Small
When you flash a bootloader, it permanently occupies a portion of the flash. The remainder is available for your application:
text
Total Flash: 32,768 bytes (32 KB)├── Bootloader Section: 2,048 - 4,096 bytes (2-4 KB)└── Application Space: 28,672 - 30,720 bytes (28-30 KB)
A Real Memory Usage Example
Here's actual output from compiling a bootloader:
text
AVR Memory Usage----------------Device: atmega328p Program: 3880 bytes (11.8% Full)(.text + .data + .bootloader) Data: 256 bytes (12.5% Full)(.data + .bss + .noinit)
Analysis:
· Program uses 3.88 KB - exceeds 2KB boot section
· Data uses 256 bytes of SRAM - reasonable for bootloader
· Total would need 4KB boot section to fit
The ATmega328P Memory Architecture
Flash Memory Layout
text
|
Address Range |
Size |
Illustrative Usage Layout |
|
0x0000 - 0x1FFF |
8 KB |
Application Region Example 1 |
|
0x2000 - 0x3FFF |
8 KB |
Application Part 2 |
|
0x4000 - 0x5FFF |
8 KB |
Application Part 3 |
|
0x6000 - 0x6FFF |
4 KB |
Application Part 4 |
|
0x7000 - 0x77FF |
2 KB |
Additional boot section area (used when configured for 4KB) |
|
0x7800 - 0x7FFF |
2 KB |
Boot section used by 2KB configurations |
Bootloader Section Sizes
|
Bootloader Size |
Start Address |
End Address |
App Space |
Fuse Setting (HFUSE) |
|
512 bytes |
0x7E00 |
0x7FFF |
32,256 bytes |
0xDE (BOOTSZ=11, 512B; Optiboot default) |
|
1 KB |
0x7C00 |
0x7FFF |
31,744 bytes |
0xDC (BOOTSZ=10, 1KB section) |
|
2 KB |
0x7800 |
0x7FFF |
30,720 bytes |
0xDA (BOOTSZ=01, 2KB; legacy ATmegaBOOT Nano) |
|
4 KB |
0x7000 |
0x7FFF |
28,672 bytes |
0xD8 (BOOTSZ=00, 4KB; max on ATmega328P) |
Note: The HFUSE values shown below assume BOOTRST is programmed (BOOTRST=0) and other fuse bits follow the common Arduino configuration (SPIEN enabled, debugWIRE disabled, etc.).
Fuse Settings and Boot Section Configuration
Important Warning About Fuse Programming
Incorrect fuse settings can make the microcontroller difficult or impossible to
program using normal ISP methods. In particular, disabling RESET (RSTDISBL) or
SPI programming access can require a high-voltage programmer to recover the
chip.
What Are Fuses?
Fuses are non-volatile configuration bits that control hardware features including:
· Boot section size and location
· Reset vector behavior
· Clock source and timing
· Brown-out detection levels
Critical Fuse Settings for Bootloaders
|
Fuse |
Bit |
Default |
Description |
|
BOOTRST |
0 |
1 (disabled) |
Reset vector points to bootloader when 0 |
|
BOOTSZ1 |
2 |
Varies |
Boot section size bit 1 |
|
BOOTSZ0 |
1 |
Varies |
Boot section size bit 0 |
Boot Section Encoding
Note: AVR datasheets often describe boot section sizes in words (1 word = 2 bytes), while most AVR tools use byte addresses. The table below uses byte-based boot section sizes and byte addresses for clarity.
|
BOOTSZ1 |
BOOTSZ0 |
Boot Size (bytes) |
Start Address |
Typical Use |
|
0 |
0 |
4 KB (2048 words) |
0x7000 |
Large custom bootloaders |
|
0 |
1 |
2 KB (1024 words) |
0x7800 |
Legacy Arduino Nano |
|
1 |
0 |
1 KB (512 words) |
0x7C00 |
Medium size |
|
1 |
1 |
512 B (256 words) |
0x7E00 |
Modern Optiboot (recommended) |
Arduino Nano Legacy Fuse Settings
text
Low Fuse: 0xFF (External crystal, fast startup)High Fuse: 0xDA (Legacy ATmegaBOOT Nano: 2KB boot section; modern Optiboot uses 0xDE for 512B)Extended: 0x05 (Brown-out at 2.7V)
ATmega328P Fuse Bytes Layout
Low Fuse Byte (LFUSE)
Bit: 7 6 5 4 3 2 1 0
| | | | | | | |
| | | | | | | CKSEL0
| | | | | | CKSEL1
| | | | | CKSEL2
| | | | CKSEL3
| | | SUT0
| | SUT1
| CKOUT
CKDIV8
LFUSE Bit Meanings
|
Bit |
Name |
Function |
|
7 |
CKDIV8 |
Divide system clock by 8 |
|
6 |
CKOUT |
Output system clock on PB0 |
|
5 |
SUT1 |
Startup time selection |
|
4 |
SUT0 |
Startup time selection |
|
3 |
CKSEL3 |
Clock source selection |
|
2 |
CKSEL2 |
Clock source selection |
|
1 |
CKSEL1 |
Clock source selection |
|
0 |
CKSEL0 |
Clock source selection |
|
0 |
CKSEL0 |
Clock source selection |
High Fuse Byte (HFUSE)
Bit: 7 6 5 4 3 2 1 0
| | | | | | | |
| | | | | | | BOOTRST
| | | | | | BOOTSZ0
| | | | | BOOTSZ1
| | | | EESAVE
| | | WDTON
| | SPIEN
| DWEN
RSTDISBL
HFUSE Bit Meanings
|
Bit |
Name |
Function |
|
7 |
RSTDISBL |
Disable external RESET pin |
|
6 |
DWEN |
Enable debugWIRE |
|
5 |
SPIEN |
Enable SPI programming |
|
4 |
WDTON |
Watchdog always on |
|
3 |
EESAVE |
Preserve EEPROM during chip erase |
|
2 |
BOOTSZ1 |
Bootloader size selection |
|
1 |
BOOTSZ0 |
Bootloader size selection |
|
0 |
BOOTRST |
Start execution from bootloader |
Extended Fuse Byte (EFUSE)
Bit: 7 6 5 4 3 2 1 0
| | | | | | | |
| | | | | | | BODLEVEL0
| | | | | | BODLEVEL1
| | | | | BODLEVEL2
| | | | Reserved
| | | Reserved
| | Reserved
| Reserved
Reserved
EFUSE Bit Meanings
|
Bit |
Name |
Function |
|
7 |
Reserved |
Keep as 1 |
|
6 |
Reserved |
Keep as 1 |
|
5 |
Reserved |
Keep as 1 |
|
4 |
Reserved |
Keep as 1 |
|
3 |
Reserved |
Keep as 1 |
|
2 |
BODLEVEL2 |
Brown-out detection level |
|
1 |
BODLEVEL1 |
Brown-out detection level |
|
0 |
BODLEVEL0 |
Brown-out detection level |
AVR Fuse Logic
Important: AVR fuse bits use inverted logic. A value of 0 means the feature is programmed/enabled, while 1 means unprogrammed/disabled.
0 = programmed/enabled
1 = unprogrammed/disabled
Example:
SPIEN = 0
means:
- SPI programming is enabled.
Clarification: Modern Arduino Uno and newer Nano boards typically use Optiboot with a 512-byte boot section (HFUSE = 0xDE). Older Nano boards using the legacy ATmegaBOOT bootloader commonly used a 2KB boot section (HFUSE = 0xDA).
Common Arduino Uno Fuse Values
|
Fuse |
Hex |
Binary |
|
LFUSE |
0xFF |
11111111 |
|
HFUSE |
0xDE |
11011110 |
|
EFUSE |
0x05 |
00000101 |
Changing Fuses for 4KB Boot Section
bash
# Single command to set 4KB boot sectionavrdude -c usbasp -p m328p -U hfuse:w:0xD8:m
# Verify new settingsavrdude -c usbasp -p m328p -U hfuse:r:-:h
Expected output:
text
Reading high fuse byte: 0xD8
Important Warning
Many legacy Arduino Nano boards reserve a 2KB boot section even though modern Optiboot implementations typically require only about 0.5KB. This leaves additional flash space unused that could otherwise be reclaimed for applications or custom bootloaders.
Bootloader Approaches - Complete Overview
Summary Comparison Table
|
# |
Method |
Storage |
Code Size |
Complexity |
Fuse Change |
External Hardware |
|
1 |
FAT Filesystem on SD |
SD Card |
3.6-4 KB |
High |
Yes (4KB) |
SD Card + SPI |
|
2 |
Raw Sectors on SD |
SD Card |
1.5-2 KB |
Medium |
No (2KB) |
SD Card + SPI |
|
3 |
XMODEM over UART |
Serial |
0.6-0.8 KB |
Low |
No |
USB Cable |
|
4 |
YMODEM over UART |
Serial |
1.2-1.8 KB |
Medium |
No |
USB Cable |
|
5 |
Custom Serial |
Serial |
0.3-0.5 KB |
Low |
No |
USB Cable |
|
6 |
I2C Bootloader |
I2C EEPROM |
0.8-1.2 KB |
Medium |
No |
I2C Master |
|
7 |
USB HID (V-USB) |
USB |
1.8-2.2 KB |
High |
No |
Resistors + Crystal |
|
8 |
OTA Wireless |
RF Module |
3-5 KB |
Very High |
Yes (4KB) |
NRF24L01/ESP |
|
9 |
Watchdog Timer |
None |
+0.1 KB |
Low |
No |
None (modifies existing) |
|
10 |
Dual Image |
Flash |
0.5 KB |
High |
No |
None |
Decision Flowchart
text
Start: Need a bootloader? │ ├─→ Need file compatibility (FAT)? │ ├─→ YES → Use FAT SD Card (Approach 1) │ └─→ NO → Continue │ ├─→ Have SD card hardware? │ ├─→ YES → Use Raw SD Card (Approach 2) │ └─→ NO → Continue │ ├─→ Have USB-serial available? │ ├─→ YES → Use XMODEM (Approach 3) │ └─→ NO → Use I2C or Custom protocol │ └─→ Need OTA capability? ├─→ YES → Wireless bootloader └─→ NO → Done
Approach 1: FAT Filesystem on SD Card
Overview
This approach implements a complete FAT16/FAT32 filesystem reader to load firmware from a standard SD card formatted on any computer.
Size Breakdown
|
Component |
Size (bytes) |
Description |
|
FAT Reader |
700-800 |
MBR/BPB parsing, cluster walking |
|
SD Card Driver |
600-700 |
SPI communication, command handling |
|
CRC32 |
40 |
Checksum verification |
|
Flash Writer |
200 |
Page erase/write operations |
|
Main Logic |
300 |
Orchestration and error handling |
|
Total |
~1850-2040 |
Code only |
|
+ RAM Buffers |
1208 |
.data + .bss (sector + page buffers) |
|
Flash: ~1850-2040 / RAM: ~1208 (independent) |
~3058-3248 |
Exceeds 2KB boot section |
Memory Map
text
Flash (3.8 KB):├── FAT Filesystem Parser (800 bytes)├── SD Card Driver (700 bytes)├── CRC32 Implementation (40 bytes)├── Flash Writer (200 bytes)└── Main Control (300 bytes) RAM (1.2 KB):├── Sector Buffer (512 bytes)├── Page Buffer (128 bytes)├── Shared SBUF (512 bytes)└── Variables (56 bytes)
Pros and Cons
|
Pros |
Cons |
|
User-friendly (drag & drop files) |
Large code size (needs 4KB boot) |
|
Works on any PC OS |
High RAM usage |
|
CRC verification possible |
Complex debugging |
|
Can store multiple files |
Slow boot time (FAT parsing) |
When to Use
· End-user products where users update by copying files
· Development environments where SD card is already present
· Data logging devices that already have SD card hardware
Approach 2: Raw Flash Image on SD Card
Overview
Bypasses the filesystem entirely by reading firmware from fixed SD card sectors. Much smaller code but requires preparing the SD card with special tools.
Size Breakdown
|
Component |
Size (bytes) |
Description |
|
SD Card Driver |
600-700 |
SPI communication (minimal) |
|
Raw Sector Reader |
200 |
Direct LBA reads |
|
CRC16 |
30 |
Simple checksum |
|
Flash Writer |
200 |
Page operations |
|
Main Logic |
150 |
Simple loop |
|
Total |
~1180-1280 |
Fits in 2KB boot! |
|
+ RAM Buffers |
640 |
Single 512-byte buffer + variables |
|
Flash: ~1180-1280 / RAM: ~640 (independent) |
~1820-1920 |
Fits in 2KB boot! |
Memory Layout with 2KB Boot Section
text
Flash Layout (2KB boot section at 0x7800):0x7800 - 0x7803 : Bootloader entry point0x7804 - 0x7D00 : SD Card driver (700 bytes)0x7D00 - 0x7E00 : Raw reader (200 bytes)0x7E00 - 0x7F00 : Flash writer (200 bytes)0x7F00 - 0x7FFF : Main logic + variables SD Card Layout:Sector 0-99 : Reserved (MBR, etc.)Sector 100-164 : Firmware image (32KB max)Sector 165 : CRC/version info (optional)
Pros and Cons
|
Pros |
Cons |
|
Small code (fits in 2KB) |
Requires special tools to prepare SD card |
|
Fast boot (no filesystem parsing) |
No file names (must know sector numbers) |
|
Simple and reliable |
User can't just copy files |
|
Easy to debug |
Must document sector layout |
When to Use
· Developer-focused tools where you control the update process
· Space-constrained bootloaders (under 2KB)
· Fixed-format field updates with prepared SD cards
Approach 3: XMODEM Serial Protocol
Overview
Uses the classic XMODEM protocol over UART to receive firmware. No SD card needed - just the USB cable you already use for programming.
Size Breakdown (Actual Measured)
|
Component |
Size (bytes) |
Notes |
|
XMODEM Protocol |
300 |
|
|
UART Driver |
100 |
|
|
CRC16 |
30 |
|
|
Flash Writer |
200 |
|
|
Delay/Timing |
100 |
|
|
Main Logic |
150 |
|
|
Total Code |
~880 |
Fits in 2KB boot section |
|
+ RAM (stack + buffers) |
256 |
SRAM (not flash) |
|
Flash: ~880 / RAM: ~256 (independent) |
~1136 |
Fits easily in 2KB! |
Host-side software is required for XMODEM uploads. Common options include custom Python scripts, terminal programs with XMODEM support, or embedded upload utilities.
Protocol Flow Diagram
text
PC Host Arduino Nano (Bootloader) │ │ │ Send 'U' trigger │ │ ─────────────────────────────────► │ │ │ │ Respond with 'C' (CRC mode) │ │ ◄───────────────────────────────── │ │ │ │ Send SOH + block # + CRC │ │ ─────────────────────────────────► │ │ │ │ Verify CRC, flash page │ │ │ │ Send ACK │ │ ◄───────────────────────────────── │ │ │ │ ... repeat for all blocks ... │ │ │ │ Send EOT (End of Transmission) │ │ ─────────────────────────────────► │ │ │ │ Send ACK, jump to app │ │ ◄───────────────────────────────── │ │ │
XMODEM Packet Structure
text
+------+------+------+------------------+------+| SOH | Blk# | ~Blk | Data (128 bytes) | CRC || 0x01 | 0x01 | 0xFE | .............. | 2 bytes |+------+------+------+------------------+------+ Legend:SOH = Start of Header (0x01)Blk# = Block number (1-255, wraps)~Blk = Bitwise complement of block numberData = 128 bytes of firmwareCRC = 16-bit CRC of data
Pros and Cons
|
Pros |
Cons |
|
Very small code (fits in 2KB) |
Slower than SD card (115200 baud) |
|
No extra hardware needed |
Requires PC software for upload |
|
Uses existing USB-serial |
No file system (send binary only) |
|
Reliable with CRC verification |
Must trigger within 1 second |
|
Easy to debug |
Single-threaded (no other tasks during flash) |
When to Use
· Default choice for most hobbyist projects
· Space-constrained bootloaders (must fit in 2KB)
· No SD card hardware on the board
· Rapid development with frequent updates
Additional Bootloader Methods
Method 4: YMODEM Protocol
Improvements over XMODEM:
· Larger 1KB block sizes (vs XMODEM's 128 bytes)
· File name preservation
· Variable block sizes (128 or 1024 bytes)
Code size: 1.2-1.8 KB
Best for: Larger firmware files (over 64KB)
Method 5: Custom Serial Protocol
Minimal implementation:
c
// Packet: [0xAA][LEN][DATA...][CRC8]// Only 40 lines of code!
Code size: 0.3-0.5 KB (smallest possible!)
Best for: Custom PC software control
Method 6: I2C Bootloader
Use case: Update from another microcontroller or external EEPROM
text
External I2C EEPROM (24LC256) │ │ I2C (SDA/SCL) ▼ ATmega328P (Bootloader) │ │ Flash programming ▼ Internal Flash
Code size: 0.8-1.2 KB
Best for: Multi-chip products, field updates via external programmer
Method 7: USB HID Bootloader
Using V-USB library: Software USB implementation
Requirements:
· External 12MHz crystal or 16MHz with tuning
· 3.6V Zener diodes on D+ and D-
· 68Ω resistors on data lines
Code size: 1.8-2.2 KB (tight fit in 2KB)
Best for: When you want USB but have no serial chip
Method 8: OTA Wireless Bootloader
Compatible modules:
· NRF24L01+ (2.4GHz)
· HC-05/HC-06 (Bluetooth)
· ESP8266 (WiFi)
· LoRa modules
Protocol: Simple packet-based with retries
Code size: 3-5 KB (needs 4KB boot section)
Best for: Remote updates, IoT devices
Method 9: Watchdog Timer Bootloader
Concept: Enter bootloader by resetting multiple times
c
// In applicationvoid setup() {
uint8_t reset_count = read_eeprom(0);
if (reset_count > 3) {
// Jump to bootloader
((void(*)())0x7000)();
}
// Store reset count
write_eeprom(0, reset_count + 1);
// Clear after 2 seconds
delay(2000);
write_eeprom(0, 0);
}
Additional code: ~100 bytes
Best for: Adding bootloader entry to existing apps
Method 10: Dual Image Bootloader
Concept: Keep two application images in flash
text
Flash Layout (example only — unequal image sizes for illustration):
0x0000-0x3FFF : Image 1 (Active)0x4000-0x6FFF : Image 2 (Update)0x7000-0x7FFF : Bootloader
Benefits:
· Atomic updates (always valid image)
· Rollback capability
· No "bricked" devices
Drawback: Uses 50% of flash for redundancy
Code size: 500-800 bytes (just the switcher)
Best for: Safety-critical applications, remote locations
Performance and Size Comparison
Note: Flash memory usage and SRAM usage are independent resources on the ATmega328P. The comparison values below are shown separately to avoid implying that RAM usage consumes flash bootloader space.
Code Size Comparison Table
|
Bootloader Type |
Code (bytes) |
RAM (bytes) |
Total (bytes) |
Fits 2KB? |
|
Minimal Custom Serial |
350 |
32 |
382 |
✅ Yes |
|
Optiboot (Arduino stock) |
~512 |
~20 |
~532 (gold standard) |
✅ Yes |
|
XMODEM |
880 |
256 |
1136 |
✅ Yes |
|
YMODEM |
1500 |
256 |
1756 |
✅ Yes |
|
Raw SD Card |
1280 |
640 |
1920 |
✅ Yes |
|
USB HID (V-USB) |
2100 |
512 |
2612 |
❌ No |
|
FAT SD Card |
1850 |
1208 |
3058 |
❌ No |
|
OTA Wireless |
4000 |
1024 |
5024 |
❌ No |
Transfer Speed Comparison
|
Method |
Interface |
Max Speed |
Time for 32KB |
|
USB-serial (115200 baud) |
UART |
11.5 KB/s |
~2.8 seconds |
|
USB-serial (250000 baud) |
UART |
25 KB/s |
~1.3 seconds |
|
SD Card (SPI @ 8MHz) |
SPI |
~100-250 KB/s sustained (practical) |
~130 – 320 milliseconds |
|
I2C EEPROM (400kHz) |
I2C |
50 KB/s |
~640 milliseconds |
|
Wireless (NRF24L01) |
SPI |
200 KB/s |
~160 milliseconds |
Reliability Comparison
|
Method |
CRC |
Retransmit |
Acknowledge |
Success Rate |
|
XMODEM |
✓ 16-bit |
✓ |
✓ |
99.9%+ |
|
YMODEM |
✓ 16-bit |
✓ |
✓ |
99.9%+ |
|
Raw SD |
Optional |
❌ |
❌ |
~95% |
|
FAT SD |
Optional |
❌ |
❌ |
~95% |
|
Custom |
Implementation dependent |
Varies |
Recommendations by Use Case
Hobbyist / Maker
Recommendation: XMODEM over UART
No extra hardware
Simple to use
Fits in 2KB
Existing PC tools available
Commercial Product (End-User Updates)
Recommendation: FAT SD Card with 4KB boot
User-friendly (drag & drop)
File compatibility
Can store multiple versions
CRC verification
Field Updates (No PC)
Recommendation: Raw SD Card
Small code footprint
Reliable
No filesystem corruption risk
Works with prepared SD cards
IoT / Remote Devices
Recommendation: OTA Wireless
Remote updates
Cellular or WiFi options
Needs larger boot section
Requires robust protocol
Safety-Critical Systems
Recommendation: Dual Image + XMODEM
· Atomic updates
· Rollback capability
· Never bricks
· Verified boot
Minimal Hardware (Nano only)
Recommendation: XMODEM
· Uses existing USB-serial
· Tiny code size
· Reliable protocol
· Perfect for Nano
Development/Prototyping
Recommendation: Custom Serial + Bootloader Trigger
· Fastest iteration
· Minimal code
· Easy to debug
· Flexible protocol
Conclusion
Key Takeaways
1. Size matters - Bootloaders must be small. The ATmega328P's 2KB default boot section severely limits options.
2. Fuse changes enable larger bootloaders - Configuring a 4KB boot section (HFUSE = 0xD8 with BOOTRST enabled) provides enough flash space for FAT filesystem support and more complex bootloader protocols.
3. Three main approaches cover most needs:
o XMODEM - Best for most projects (small, no extra hardware)
o Raw SD - Good balance of size and storage
o FAT SD - User-friendly but needs 4KB boot
4. XMODEM is the sweet spot - At 880 bytes, it fits comfortably in 2KB and uses existing USB-serial.
5. Always include verification - CRC checking prevents bricked devices.
6. LED feedback is essential - Visual indicators help debug bootloader issues.
Final Architecture Recommendations
For most Arduino Nano projects:
text
Bootloader: XMODEM over UART (880 bytes)Location: Application space with trigger, or 2KB boot sectionFuses: Default (0xDE for 512B Optiboot) or optional 4KB (0xD8 for a 4KB boot section configuration)Upload: USB cable + Python scriptResult: Reliable, maintainable, fits comfortably
For products needing SD card:
text
Bootloader: FAT filesystem on SD (1850 bytes)Location: 4KB boot section (0x7000-0x7FFF)Fuses: 0xD8 (4KB boot)Upload: Drag & drop firmware.bin to SD cardResult: User-friendly, but needs 28KB app space
Final Words
Developing a bootloader for ATmega328P is challenging but achievable. The XMODEM approach provides the best balance of features, reliability, and code size for most applications. By understanding the memory constraints and fuse settings, you can create a robust update system that fits within the limited resources.