Files
DD3-LoRa-Bridge-MultiSender/CODE_REVIEW.md
acidburns 32cd0652c9 Improve reliability and add data recovery tools
## Bug Fixes
- Fix integer overflow potential in history bin allocation (web_server.cpp)
  Using uint64_t for intermediate multiplication prevents overflow with different constants

- Prevent data loss during WiFi failures (main.cpp)
  Device now automatically attempts WiFi reconnection every 30 seconds when in AP mode
  Exits AP mode and resumes MQTT transmission as soon as WiFi becomes available
  Data collection and SD logging continue regardless of connectivity

## New Features
- Add standalone MQTT data republisher for lost data recovery
  - Command-line tool (republish_mqtt.py) with interactive and scripting modes
  - GUI tool (republish_mqtt_gui.py) for user-friendly recovery
  - Rate-limited publishing (5 msg/sec default, configurable 1-100)
  - Manual time range selection or auto-detect missing data via InfluxDB
  - Cross-platform support (Windows, macOS, Linux)
  - Converts SD card CSV exports back to MQTT format

## Documentation
- Add comprehensive code review (CODE_REVIEW.md)
  - 16 detailed security and quality assessments
  - Identifies critical HTTPS/auth gaps, medium-priority overflow issues
  - Confirms absence of buffer overflows and unsafe string functions
  - Grade: B+ with areas for improvement

- Add republisher documentation (REPUBLISH_README.md, REPUBLISH_GUI_README.md)
  - Installation and usage instructions
  - Example commands and scenarios
  - Troubleshooting guide
  - Performance characteristics

## Dependencies
- Add requirements_republish.txt
  - paho-mqtt>=1.6.1
  - influxdb-client>=1.18.0

## Impact
- Eliminates data loss scenario where unreliable WiFi leaves device stuck in AP mode
- Provides recovery mechanism for any historical data missed during outages
- Improves code safety with explicit overflow-resistant arithmetic
- Increases operational visibility with comprehensive code review
2026-03-11 17:01:22 +01:00

14 KiB
Raw Permalink Blame History

Code Review: DD3 LoRa Bridge MultiSender

Date: March 11, 2026
Reviewer: Security Analysis
Focus: Buffer overflows, memory issues, security risks, and bugs


Executive Summary

The codebase is generally well-written with good defensive programming practices. Most critical vulnerabilities are mitigated through bounds checking and safe API usage. However, there are several issues ranging from minor to moderate severity that should be addressed.


Critical Issues

1. ⚠️ No HTTPS/TLS - Credentials transmitted in plaintext

Severity: CRITICAL
File: web_server.cpp
Issue: The web server runs on plain HTTP (port 80) without any encryption.

  • WiFi credentials, MQTT credentials, and API authentication are sent in plaintext
  • All data exchanges (history, configuration, status) are unencrypted
  • An attacker on the network can easily capture credentials and impersonate users
  • User login credentials transmitted via HTTP Basic Auth are also vulnerable

Impact: Complete loss of confidentiality for all sensitive data
Recommendation:

  • Implement HTTPS/TLS support on the ESP32 web server
  • Consider at minimum disabling HTTP when HTTPS is available
  • Alternatively, restrict web access to local network only with firewall rules
  • Document this limitation prominently

Code: web_server.cpp L580-620 - All server.send() calls use HTTP


2. ⚠️ Default weak credentials - "admin/admin"

Severity: HIGH
File: config.h
Issue:

constexpr const char *WEB_AUTH_DEFAULT_USER = "admin";
constexpr const char *WEB_AUTH_DEFAULT_PASS = "admin";

Impact: Default accounts are easily guessable; most users won't change them, especially in AP mode where WEB_AUTH_REQUIRE_AP = false (no auth required)
Recommendation:

  • Force users to create strong credentials during initial setup
  • Generate random default credentials (or use MAC address-based credentials)
  • Never store credentials in plain-text constants
  • In AP mode, either enable auth or display a security warning

High Priority Issues

3. ⚠️ AP mode has no authentication

Severity: HIGH
File: config.h, web_server.cpp
Issue:

constexpr bool WEB_AUTH_REQUIRE_AP = false;  // AP mode has NO authentication!

When device acts as an access point, all endpoints can be accessed without any authentication.

Impact: Any device that connects to the AP can access all functionality:

  • Download meter data and history
  • Change WiFi/MQTT configuration
  • Change web UI credentials
  • Affect system behavior

Recommendation:

  • Require authentication even in AP mode
  • Or implement a time-limited "setup mode" that requires initial password setup
  • Display a prominent warning on AP mode UI

4. ⚠️ Integer overflow potential in history bin allocation

Severity: MEDIUM
File: web_server.cpp
Code:

uint32_t bins = (static_cast<uint32_t>(days) * 24UL * 60UL) / res_min;
if (bins == 0 || bins > SD_HISTORY_MAX_BINS) {
    // error handling...
    return;
}

Issue: While bounds checks are in place, the multiplication days * 24 * 60 uses 32-bit math after casting. Although mitigated by SD_HISTORY_MAX_DAYS = 30 and SD_HISTORY_MIN_RES_MIN = 1, the order of operations could be unsafe with different constants.

Current Safety: The bounds check at L776 prevents allocation of more than 4000 bins. Max days (30) × 24 × 60 = 43,200 bins, but then divided by res_min (minimum 1), result is capped at 4000.

Recommendation:

  • Reorder the multiplication to avoid overflow: ((days * 24) * 60) / res_min → safer to do: (days / res_min_in_days) * minutes_per_day to prevent intermediate overflow
  • Or explicitly check: if (days > UINT32_MAX / (24 * 60)) { error; }

5. ⚠️ Potential memory leak in history processing on error

Severity: MEDIUM
File: web_server.cpp
Code:

g_history.bins = new (std::nothrow) HistoryBin[bins];
if (!g_history.bins) {
    g_history.error = true;
    g_history.error_msg = "oom";
    server.send(200, "application/json", "{\"ok\":false,\"error\":\"oom\"}");
    return;
}

Issue: If a new history request is made while a previous request has error state with allocated g_history.bins, the history_reset() function properly cleans up. However, if the device loses power or crashes between allocation and cleanup, memory isn't freed (minor issue, but worth noting on embedded system).

Mitigation: The history_reset() function properly cleans up on next use.

Recommendation:

  • Ensure history_reset() is always called before allocating new bins Already done at L781

Medium Priority Issues

6. ⚠️ String buffer size assumptions in CSV line parsing

Severity: MEDIUM
File: web_server.cpp
Code:

char line[160];
size_t n = g_history.file.readBytesUntil('\n', line, sizeof(line) - 1);
line[n] = '\0';

Issue: If SD card contains a line longer than 160 bytes (minus 1 for null terminator), the function will silently truncate data and re-attempt. The CSV data format is expected to be compact, but if corrupted files exist, this could cause parsing failures.

Mitigation: The function gracefully handles parse failures with if (!history_parse_line(line, ts, p)) { continue; } and returns false on oversized fields at L323.

Recommendation:

  • This is acceptable for the use case. Consider logging truncation warnings if SERIAL_DEBUG_MODE is enabled.

7. ⚠️ CSV injection vulnerability in meter data logging

Severity: MEDIUM (Low practical risk)
File: sd_logger.cpp
Code:

f.print(data.total_power_w, 1);  // Directly prints floating point
f.print(data.energy_total_kwh, 3);

Issue: If floating-point values could be controlled by attacker, they could potentially inject CSV/formula injection attacks (e.g., =1+1 starts formula in Excel). The power_w values are calculated from meter readings, so this has LOW practical risk.

Impact: Low - values come from trusted LoRa devices, not user input
Recommendation:

  • If you want to be extra safe, sanitize by checking first character: if value starts with =, +, @, or -, prefix with single quote or space
  • For now, this is acceptable given the trusted data source

Low Priority Issues / Best Practice Recommendations

8. Path construction could use better validation

Severity: LOW
File: web_server.cpp
Code:

static bool sanitize_sd_download_path(String &path, String &error) {
    // ... checks for "..", "\", "//" ...
    if (!path.startsWith("/dd3/")) {
        error = "prefix";
        return false;
    }
}

Assessment: Path traversal protection is GOOD

  • Checks for .. (parent directory)
  • Checks for \ (backslash)
  • Checks for // (double slashes)
  • Requires /dd3/ prefix
  • Limits path length to 160 characters

The implementation is solid. No changes needed.


9. HTML escaping is properly implemented

Severity: N/A
File: html_util.cpp
Assessment: XSS protection is GOOD

case '&': out += "&amp;"; break;
case '<': out += "&lt;"; break;
case '>': out += "&gt;"; break;
case '"': out += "&quot;"; break;
case '\'': out += "&#39;"; break;

All unsafe HTML characters are properly escaped. Good defensive programming.


10. Buffer overflow checks are generally sound

Severity: N/A
Files: meter_driver.cpp, lora_transport.cpp
Assessment: NO UNSAFE STRING FUNCTIONS FOUND

  • No strcpy, strcat, sprintf, gets, scanf used

  • All buffer writes check bounds before writing

  • Example from meter_driver.cpp L50:

    if (n + 1 < sizeof(num_buf)) {  // Bounds check BEFORE write
        num_buf[n++] = c;
    }
    
  • Example from lora_transport.cpp L119:

    if (pkt.payload_len > LORA_MAX_PAYLOAD) {
        return false;  // Reject oversized payloads
    }
    memcpy(&buffer[idx], pkt.payload, pkt.payload_len);
    

11. Zigzag encoding is correct

Severity: N/A
File: payload_codec.cpp
Code:

uint32_t zigzag32(int32_t x) {
    return (static_cast<uint32_t>(x) << 1) ^ static_cast<uint32_t>(x >> 31);
}

Assessment: CORRECT

  • Proper cast to uint32_t before shift avoids UB
  • Standard protobuf zigzag encoding pattern
  • Correctly handles signed integers

12. Payload encoding/decoding has solid bounds checking

Severity: N/A
File: payload_codec.cpp
Assessment: GOOD DEFENSIVE PROGRAMMING

Examples of proper bounds checks:

// Check maximum samples
if (in.n > kMaxSamples) return false;

// Check feature mask validity
if ((in.present_mask & ~kPresentMaskValidBits) != 0) return false;

// Check consistency
if (bit_count32(in.present_mask) != in.n) return false;

// Check monotonically increasing energy
if (in.energy_wh[i] < in.energy_wh[i - 1]) return false;

// Check for 32-bit overflow when adding deltas
uint64_t sum = static_cast<uint64_t>(out->energy_wh[i-1]) + delta;
if (sum > UINT32_MAX) return false;

// Check phase value ranges
if (value < INT16_MIN || value > INT16_MAX) return false;

Excellent work on defense-in-depth.


13. LoRa frame validation is robust

Severity: N/A
File: lora_transport.cpp
Assessment: GOOD

  • Validates minimum packet size
  • Validates maximum packet size
  • CRC verification
  • Message kind validation
  • Signal strength logging

14. ⚠️ Time-based security: Minimum epoch check

Severity: LOW
File: config.h
Code:

constexpr uint32_t MIN_ACCEPTED_EPOCH_UTC = 1769904000UL; // 2026-02-01 00:00:00 UTC

Issue: This constant is a static minimum and won't be appropriate over time. In 2030, this will reject legitimate timestamps from 2026-2029.

Recommendation:

  • Calculate dynamically: MIN_ACCEPTED_EPOCH = compile_time_epoch - 5_years
  • Or use a configuration that can be updated via firmware
  • Or accept any reasonable recent timestamp (e.g., >= 2020-01-01)

15. Floating point NaN handling is correct

Assessment: GOOD

The code properly uses isnan() throughout:

No integer division by zero issues detected either (checks for zero before division).


16. Integer casting for power calculations handles overflow

Severity: N/A
File: web_server.cpp
Code:

static int32_t round_power_w(float value) {
    if (isnan(value)) return 0;
    long rounded = lroundf(value);
    if (rounded > INT32_MAX) return INT32_MAX;  // Overflow protection
    if (rounded < INT32_MIN) return INT32_MIN;  // Underflow protection
    return static_cast<int32_t>(rounded);
}

Assessment: EXCELLENT - Defensive against both positive and negative overflows


Summary Table

ID Issue Severity Category Status
1 No HTTPS/TLS CRITICAL Security ⚠️ Needs Fix
2 Weak default credentials HIGH Security ⚠️ Needs Fix
3 AP mode no auth HIGH Security ⚠️ Needs Fix
4 Integer overflow in bins MEDIUM Memory ⚠️ Needs Review
5 Memory leak potential MEDIUM Memory Mitigated
6 CSV line truncation MEDIUM Data Handling Safe
7 CSV injection MEDIUM Security Low Risk
8 Path traversal LOW Security Well Protected
9-16 Best practices N/A Quality GOOD

Recommendations for Fixes

Immediate (Critical Path)

  1. Enable HTTPS - Implement TLS on ESP32 web server
  2. Strengthen AP mode security - Either enable auth or use time-limited setup mode
  3. Improve default credentials - Generate strong defaults or force user configuration

Short-term (High Priority)

  1. Fix integer overflow checks - Add explicit overflow detection before bin allocation
  2. Document security limitations - Clearly state that HTTPS is not available

Long-term (Nice to Have)

  1. Add audit logging - Log all configuration changes and data access
  2. Implement certificate pinning - Once HTTPS is added
  3. Add device firmware signature verification - Prevent unauthorized updates

Testing Recommendations

# Verify no plaintext credentials in traffic
tcpdump -i <interface> port 80 or port 1883 | grep -i password

# Test path traversal protection
curl "http://device/sd/download?path=/etc/passwd"
curl "http://device/sd/download?path=/../../../"

# Test XSS protection
curl "http://device/sender/<img%20src=x%20onerror=alert(1)>"

# Test OOM handling with large history requests
curl "http://device/history/start?days=365&res=1"

Overall Assessment

Grade: B+ (Good with areas for improvement)

Strengths:

  • Solid use of safe APIs and standard library functions
  • Excellent bounds checking throughout
  • Good defensive programming practices
  • CRC validation and format validation

Weaknesses:

  • Lack of encryption (HTTPS)
  • Weak default security posture
  • No security in AP mode
  • Need better overflow protection in integer arithmetic

The codebase demonstrates good engineering practices and would be production-ready once the critical HTTPS and authentication issues are addressed.