Files
DD3-LoRa-Bridge-MultiSender/README.md
T
acidburns def09160d0 refactor lora payload timing
Bump the batch payload codec to schema v4 with separate meter-time and UTC anchors, then use meter seconds for sparse batch slotting and receiver reconstruction.

Update the current 868 MHz bench configuration, allow ACKs from configured receiver short IDs, improve AP-to-STA recovery, quiet the test build, and document the changed protocol in the README.
2026-06-30 12:19:27 +02:00

202 lines
8.7 KiB
Markdown

# DD3-LoRa-Bridge-MultiSender
Firmware for LilyGO T3 v1.6.1 (`ESP32 + SX1276 + SSD1306`) that runs in two roles:
- `Sender` (`GPIO14` HIGH): reads one IEC 62056-21 meter, builds 30-slot sparse batches, sends via LoRa.
- `Receiver` (`GPIO14` LOW): receives/ACKs batches, publishes MQTT, serves web UI, logs to SD.
## Architecture Summary
- Single codebase, role selected at boot by `detect_role()` (`src/config.cpp`).
- LoRa transport is wrapped with firmware-level CRC16-CCITT (`src/lora_transport.cpp`).
- Sender meter ingest is decoupled from LoRa waits via FreeRTOS meter reader task + queue on ESP32 (`src/sender_state_machine.cpp`).
- Batch payload codec is schema `v4` with a 30-bit `present_mask` over `[meter_t_last-29, meter_t_last]` and a separate UTC anchor (`lib/dd3_legacy_core/src/payload_codec.cpp`).
- Sender retries reuse cached encoded payload bytes (no re-encode on retry path).
- Sender ACK receive windows adapt from observed ACK RTT + miss streak.
- Sender catch-up mode drains backlog with immediate extra sends when more than one batch is queued (still ACK-gated, single inflight batch).
- Sender only starts normal metering/transmit flow after valid time bootstrap from receiver ACK.
- Sender fault counters are reset at first valid time sync and again at each UTC hour boundary.
- Receiver runs STA mode if stored config is valid and connects, otherwise AP fallback.
- Current bench defaults in `include/config.h`: 868 MHz LoRa, sender short-ID `0x6540`, receiver short-ID `0x7EB4`.
## LoRa Protocol
On-air frame:
`[msg_kind:1][device_short_id:2][payload...][crc16:2]`
`msg_kind`:
- `0`: `BatchUp`
- `1`: `AckDown`
### BatchUp
Transport layer chunks payload into:
`[batch_id_le:2][chunk_index:1][chunk_count:1][total_len_le:2][chunk_payload...]`
Receiver reassembles all chunks before decode.
Payload codec (`schema=4`, magic `0xDDB3`) carries:
- metadata: sender ID, batch ID, `meter_t_last`, `ts_utc_last`, `present_mask`, battery mV, error counters
- arrays per present sample: `energy_wh[]`, `p1_w[]`, `p2_w[]`, `p3_w[]`
`n == 0` with `present_mask == 0` is valid and used for sync request packets.
Schema `v4` is not wire-compatible with schema `v3`: both sender and receiver must be flashed with matching firmware.
### AckDown (7 bytes payload)
`[flags:1][batch_id_be:2][epoch_utc_be:4]`
- `flags bit0`: `time_valid`
- ACK is repeated (`ACK_REPEAT_COUNT=3`, `ACK_REPEAT_DELAY_MS=200`)
- Sender sets local time only if `time_valid=1` and `epoch >= MIN_ACCEPTED_EPOCH_UTC` (`2026-02-01 00:00:00 UTC`)
- Sender ACK wait windows are adaptive (short first window, expanded second window on miss)
## Time Bootstrap and Timezone
Sender boot starts in sync-only mode:
- `g_time_acquired=false`
- sends sync requests every `SYNC_REQUEST_INTERVAL_MS` (`15s`)
- does not run normal 1 Hz sample/batch flow yet
After valid ACK time:
- `time_set_utc()` is called
- `g_time_acquired=true`
- sender fault counters are reset once (`err_m`, `err_d`, `err_tx`, last-error state)
- normal 1 Hz sampling + periodic batch transmission starts
After initial sync:
- sender fault counters are reset again once per UTC hour when the hour index changes (`HH:00 UTC` boundary)
Timezone:
- `TIMEZONE_TZ` from `include/config.h` is applied in `time_manager`.
- Web/OLED local-time rendering uses this timezone.
- Default: `CET-1CEST,M3.5.0/2,M10.5.0/3`.
## Sender Meter Path
Implemented by `src/meter_driver.cpp` and sender loop in `src/sender_state_machine.cpp`:
- UART: `Serial2`, `GPIO34`, `9600 7E1`
- ESP32 RX buffer enlarged to `8192`
- Frame detection `/ ... !`, timeout `METER_FRAME_TIMEOUT_MS=3000`
- Single-pass OBIS line dispatch (no repeated multi-key scans per line)
- Fixed-point decimal parser (dot/comma decimals), with early-exit once all required OBIS fields are captured
- Parsed OBIS fields:
- `0-0:96.8.0*255` meter Sekundenindex (hex u32)
- `1-0:1.8.0` total energy (auto scales Wh -> kWh when unit is Wh)
- `1-0:16.7.0` total active power
- `1-0:36.7.0`, `56.7.0`, `76.7.0` phase powers
Timestamp derivation:
- anchor offset: `epoch_offset = epoch_now - meter_seconds`
- sample epoch: `ts_utc = meter_seconds + epoch_offset`
- jump checks: rollback, wall-time delta mismatch, anchor drift
Sender builds sparse 30-slot windows in meter-seconds space and sends every `METER_SEND_INTERVAL_MS` (`30s`).
Samples without a valid meter seconds value are rejected for normal batch transmission.
When backlog is present (`batch_q > 1`), sender transmits additional queued batches immediately after ACK to reduce lag, while keeping stop-and-wait ACK semantics.
Sender diagnostics (serial debug mode):
- periodic structured `diag:` line with:
- meter parser counters (`ok/parse_fail/overflow/timeout`)
- meter queue stats (`depth/high-watermark/drops`)
- ACK stats (`last RTT`, `EWMA RTT`, `miss streak`, timeout/retry totals)
- sender runtime totals (`rx window ms`, `sleep ms`)
- diagnostics are local-only (serial); LoRa payload schema/fields are unchanged.
## Receiver Behavior
For decoded `BatchUp`:
1. Reassemble and decode.
2. Validate sender identity (`EXPECTED_SENDER_IDS` and payload sender ID mapping).
3. Reject unknown/mismatched senders before ACK and before SD/MQTT/web updates.
4. Send `AckDown` promptly for accepted senders.
5. Track duplicates per configured sender.
6. If duplicate: update duplicate counters/time, skip data write/publish.
7. If `n==0`: sync request path only.
8. Else reconstruct each sample from `meter_t_last + present_mask` and `ts_utc_last + present_mask`, then:
- append to SD CSV
- publish MQTT state
- update web status and last batch table
ACK validation accepts only configured sender IDs, the sender's own short-ID, or configured receiver short-IDs (`EXPECTED_RECEIVER_IDS`) so receiver-originated ACKs are allowed without accepting arbitrary device IDs.
## MQTT
State topic:
- `smartmeter/<device_id>/state`
Fault topic (retained):
- `smartmeter/<device_id>/faults`
State JSON (`lib/dd3_legacy_core/src/json_codec.cpp`) includes:
- `id`, `ts`, `e_kwh`
- `p_w`, `p1_w`, `p2_w`, `p3_w`
- `bat_v`, `bat_pct`
- optional link: `rssi`, `snr`
- `err_last`, `rx_reject`, `rx_reject_text`
- non-zero fault counters when available
Sender fault counter lifecycle:
- counters are cumulative only within the current UTC-hour window after first sync
- counters reset on first valid sender time sync and at each subsequent UTC hour boundary
Home Assistant discovery:
- enabled by `ENABLE_HA_DISCOVERY=true`
- publishes to `homeassistant/sensor/<device_id>/<key>/config`
- `unique_id` format is `<device_id>_<key>` (example: `dd3-F19C_energy`)
- device metadata:
- `identifiers: ["<device_id>"]`
- `name: "<device_id>"`
- `model: "DD3-LoRa-Bridge"`
- `manufacturer: "AcidBurns"` (from `HA_MANUFACTURER` in `include/config.h`)
- single source of truth: change manufacturer only in `include/config.h`
## Web UI, Wi-Fi, SD
- Wi-Fi/MQTT/NTP/web-auth config is stored in Preferences.
- AP fallback SSID prefix: `DD3-Bridge-`.
- Default web credentials: `admin/admin`.
- AP auth requirement is controlled by `WEB_AUTH_REQUIRE_AP` (default `true`).
- STA auth requirement is controlled by `WEB_AUTH_REQUIRE_STA` (default `true`).
- If the receiver boots into AP mode with saved Wi-Fi and MQTT config, it periodically retries STA mode and reinitializes NTP, MQTT, and the web server after a successful reconnect.
Web timestamp display:
- human-facing timestamps show `epoch (HH:MM:SS TZ)` in local configured timezone.
SD CSV logging (`src/sd_logger.cpp`):
- header: `ts_utc,ts_hms_local,p_w,p1_w,p2_w,p3_w,e_kwh,bat_v,bat_pct,rssi,snr,err_m,err_d,err_tx,err_last`
- `ts_hms_local` is local `HH:MM:SS` derived from `TIMEZONE_TZ`
- per-day file partition uses local date from `TIMEZONE_TZ`: `/dd3/<device_id>/YYYY-MM-DD.csv`
History parser (`src/web_server.cpp`):
- accepts both:
- current layout (`ts_utc,ts_hms_local,p_w,...`)
- legacy layout (`ts_utc,p_w,...`)
- daily file lookup prefers local-date filenames and falls back to legacy UTC-date filenames for backward compatibility
- requires full numeric parse for `ts_utc` and `p_w` (rejects trailing junk)
OLED duplicate display:
- receiver sender-pages show duplicate rate as `pct (absolute)` and last duplicate as `HH:MM`.
## Build Environments
From `platformio.ini`:
- `production`: serial debug off, light sleep on
- `debug`: serial diagnostics on, real meter and real LoRa
- `test`: synthetic meter samples and payload codec self-test, serial debug off
Example:
```bash
python -m platformio run -e production
```
## Test Mode
`ENABLE_TEST_MODE` replaces normal loops with `test_sender_loop` / `test_receiver_loop` (`src/test_mode.cpp`):
- Sender emits periodic JSON test payloads over LoRa.
- Receiver decodes test payloads, updates display test codes, publishes MQTT to:
- `smartmeter/<device_id>/test`