Document minimal batch/ack protocol and timestamp safety rules

This commit is contained in:
2026-02-04 11:57:59 +01:00
parent f0503af8c7
commit 373667ab8a

453
README.md
View File

@@ -1,419 +1,88 @@
# DD3 LoRa Bridge (Multi-Sender) # DD3-LoRa-Bridge-MultiSender
Unified firmware for LilyGO T3 v1.6.1 (ESP32 + SX1276 + SSD1306) that runs as **Sender** or **Receiver** based on a GPIO jumper. Senders read DD3 smart meter values and transmit compact binary batches over LoRa. The receiver validates packets, publishes to MQTT, provides a web UI, and shows per-sender status on the OLED. Unified firmware for LilyGO T3 v1.6.1 (ESP32 + SX1276 + SSD1306) running as either:
- `Sender`: reads meter values and sends binary batches over LoRa.
- `Receiver`: accepts batches, ACKs with optional time, publishes to MQTT/web.
## Hardware ## Protocol (minimal)
Board: **LilyGO T3 LoRa32 v1.6.1** (ESP32 + SX1276 + SSD1306 128x64 + LiPo)
Variants:
- SX1276 **433 MHz** module (default build)
- SX1276 **868 MHz** module (use 868 build environments)
### Pin Mapping Frame format:
- LoRa (SX1276)
- SCK: GPIO5
- MISO: GPIO19
- MOSI: GPIO27
- NSS/CS: GPIO18
- RST: GPIO23
- DIO0: GPIO26
- OLED (SSD1306)
- SDA: GPIO21
- SCL: GPIO22
- RST: **not used** (SSD1306 init uses `-1` reset pin)
- I2C address: 0x3C
- microSD (on-board)
- CS: GPIO13
- MOSI: GPIO15
- SCK: GPIO14
- MISO: GPIO2
- I2C RTC (DS3231)
- SDA: GPIO21
- SCL: GPIO22
- I2C address: 0x68
- Battery ADC: GPIO35 (via on-board divider)
- **Role select**: GPIO14 (INPUT_PULLDOWN, sampled at boot, **shared with SD SCK**)
- HIGH = Sender
- LOW/floating = Receiver
- **OLED control**: GPIO13 (INPUT_PULLDOWN, sender only, **shared with SD CS**)
- HIGH = force OLED on
- LOW = allow auto-off after timeout
- Not used on receiver (OLED always on)
- Smart meter UART RX: GPIO34 (input-only, always connected)
### Notes on GPIOs `[msg_kind:1][dev_id_short:2][payload...][crc16:2]`
- GPIO34/35/36/39 are input-only and have **no internal pullups/pulldowns**.
- Strap pins (GPIO0/2/4/5/12/15) can affect boot; avoid for role or control jumpers.
- GPIO14 is shared between role select and SD SCK. **Do not attach the role jumper in Receiver mode if the SD card is connected/used**, and never force GPIO14 high when using SD.
- GPIO13 is shared between OLED control and SD CS. Avoid driving OLED control when SD is active.
- Receiver firmware releases GPIO14 to `INPUT` (no pulldown) after boot before SD SPI init.
## Firmware Roles `msg_kind`:
### Sender (battery-powered) - `0`: `BATCH_UP` (Sender -> Receiver)
- Reads smart meter via optical IR (UART 9600 7E1). - `1`: `ACK_DOWN` (Receiver -> Sender)
- Extracts OBIS values:
- Energy total: 1-0:1.8.0*255
- Total power: 1-0:16.7.0*255
- Phase power: 36.7 / 56.7 / 76.7
- Meter input is parsed via a non-blocking RX state machine; the last valid frame is reused for 1 Hz sampling.
- Reads battery voltage and estimates SoC.
- Builds compact binary batch payload, wraps in LoRa packet, transmits.
- Light sleeps between meter reads; batches are sent every 30s.
- Listens for LoRa time sync packets to set UTC clock.
- Uses DS3231 RTC after boot if no time sync has arrived yet.
- OLED shows status + meter data pages.
**Sender flow (pseudo-code)**: Removed from protocol:
```cpp - protocol version field
void sender_loop() { - payload type field
meter_read_every_second(); // OBIS -> MeterData samples - MeterData JSON/compressed LoRa path
read_battery(data); // VBAT + SoC - standalone TimeSync packets
if (time_to_send_batch()) { CRC16 validation is still required on every frame.
payload = encode_batch(samples, batch_id); // compact binary batch
lora_send(packet(MeterBatch, payload));
}
display_set_last_meter(data); ## Payloads
display_set_last_read(ok);
display_set_last_tx(ok);
display_tick();
lora_receive_time_sync(); // optional ### 1) `BATCH_UP`
light_sleep_until_next_event(); - Uses existing binary batch/chunk transport.
} - `sample_count == 0` is valid and means `SYNC_REQUEST`.
```
**Key sender functions**: ### 2) `ACK_DOWN` (7 bytes)
```cpp - `flags` (`u8`): bit0 = `time_valid`
bool meter_read(MeterData &data); // parse OBIS fields - `batch_id` (`u16`, big-endian)
void read_battery(MeterData &data); // ADC -> volts + percent - `epoch_utc` (`u32`, big-endian)
bool meterDataToJson(const MeterData&, String&);
bool compressBuffer(const uint8_t*, size_t, uint8_t*, size_t, size_t&); // MeterData only
bool lora_send(const LoraPacket &pkt); // add header + CRC16 and transmit
```
### Receiver (USB-powered) Receiver sets:
- WiFi STA connect using stored config; if not available/fails, starts AP. - `time_valid=1` only when receiver time is authoritative and sane.
- NTP sync (UTC) and local display in Europe/Berlin. - Otherwise `time_valid=0` and `epoch_utc=0`.
- Receives LoRa packets, verifies CRC16, decompresses MeterData JSON, decodes binary batches.
- Publishes meter JSON to MQTT.
- Sends ACKs for MeterBatch packets and de-duplicates by batch_id.
- Web UI:
- AP mode: status + WiFi/MQTT config.
- STA mode: status + per-sender pages.
- OLED cycles through receiver status and per-sender pages (receiver OLED never sleeps).
**Receiver loop (pseudo-code)**: ## Time bootstrap safety
```cpp
void receiver_loop() {
if (lora_receive(pkt)) {
if (pkt.type == MeterData) {
json = decompressBuffer(pkt.payload);
if (jsonToMeterData(json, data)) {
update_sender_status(data);
mqtt_publish_state(data);
}
} else if (pkt.type == MeterBatch) {
batch = reassemble_and_decode_batch(pkt);
for (sample in batch) {
update_sender_status(sample);
mqtt_publish_state(sample);
}
}
}
if (time_to_send_timesync()) { Sender starts with:
time_send_timesync(self_short_id); // always every 60s (receiver is mains-powered) - `g_time_acquired=false`
} - no real sampling/batching
- periodic `SYNC_REQUEST` every `SYNC_REQUEST_INTERVAL_MS` (default `15000ms`)
mqtt_loop(); Sender only accepts time from `ACK_DOWN` if:
web_server_loop(); - `time_valid == 1`
display_set_receiver_status(...); - `epoch_utc >= 2026-02-01 00:00:00 UTC` (`MIN_ACCEPTED_EPOCH_UTC = 1769904000`)
display_tick();
}
```
Receiver keeps the SX1276 in continuous RX, re-entering RX after any transmit (ACK or time sync). Only then:
- system time is set
- `g_time_acquired=true`
- normal 1 Hz sampling + batch transmit starts
**Key receiver functions**: This guarantees no pre-`2026-02-01` epoch reaches MQTT or SD/DB paths.
```cpp
bool lora_receive(LoraPacket &pkt, uint32_t timeout_ms);
bool jsonToMeterData(const String &json, MeterData &data);
bool decode_batch(const uint8_t *buf, size_t len, BatchInput *out);
bool mqtt_publish_state(const MeterData &data);
void web_server_loop(); // AP or STA UI
void time_send_timesync(uint16_t self_id);
```
## Test Mode (compile-time) ## Receiver behavior
Enabled by `-DENABLE_TEST_MODE` (see `platformio.ini` test environment).
- Sender: sends 4-digit test code every ~30s in JSON. On `BATCH_UP`:
- Receiver: shows last test code per sender and publishes to `/test` topic. 1. Decode batch/chunks.
- Normal behavior is excluded from test builds. 2. Send `ACK_DOWN` immediately.
3. If `sample_count == 0`: treat as `SYNC_REQUEST`, do not publish MQTT/update stats.
4. Else decode and publish samples as normal.
**Test sender (pseudo-code)**: ## Sender/Receiver debug logs (`SERIAL_DEBUG_MODE`)
```cpp
void test_sender_loop() {
code = random_4_digits();
json = {id, role:"sender", test_code: code, ts};
lora_send(packet(TestCode, compress(json)));
display_set_test_code(code);
}
```
**Test receiver (pseudo-code)**: Sender:
```cpp - `sync: request tx batch_id=%u`
void test_receiver_loop() { - `ack: rx ok batch_id=%u time_valid=%u epoch=%lu set=%u`
if (pkt.type == TestCode) { - `ack: timeout batch_id=%u retry=%u`
json = decompress(pkt.payload);
update_sender_test_code(json);
mqtt_publish_test(id, json);
}
}
```
## LoRa Protocol Receiver:
Packet layout: - `ack: tx batch_id=%u time_valid=%u epoch=%lu samples=%u`
``` ## Removed hardware dependency
[0] protocol_version (1)
[1] role (0=sender, 1=receiver)
[2..3] device_id_short (uint16)
[4] payload_type (0=meter, 1=test, 2=time_sync, 3=meter_batch, 4=ack)
[5..N-3] payload bytes (compressed JSON for MeterData, binary for MeterBatch/Test/TimeSync)
[N-2..N-1] CRC16 (bytes 0..N-3)
```
LoRa radio settings: DS3231 RTC support was removed:
- Frequency: **433 MHz** or **868 MHz** (set by build env via `LORA_FREQUENCY_HZ`) - no RTC files
- SF12, BW 125 kHz, CR 4/5, CRC on, Sync Word 0x34 - no RTC init/load/set logic
- When `SERIAL_DEBUG_MODE` is enabled, LoRa TX logs include timing breakdowns for `idle/begin/write/end` to diagnose long transmit times. - no `ENABLE_DS3231` flow
## Data Format ## Build
MeterData JSON (sender + MQTT):
```json
{
"id": "F19C",
"ts": 1737200000,
"e_kwh": 1234.57,
"p_w": 950.00,
"p1_w": 500.00,
"p2_w": 450.00,
"p3_w": 0.00,
"bat_v": 3.92,
"bat_pct": 78,
"rx_reject": 0,
"rx_reject_text": "none"
}
```
### Binary MeterBatch Payload (LoRa)
Fixed header (little-endian):
- `magic` u16 = 0xDDB3
- `schema` u8 = 2
- `flags` u8 = 0x01 (bit0 = signed phases)
- `sender_id` u16 (1..NUM_SENDERS, maps to `EXPECTED_SENDER_IDS`)
- `batch_id` u16
- `t_last` u32 (unix seconds of last sample)
- `dt_s` u8 (seconds, >0)
- `n` u8 (sample count, <=30)
- `battery_mV` u16
- `err_m` u8 (meter read failures, sender-side counter)
- `err_d` u8 (decode failures, sender-side counter)
- `err_tx` u8 (LoRa TX failures, sender-side counter)
- `err_last` u8 (last error code: 0=None, 1=MeterRead, 2=Decode, 3=LoraTx, 4=TimeSync)
- `err_rx_reject` u8 (last RX reject reason)
- `err_rx_reject` u8 (last RX reject reason: 0=None, 1=crc_fail, 2=bad_protocol_version, 3=wrong_role, 4=wrong_payload_type, 5=length_mismatch, 6=device_id_mismatch, 7=batch_id_mismatch)
- MQTT faults payload also includes `err_last_text` (string) and `err_last_age` (seconds).
Body:
- `E0` u32 (absolute energy in Wh)
- `dE[1..n-1]` ULEB128 (delta vs previous, >=0)
- `P1_0` s16 (absolute W)
- `dP1[1..n-1]` signed varint (ZigZag + ULEB128)
- `P2_0` s16
- `dP2[1..n-1]` signed varint
- `P3_0` s16
- `dP3[1..n-1]` signed varint
Notes:
- Receiver reconstructs timestamps from `t_last` and `dt_s`.
- Total power is computed on receiver as `p1 + p2 + p3`.
- Sender error counters are carried in the batch header and applied to all samples.
- Receiver ACKs MeterBatch as soon as the batch is reassembled, before MQTT/web/UI work, to avoid missing the sender ACK window.
- Receiver repeats ACKs (`ACK_REPEAT_COUNT`) spaced by `ACK_REPEAT_DELAY_MS` to cover sender RX latency.
- Sender ACK RX window is derived from LoRa airtime (bounded min/max) and retried once if the first window misses.
## Device IDs
- Derived from WiFi STA MAC.
- `short_id = (MAC[4] << 8) | MAC[5]`
- `device_id = dd3-%04X`
- JSON `id` uses only the last 4 hex digits (e.g., `F19C`) to save airtime.
Receiver expects known senders in `include/config.h` via:
```cpp
constexpr uint8_t NUM_SENDERS = 1;
inline constexpr uint16_t EXPECTED_SENDER_IDS[NUM_SENDERS] = { 0xF19C };
```
## OLED Behavior
- Sender: OLED stays on for `OLED_AUTO_OFF_MS` after boot or last activity.
- Activity is detected while `PIN_OLED_CTRL` is held high, or on the high->low edge when the control is released.
- Receiver: OLED is always on (no auto-off).
- Pages rotate every 4s.
## Power & Battery
- Sender disables WiFi/BLE, reads VBAT via ADC, and converts voltage to % using a LiPo curve:
- 4.2 V = 100%
- 2.9 V = 0%
- linear interpolation between curve points
- Uses deep sleep between cycles (`SENDER_WAKE_INTERVAL_SEC`).
- Sender CPU is throttled to 80 MHz and LoRa RX is only enabled in short windows (ACK wait or time-sync).
- Battery sampling averages 5 ADC reads and updates at most once per `BATTERY_SAMPLE_INTERVAL_MS` (default 60s).
- `BATTERY_CAL` applies a scale factor to match measured VBAT.
- When `SERIAL_DEBUG_MODE` is enabled, each ADC read logs the 5 raw samples, average, and computed voltage.
## Web UI
- AP SSID: `DD3-Bridge-<short_id>` (prefix configurable)
- AP password: `changeme123` (configurable)
- Endpoints:
- `/`: status overview
- `/wifi`: WiFi/MQTT/NTP config (AP and STA)
- `/sender/<device_id>`: per-sender details
- Sender IDs on `/` are clickable (open sender page in a new tab).
- In STA mode, the UI is also available via the board's IP/hostname on your WiFi network.
- Main page shows SD card file listing (downloadable).
- Sender page includes a history chart (power) with configurable range/resolution/mode.
## Security
- Basic Auth is supported for the web UI. In STA mode it is enabled by default; AP mode is optional.
- Config flags in `include/config.h`:
- `WEB_AUTH_REQUIRE_STA` (default `true`)
- `WEB_AUTH_REQUIRE_AP` (default `false`)
- `WEB_AUTH_DEFAULT_USER` / `WEB_AUTH_DEFAULT_PASS`
- Web credentials are stored in NVS. `/wifi`, `/sd/download`, `/history/*`, `/`, `/sender/*`, and `/manual` require auth when enabled.
- Password inputs are not prefilled. Leaving a password blank keeps the stored value; use the "clear password" checkbox to erase it.
- User-controlled strings are HTML-escaped before embedding in pages.
## MQTT
- Topic: `smartmeter/<deviceId>/state`
- QoS 0
- Test mode: `smartmeter/<deviceId>/test`
- Client ID: `dd3-bridge-<device_id>` (stable, derived from MAC)
## NTP
- NTP servers are configurable in the web UI (`/wifi`).
- Defaults: `pool.ntp.org` and `time.nist.gov`.
## RTC (DS3231)
- Optional DS3231 on the I2C bus. Connect SDA to GPIO21 and SCL to GPIO22 (same bus as the OLED).
- Enable/disable with `ENABLE_DS3231` in `include/config.h`.
- Receiver time sync packets set the RTC.
- On boot, if no LoRa time sync has arrived yet, the sender uses the RTC time as the initial `ts_utc`.
- Receiver keeps sending time sync every 60 seconds.
- If a senders timestamps drift from receiver time by more than `TIME_SYNC_DRIFT_THRESHOLD_SEC`, the receiver enters a burst mode (every `TIME_SYNC_BURST_INTERVAL_MS` for `TIME_SYNC_BURST_DURATION_MS`).
- Sender raises a local `TimeSync` error if it has not received a time beacon for `TIME_SYNC_ERROR_TIMEOUT_MS` (default 2 days). This is shown on the sender OLED only and is not sent over LoRa.
- RTC loads are validated (reject out-of-range epochs) so LoRa TimeSync can recover if the RTC is wrong.
- Sender uses a short “fast acquisition” mode on boot (until first LoRa TimeSync) with wider RX windows to avoid phase-miss.
## Build Environments
- `lilygo-t3-v1-6-1`: production build (debug on)
- `lilygo-t3-v1-6-1-test`: test build with `ENABLE_TEST_MODE`
- `lilygo-t3-v1-6-1-868`: production build for 868 MHz modules (debug on)
- `lilygo-t3-v1-6-1-868-test`: test build for 868 MHz modules
- `lilygo-t3-v1-6-1-payload-test`: build with `PAYLOAD_CODEC_TEST`
- `lilygo-t3-v1-6-1-868-payload-test`: 868 MHz build with `PAYLOAD_CODEC_TEST`
- `lilygo-t3-v1-6-1-prod`: production build with serial debug off
- `lilygo-t3-v1-6-1-868-prod`: 868 MHz production build with serial debug off
## Config Knobs
Key timing settings in `include/config.h`:
- `METER_SAMPLE_INTERVAL_MS`
- `METER_SEND_INTERVAL_MS`
- `BATTERY_SAMPLE_INTERVAL_MS`
- `BATTERY_CAL`
- `BATCH_ACK_TIMEOUT_MS`
- `BATCH_MAX_RETRIES`
- `BATCH_QUEUE_DEPTH`
- `BATCH_RETRY_POLICY` (keep or drop on retry exhaustion)
- `SERIAL_DEBUG_MODE_FLAG` (build flag) / `SERIAL_DEBUG_DUMP_JSON`
- `LORA_SEND_BYPASS` (debug only)
- `ENABLE_SD_LOGGING` / `PIN_SD_CS`
- `SENDER_TIMESYNC_WINDOW_MS`
- `SENDER_TIMESYNC_CHECK_SEC_FAST` / `SENDER_TIMESYNC_CHECK_SEC_SLOW`
- `TIME_SYNC_DRIFT_THRESHOLD_SEC`
- `TIME_SYNC_BURST_INTERVAL_MS` / `TIME_SYNC_BURST_DURATION_MS`
- `TIME_SYNC_ERROR_TIMEOUT_MS`
- `SD_HISTORY_MAX_DAYS` / `SD_HISTORY_MIN_RES_MIN`
- `SD_HISTORY_MAX_BINS` / `SD_HISTORY_TIME_BUDGET_MS`
- `WEB_AUTH_REQUIRE_STA` / `WEB_AUTH_REQUIRE_AP` / `WEB_AUTH_DEFAULT_USER` / `WEB_AUTH_DEFAULT_PASS`
## Limits & Known Constraints
- **Compression**: MeterData uses lightweight RLE (good for JSON but not optimal).
- **OBIS parsing**: supports IEC 62056-21 ASCII (Mode D); may need tuning for some meters.
- **Payload size**: single JSON frames < 256 bytes (ArduinoJson static doc); binary batch frames are chunked and reassembled (typically 1 chunk).
- **Battery ADC**: uses a divider (R44/R45 = 100K/100K) with a configurable `BATTERY_CAL` scale and LiPo % curve.
- **OLED**: no hardware reset line is used (matches working reference).
- **Batch ACKs**: sender waits for ACK after a batch and retries up to `BATCH_MAX_RETRIES` with `BATCH_ACK_TIMEOUT_MS` between attempts.
## SD Logging (Receiver)
Optional CSV logging to microSD (FAT32) when `ENABLE_SD_LOGGING = true`.
- Path: `/dd3/<device_id>/YYYY-MM-DD.csv`
- Columns:
`ts_utc,p_w,p1_w,p2_w,p3_w,e_kwh,bat_v,bat_pct,rssi,snr,err_m,err_d,err_tx,err_last`
- `err_last` is written as text (`meter`, `decode`, `loratx`) only on the last sample of a batch that reports an error.
- Files are downloadable from the main UI page.
- Downloads only allow absolute paths under `/dd3/`, reject `..`, backslashes, and repeated slashes, and enforce a max path length.
- History chart on sender page stream-parses CSVs and bins data in the background.
- SD uses the on-board microSD SPI pins (CS=13, MOSI=15, SCK=14, MISO=2).
## Files & Modules
- `include/config.h`, `src/config.cpp`: pins, radio settings, sender IDs
- `include/data_model.h`, `src/data_model.cpp`: MeterData + ID init
- `include/json_codec.h`, `src/json_codec.cpp`: JSON encode/decode
- `include/compressor.h`, `src/compressor.cpp`: RLE compression
- `include/lora_transport.h`, `src/lora_transport.cpp`: LoRa packet + CRC
- `src/payload_codec.h`, `src/payload_codec.cpp`: binary batch encoder/decoder
- `include/meter_driver.h`, `src/meter_driver.cpp`: IEC 62056-21 ASCII parse
- `include/power_manager.h`, `src/power_manager.cpp`: ADC + sleep
- `include/time_manager.h`, `src/time_manager.cpp`: NTP + time sync
- `include/wifi_manager.h`, `src/wifi_manager.cpp`: NVS config + WiFi
- `include/mqtt_client.h`, `src/mqtt_client.cpp`: MQTT publish
- `include/web_server.h`, `src/web_server.cpp`: AP/STA web pages
- `include/display_ui.h`, `src/display_ui.cpp`: OLED pages + control
- `include/test_mode.h`, `src/test_mode.cpp`: test sender/receiver
- `src/main.cpp`: role detection and main loop
## Quick Start
1. Set role jumper on GPIO14:
- LOW: sender
- HIGH: receiver
2. OLED control on GPIO13:
- HIGH: always on
- LOW: auto-off after 10 minutes
3. Build and upload:
```bash ```bash
pio run -e lilygo-t3-v1-6-1 -t upload --upload-port COMx pio run -e lilygo-t3-v1-6-1
pio run -e lilygo-t3-v1-6-1-test
``` ```
Test mode:
```bash
pio run -e lilygo-t3-v1-6-1-test -t upload --upload-port COMx
```
868 MHz builds:
```bash
pio run -e lilygo-t3-v1-6-1-868 -t upload --upload-port COMx
```
868 MHz test mode:
```bash
pio run -e lilygo-t3-v1-6-1-868-test -t upload --upload-port COMx
```