telemetry: fix log deduplication — strip ANSI codes, tz offsets, mid-line timestamps (v0.30.6)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-25 16:01:53 +01:00
parent 17db33e419
commit 19f2c908fc
3 changed files with 28 additions and 12 deletions
+1 -1
View File
@@ -892,7 +892,7 @@ Each report push now includes per-app telemetry data:
**Log scanning** (`logscanner.go`):
- `ScanContainerLogs(containerNames, since, logger)` runs `docker logs --since=15m --tail=1000` sequentially on all non-protected deployed containers.
- Classifies lines by keyword match (errors: `error`, `fatal`, `panic`, `crit`, `oom`, `killed`, `exception`, `traceback`; warnings: `warn`, `warning`) on the first 5 words (case-insensitive).
- Deduplicates via fingerprinting: strips timestamps, replaces 6+ digit numbers with `<N>`, 8+ char hex with `<HEX>`, UUIDs with `<UUID>`. Groups identical fingerprints, keeps top 10 per container.
- Deduplicates via fingerprinting: strips ANSI escape codes, ISO timestamps (with timezone offsets), and syslog timestamps (including mid-line); replaces 6+ digit numbers with `<N>`, 8+ char hex with `<HEX>`, UUIDs with `<UUID>`. Groups identical fingerprints, keeps top 10 per container.
- Returns `[]ContainerLogSummary` with `ErrorCount`, `WarnCount`, `RecentIssues []LogIssue`.
**Report integration** (`report/telemetry.go`):