2026-05-03 22:00:06 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:59:22 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 21:58:47 +03:00
2026-05-03 22:00:06 +03:00
2026-05-03 21:58:47 +03:00

BGtopoVJ Blue Rectangle/Square Detection PoC

This is a practical first-pass pipeline for finding blue/light-blue square and rectangle symbols in BGtopoVJ raster maps, then using those detections to score a coordinate dataset and bootstrap a YOLO detector.

The PoC is intentionally hybrid:

  1. Download original BGtopoVJ *.tif + *.map sheet pairs.
  2. Open the raster through GDAL/Rasterio, preferring the OziExplorer .map sidecar when available.
  3. Mine weak candidates using OpenCV HSV thresholding + contour/rectangle filters.
  4. Generate QA overlays and HTML report.
  5. Score your known coordinates against nearby candidates.
  6. Export weak labels into YOLO format.
  7. Train a first YOLO model on your RTX 3080 FE after you review/clean the weak labels.

This is not meant to be a final truth engine on day one. It is meant to rapidly produce reviewable candidates, hard negatives, and a training set.


Hardware fit

Your RTX 3080 FE is enough for the first detector. Start with:

  • yolov8s.pt
  • imgsz=1024
  • batch=2 or batch=4
  • epochs=80

If you hit CUDA OOM, lower batch first. Do not lower image size below 896 too early, because the target symbols are small.

16 GB system RAM is tight for country-scale processing, but fine for per-sheet scanning. Avoid loading the whole corpus at once. This PoC scans by windows.


Install locally

Linux / WSL / Manjaro-like

python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

GDAL/Rasterio can be the annoying part. If rasterio.open("*.map") fails, install GDAL from your OS package manager, or use the Docker option below.

GPU Docker option

docker compose -f docker-compose.gpu.yml build
docker compose -f docker-compose.gpu.yml run --rm bgtopo-bluebox bash

Inside the container:

./scripts/run_pilot.sh

Run the pilot

./scripts/run_pilot.sh

This downloads only two sheets, scans them, writes candidate CSV files, draws overlays, and builds:

reports/poc_report.html
reports/overlays/*.png
data/interim/candidates/*_candidates.csv

Inspect the overlays. If too many rivers/text labels are detected, tighten configs/blue_detector.yaml. If real blue rectangles are missed, loosen the HSV ranges and size filters.


Manual one-sheet run

python -m bgtopo_poc.cli inventory \
  --config configs/blue_detector.yaml \
  --out data/manifest.csv \
  --limit 1

python -m bgtopo_poc.cli download \
  --manifest data/manifest.csv \
  --out-dir data/raw \
  --out-manifest data/manifest_downloaded.csv \
  --limit 1

python -m bgtopo_poc.cli detect \
  --config configs/blue_detector.yaml \
  --sheet-id K-34-009-2 \
  --map data/raw/K-34-009-2/K-34-009-2.map \
  --tif data/raw/K-34-009-2/K-34-009-2.tif \
  --out-dir data/interim/candidates

python -m bgtopo_poc.cli overlay \
  --tif data/raw/K-34-009-2/K-34-009-2.tif \
  --candidates data/interim/candidates/K-34-009-2_candidates.csv \
  --out reports/overlays/K-34-009-2_overlay.png

Score your 60k coordinates

Expected coordinate CSV columns:

id,lat,lon,expected
pt001,42.58837223,23.19638729,unknown

Then run:

python -m bgtopo_poc.cli score-coords \
  --config configs/blue_detector.yaml \
  --sheet-id K-34-009-2 \
  --coordinates data/coordinates/your_60k_points.csv \
  --candidates data/interim/candidates/K-34-009-2_candidates.csv \
  --map data/raw/K-34-009-2/K-34-009-2.map \
  --tif data/raw/K-34-009-2/K-34-009-2.tif \
  --out-dir data/interim/coordinate_scores \
  --coord-crs EPSG:4326

Extract review crops for predicted positives/review cases:

python -m bgtopo_poc.cli crops \
  --scores data/interim/coordinate_scores/K-34-009-2_coordinate_scores.csv \
  --map data/raw/K-34-009-2/K-34-009-2.map \
  --tif data/raw/K-34-009-2/K-34-009-2.tif \
  --out-dir data/interim/crops/K-34-009-2 \
  --crop-size 256

Important: The PoC currently scores coordinates sheet-by-sheet. The next production step is assigning every point to the right sheet footprint automatically. This requires confirming that .map georeferencing opens correctly on your system.


Export YOLO dataset

After reviewing/correcting candidates, export YOLO tiles:

python -m bgtopo_poc.cli export-yolo \
  --config configs/blue_detector.yaml \
  --sheet-id K-34-009-2 \
  --tif data/raw/K-34-009-2/K-34-009-2.tif \
  --candidates data/interim/candidates/K-34-009-2_candidates.csv \
  --out-dir data/yolo/K-34-009-2 \
  --tile-size 1024 \
  --overlap 128

Then train:

python -m bgtopo_poc.cli train-yolo \
  --data-yaml data/yolo/K-34-009-2/data.yaml \
  --model yolov8s.pt \
  --imgsz 1024 \
  --epochs 80 \
  --batch 4 \
  --device 0

What to improve after this PoC works

  1. Add automatic sheet-footprint discovery and coordinate-to-sheet assignment.
  2. Add CVAT export/import so weak labels can be corrected by hand.
  3. Add hard-negative mining for rivers, lakes, blue text and blue linework.
  4. Add calibrated coordinate scoring using a small sklearn model trained on reviewed points.
  5. Add active learning: prioritize review crops where the model and rule detector disagree.
  6. Add full-map batch inference with overlap-aware de-duplication.

Output files

data/manifest.csv                         # discovered remote assets
data/manifest_downloaded.csv              # local paths after download
data/interim/candidates/*_candidates.csv  # weak detections
data/interim/coordinate_scores/*.csv      # coordinate-level predictions
data/interim/crops/*/*.png                # review crops
reports/overlays/*.png                    # visual QA overlays
reports/poc_report.html                   # summary report
data/yolo/*/data.yaml                     # YOLO training dataset
runs/bgtopo_bluebox/*                     # YOLO training runs
Description
My journey on extracting data from garmin img non-documented format.
Readme 172 MiB
routes-v3 Latest
2026-06-04 10:44:04 +00:00
Languages
Python 74%
Jupyter Notebook 25.9%