STHN introduces a coarse-to-fine deep homography estimation approach for UAV thermal geo-localization, enabling accurate alignment between thermal and satellite imagery even with significant appearance differences and geometric noise.
Accurate geo-localization of Unmanned Aerial Vehicles (UAVs) is crucial for outdoor applications including search and rescue operations, power line inspections, and environmental monitoring. The vulnerability of Global Navigation Satellite Systems (GNSS) signals to interference and spoofing necessitates the development of additional robust localization methods for autonomous navigation. Visual Geo-localization (VG), leveraging onboard cameras and reference satellite maps, offers a promising solution for absolute localization. Specifically, Thermal Geo-localization (TG), which relies on image-based matching between thermal imagery with satellite databases, stands out by utilizing infrared cameras for effective nighttime localization. However, the efficiency and effectiveness of current TG approaches, are hindered by dense sampling on satellite maps and geometric noises in thermal query images. To overcome these challenges, we introduce STHN, a novel UAV thermal geo-localization approach that employs a coarse-to-fine deep homography estimation method. This method attains reliable thermal geo-localization within a 512-meter radius of the UAV's last known location even with a challenging 11% size ratio between thermal and satellite images, despite the presence of indistinct textures and self-similar patterns. We further show how our research significantly enhances UAV thermal geo-localization performance and robustness against geometric noises under low-visibility conditions in the wild. The code is made publicly available.
STHN uses a coarse-to-fine deep homography estimation pipeline with three main components: a Thermal Generative Module (TGM), a coarse alignment module, and a refinement module. The coarse alignment module first estimates a rough homography between resized satellite and thermal images. The refinement module then crops the aligned region and performs fine-grained alignment. A two-stage training strategy with bounding box augmentation ensures robust refinement.
Comparison of test MACE (m) between different homography estimation methods across different DC. Lower is better.
| Method | DC=50m | DC=64m | DC=128m | DC=256m | DC=512m | Failure Rate |
|---|---|---|---|---|---|---|
| SIFT + RANSAC | 442.20 | 654.77 | 547.29 | 529.63 | 1650.46 | 99.6% |
| SIFT + MAGSAC++ | 512.60 | 438.54 | 529.46 | 561.64 | 693.03 | 99.7% |
| ORB + RANSAC | 720.80 | 733.69 | 733.94 | 4614.84 | 975.83 | 82.6% |
| LoFTR + RANSAC | 1123.74 | 1697.33 | 1317.69 | 1269.71 | 2564.65 | 0% |
| DHN | 16.78 | 20.43 | 77.68 | 197.27 | 457.23 | 0% |
| IHN | 5.91 | 7.81 | 51.74 | 190.93 | 367.24 | 0% |
| Ours (WS=512) | 4.24 | 4.93 | 14.97 | 142.71 | 347.50 | 0% |
| Ours (WS=1024) | 4.92 | 5.31 | 6.03 | 9.22 | 86.74 | 0% |
| Ours (WS=1536) | 6.50 | 7.04 | 7.27 | 16.78 | 16.42 | 0% |
| Ours (two-stage) | 7.51 | 7.20 | 7.51 | 14.99 | 12.70 | 0% |
Blue bold = best result. Underlined = second best. Highlighted row = our method.
Comparison with image-based matching methods at DC=512m.
| Method | Test CE (m) | Latency (ms) |
|---|---|---|
| AnyLoc-VLAD-DINOv2 | 258.21 | 352,404 |
| STGL-NetVLAD-ResNet50 | 89.31 | 7,180 |
| STGL-GeM-ResNet50 | 13.52 | 4,919 |
| Ours (one-stage) | 15.90 | 35.2 |
| Ours (two-stage) | 12.12 | 63.9 |
We investigate the effects of the Thermal Generative Module (TGM) and the relationship between satellite image size WS and search distance DC on alignment accuracy.
Our two-stage method maintains accurate localization under rotation, resizing, and perspective transformation noise. Green = Ground Truth, Blue = Coarse Alignment, Red = Final Prediction.
@ARTICLE{xiao2024sthn,
author={Xiao, Jiuhong and Zhang, Ning and Tortei, Daniel and Loianno, Giuseppe},
journal={IEEE Robotics and Automation Letters},
title={STHN: Deep Homography Estimation for UAV Thermal Geo-Localization With Satellite Imagery},
year={2024},
volume={9},
number={10},
pages={8754-8761},
doi={10.1109/LRA.2024.3448129}}