hacker news Hacker News
  1. new
  2. show
  3. ask
  4. jobs

The offline geocoder we wanted

7 points

by gipsyjaeger

13 hours ago

2 comments

story

What is this?

This is an offline reverse geocoder written in Python. Given a latitude–longitude pair, it returns the correct administrative region such as country, state, or district without calling any external APIs. This avoids API costs, rate limits, and network dependency.

Why build another reverse geocoder?

Most offline reverse geocoders rely on nearest-neighbor lookups. While fast, this approach often fails near borders because the closest location is not always the correct administrative region. This project focuses on correctness over proximity by verifying which boundary a coordinate actually falls inside.

How does it work?

A KD-Tree is used to quickly shortlist nearby administrative boundaries. For those candidates, the system performs polygon containment checks to confirm the true region. It supports both single-process execution for small workloads and multiprocessing for large batch processing.

Performance

The system processes 10,000 coordinates in under 2 seconds, with an average polygon validation time below 0.4 milliseconds per coordinate.

Who is this for?

Anyone who needs reverse geocoding, predictable costs, large-scale batch processing.

Implementation notes

This started as a toy implementation to explore boundary-aware reverse geocoding, but it turned out to be reliable enough for real production use. The dataset covers more than 210 countries with over 145,000 administrative boundaries.

Links

Source code: https://github.com/SOORAJTS2001/gazetteer

Documentation: https://gazetteer.readthedocs.io/en/stable

Feedback is welcome, especially around the approach, performance trade-offs, and edge cases.

loading...