Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
podcast
Filter by Categories
ArcGIS Pro
GDAL
GeoJson
Map
Map Tools
Maps
postgis
Python
QGIS
Uncategorized

Open‑source Geocoding with Nominatim and Geopy

How to convert addresses ↔ coordinates at scale using nothing but Python and OpenStreetMap

Context: This guide complements our ArcGIS‑based batch geocoding and reverse‑geocoding articles. If you’re looking for a no‑credit, fully open‑source workflow, read on.


1  Why choose an open‑source stack?

Proprietary locator (ArcGIS, Google)Nominatim + Geopy
Pay‑as‑you‑go credits or API fees100 % free to use*
Global coverage, consistent qualityCommunity‑driven OSM data (excellent in cities, variable elsewhere)
Closed‑source algorithmsTransparent, replicable
Needs internet or local licenceCan run 100 % offline (self‑hosted)

*You must respect OpenStreetMap Nominatim usage policy: 1 request / second max to the public API and include a unique User‑Agent string.


2  Prerequisites

  1. Python 3.8+ (Anaconda or system install).
  2. Packages: geopy, pandas, requests, tqdm (for progress bars). pip install geopy pandas requests tqdm
  3. Input data: CSV or Excel file with either an address column (forward geocode) or lat, lon columns (reverse geocode).
  4. (Optional) Docker Desktop if you plan to self‑host Nominatim for unlimited throughput.

3  Method 1 — Quick one‑liner look‑ups with Geopy

For small jobs (≤ 500 addresses) the 10‑line script below is often enough.

from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="my‑gis‑blog/0.1 (+your‑email@example.com)")

location = geolocator.geocode("380 New York St Redlands CA")
print(location.latitude, location.longitude)

Change .geocode to .reverse("34.056,-117.195", language="en", zoom=18) for reverse geocoding.

Rate‑limit reminder: Always add time.sleep(1) between requests if you are using the public API.


4  Method 2 — Bulk CSV geocoding (public API, respectful)

Below is a fully‑commented script that reads a CSV, geocodes each address, and writes results to a new file.

import pandas as pd, time
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
from tqdm import tqdm

tqdm.pandas()

df = pd.read_csv("customers.csv")           # needs column 'address'

geolocator = Nominatim(user_agent="my‑gis‑blog/0.1", timeout=10)

# Wrap with RateLimiter: min 1 sec between calls as per Nominatim policy
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

df["location"] = df["address"].progress_apply(geocode)
df["lat"] = df["location"].apply(lambda loc: loc.latitude if loc else None)
df["lon"] = df["location"].apply(lambda loc: loc.longitude if loc else None)

df.to_csv("customers_geocoded.csv", index=False)

Typical throughput: ~3 500 addresses per hour on the public endpoint.


5  Method 3 — Self‑hosted Nominatim via Docker (high‑volume, offline)

When you need millions of requests or must keep data on‑prem, spin up your own Nominatim server.

5.1  Spin up a container

git clone https://github.com/mediagis/nominatim-docker.git
cd nominatim-docker
# download a country extract (~2 GB for US) via geofabrik.de
wget https://download.geofabrik.de/north-america/us-latest.osm.pbf
# edit .env to point to the PBF file
sudo docker compose up -d

The stack provisions PostgreSQL/PostGIS + the Nominatim service exposed on http://localhost:8080.

5.2  Change your Geopy endpoint

geolocator = Nominatim(domain="http://localhost:8080", scheme="http", user_agent="local-nominatim")

No more rate limits! Add a load‑balancer or additional read‑replicas for extra concurrency.


6  Choosing between public vs self‑hosted

MetricPublic OSM NominatimSelf‑hosted Nominatim
Requests/day86 400 (1/s)Unlimited (hardware‑bound)
CostFreeVPS ≈ $40/mo or on‑prem server
Latency300‑500 ms50‑100 ms (local network)
Data freshnessPlanet file updated weeklyYou control import schedule

7  Data quality & troubleshooting

  • No result: Address may not exist in OSM; try wider country_codes or partial address.
  • Multiple results: Use exactly_one=False and pick top‑score or ask user input.
  • Timeout errors: Increase timeout in Geopy constructor or batch size.
  • Encoding issues: Ensure UTF‑8; strip emojis.

8  Exporting geocoded data to GIS formats

  1. CSV → Shapefile/GeoPackage ogr2ogr -f GPKG customers.gpkg customers_geocoded.csv X_POSSIBLE_NAMES=lon Y_POSSIBLE_NAMES=lat
  2. Load directly into QGIS, symbolise by match quality.

9  Linking into your ArcGIS workflow

Even if you primarily work in ArcGIS Pro, you can import this open‑source output:

  • Use Add Datacustomers_gpkg.
  • Join to enterprise geodatabase tables for further analysis.
  • Combine with the Reverse Geocoding table to build QA dashboards.

FAQ

Q: Is it legal to use Nominatim for commercial projects?
A: Yes, provided you comply with the OSM licence (ODbL) and the Nominatim usage policy. Attribution to OpenStreetMap contributors is required.

Q: How accurate is OSM geocoding compared to paid services?
A: In major cities accuracy is comparable; coverage and address completeness can be lower in rural or under‑mapped regions. Always validate samples.

Q: Can I cache results to avoid duplicate requests?
A: Absolutely. Store the place_id or lat/lon in a local database and query it before calling the API.


Now you have a full, cost‑free pipeline for geocoding at any scale—pair it with our ArcGIS and reverse‑geocoding guides to choose the right tool for every budget and project.

About the Author
I'm Daniel O'Donohue, the voice and creator behind The MapScaping Podcast ( A podcast for the geospatial community ). With a professional background as a geospatial specialist, I've spent years harnessing the power of spatial to unravel the complexities of our world, one layer at a time.