Table of contents
Open Table of contents
Introduction
Have you ever wondered how Google Maps can quickly find nearby restaurants or how Uber efficiently matches riders with drivers? The secret often lies in a clever data structure called geohash.
In this post, we’ll explore what geohashing is, why it’s so effective for spatial data processing, and how to implement it in Go.
But first, let’s dive into the problem geohashing was designed to solve.
The Problem: Finding Nearby Restaurants
Imagine you’re searching for nearby restaurants because you’re hungry. In a 2D plane, every location is defined by two properties: latitude and longitude (e.g., 52.3 and 4.8).
Latitude vs longitude
If you store these coordinates in a traditional SQL database, you’ll face a challenge: using a conventional B-tree index, you can efficiently search by either latitude or longitude, but not both simultaneously. This makes proximity searches like “What’s nearby?” inefficient.
The Solution: Geohashing
Geohashing solves this problem by encoding latitude and longitude into a compact string of letters and digits. For instance, 52.3 and 4.8 from above could be represented as “u173zk7d”. What makes geohashing so useful is that nearby locations share common prefixes, making it ideal for spatial indexing and fast proximity searches.
How it Works
Demo
Geohashing iteratively divides the Earth’s surface into a grid of 32 regions, each represented by a unique character. The longer the geohash, the more specific the location. For example, “u173” covers a large area of Amsterdam, while “u173zk7d” pinpoints a precise spot on Kerkstraat.
This method transforms complex 2D spatial indexing into a simpler 1D problem, making it compatible with traditional database indexing methods such as B-trees, allowing for fast and efficient spatial queries just like any other type of data.
Implementation
The geohashing algorithm works by iteratively dividing the latitude and longitude ranges, alternating between them, and converting the resulting bits into a base32 string. Let’s go over the steps one by one.
func GenerateGeohash(x, y float64) string {
var xBits, yBits uint32
xs := []float64{-180.0, 0.0, 180.0}
ys := []float64{-90.0, 0.0, 90.0}
// coordBits = 30
for i := coordBits - 1; i >= 0; i-- {
processCoordinate(x >= xs[1], &xBits, xs, i)
processCoordinate(y >= ys[1], &yBits, ys, i)
}
resultBits := interleaveBits(xBits, yBits)
return encodeGeohash(resultBits)
}
Process Coordinates
The first step is to divide the coordinate space:
- Longitude ranges from -180 to 180 degrees.
- Latitude ranges from -90 to 90 degrees.
Each range is split in half at each step, and a bit is assigned depending on whether the coordinate is in the lower or upper half.
func processCoordinate(rightOfMedian bool, bits *uint32, coords []float64, bitPosition int) {
if rightOfMedian {
*bits |= 1 << bitPosition
coords[0] = coords[1]
} else {
coords[2] = coords[1]
}
coords[1] = (coords[0] + coords[2]) / 2.0
}
Interleave Bits
After determining the bits for both latitude and longitude, these bits are interleaved to create a binary string.
func interleaveBits(xBits, yBits uint32) uint64 {
var resultBits uint64
// coordBits = 30
for i := coordBits - 1; i >= 0; i-- {
resultBits |= uint64((xBits>>i)&1) << (2*i + 1)
resultBits |= uint64((yBits>>i)&1) << (2 * i)
}
return resultBits
}
Encode Bits Using Base32
Finally, this binary string is split into 5-bit chunks and each chunk is mapped to a character in the base32 encoding scheme, resulting in the final geohash string.
func encodeGeohash(resultBits uint64) string {
var geohash strings.Builder
// totalBits = 60
// bitsPerChar = 5
geohash.Grow(totalBits / bitsPerChar)
// bitsMask = 0x1F, 00011111 in binary
// base32Codes = 0123456789bcdefghjkmnpqrstuvwxyz
for i := bitsPerChar; i <= totalBits; i += bitsPerChar {
pos := (resultBits >> (totalBits - i)) & bitsMask
geohash.WriteByte(base32Codes[pos])
}
return geohash.String()
}
Conclusion
Geohashing is a powerful tool that bridges the gap between spatial data and efficient query performance. It is widely used in geographic information systems (GIS), geolocation services, and databases for efficient spatial indexing and querying.
If you want to understand it better, you can check the visualization here: https://geohash.softeng.co/
Source code is available on GitHub.