CiviCRM can use OpenStreetMap or Google as a geoencoder (i.e. looking up latitude, longitude from address). With this correctly set up each contact you add will have a longitude and latitude value and can be placed on a map.
I had a large set of addresses (about 10,000) that had been entered without this and wanted to retro fit the data. There is a job you can use for this, but it's way too slow. Here's how I did it.
Tools used:
- the open source postcodes.io service.
- jq - for json manipulation
- bash, curl, sed, wc.
First, get the UK postcodes out of the database where we're missing the geocoding:
mysql -e 'select CONCAT("\"", postal_code, "\"")
from civicrm_address
where geo_code_1 is null
and country_id = 1226
and postal_code is not null and postal_code NOT LIKE "%\"%"
group by postal_code;' -B mycivicrmdb >postcodes
This created a file with one post code per line, surrounded by "quotes".
I found that postcodes.io doesn't like to give out more than 100 at a time, so I then batched them in 100s. Not all the postcodes would be valid so I had to root out ones without a result.
while [ -s postcodes ]
do
head -n100 postcodes | jq -s '{postcodes:.}' >postcode-body
curl -s -XPOST -H "Content-Type: application/json" -d@postcode-body https://api.postcodes.io/postcodes | \
jq -c '.result[] | select(.result)| [ .query, .result.postcode, .result.longitude, .result.latitude ] ' >>geo.json
sed -i '1,100d' postcodes;
echo -n "postcodes: " ; wc -l postcodes ; echo -n "coded: "; wc -l geo.json ;
done
This ran through and gave me a file geo.json with lines like:
["rh175da","RH17 5DA",-0.145082,51.014794]
The first item is the value from CiviCRM; the next is the corrected postcode; the next is the longitude and the last is the latitude.
I was able to transform these into SQL updates with jq again:
jq -r '"UPDATE civicrm_address SET postal_code=\"" + .[1] + "\", geo_code_2=" + (.[2]|tostring) + ", geo_code_1=" + (.[3]|tostring) + " WHERE country_id = 1226 AND geo_code_1 IS NULL AND postal_code = \"" + .[0] + "\";"' geo.json >address-updates.sql
Finally I visually checked these updates looked OK (in theory someone could have planted some malicious SQL injection in a postcode field), top-and-tailed them with BEGIN; and COMMIT; and applied them to the database.
I found that it was taking ages because there is no index on postal_code, so I added one and then it only took a few seconds to apply the updates.
Now my data is cleaner (postcodes are corrected) and also has geolocation data. Yay.
Add new comment