I can fetch a JSON list of subscribers like this:
curl -u $API_LOGIN:$API_PASSWORD https://example.org/subscribers
Then there’s jq, which is basically AWK for JSON.
The following invocation simply breaks a top-level JSON array into an object per line.
cat subscribers.json | jq '.[]' -c
And then there’s Postgres, which has a COPY command that will (by default) accept a row per line, with columns separated by tabs. In my case I only care about one column.
(I specify the port number because I have both Postgres 9.1 and 9.3 servers running)
cat subscribers.txt | psql -p 5433 -c 'DELETE FROM scratch; COPY scratch (subscriber) FROM STDIN;'
Put it all together:
curl -u $API_LOGIN:$API_PASSWORD https://example.org/subscribers | jq '.[]' -c | psql -p 5433 -c 'DELETE FROM scratch; COPY scratch (subscriber) FROM STDIN;'
Bam, I’ve refreshed my entire local subscriber database in one shell command and 4 seconds (most of which was the initial HTTP request). This could easily go in a cron job.
OK, so now I have a bunch of rows with raw JSON data in them, what’s the big deal? Ah, but it isn’t raw. I declared the subscriber column as type json.
Let’s count active subscriptions.
psql -p 5433 -c "SELECT COUNT(*) from scratch where subscriber#>>'{subscription, status}' = 'ACTIVE'"
Thanks to Postgres’ JSON querying operators, I just selected into the JSON data in the subscriber column in order to query on a nested property of that data.
How cool is that?!
pretty cool
Sweet one-liner. I wish PG would use friendlier json “selectors”.. this hurts my brain:
subscriber#>>'{subscription, status}’ = ‘ACTIVE’
This would be more what I am used to:
subscriber[‘subscription’][‘status’] = ‘ACTIVE’
I’m pretty sure that would conflict with existing postgres syntax.