Re: Question regarding querying some JSON/JSONB

From: Weston Weems <wweems(at)gmail(dot)com>
To: Steve Midgley <science(at)misuse(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Question regarding querying some JSON/JSONB
Date: 2015-04-27 18:50:48
Message-ID: CAHcN2MxWFjJHrRTEFvT2_iGC5dGRAxsBJqRzvYXddCUe3JF6pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

sorry if I was unclear... all I meant to say about the dates were that they
were the first key I use in the lookup in the json columns... they could be
anything...

So basically I'd like to say go pull these keys and perform the aggregation
(lets say status is 1-10, I want a count of each, so I have the count
queries that do that) this works, and provides OverallStatus1Count,
OverallStatus2Count etc

Now I'm just trying to figure out how to expand on this so I don't have to
run the query 12 times to effectively build aggregates on the fly for each
month.

Grouping may be the key, but being I'm looking for a count of values equal
to 1, then 2, then 3... a simple sum/group by may not be able to achieve
what I'm looking for... but may be the right clue as to how to get where I
need to go.

In any case, I'll definitely respond with anything I manage to get working,
to see if theres anything hugely alarming with it.

Weston

On Mon, Apr 27, 2015 at 11:42 AM, Steve Midgley <science(at)misuse(dot)org> wrote:

> I'm not sure I understand this part of your question: "what I'd like to
> do is pass in a number of dates and have those dates substituted in for the
> date keys"
>
> Are you trying to aggregate your query to add up all the "1" outputs in a
> sum from a large number of data structures featuring different dates? If
> so, I would think that you should use your current query as a dynamic table
> and then write a group by aggregation on it.
>
> If that assessment is right, then it seems like your real problem is that
> your date field is a key not a value? Can you restructure the json so you
> have the date as a value as well:
>
> ["struct":{"date": “2014-10-10”, “overall_status”:10, “avg_response:20},
> "struct":{"date": “2014-11-10”, “overall_status”:10, “avg_response:20}]
>
> Wouldn't that let you aggregate dates via a dynamic table? I'm writing
> free-hand (untested) but something like:
>
> select date, sum(overall_status) as StatusZeroCount, sum(avg_response) as
> Response10Count
> from
> select
> data->'struct'->'date' as date
> coalesce(data->'overall_status' as overall_status, 1)
> coalesce(data->'avg_response' as avg_response, 1)
> from some_table
> group by date
>
> I'm riffing there, but maybe that idea is useful?
>
> Steve
>
>
> On Mon, Apr 27, 2015 at 7:19 AM, Weston Weems <wweems(at)gmail(dot)com> wrote:
>
>> I'm having troubles with this query (or rather getting it to work exactly
>> as I expected to. I've also summarized here:
>>
>> http://pastecode.org/index.php/view/47785912
>>
>>
>> data
>> [“2014-10-10”:{“overall_status”:10, “avg_response:20},
>> “2014-11-10”:{“overall_status”:10, “avg_response:20}]
>> [“2014-10-10”:{“overall_status”:10, “avg_response:20},
>> “2014-11-10”:{“overall_status”:10, “avg_response:20}]
>>
>>
>> select
>> count(case when data->’2014-10-10’->’overall_status’ = 0 then 1) as
>> StatusZeroCount,
>> count(case when data->’2014-10-10’->’avg_response’ = 10 then 1) as
>> Response10Count,
>> from some_table where ...
>>
>>
>> This works, even for cases where records dont actually even have a key
>> for that date (which is what I want)
>>
>> Ideally what I'd like to do is pass in a number of dates and have those
>> dates substituted in for the date keys (and return counts even if the keys
>> dont exist in the json) in the count queries and return data like:
>>
>>
>> date | StatusZeroCount | Response10Count
>> 2014-10-10 | 10 | 2
>>
>>
>> The problems I see is first of all, how to say substittute in these
>> dates... seems like a subselect, where the date keys are pulled from the
>> parent query would work, but then I'd get one record with a ton of columns.
>>
>> Seems like there would probably be a way to group by key and group by
>> overall_status, avg_response and get counts of each or something too, but I
>> dont know.
>>
>>
>> I guess worst case scenario, I could just query the data 12 times (since
>> I'm basically saying get some counts for the last 12 mo) so conceivably 12
>> records, and the aggregates I can build against that data, but it seems
>> like that would be hugely wasteful.
>>
>>
>> Thanks for any advice in advance!
>>
>>
>>
>>
>>
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Suresh Raja 2015-05-05 20:13:42 documenting tables version control
Previous Message Steve Midgley 2015-04-27 18:42:45 Re: Question regarding querying some JSON/JSONB