From: | Matija Lesar <matija(dot)lesar(at)gmail(dot)com> |
---|---|
To: | Tom Smith <tomsmith1989sk(at)gmail(dot)com> |
Cc: | Vik Fearing <vik(at)2ndquadrant(dot)fr>, PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: TABLESAMPLE usage |
Date: | 2016-01-25 09:44:33 |
Message-ID: | CAPx3hmPVfnaw6kZzTSSseLqiDwiF174MMJ0vj8RdRu70FipTnA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 25 January 2016 at 09:55, Tom Smith <tomsmith1989sk(at)gmail(dot)com> wrote:
> Thanks, the solution would work for fixed interval timestamp.
> But the data I am dealing with has irregular timestamp so can not be
> generated with exact steps.
>
> I would consider this a special case/method of random sampling, evenly
> distributed sampling according to the defined timestamp index.
>
> On Mon, Jan 25, 2016 at 3:48 AM, Vik Fearing <vik(at)2ndquadrant(dot)fr> wrote:
>
>> On 01/25/2016 05:09 AM, Tom Smith wrote:
>> > Hello:
>> >
>> > I have a big table with that is always appended with new data with a
>> unique
>> > sequence id (always incremented, or timestamp as unique index) each
>> row.
>> > I'd like to sample, say 100 rows out of say 1000 rows evently across all
>> > the rows,
>> > so that it would return rows of1, 101, 201, 301 you get idea.
>> > can TABLESAMPLE get one row for every 100 rows, based on the order
>> > of the rows added to table using the timestamp as already indexed/sorted
>> > sequence
>>
>> No, TABLESAMPLE is intended to take a random sampling of the data using
>> various methods.
>>
>> You're looking for something more like this:
>>
>> select t.*
>> from generate_series(1, (select max(id) from t), 100) g
>> join t on t.id = g;
>> --
>> Vik Fearing +33 6 46 75 15 36
>> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>>
>
>
Hi,
you can accomplish this with row_number()
<http://www.postgresql.org/docs/8.4/static/functions-window.html#FUNCTIONS-WINDOW-TABLE>
:
WITH data_cte as (
SELECT
id,
clock_timestamp() as ctimestamp
FROM generate_series(1,1000) as id
)
SELECT
*
FROM
(SELECT
id,
ctimestamp,
row_number() OVER (ORDER BY ctimestamp) as rownum
FROM data_cte
) as data_withrownumbers
WHERE
rownum%100=1;
Bye,
Matija Lesar
From | Date | Subject | |
---|---|---|---|
Next Message | Brian Cardarella | 2016-01-25 10:16:23 | Possible to dump/load a database from within psql? |
Previous Message | David Rowley | 2016-01-25 09:44:06 | Re: Performance options for CPU bound multi-SUM query |