From: | Andreas Joseph Krogh <andreas(at)visena(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Updated RUM-index and support for bigint as part of index |
Date: | 2016-08-11 20:33:50 |
Message-ID: | VisenaEmail.bf.b23b7ec74f28713.1567b293dcc@tc7-visena |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
På torsdag 11. august 2016 kl. 19:13:10, skrev Artur Zakirov <
a(dot)zakirov(at)postgrespro(dot)ru <mailto:a(dot)zakirov(at)postgrespro(dot)ru>>:
On 07.08.2016 11:05, Andreas Joseph Krogh wrote:
> På søndag 07. august 2016 kl. 08:27:06, skrev Oleg Bartunov
> <obartunov(at)gmail(dot)com <mailto:obartunov(at)gmail(dot)com>>:
>
> [snip]
> have you considered <=| and |=> operators ? <=> in ORDER BY works
> like KNN.
>
>
> I don't get how these operators should work. Neither give me the
> expected results.
>
> Using <=>
>
>
> SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
> origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
> 'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=>
> del.received_timestamp LIMIT 10;
>
> entity_id | folder_id | received_timestamp
> -----------+-----------+-------------------------
> 1224278 | 1068087 | 2015-08-17 23:53:26
> 1224382 | 1068087 | 2015-08-18 03:07:55
> 1224404 | 1068087 | 2015-08-18 03:49:02
> 1505713 | 48496 | 2015-10-27 14:51:45
> 142132 | 66658 | 2012-12-03 14:14:05.488
> 122565 | 90115 | 2012-11-20 15:41:04.936
> 200744 | 66655 | 2013-01-28 21:47:44.561
> 1445927 | 888665 | 2015-09-29 00:26:56
> 123671 | 83509 | 2012-11-21 14:16:26.448
> 1129928 | 66658 | 2015-05-09 08:39:14.128
> (10 rows)
>
>
> Using <=|
>
> SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
> origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
> 'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=|
> del.received_timestamp LIMIT 10;
>
>
> entity_id | folder_id | received_timestamp
> -----------+-----------+-------------------------
> 1224278 | 1068087 | 2015-08-17 23:53:26
> 1224382 | 1068087 | 2015-08-18 03:07:55
> 1224404 | 1068087 | 2015-08-18 03:49:02
> 1505713 | 48496 | 2015-10-27 14:51:45
> 142132 | 66658 | 2012-12-03 14:14:05.488
> 122565 | 90115 | 2012-11-20 15:41:04.936
> 200744 | 66655 | 2013-01-28 21:47:44.561
> 1445927 | 888665 | 2015-09-29 00:26:56
> 123671 | 83509 | 2012-11-21 14:16:26.448
> 1129928 | 66658 | 2015-05-09 08:39:14.128
> (10 rows)
>
>
> Neither are ordered by received_timestamp
>
> Can you explain how to get ORDER BY received_timestamp DESC?
>
> Thanks.
>
> --
> *Andreas Joseph Krogh*
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
> www.visena.com <https://www.visena.com>
> <https://www.visena.com>
>
Do you need simple ordering by received_timestamp column? Not ordering
by distance between received_timestamp and some date?
Then you can use simple "ORDER BY received_timestamp". For example, we
have data:
=# SELECT * FROM test;
id | fts | received
----+-------------+-------------------------
1 | 'andreas':1 | 2015-08-17 23:53:26
2 | 'andreas':1 | 2015-08-18 03:07:55
3 | 'andreas':1 | 2015-08-18 03:49:02
4 | 'andreas':1 | 2012-12-03 14:14:05.488
5 | 'andreas':1 | 2012-11-20 15:41:04.936
6 | 'andreas':1 | 2013-01-28 21:47:44.561
6 | 'andreas':1 | 2015-09-29 00:26:56
7 | 'andreas':1 | 2012-11-21 14:16:26.448
8 | 'andreas':1 | 2015-05-09 08:39:14.128
(9 rows)
I created index:
CREATE INDEX rumidx ON test USING rum (fts rum_tsvector_timestamp_ops,
received) WITH (attach = 'received', to = 'fts');
Then we can execute queries:
=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received LIMIT 8;
id | received
----+-------------------------
5 | 2012-11-20 15:41:04.936
7 | 2012-11-21 14:16:26.448
4 | 2012-12-03 14:14:05.488
6 | 2013-01-28 21:47:44.561
8 | 2015-05-09 08:39:14.128
1 | 2015-08-17 23:53:26
2 | 2015-08-18 03:07:55
3 | 2015-08-18 03:49:02
(8 rows)
=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received DESC LIMIT 8;
id | received
----+-------------------------
6 | 2015-09-29 00:26:56
3 | 2015-08-18 03:49:02
2 | 2015-08-18 03:07:55
1 | 2015-08-17 23:53:26
8 | 2015-05-09 08:39:14.128
6 | 2013-01-28 21:47:44.561
4 | 2012-12-03 14:14:05.488
7 | 2012-11-21 14:16:26.448
(8 rows)
Yes, this gives the correct result, but the whole motivation for using
RUM-index is for the query to use the same index for ORDER BY, as it seems to
do using the <=> operator.
The query you gave above does not the index for sorting AFAIU.
Operators <=>, |=>, <=| you can use to order by nearest date to specific
date:
[snip]
I hope this is what you want.
I still don't understand how my query which had ORDER BY '2000-01-01' ::
TIMESTAMP <=> del.received_timestamp
can produce the following ordering:
entity_id | folder_id | received_timestamp
-----------+-----------+-------------------------
1224278 | 1068087 | 2015-08-17 23:53:26
1224382 | 1068087 | 2015-08-18 03:07:55
1224404 | 1068087 | 2015-08-18 03:49:02
1505713 | 48496 | 2015-10-27 14:51:45
142132 | 66658 | 2012-12-03 14:14:05.488
122565 | 90115 | 2012-11-20 15:41:04.936
200744 | 66655 | 2013-01-28 21:47:44.561
1445927 | 888665 | 2015-09-29 00:26:56
123671 | 83509 | 2012-11-21 14:16:26.448
1129928 | 66658 | 2015-05-09 08:39:14.128
How can "nearest date to specific date" produce this ordering when the
specific date si 2000-01-01?
Thanks for explaining.
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-08-11 20:48:27 | Re: Postgres Pain Points: 1 pg_hba conf |
Previous Message | Andreas Joseph Krogh | 2016-08-11 20:20:24 | Re: Postgres Pain Points 2 ruby / node language drivers |