Re: Effective query for listing flags in use by messages in a folder

From: Hector Vass <hector(dot)vass(at)metametrics(dot)co(dot)uk>
To: Andreas Joseph Krogh <andreas(at)visena(dot)com>, "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Effective query for listing flags in use by messages in a folder
Date: 2015-03-17 13:31:28
Message-ID: 1426599087812.95161@metametrics.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

A couple of ideas to try ... the approach you take will depend on volume of records, how quickly/how much they change and the overheads of updating/maintaining this data

/** you could consider holding the flags as integers as you can do more stuff with simple math **/

drop table if exists message;
create table message(
folder_id integer not NULL,
is_seen int not null default 0,
is_replied int not null default 0,
is_forwarded int not null default 0,
is_deleted int not null default 0,
is_draft int not null default 0,
is_flagged int not null default 0
);
insert into message values
(1,0,0,0,0,0,0),
(2,0,0,0,0,0,1),
(3,0,0,0,0,1,1),
(4,0,0,0,1,0,1)
;
select folder_id from message where is_seen+is_replied+is_forwarded+is_deleted+is_draft+is_flagged>0;

/** of course holding flags as columns can be inefficient and not as flexible as holding them as key value pairs **/
drop table if exists message;
create table message(
folder_id integer not NULL,
is_flag char(1),
is_val int
);
insert into message values
(1,'s',0),(1,'r',0),(1,'f',0),(1,'d',0),(1,'a',0),(1,'f',0),
(2,'s',0),(2,'r',0),(2,'f',0),(2,'d',0),(2,'a',0),(2,'f',1),
(3,'s',0),(3,'r',0),(3,'f',0),(3,'d',0),(3,'a',1),(3,'f',1),
(4,'s',0),(4,'r',0),(4,'f',0),(4,'d',1),(4,'a',0),(4,'f',1)
;
select folder_id from message where is_val>0 group by 1;

/** key value pairs can use a lot of space as the folder_id has to be repeated ... **/
/** so why bother holding value of Zero at all just hold those with a flag**/
/** key value approach has advantages as you simply add flags so no update or delete operations **/
delete from message where is_val=0;
select folder_id from message group by 1;

/** Commonly the bottleneck comes down to the speed at which you can store/update the flags sql not the best
at generating bit map fields but you can or you can consider using a stored procedure in C ... then this would be my preferred approach **/

drop table if exists message;
create table message(
folder_id integer not NULL,
is_flag int
);
insert into message values
(1,0),
(2,1),
(3,3),
(4,5)
;
select folder_id from message where is_flag>0;
select folder_id,(is_flag::bit(7))::char(7) from message where is_flag>0;
--or can do bit level operations which are v fast
select 'deleted flag set',folder_id from message where is_flag&4>0;
select 'deleted flag set or seen draft set ',folder_id from message where is_flag&6>0;

/** if you can compress the data into bit field you may not need to bother with overhead of indexes as scanning whole table or partition can be very fast **/

Hector Vass

+44(0)7773 352 559
* Metametrics, International House, 107 Gloucester Road, Malmesbury, Wiltshire, SN16 0AJ
* www.metametrics.co.uk<http://www.metametrics.co.uk/>

________________________________
From: pgsql-sql-owner(at)postgresql(dot)org <pgsql-sql-owner(at)postgresql(dot)org> on behalf of Andreas Joseph Krogh <andreas(at)visena(dot)com>
Sent: 17 March 2015 07:45
To: pgsql-sql(at)postgresql(dot)org
Subject: [SQL] Effective query for listing flags in use by messages in a folder

Hi all.

On PG-9.3 (no JSONB),

For an IMAP-like system; I'm trying to figure out an effective way to query for "what flags are in use in a folder". A flag is considered used when one or more messages in that folder has the value=true.

The schema is like this:

create table message(
folder_id integer not NULL,
is_seen boolean NOT NULL default false,
is_replied boolean not null default false,
is_forwarded boolean not null default false,
is_deleted boolean not null default false,
is_draft boolean not null default false,
is_flagged boolean not null default false
);

I need the "flags" to be in the message-table for other queries to be as efficient as possible (no JOIN'ing), the system contains millions of messages.

create index message_folder_id_deleted_idx ON message(folder_id) where is_deleted = TRUE;
create index message_folder_id_forwarded_idx ON message(folder_id) where is_forwarded = TRUE;
create index message_folder_id_replied_idx ON message(folder_id) where is_replied = TRUE;
create index message_folder_id_seen_idx ON message(folder_id) where is_seen = TRUE;
create index message_folder_id_flagged_idx ON message(folder_id) where is_flagged = TRUE;
create index message_folder_id_draft_idx ON message(folder_id) where is_draft = TRUE;

select 'is_deleted' from (select * from message where folder_id = 3 AND is_deleted limit 1) as q
UNION
select 'is_forwarded' from (select * from message where folder_id = 3 AND is_forwarded limit 1) as q
UNION
select 'isreplied' from (select * from message where folder_id = 3 AND is_replied limit 1) as q
UNION
select 'is_seen' from (select * from message where folder_id = 3 AND is_seen limit 1) as q
UNION
select 'is_flagged' from (select * from message where folder_id = 3 AND is_flagged limit 1) as q
UNION
select 'is_draft' from (select * from message where folder_id = 3 AND is_draft limit 1) as q
;

Are there better ways to do this?

Thanks.

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com<mailto:andreas(at)visena(dot)com>
www.visena.com<https://www.visena.com>
[cid:part_F12291144747920LR13D]<https://www.visena.com>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Andreas Joseph Krogh 2015-03-17 21:02:48 Re: Effective query for listing flags in use by messages in a folder
Previous Message Andreas Joseph Krogh 2015-03-17 07:45:17 Effective query for listing flags in use by messages in a folder