From: | hubert depesz lubaczewski <depesz(at)depesz(dot)com> |
---|---|
To: | Ivan Voras <ivoras(at)gmail(dot)com> |
Cc: | postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Indexes for hashes |
Date: | 2016-06-15 13:38:33 |
Message-ID: | 20160615133833.GA29449@depesz.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Jun 15, 2016 at 11:34:18AM +0200, Ivan Voras wrote:
> I have an application which stores a large amounts of hex-encoded hash
> strings (nearly 100 GB of them), which means:
Why do you keep them hex encoded, and not use bytea?
I made a sample table with 1 million rows, looking like this:
Table "public.new"
Column | Type | Modifiers
---------+-------+-----------
texthex | text |
a_bytea | bytea |
values are like:
$ select * from new limit 10;
texthex | a_bytea
------------------------------------------------------------------+--------------------------------------------------------------------
c968f64426b941bc9a8f6d4e87fc151c7a7192679837618570b7989c67c31e2f | \xc968f64426b941bc9a8f6d4e87fc151c7a7192679837618570b7989c67c31e2f
61dffbf002d7fc3db9df5953aca7f2e434a78d4c5fdd0db6f90f43ee8c4371db | \x61dffbf002d7fc3db9df5953aca7f2e434a78d4c5fdd0db6f90f43ee8c4371db
757acf228adf2357356fd38d03b80529771f211e0ad3b35b66a23d6de53e5033 | \x757acf228adf2357356fd38d03b80529771f211e0ad3b35b66a23d6de53e5033
fba35b8b33a7fccc2ac7d96389d43be509ff17636fe3c0f8a33af6d009f84f15 | \xfba35b8b33a7fccc2ac7d96389d43be509ff17636fe3c0f8a33af6d009f84f15
ecd8587a8b9acae650760cea8683f8e1c131c4054c0d64b1d7de0ff269ccc61a | \xecd8587a8b9acae650760cea8683f8e1c131c4054c0d64b1d7de0ff269ccc61a
11782c73bb3fc9f281b41d3eff8c1e7907b3494b3abe7b6982c1e88f49dad2ea | \x11782c73bb3fc9f281b41d3eff8c1e7907b3494b3abe7b6982c1e88f49dad2ea
5862bd8d645e4d44997a485c616bc18f1acabeaec5df3c3b09b9d4c08643e852 | \x5862bd8d645e4d44997a485c616bc18f1acabeaec5df3c3b09b9d4c08643e852
2d09a5cca2c03153a55faa3aff13df0f0593f4355a1b2cfcf9237c2931b4918c | \x2d09a5cca2c03153a55faa3aff13df0f0593f4355a1b2cfcf9237c2931b4918c
2186eb2bcc12319ee6e00f7e08a1d61e379a75c01c579c29d0338693bc31c7c7 | \x2186eb2bcc12319ee6e00f7e08a1d61e379a75c01c579c29d0338693bc31c7c7
2061bd05049c51bd1162e4d77f72a37f06d2397fc522ef587ed172a5ad8d57aa | \x2061bd05049c51bd1162e4d77f72a37f06d2397fc522ef587ed172a5ad8d57aa
(10 rows)
created two indexes:
create index i1 on new (texthex);
create index i2 on new (a_bytea);
i1 is 91MB, and i2 is 56MB.
Index creation was also much faster - best out of 3 runs for i1 was 4928.982
ms, best out of 3 runs for i2 was 2047.648 ms
Best regards,
depesz
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Voras | 2016-06-15 13:54:07 | Re: Indexes for hashes |
Previous Message | ktm@rice.edu | 2016-06-15 13:16:14 | Re: Indexes for hashes |