From: | AJ Weber <aweber(at)comcast(dot)net> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Partition table in 9.0.x? |
Date: | 2013-01-08 18:04:55 |
Message-ID: | 50EC5FC7.7000902@comcast.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> It does if you use it without an argument, to display all the tables
> in the search path:
>
> jjanes=# \d+
> List of relations
> Schema | Name | Type | Owner | Size | Description
> --------+------------------+-------+--------+---------+-------------
> public | pgbench_accounts | table | jjanes | 128 MB |
> public | pgbench_branches | table | jjanes | 40 kB |
> public | pgbench_history | table | jjanes | 0 bytes |
> public | pgbench_tellers | table | jjanes | 40 kB |
> (4 rows)
>
> It rather annoys me that you actually get less information (no size,
> no owner) when you use \d+ on a named table. I don't know if there is
> a reason for that feature, or if it was just an oversight.
That is rather peculiar. Sorry for that.
Table in question is 9284MB
(Parent table is 621MB)
>
> The current constraint exclusion code is quite simple-minded and
> doesn't know how to make use of check constraints that use the mod
> function, so the indexes of all partitions would have to be searched
> for each order_num-driven query, even though we know the data could
> only exist in one of them. The constraint exclusion codes does
> understand check constraints that involve ranges.
Hmm. That's a bit of a limitation I didn't know about. I assume it
doesn't understand the percent (mod operator) just the same as not
understanding the MOD() function? Either way, I guess this strategy
does not pan-out.
> There could still be some benefit as the table data would be
> concentrated, even if the index data is not.
I'm reaching way, way back in my head, but I think _some_ RDBMS I worked
with previously had a way to "cluster" the rows around a single one of
the indexes on the table, thus putting the index and the row-data
"together" and reducing the number of IO's to retrieve the row if that
index was used. Am I understanding that PG's "cluster" is strictly to
group like rows together logically -- table data only, not to coordinate
the table row with the index upon which you clustered them?
>
>> and all rows for the same order would stay
>> within the same partition-table.
> But usually a given order_num would only be of interest for a fraction
> of a second before moving on to some other order_num of interest, so
> by the time the relevant partition become fully cached, it would no
> longer be hot. Or, if the partitions were small enough, you could
> assume that all rows would be dragged into memory when the first one
> was requested because they lay so close to each other. But it is not
> feasible to have a large enough number of partitions to make that
> happen. But if the table is clustered, this is exactly what you would
> get--the trouble would be keeping it clustered. If most of the
> line-items are inserted at the same time as each other, they probably
> should be fairly well clustered to start with.
Does decreasing the fill to like 90 help keep it clustered in-between
times that I could shutdown the app and perform a (re-) cluster on the
overall table? Problem is, with a table that size, and the hardware I'm
"blessed with", the cluster takes quite a bit of time. :(
From | Date | Subject | |
---|---|---|---|
Next Message | Midge Brown | 2013-01-08 18:25:52 | Re: Two Necessary Kernel Tweaks for Linux Systems |
Previous Message | Tom Lane | 2013-01-08 17:51:40 | Re: Partition table in 9.0.x? |