From: | Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> |
---|---|
To: | gkokolatos(at)pm(dot)me |
Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz> |
Subject: | Re: PATCH: Attempt to make dbsize a bit more consistent |
Date: | 2020-11-11 17:53:48 |
Message-ID: | CAE-ML+_kh4p519d+67UoJKaBQUgrpBWzkHxRjZiktGm6BMC3QQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hey Georgios,
On Tue, Nov 10, 2020 at 6:20 AM <gkokolatos(at)pm(dot)me> wrote:
>
>
>
>
>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> wrote:
>
> > Hey Georgios,
> >
> > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > my review below:
>
> A great review Soumyadeep, it is much appreciated.
> Please remember to add yourself as a reviewer in the commitfest
> [https://commitfest.postgresql.org/30/2701/]
Ah yes. Sorry, I forgot that!
> >
> > On Tue, Oct 13, 2020 at 6:28 AM gkokolatos(at)pm(dot)me wrote:
> >
> > 1.
> >
> > > /*
> > >
> > > - - heap size, including FSM and VM
> > >
> > > - - table size, including FSM and VM
> > > */
> > >
> >
> > We should not mention FSM and VM in dbsize.c at all as these are
> > heap AM specific. We can say:
> > table size, excluding TOAST relation
>
> Yeah, I was thinking that the notion that FSM and VM are still taken
> into account should be stated. We are iterating over ForkNumber
> after all.
>
> How about phrasing it as:
>
> + table size, including all implemented forks from the AM (e.g. FSM, VM)
> + excluding TOAST relations
>
> Thoughts?
Yes, I was thinking along the same lines. The concept of a "fork" forced
should not be forced down into the tableAM. But that is a discussion for
another day. We can probably say:
+ table size, including all implemented forks from the AM (e.g. FSM, VM
+ for the heap AM) excluding TOAST relations
> >
> > 2.
> >
> > > /*
> > >
> > > - Size of toast relation
> > > */
> > > if (OidIsValid(rel->rd_rel->reltoastrelid))
> > >
> > >
> > > - size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > >
> > > - {
> > > - Relation toastRel;
> > > -
> > > - toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> >
> > We can replace the OidIsValid check with relation_needs_toast_table()
> > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > make things more readable.
>
> Please, allow me to kindly disagree.
>
> Relation is already open at this stage. Even create_toast_table(), the
> internal workhorse for creating toast relations, does check reltoastrelid
> to test if the relation is already toasted.
>
> Furthermore, we do call:
>
> + toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
>
> and in order to avoid elog() errors underneath us, we ought to have
> verified the validity of reltoastrelid.
>
> In short, I think that the code in the proposal is not too unreadable
> nor that it breaks the coding patterns throughout the codebase.
>
> Am I too wrong?
No not at all. The code in the proposal is indeed adhering to the
codebase. What I was going for here was to increase the usage of
relation_needs_toast_table(). What I meant was:
if (table_relation_needs_toast_table(rel))
{
if (!OidIsValid(rel->rd_rel->reltoastrelid))
{
elog(ERROR, <errmsg that toast table wasn't found>);
}
//size calculation here..
}
We want to error out if the toast table can't be found and the relation
is expected to have one, which the existing code doesn't handle.
>
> >
> > 3.
> >
> > > - if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > - rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > - rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > - size = calculate_table_size(rel);
> > > - else
> > > - {
> > > - relation_close(rel, AccessShareLock);
> > > - PG_RETURN_NULL();
> > > - }
> >
> > This leads to behavioral changes:
> >
> > I am talking about the case where one calls pg_table_size() on an index.
> > W/o your patch, it returns the size of the index. W/ your patch it
> > returns NULL. Judging by the documentation, this function should not
> > ideally apply to indexes but it does! I have a sinking feeling that lots
> > of users use it for this purpose, as there is no function to calculate
> > the size of a single specific index (except pg_relation_size()).
> > The same argument I have made above applies to sequences. Users may have
> > trial-and-errored their way into finding that pg_table_size() can tell
> > them the size of a specific index/sequence! I don't know how widespread
> > the use is in the user community, so IMO maybe we should be conservative
> > and not introduce this change? Alternatively, we could call out that
> > pg_table_size() is only for tables by throwing an error if anything
> > other than a table is passed in.
> >
> > If we decide to preserve the existing behavior of the pg_table_size():
> > It seems that for things not backed by the tableAM (indexes and
> > sequences), they should still go through calculate_relation_size().
> > We can call table_relation_size() based on if relkind is
> > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > such as a partitioned table (Currently w/ the patch applied, we return
> > NULL for those cases, which is another behavior change)
>
>
> Excellent point. This is the discussion I was longing to have.
>
> I stand by the decision coded in the patch, that pg_table_size() should
> return NULL for other kinds of relations, such as indexes, sequences
> etc.
>
> It is a conscious decision based on the following:
>
> * Supported by the documentation, pg_table_size() applies to tables only.
> For other uses the higher-level functions pg_total_relation_size() or
> pg_relation_size() should be used.
> * Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This makes perfect sense for the
> scenarios described in the commit:
>
> That avoids errors when the functions are used in queries like
> "SELECT pg_relation_size(oid) FROM pg_class",
> and a table is dropped concurrently.
>
> IMHO: It is more consistent to return NULL when the relation does exist
> OR it is not a table kind.
> * Returning 0 for things that do not have storage, is nonsensical because
> it implies that it can be NON zero at some point. Things that do not
> have storage have an unknown size.
Fair. We will have to document the behavior change.
>
> As far as for the argument that users might have trialed and errored
> their way into undocumented behaviour, I do not think it is strong
> enough to stop us from implementing the documented behaviour.
Fair. I would strongly vote for having two additional functions
(pg_index_size() and pg_sequence_size()) to strongly signal our intent
of banning that kind of use of pg_table_size(). I think it would help
users a lot. It is not easy to find what function to call when you want
the size of a single index/sequence.
> >
> > 4.
> >
> > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > gettext_noop("Access Method"));
> > >
> > > /*
> > >
> > >
> > > - * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > >
> > >
> > > - * sequences as it does not behave sanely for those.
> > >
> > >
> > > - *
> > > * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > * size of a table, including FSM, VM and TOAST tables.
> > > */
> > >
> > >
> > >
> > > - if (pset.sversion >= 90000)
> > >
> > >
> > >
> > > - if (pset.sversion >= 140000)
> > >
> > >
> > > - appendPQExpBuffer(&buf,
> > >
> > >
> > > - ",\\n CASE"
> > >
> > >
> > > - " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > >
> > >
> > > - CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > >
> > >
> > > - CppAsString2(RELKIND_SEQUENCE)") THEN"
> > >
> > >
> > > - " pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > >
> > >
> > > - " ELSE"
> > >
> > >
> > > - " pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > >
> > >
> > > - " END as \\"%s\\"",
> > >
> > >
> > > - gettext_noop("Size"));
> > >
> > >
> > > - else if (pset.sversion >= 90000)
> > > appendPQExpBuffer(&buf,
> > > ",\\n pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \\"%s\\"",
> > > gettext_noop("Size"));
> > >
> > >
> >
> > Following on from point 3, if we decide to preserve the existing behavior,
> > we wouldn't need this change, as it would be internalized within
> > pg_table_size().
>
> We really should not decide to preserve the existing behaviour.
>
> I will reiterate my point: Returning 0 for things that do not have
> storage, implies that it can be NON zero at some point. We should not
> treat pg_table_size() as an alias for pg_relation_size().
+1
> >
> > 4.
> >
> > > > - return size;
> > > >
> > > > - Assert(size < PG_INT64_MAX);
> > > >
> > > > -
> > > > - return (int64)size;
> > > > I assume that this change, and the other ones like that, aim to handle int64
> > > > overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > however theoretical it may be. Wouldn't it be better to check for overflow
> > > > before adding to size rather than after the fact? Something along the lines of
> > > > checking for headroom left:
> > > > rel_size = table_relation_size(..);
> > > > if (rel_size > (PG_INT64_MAX - total_size))
> > > > < error codepath >
> > > >
> > > >
> > > > total_size += rel_size;
> > >
> > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the calling function returns int64.
> > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return different types. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more than 9200 PetaByte tables.
> > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of functions to return uint64 instead.
> >
> > Changing the signature would be the ideal change for all of this here.
> > But Postgres does not have support for an unsigned 64bit integer (bigint
> > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > to what Daniel said above.
>
> Apologies, I do not follow. Are you suggesting that we should
> introduce overflow tests?
Yes, to introduce the same overflow test that Daniel suggested above as
returning a uint64 in PG does not really return a uint64 AFAIU (since
the pg_**size() functions all return bigint which is signed and there
is no uint64 user-facing type).
Regards,
Soumyadeep (VMware)
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2020-11-11 18:17:03 | Re: Support for NSS as a libpq TLS backend |
Previous Message | vignesh C | 2020-11-11 17:12:24 | Re: Parallel copy |