xlog viewer prototype and new proposal

From: "Diogo Biazus" <diogob(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: xlog viewer prototype and new proposal
Date: 2006-07-07 05:35:52
Message-ID: eca519a10607062235t270b9345led5fb56666ae3a0a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've worked on a prototype (attached to this email) of the SRF function and
I can query the xlog files for some useful info.
I know that the error codes are still incorrect and the tests are missing,
but this is only a proof of concept.

Examples of usage:
Query for committed transactions on the xlog segment:

postgres=# SELECT * from
xlogviewer('/usr/local/pgsql/data/pg_xlog/000000010000000000000003') where
trim(infoname) = 'commit';
rmid | xid | rmname | info | len | total_len | infoname
------+-----+--------+------+-----+-----------+----------
1 | 4 | XACT | 0 | 12 | 40 | commit
1 | 5 | XACT | 0 | 12 | 40 | commit
...

Query for the total length of transactions in the segment:
postgres=# SELECT xid, sum(total_len) from
xlogviewer('/usr/local/pgsql/data/pg_xlog/000000010000000000000003') group
by xid;
xid | sum
-----+---------
499 | 69054
497 | 1460
...

I'll sumarize some relevant points of our previous thread (can be viewed at:
http://archives.postgresql.org/pgsql-hackers/2006-06/msg01069.php)

I exposed the idea of bringing the xlogdump functionality to a backend
module.
The main drawback is the use case where the database is down.
But the access to a failed cluster isn't impossible, just a little bit more
dificult, requiring another cluster to be initialized.
I don't think that this is a no-go for the design, of course for the case
where the database is down is still better to have a standalone tool.
So there were suggestions to code something that could be used both on the
frontend and the backend.
Tom said it would be difficult to use the same functions on backend and
frontend.
I think that I could use the same library and pass pointers to memory
allocation and error reporting functions.

Advantages in the SRF design:
- Using SQL to agregate and transform data in any way from the logs.
- It's easier for the DBA in the other use cases where the cluster is still
active.
- I already have a database connection in cases where I want to translate
oid to names.
- I can connect directly to the postgresql server if I want to query xlogs
in a remote machine (don't need remote access to the system).
- Easier to integrate with existing admin tools, like PgAdmin.

In any case I need to start ASAP, to have something useful till the end of
the google soc.
So if the way to go will be the standalone program, I think that I can
enhance it by adding:
- option to translate OIDs to names given a database connection
- find loser transactions (not commited to the end of the log)
- have an options to output only the transactions with their status and some
aggregate data (transaction size).
- a CSV output (to read it's data on another programs, including the
backend)

There's one functionality I would like to work on but I don't how dificult
it would be to acomplish in time:
- Extract the exact operation done in cases of xlog generated by
insert/update/delete.

--
Diogo Biazus - diogob(at)gmail(dot)com
Móvel Consultoria
http://www.movelinfo.com.br
http://www.postgresql.org.br

Attachment Content-Type Size
xlogviewer.tar.gz application/x-gzip 6.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2006-07-07 10:33:14 Re: request for feature: psql 'DSN' option
Previous Message Christopher Browne 2006-07-07 04:49:14 request for feature: psql "DSN" option