From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Herouth Maoz <herouth(at)unicell(dot)co(dot)il> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Question about optimizing access to a table. |
Date: | 2013-12-10 18:55:45 |
Message-ID: | CAMkU=1xZAtKddZngO4=YW7okHD06weYKBg53hEuEcGzXteKwyw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, Dec 10, 2013 at 8:23 AM, Herouth Maoz <herouth(at)unicell(dot)co(dot)il> wrote:
>
> Hello.
>
> I have one particular table with very specialized use. I am sending
> messages to some partner. The partner processes them asynchronously, and
> then returns the status report to me. The table is used to store a
> serialized version of the message object, together with a few identifiers,
> expiry date, and a reference from the partner. When the status report comes
> back from the partner, we:
>
>
> 1. Select the record using the partner's reference number
> 2. reconstruct and queue the object.
> 3. Delete the record from database using the serial primary key ID.
>
>
Where do you "queue" the object? Back into a different table within the
same database? Why not use "DELETE ...RETURNING"?
>
> Every once in a while we run a garbage collection process which selects
> expired messages, reconstructs and queues them for processing, and deletes
> the record using the primary key.
>
> This works wonderfully as long as the table remains small - a few thousand
> records waiting for their status report, and that's it. The table is set to
> have frequent auto-anylize runs.
>
You might want to turn
>
> The problem starts when our partner has some glitch, under high load, and
> fails to send back a few hundred thousand reports. In that case, the table
> grows to a few hundred records, and they are not deleted until they hit
> their expiry date, at which point the "garbage collector" takes care of
> them and everything goes back to normal. When it contains hundreds of
> thousands of records, performance deteriorates considerably.
>
There is no inherent reason the performance needs to degrade. Can you give
the specific queries that perform worse?
> I am trying to figure out a solution that will keep the system working
> well even when there is a large number of records in the table. At first I
> thought of partitioning the table on the partner's reference field. But the
> DELETE would actually slow down if I do this, right?
>
Not necessarily, but partitioning should be your last resort not your first
resort, and I don't see any reason it would be needed here.
Indexes:
> "transient_pkey" PRIMARY KEY, btree (id)
> "transient_msisdn_ref" btree (msisdn, smsc_reference)
> "transient_sm_vp" btree (validity)
>
What is the partner reference? If it is smsc_reference, then you probably
need a index in which that is the lead (or only) column.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2013-12-10 19:13:26 | Re: While only running pg_dump, postgresql performs writes inside base/ directory. How can we stop this? |
Previous Message | Kevin Grittner | 2013-12-10 18:55:23 | Re: Question about optimizing access to a table. |