Re: Fsync IO issue

From: ProfiVPS Support <support(at)profivps(dot)hu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Fsync IO issue
Date: 2023-05-04 20:37:22
Message-ID: b99c6e70d03bb18394c343dc94e5d968@profivps.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

thank you for your response :)

Yes, that's exactly what's happening and I understand the issue with
fsync in these cases. But I see no workaround about this as the data is
ingested one-by-one (sent by collectd) and a db function handles it (it
has to do lookup and set state in a different table based on the
incoming value).

I feel like ANYTHING would be better than this. Even risking loosing
_some_ of the latest data in case of a server crash (if it crashes we
lose data anyways until restart, ofc we could have HA I know and we will
when there'll be a need) .

Around 100 times the write need for wall than the useful data! That's
insane. This is actually endangering the whole project we've been
working on for the last 1.5 years and I face this issue after 100k
devices have been added for a client. So I'm between a rock and a hard
place :(

Ye, I think this is called "experience", but I must be honest, I was
not expecting this at all :(

However, collectd's plugin does have an option to increase commit
interval, but that kept the records locked and it caused strange issues,
that's why I disabled it. I tried now to add that setting back and it
does ease the situation somewhat with write spikes on commit.

All in all, thank you for your help. Honestly, after todays journey I
thought that's the issue, but didn't want to believe it.

Thanks,

András

2023-05-04 21:21 időpontban Andres Freund ezt írta:

> Hi,
>
> On 2023-05-04 19:31:45 +0200, ProfiVPS Support wrote:
>
>> We are collecting around 400k values each 5 minutes into a hypertable.
>> (We
>> use timescaledb extension, I also shared this on timescale forum but
>> then I
>> realised the issue is postgresql related.)
>
> I don't know how timescale does its storage - how did you conclude this
> is
> about postgres, not about timescale? Obviously WAL write patterns
> depend on
> the way records are inserted and flushed.
>
>> I also found that the 16MB WAL segment got 80+ MB written into it
>> before
>> being closed. So what's happening here? Does fsync cause the whole
>> file to
>> be written out again and again?
>
> One possible reason for this is that you are committing small
> transactions
> very frequently. When a transaction commits, the commit records needs
> to be
> flushed to disk. If the transactions are small, the next commit might
> reside
> on the same page - which needs to be written out again. Which of course
> can
> increase the write rate considerably.
>
> Your workload does not sound like it actually needs to commit in tiny
> transactions? Some larger batching / using longer lived transactions
> might
> help a lot.
>
> Another possibility is that timescale does flush WAL too frequently for
> some
> reason...
>
> Greetings,
>
> Andres Freund

---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: Support(at)ProfiVPS(dot)hu

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Thomas Munro 2023-05-04 21:23:07 Re: Fsync IO issue
Previous Message Andres Freund 2023-05-04 19:21:46 Re: Fsync IO issue