From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: checkpointer continuous flushing |
Date: | 2015-08-17 11:41:38 |
Message-ID: | 20150817114138.GG3522@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-08-11 17:15:22 +0200, Fabien COELHO wrote:
> +void
> +PerformFileFlush(FileFlushContext * context)
> +{
> + if (context->ncalls != 0)
> + {
> + int rc;
> +
> +#if defined(HAVE_SYNC_FILE_RANGE)
> +
> + /* Linux: tell the memory manager to move these blocks to io so
> + * that they are considered for being actually written to disk.
> + */
> + rc = sync_file_range(context->fd, context->offset, context->nbytes,
> + SYNC_FILE_RANGE_WRITE);
> +
> +#elif defined(HAVE_POSIX_FADVISE)
> +
> + /* Others: say that data should not be kept in memory...
> + * This is not exactly what we want to say, because we want to write
> + * the data for durability but we may need it later nevertheless.
> + * It seems that Linux would free the memory *if* the data has
> + * already been written do disk, else the "dontneed" call is ignored.
> + * For FreeBSD this may have the desired effect of moving the
> + * data to the io layer, although the system does not seem to
> + * take into account the provided offset & size, so it is rather
> + * rough...
> + */
> + rc = posix_fadvise(context->fd, context->offset, context->nbytes,
> + POSIX_FADV_DONTNEED);
> +
> +#endif
> +
> + if (rc < 0)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not flush block " INT64_FORMAT
> + " on " INT64_FORMAT " blocks in file \"%s\": %m",
> + context->offset / BLCKSZ,
> + context->nbytes / BLCKSZ,
> + context->filename)));
> + }
I'm a bit wary that this might cause significant regressions on
platforms not supporting sync_file_range, but support posix_fadvise()
for workloads that are bigger than shared_buffers. Consider what happens
if the workload does *not* fit into shared_buffers but *does* fit into
the OS's buffer cache. Suddenly reads will go to disk again, no?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2015-08-17 12:01:38 | Re: Warnings around booleans |
Previous Message | Andres Freund | 2015-08-17 10:59:12 | Re: checkpointer continuous flushing |