From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Improve WALRead() to suck data directly from WAL buffers when possible |
Date: | 2023-03-06 22:00:27 |
Message-ID: | 20230306220027.GA3122138@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
+void
+XLogReadFromBuffers(XLogRecPtr startptr,
+ TimeLineID tli,
+ Size count,
+ char *buf,
+ Size *read_bytes)
Since this function presently doesn't return anything, can we have it
return the number of bytes read instead of storing it in a pointer
variable?
+ ptr = startptr;
+ nbytes = count;
+ dst = buf;
These variables seem superfluous.
+ /*
+ * Requested WAL isn't available in WAL buffers, so return with
+ * what we have read so far.
+ */
+ break;
nitpick: I'd move this to the top so that you can save a level of
indentation.
if (expectedEndPtr != endptr)
break;
... logic for when the data is found in the WAL buffers ...
+ /*
+ * All the bytes are not in one page. Read available bytes on
+ * the current page, copy them over to output buffer and
+ * continue to read remaining bytes.
+ */
Is it possible to memcpy more than a page at a time?
+ /*
+ * The fact that we acquire WALBufMappingLock while reading the WAL
+ * buffer page itself guarantees that no one else initializes it or
+ * makes it ready for next use in AdvanceXLInsertBuffer().
+ *
+ * However, we perform basic page header checks for ensuring that
+ * we are not reading a page that just got initialized. Callers
+ * will anyway perform extensive page-level and record-level
+ * checks.
+ */
Hm. I wonder if we should make these assertions instead.
+ elog(DEBUG1, "read %zu bytes out of %zu bytes from WAL buffers for given LSN %X/%X, Timeline ID %u",
+ *read_bytes, count, LSN_FORMAT_ARGS(startptr), tli);
I definitely don't think we should put an elog() in this code path.
Perhaps this should be guarded behind WAL_DEBUG.
+ /*
+ * Check if we have read fully (hit), partially (partial hit) or
+ * nothing (miss) from WAL buffers. If we have read either partially or
+ * nothing, then continue to read the remaining bytes the usual way,
+ * that is, read from WAL file.
+ */
+ if (count == read_bytes)
+ {
+ /* Buffer hit, so return. */
+ return true;
+ }
+ else if (read_bytes > 0 && count > read_bytes)
+ {
+ /*
+ * Buffer partial hit, so reset the state to count the read bytes
+ * and continue.
+ */
+ buf += read_bytes;
+ startptr += read_bytes;
+ count -= read_bytes;
+ }
+
+ /* Buffer miss i.e., read_bytes = 0, so continue */
I think we can simplify this. We effectively take the same action any time
"count" doesn't equal "read_bytes", so there's no need for the "else if".
if (count == read_bytes)
return true;
buf += read_bytes;
startptr += read_bytes;
count -= read_bytes;
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2023-03-06 22:09:58 | Re: add PROCESS_MAIN to VACUUM |
Previous Message | David Rowley | 2023-03-06 21:52:08 | Re: using memoize in in paralel query decreases performance |