From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | pgbench randomness initialization |
Date: | 2016-04-07 08:27:11 |
Message-ID: | 20160407082711.q7iq3ykffqxcszkv@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
pondering
http://archives.postgresql.org/message-id/CA%2BTgmoZJdA6K7-17K4A48rVB0UPR98HVuaNcfNNLrGsdb1uChg%40mail.gmail.com
et al I was wondering why it's a good idea for pgbench to do
INSTR_TIME_SET_CURRENT(start_time);
srandom((unsigned int) INSTR_TIME_GET_MICROSEC(start_time));
to initialize randomness and then
for (i = 0; i < nthreads; i++)
thread->random_state[0] = random();
thread->random_state[1] = random();
thread->random_state[2] = random();
to initialize the individual thread random state which is then used by
pg_erand48().
To me it seems better to instead initialize srandom() with a known value
(say, uh, 0). Or even better don't use random() at all, and fill a
global pg_erand48() with a known state; and use pg_erand48() to
initialize the thread states.
Obviously that doesn't make pgbench entirely reproducible, but it seems
a lot better than now. Individual threads would do work in a
reproducible order.
I see very little reason to have the current behaviour, or at the very
least not by default.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-04-07 08:43:24 | Re: Move PinBuffer and UnpinBuffer to atomics |
Previous Message | Simon Riggs | 2016-04-07 08:26:11 | Re: WIP: Detecting SSI conflicts before reporting constraint violations |