Re: Parallel copy

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-10-30 20:37:30
Message-ID: 20201030203730.eicjk6542pwoicvb@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've done a bit more testing today, and I think the parsing is busted in
some way. Consider this:

test=# create extension random;
CREATE EXTENSION

test=# create table t (a text);
CREATE TABLE

test=# insert into t select random_string(random_int(10, 256*1024)) from generate_series(1,10000);
INSERT 0 10000

test=# copy t to '/mnt/data/t.csv';
COPY 10000

test=# truncate t;
TRUNCATE TABLE

test=# copy t from '/mnt/data/t.csv';
COPY 10000

test=# truncate t;
TRUNCATE TABLE

test=# copy t from '/mnt/data/t.csv' with (parallel 2);
ERROR: invalid byte sequence for encoding "UTF8": 0x00
CONTEXT: COPY t, line 485: "m&\nh%_a"%r]>qtCl:Q5ltvF~;2oS6(at)HB>F>og,bD$Lw'nZY\tYl#BH\t{(j~ryoZ08"SGU~(dot)}8CcTRk1\ts$(at)U3szCC+U1U3i@P..."
parallel worker

The functions come from an extension I use to generate random data, I've
pushed it to github [1]. The random_string() generates a random string
with ASCII characters, symbols and a couple special characters (\r\n\t).
The intent was to try loading data where a fields may span multiple 64kB
blocks and may contain newlines etc.

The non-parallel copy works fine, the parallel one fails. I haven't
investigated the details, but I guess it gets confused about where a
string starts/end, or something like that.

[1] https://github.com/tvondra/random

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-10-30 20:56:00 Re: Parallel copy
Previous Message Heikki Linnakangas 2020-10-30 20:35:48 Re: making update/delete of inheritance trees scale better