Re: Making psql error out on output failures

From: David Zhang <david(dot)zhang(at)highgo(dot)ca>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Making psql error out on output failures
Date: 2020-02-18 18:28:26
Message-ID: a950d923-057c-ae22-2c0e-634160d02ead@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hi,

I did some further research on this bug. Here is the summary:

1. Tried to wrap "fputs" similar to "fprintf" redefined by "pg_fprintf",
but it ended up with too much error due to "fputs" is called everywhere
in "print_unaligned_text". If add an extra static variable to track the
output status, then it will be an overkill and potentially more bugs may
be introduced.

2. Investigated some other libraries such as libexplain
(http://libexplain.sourceforge.net/) but it definitely will introduce
the complexity.

3. I think a better way to resolve this issue will still be the solution
with an extra %m, which can make the error message much more informative
to the end users. The reason is that,

3.1 Providing the reasons for errors is required by PostgreSQL document,
https://www.postgresql.org/docs/12/error-style-guide.html
    "Reasons for Errors
    Messages should always state the reason why an error occurred. For
example:

    BAD:    could not open file %s
    BETTER: could not open file %s (I/O failure)
    If no reason is known you better fix the code."

3.2 Adding "%m" can provide a common and easy to understand reasons
crossing many operating systems. The "%m" fix has been tested on
platforms: CentOS 7, RedHat 7, Ubuntu 18.04, Windows 7/10, and macOS
Mojave 10.14, and it works.

3.3 If multiple errors happened after the root cause "No space left on
device", such as "No such file or directory" and "Permission denied",
then it make sense to report the latest one. The end users suppose to
know the latest error and solve it first. Eventually, "No space left on
device" error will be showing up.

3.4 Test log on different operating systems.

### CentOS 7
[postgres(at)localhost ~]$ uname -a
Linux localhost.localdomain 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6
15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

[postgres(at)localhost ~]$ sudo mount -t tmpfs -o rw,size=1M tmpfs /mnt/ramdisk
[postgres(at)localhost ~]$ df -h
tmpfs                    1.0M     0  1.0M   0% /mnt/ramdisk

[postgres(at)localhost ~]$ psql
psql (12.2)
Type "help" for help.

postgres=# select repeat('111', 1000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)

[postgres(at)localhost ~]$ psql -d postgres  -At -c "select repeat('111',
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)

[postgres(at)localhost ~]$ psql -d postgres  -At -c "select repeat('111',
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)

### RedHat 7
[root(at)rh7 postgres]# uname -a
Linux rh7 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019
x86_64 x86_64 x86_64 GNU/Linux
[postgres(at)rh7 ~]$ sudo mkdir -p /mnt/ramdisk

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for postgres:
[postgres(at)rh7 ~]$ sudo mount -t tmpfs -o rw,size=1M tmpfs /mnt/ramdisk

[postgres(at)rh7 ~]$ psql -d postgres
psql (12.2)
Type "help" for help.

postgres=# select repeat('111', 1000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)

[postgres(at)rh7 ~]$ psql -d postgres  -At -c "select repeat('111',
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)

[postgres(at)rh7 ~]$ psql -d postgres  -At -c "select repeat('111',
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)

### Ubuntu 18.04
postgres=# select repeat('111', 10000000) \g /mnt/ramdisk/file
Error printing tuples (No space left on device)
postgres=# \q

david(at)david-VM:~$ psql -d postgres  -At -c "select repeat('111',
1000000)" > /mnt/ramdisk/file
Error printing tuples (No space left on device)

david(at)david-VM:~$ psql -d postgres  -At -c "select repeat('111',
1000000)" -o /mnt/ramdisk/file
Error printing tuples (No space left on device)

### Windows 7
postgres=# select repeat('111', 1000000) \g E:/file
Error printing tuples (No space left on device)
postgres=# \q

C:\pg12.1\bin>psql -d postgres -U postgres -h 172.20.14.29 -At -c
"select repeat
('111', 100000)" >> E:/file
Error printing tuples (No space left on device)

C:\pg12.1\bin>psql -d postgres -U postgres -h 172.20.14.29 -At -c
"select repeat
('111', 100000)" -o E:/file
Error printing tuples (No space left on device)

### Windows 10
postgres=# select repeat('111', 1000000) \g E:/file
Error printing tuples (No space left on device)
postgres=# \q

C:\>psql -d postgres -U postgres -h 192.168.0.19 -At -c "select
repeat('111', 10000000)" -o E:/file
Error printing tuples (No space left on device)

C:\>psql -d postgres -U postgres -h 192.168.0.19 -At -c "select
repeat('111', 2000000)" >> E:/file
Error printing tuples (No space left on device)

### macOS Mojave 10.14
postgres=# select repeat('111', 1000000) \g  /Volumes/sdcard/file
Error printing tuples (No space left on device)
postgres=# \q

MacBP:bin david$ psql -d postgres -h 192.168.0.10 -At -c "select
repeat('111', 3000000)" > /Volumes/sdcard/file
Error printing tuples (No space left on device)

MacBP:bin david$ psql -d postgres -h 192.168.0.10 -At -c "select
repeat('111', 3000000)" -o /Volumes/sdcard/file
Error printing tuples (No space left on device)

On 2020-01-29 1:51 a.m., Daniel Verite wrote:
> David Zhang wrote:
>
>>> Are you sure? I don't find that redefinition. Besides
>>> print_aligned_text() also calls putc and puts.
>> Yes, below is the gdb debug message when psql first time detects the
>> error "No space left on device". Test case, "postgres=# select
>> repeat('111', 1000000) \g /mnt/ramdisk/file"
>> bt
>> #0 flushbuffer (target=0x7ffd6a709ad0) at snprintf.c:313
> Indeed. For some reason gdb won't let me step into these fprintf()
> calls, but you're right they're redefined (through include/port.h):
>
> #define vsnprintf pg_vsnprintf
> #define snprintf pg_snprintf
> #define vsprintf pg_vsprintf
> #define sprintf pg_sprintf
> #define vfprintf pg_vfprintf
> #define fprintf pg_fprintf
> #define vprintf pg_vprintf
> #define printf(...) pg_printf(__VA_ARGS__)
>
> Anyway, I don't see it leading to an actionable way to reliably keep
> errno, as discussed upthread.
>
> Best regards,
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

Attachment Content-Type Size
Fix-psql-to-report-IO-errors-on-writing-query-results_26-2400.patch text/plain 961 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-02-18 18:33:33 Re: DB running out of memory issues after upgrade
Previous Message Nagaraj Raj 2020-02-18 18:10:08 Re: DB running out of memory issues after upgrade