Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Abhishek Mittal <abhmittal(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.
Date: 2024-09-19 12:31:24
Message-ID: CAEG8a3+T8o78GdmMrZ3mZdo01S_M+tjYE13iaRQTezVwdLgNNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Abhishek,

On Thu, Sep 19, 2024 at 7:04 PM Abhishek Mittal <abhmittal(at)gmail(dot)com> wrote:
>
> Hi Junwang,
>
> Please find the table schema below along with the attached sample CSV file.
> test=> \d my_table
> Table "public.my_table"
> Column | Type | Collation | Nullable | Default
> ----------------+-------------------------+-----------+----------+---------
> id | integer | | |
> uindex | integer | | |
> pindex | integer | | |
> hid | integer | | |
> hcode | bigint | | |
> len | integer | | |
> uname | character varying(4096) | | |
> plen | integer | | |
> pname | character varying(8192) | | |
>
> Note: There are no indexes on any of the columns.
>
> Regards
> Abhishek Mittal
>
> On Thu, Sep 19, 2024 at 2:57 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
>>
>> On Thu, Sep 19, 2024 at 4:28 PM PG Bug reporting form
>> <noreply(at)postgresql(dot)org> wrote:
>> >
>> > The following bug has been logged on the website:
>> >
>> > Bug reference: 18624
>> > Logged by: Abhishek Mittal
>> > Email address: abhmittal(at)gmail(dot)com
>> > PostgreSQL version: 12.16
>> > Operating system: ubuntu0.20.04.1
>> > Description:
>> >
>> > We are currently experiencing a memory leak issue with the PostgreSQL
>> > connection process while executing the COPY command multiple times on the
>> > same connection.
>> >
>> > Issue Details:
>> > 1. We establish a connection using the following command:
>> > psql <DB_NAME> <USER_NAME>
>> >
>> > 2. We retrieve the process ID associated with this connection from the
>> > pg_stat_activity table using the query:
>> > SELECT pid, query FROM pg_stat_activity;
>> >
>> > 3. We then monitor this process ID using the top command:
>> > top -p <PID>
>> >
>> > 4. We observe that the RES (resident) memory reported by the top command
>> > continuously increases. This memory increase only occurs when we run queries
>> > on this connection. If no queries are executed, the RES memory remains
>> > stable.
>> >
>> > 5. We attempted to clear the cache by executing the DISCARD ALL command
>> > after each query, but the issue persists.
>> >
>> > Additionally, we tested the process using a C program with libpg, and the
>> > issue remains the same.
>> >
>> > Additional Information:
>> > We are executing only the COPY command to copy a CSV file into a database
>> > table.
>> > In our scenario, the RES memory usage has reached approximately 10 GB.
>> >
>> > Could you please provide guidance or recommend a solution to address this
>> > memory leak issue?
>> >
>> >
>> > Please find the sample C program below.
>> > int db_copy_into_db(PGconn *conn, char *buf, int read_bytes)
>> > {
>> > PGresult *res;
>> > char copy_cmd[1024];
>> > int ret;
>> >
>> > /*If read_bytes are 0 then no need for further processing */
>> > if(!read_bytes)
>> > return 0;
>> >
>> > sprintf(copy_cmd, "COPY %s FROM STDIN WITH DELIMITER ',' NULL AS '' CSV",
>> > "my_table");
>> > res = PQexec(conn, copy_cmd);
>> > if (PQresultStatus(res) != PGRES_COPY_IN)
>> > {
>> > fprintf(stderr,"COPY command for %s table failed: %s\n",
>> > "urltable_1114", PQerrorMessage(conn));
>> > conn = NULL;
>> > return -1;
>> > }
>> > //fprintf(stdout ,"COPY command is started\n");
>> >
>> > ret = PQputCopyData(conn, buf, read_bytes);
>> > //fprintf(stdout, "ret = %d\n", ret);
>> > if(ret == -1)
>> > {
>> > fprintf(stderr, "PQputCopyData command failed: %s\n",
>> > PQerrorMessage(conn));
>> > conn = NULL;
>> > return -1;
>> > }
>> >
>> > /* Ends the COPY_IN operation successfully if errormsg is NULL.
>> > * If errormsg is not NULL then the COPY is forced to fail,
>> > * with the string pointed to by errormsg used as the error message. */
>> > if(PQputCopyEnd(conn, NULL) != 1)
>> > {
>> > fprintf(stderr,"PQputCopyEnd() command for %s table failed: %s\n",
>> > "urltable_1114", PQerrorMessage(conn));
>> > conn = NULL;
>> > }
>> > PQclear(res);
>> > res = PQexec(conn, "DISCARD ALL");
>> >
>> > if (PQresultStatus(res) != PGRES_COMMAND_OK)
>> > {
>> > fprintf(stderr, "DISCARD ALL command failed: %s",
>> > PQerrorMessage(conn));
>> > }
>> > PQclear(res);
>> > return 0;
>> > }
>> >
>> > void read_csv_and_copy_inio_db( char *csv_file, PGconn *conn)
>> > {
>> > char* read_buf = malloc(sizeof(char) * 1024 * 1024 * 20);
>> > int read_bytes;
>> >
>> > /* 1. open file
>> > * read bufer
>> > */
>> > int fd;
>> > int status;
>> > fd = open(csv_file, O_RDONLY);
>> > while (1)
>> > {
>> > //seek to byke 0
>> > status = lseek(fd, 0, SEEK_SET);
>> > if(status == -1)
>> > fprintf(stderr, "Failed to seek offset file\n");
>> > else
>> > printf("Seeking to 0\n");
>> > read_bytes = read(fd, read_buf, 1024 * 1024 * 20);
>> > db_copy_into_db(conn, read_buf, read_bytes);
>> >
>> > getc(stdin);
>> > }
>> > }
>> >
>> > int main()
>> > {
>> > char csv_file[1024 * 10];
>> > char *conn_info = "dbname=my_db user=abhi";
>> > PGconn *conn = PQconnectdb(conn_info);
>> > if(PQstatus(conn) != CONNECTION_OK)
>> > {
>> > fprintf(stderr, "Connection to database failed: %s\n",
>> > PQerrorMessage(conn));
>> > if(conn)
>> > {
>> > PQfinish(conn);
>> > conn = NULL;
>> > }
>> > return -1;
>> > }
>> > sprintf(csv_file,"/tmp/my_table.csv");
>> > read_csv_and_copy_inio_db(csv_file, conn);
>> > }
>> >
>>
>> Can you please provide the table schema and my_table.csv file?
>>
>> --
>> Regards
>> Junwang Zhao
>
>
>
> --
> amittal

I tried to reproduce the bug you reported against postgresql master
but no luck, I built from source with asan, I didn't observe any leak.

I did not test against 12.16 yet because I got some compiling issues.

I use the following command btw, not sure if this is enough to
reproduce the bug.

select 'COPY my_table from ''/tmp/my_table.csv'' CSV' FROM
generate_series(0,10000) \gexec

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Abhishek Mittal 2024-09-19 13:55:18 Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.
Previous Message Abhishek Mittal 2024-09-19 11:04:01 Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.