Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.

From: Abhishek Mittal <abhmittal(at)gmail(dot)com>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.
Date: 2024-09-19 13:55:18
Message-ID: CAEHmedAWf-7LBuKNA_CDnuQq+NQo0=uzzcMyk017EAf71kxMdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Junwangm,

Could you please try with the program, which was provided by me.

Regards
Abhishek Mittal

On Thu, Sep 19, 2024 at 6:01 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:

> Hi Abhishek,
>
> On Thu, Sep 19, 2024 at 7:04 PM Abhishek Mittal <abhmittal(at)gmail(dot)com>
> wrote:
> >
> > Hi Junwang,
> >
> > Please find the table schema below along with the attached sample CSV
> file.
> > test=> \d my_table
> > Table "public.my_table"
> > Column | Type | Collation | Nullable |
> Default
> >
> ----------------+-------------------------+-----------+----------+---------
> > id | integer | | |
> > uindex | integer | | |
> > pindex | integer | | |
> > hid | integer | | |
> > hcode | bigint | | |
> > len | integer | | |
> > uname | character varying(4096) | | |
> > plen | integer | | |
> > pname | character varying(8192) | | |
> >
> > Note: There are no indexes on any of the columns.
> >
> > Regards
> > Abhishek Mittal
> >
> > On Thu, Sep 19, 2024 at 2:57 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> >>
> >> On Thu, Sep 19, 2024 at 4:28 PM PG Bug reporting form
> >> <noreply(at)postgresql(dot)org> wrote:
> >> >
> >> > The following bug has been logged on the website:
> >> >
> >> > Bug reference: 18624
> >> > Logged by: Abhishek Mittal
> >> > Email address: abhmittal(at)gmail(dot)com
> >> > PostgreSQL version: 12.16
> >> > Operating system: ubuntu0.20.04.1
> >> > Description:
> >> >
> >> > We are currently experiencing a memory leak issue with the PostgreSQL
> >> > connection process while executing the COPY command multiple times on
> the
> >> > same connection.
> >> >
> >> > Issue Details:
> >> > 1. We establish a connection using the following command:
> >> > psql <DB_NAME> <USER_NAME>
> >> >
> >> > 2. We retrieve the process ID associated with this connection from the
> >> > pg_stat_activity table using the query:
> >> > SELECT pid, query FROM pg_stat_activity;
> >> >
> >> > 3. We then monitor this process ID using the top command:
> >> > top -p <PID>
> >> >
> >> > 4. We observe that the RES (resident) memory reported by the top
> command
> >> > continuously increases. This memory increase only occurs when we run
> queries
> >> > on this connection. If no queries are executed, the RES memory remains
> >> > stable.
> >> >
> >> > 5. We attempted to clear the cache by executing the DISCARD ALL
> command
> >> > after each query, but the issue persists.
> >> >
> >> > Additionally, we tested the process using a C program with libpg, and
> the
> >> > issue remains the same.
> >> >
> >> > Additional Information:
> >> > We are executing only the COPY command to copy a CSV file into a
> database
> >> > table.
> >> > In our scenario, the RES memory usage has reached approximately 10 GB.
> >> >
> >> > Could you please provide guidance or recommend a solution to address
> this
> >> > memory leak issue?
> >> >
> >> >
> >> > Please find the sample C program below.
> >> > int db_copy_into_db(PGconn *conn, char *buf, int read_bytes)
> >> > {
> >> > PGresult *res;
> >> > char copy_cmd[1024];
> >> > int ret;
> >> >
> >> > /*If read_bytes are 0 then no need for further processing */
> >> > if(!read_bytes)
> >> > return 0;
> >> >
> >> > sprintf(copy_cmd, "COPY %s FROM STDIN WITH DELIMITER ',' NULL AS ''
> CSV",
> >> > "my_table");
> >> > res = PQexec(conn, copy_cmd);
> >> > if (PQresultStatus(res) != PGRES_COPY_IN)
> >> > {
> >> > fprintf(stderr,"COPY command for %s table failed: %s\n",
> >> > "urltable_1114", PQerrorMessage(conn));
> >> > conn = NULL;
> >> > return -1;
> >> > }
> >> > //fprintf(stdout ,"COPY command is started\n");
> >> >
> >> > ret = PQputCopyData(conn, buf, read_bytes);
> >> > //fprintf(stdout, "ret = %d\n", ret);
> >> > if(ret == -1)
> >> > {
> >> > fprintf(stderr, "PQputCopyData command failed: %s\n",
> >> > PQerrorMessage(conn));
> >> > conn = NULL;
> >> > return -1;
> >> > }
> >> >
> >> > /* Ends the COPY_IN operation successfully if errormsg is NULL.
> >> > * If errormsg is not NULL then the COPY is forced to fail,
> >> > * with the string pointed to by errormsg used as the error
> message. */
> >> > if(PQputCopyEnd(conn, NULL) != 1)
> >> > {
> >> > fprintf(stderr,"PQputCopyEnd() command for %s table failed: %s\n",
> >> > "urltable_1114", PQerrorMessage(conn));
> >> > conn = NULL;
> >> > }
> >> > PQclear(res);
> >> > res = PQexec(conn, "DISCARD ALL");
> >> >
> >> > if (PQresultStatus(res) != PGRES_COMMAND_OK)
> >> > {
> >> > fprintf(stderr, "DISCARD ALL command failed: %s",
> >> > PQerrorMessage(conn));
> >> > }
> >> > PQclear(res);
> >> > return 0;
> >> > }
> >> >
> >> > void read_csv_and_copy_inio_db( char *csv_file, PGconn *conn)
> >> > {
> >> > char* read_buf = malloc(sizeof(char) * 1024 * 1024 * 20);
> >> > int read_bytes;
> >> >
> >> > /* 1. open file
> >> > * read bufer
> >> > */
> >> > int fd;
> >> > int status;
> >> > fd = open(csv_file, O_RDONLY);
> >> > while (1)
> >> > {
> >> > //seek to byke 0
> >> > status = lseek(fd, 0, SEEK_SET);
> >> > if(status == -1)
> >> > fprintf(stderr, "Failed to seek offset file\n");
> >> > else
> >> > printf("Seeking to 0\n");
> >> > read_bytes = read(fd, read_buf, 1024 * 1024 * 20);
> >> > db_copy_into_db(conn, read_buf, read_bytes);
> >> >
> >> > getc(stdin);
> >> > }
> >> > }
> >> >
> >> > int main()
> >> > {
> >> > char csv_file[1024 * 10];
> >> > char *conn_info = "dbname=my_db user=abhi";
> >> > PGconn *conn = PQconnectdb(conn_info);
> >> > if(PQstatus(conn) != CONNECTION_OK)
> >> > {
> >> > fprintf(stderr, "Connection to database failed: %s\n",
> >> > PQerrorMessage(conn));
> >> > if(conn)
> >> > {
> >> > PQfinish(conn);
> >> > conn = NULL;
> >> > }
> >> > return -1;
> >> > }
> >> > sprintf(csv_file,"/tmp/my_table.csv");
> >> > read_csv_and_copy_inio_db(csv_file, conn);
> >> > }
> >> >
> >>
> >> Can you please provide the table schema and my_table.csv file?
> >>
> >> --
> >> Regards
> >> Junwang Zhao
> >
> >
> >
> > --
> > amittal
>
> I tried to reproduce the bug you reported against postgresql master
> but no luck, I built from source with asan, I didn't observe any leak.
>
> I did not test against 12.16 yet because I got some compiling issues.
>
> I use the following command btw, not sure if this is enough to
> reproduce the bug.
>
> select 'COPY my_table from ''/tmp/my_table.csv'' CSV' FROM
> generate_series(0,10000) \gexec
>
> --
> Regards
> Junwang Zhao
>

--
amittal

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrei Lepikhov 2024-09-19 16:47:30 Volatile functions under Memoize node
Previous Message Junwang Zhao 2024-09-19 12:31:24 Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.