Re: strange sql behavior

From: Yu Nie <nieyu04(at)gmail(dot)com>
To: Melvin Davidson <melvin6925(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: strange sql behavior
Date: 2016-02-02 03:47:50
Message-ID: CAPQGz4Lnk4Hfqvx-rhvXLDJDqmnkbssJG9X2sD0OCd+P0eYSfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Melvin,

Many thanks for your help. I just deleted the small table,along with
another large unused table (with about 1.2 billion rows) in the same
database, It frees up about 100 Gb space on the disk. I am reloading the
small table again and it will take a few more hours to complete. Once it
is done, I will let you know if the problem persists. I will then report
the info at your suggestion. By the way, I also did not disk
defragmentation after deleting the tables, not sure it helps...

Best, Marco

On Mon, Feb 1, 2016 at 8:50 PM, Melvin Davidson <melvin6925(at)gmail(dot)com>
wrote:

> OK, one more thing to try.
>
> Please send the output of the following query to a file and attach so I
> can review.
>
> SELECT c.relname as idxname,
> i.indisclustered,
> i.indisvalid,
> i.indcollation
> FROM pg_index i
> JOIN pg_class c ON i.indexrelid = c.oid
> WHERE i.indrelid IN (SELECT oid
> FROM pg_class
> WHERE relkind = 'r'
> AND relname = 'data2011_01'
> OR relname = 'data2013_01w'
> )
> ORDER BY 1;
>
>
> On Mon, Feb 1, 2016 at 5:51 PM, Yu Nie <nieyu04(at)gmail(dot)com> wrote:
>
>> Yeah, it is really puzzling. :(
>>
>> No, I have never attempted to cluster the large table - as I mentioned
>> before the two tables were created almost exactly the same way (through a
>> c++ api), except with different data files. I only use theme for query, and
>> had never done any update/insert/delete since the creation.
>>
>>
>>
>> On Mon, Feb 1, 2016 at 4:41 PM, Melvin Davidson <melvin6925(at)gmail(dot)com>
>> wrote:
>>
>>> At this point I can see no possible explanation why this would occur.
>>>
>>> Just a thought though, is it possible that data2011_01 was clustered on
>>> the index at some point but data2013_01w has not been clustered?
>>> If you cluster data2013_01w on the index, does the performance change?
>>>
>>> On Mon, Feb 1, 2016 at 4:03 PM, Yu Nie <nieyu04(at)gmail(dot)com> wrote:
>>>
>>>> Melvin,
>>>>
>>>> Please see attached for the requests results. I ran two queries (each
>>>> with a different taxiid that is next to each other) for each table. Note
>>>> that for the large table one is much faster than the other because the
>>>> shared cache was used for the second query. This does not work however for
>>>> the small table.
>>>>
>>>> Many thanks for your willingness to help!
>>>>
>>>> Best, Marco
>>>>
>>>> On Mon, Feb 1, 2016 at 2:25 PM, melvin6925 <melvin6925(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> Fine. Please rerun both explains and my queries and send ALL output to
>>>>> a file (F8 in PGADMIN Sql). Then attach the file.
>>>>>
>>>>>
>>>>>
>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>> -------- Original message --------
>>>>> From: Yu Nie <nieyu04(at)gmail(dot)com>
>>>>> Date: 2/1/2016 15:17 (GMT-05:00)
>>>>> To: melvin6925 <melvin6925(at)gmail(dot)com>
>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>
>>>>> Yes, absolutely.
>>>>>
>>>>> On Mon, Feb 1, 2016 at 2:12 PM, melvin6925 <melvin6925(at)gmail(dot)com>
>>>>> wrote:
>>>>>
>>>>>> And were _your_ queries run on the same day at the same time within a
>>>>>> few seconds of each other?
>>>>>>
>>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>>> -------- Original message --------
>>>>>> From: Yu Nie <nieyu04(at)gmail(dot)com>
>>>>>> Date: 2/1/2016 15:01 (GMT-05:00)
>>>>>> To: melvin6925 <melvin6925(at)gmail(dot)com>
>>>>>> Cc: Bill Moran <wmoran(at)potentialtech(dot)com>,
>>>>>> pgsql-general(at)postgresql(dot)org
>>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>>
>>>>>> Thanks, Bill and Melvin!
>>>>>>
>>>>>> Just some quick note/answers before I absorb all the information
>>>>>> provided by Bill.
>>>>>>
>>>>>> 1. I don't expect many users running queries against the tables,
>>>>>> especially for the small table - since I just created it this morning, and
>>>>>> nobody know about it except myself.
>>>>>>
>>>>>> 2. The setting in the config:
>>>>>>
>>>>>>
>>>>>> shared_buffers = 512MB # min 128kB
>>>>>> work_mem = 128MB # min 64kB
>>>>>>
>>>>>> 3. I am running a Windows 7 with 24 GB RAM. and my postgresal is
>>>>>> 9.4.
>>>>>>
>>>>>> 4. here is the query I ran:
>>>>>>
>>>>>> SELECT n.nspname,
>>>>>> s.relname,
>>>>>> c.reltuples::bigint,
>>>>>> n_tup_ins,
>>>>>> n_tup_upd,
>>>>>> n_tup_del,
>>>>>> date_trunc('second', last_vacuum) as last_vacuum,
>>>>>> date_trunc('second', last_autovacuum) as last_autovacuum,
>>>>>> date_trunc('second', last_analyze) as last_analyze,
>>>>>> date_trunc('second', last_autoanalyze) as last_autoanalyze
>>>>>> ,
>>>>>> round( current_setting('autovacuum_vacuum_threshold')::integer
>>>>>> + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
>>>>>> AS av_threshold
>>>>>> FROM pg_stat_all_tables s
>>>>>> JOIN pg_class c ON c.oid = s.relid
>>>>>> JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>>>>> WHERE s.relname NOT LIKE 'pg_%'
>>>>>> AND s.relname NOT LIKE 'sql_%'
>>>>>> AND s.relname IN ('data2013_01w', 'data2011_01')
>>>>>> ORDER by 1, 2;
>>>>>>
>>>>>> I copied the result from PGAdmin directly, here it is again.:
>>>>>>
>>>>>> public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
>>>>>> 17:31:08-06";156847423
>>>>>> public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
>>>>>> 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 1:45 PM, melvin6925 <melvin6925(at)gmail(dot)com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Bill.
>>>>>>> Also, it's very important to include the headers with the queries!
>>>>>>>
>>>>>>> Marco,
>>>>>>> There is no top secret information that is requested, so please do
>>>>>>> not edit the output.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>>>> -------- Original message --------
>>>>>>> From: Bill Moran <wmoran(at)potentialtech(dot)com>
>>>>>>> Date: 2/1/2016 14:41 (GMT-05:00)
>>>>>>> To: Yu Nie <nieyu04(at)gmail(dot)com>
>>>>>>> Cc: Melvin Davidson <melvin6925(at)gmail(dot)com>,
>>>>>>> pgsql-general(at)postgresql(dot)org
>>>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>>>
>>>>>>>
>>>>>>> Came a little late to the thread, see many comments inline below:
>>>>>>>
>>>>>>> On Mon, 1 Feb 2016 13:16:13 -0600
>>>>>>> Yu Nie <nieyu04(at)gmail(dot)com> wrote:
>>>>>>>
>>>>>>> > Thanks a lot for your reply. I ran the query you suggested and
>>>>>>> here are
>>>>>>> > the results
>>>>>>> >
>>>>>>> > Large table:
>>>>>>> "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
>>>>>>> > 17:31:08-06";156847423
>>>>>>> > Small table:
>>>>>>> "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
>>>>>>> > 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>>>>>
>>>>>>> You didn't do Melvin's query correctly. He specified 11 columns to
>>>>>>> select, but you only show 10. Since you don't show the query you
>>>>>>> actually ran, we don't know which of the numeric columns is missing.
>>>>>>>
>>>>>>> More information inline below:
>>>>>>>
>>>>>>> >
>>>>>>> > On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson <
>>>>>>> melvin6925(at)gmail(dot)com>
>>>>>>> > wrote:
>>>>>>> >
>>>>>>> > > One thing to look at is the last time both tables were
>>>>>>> vacuumed/analyzed.
>>>>>>> > >
>>>>>>> > > SELECT n.nspname,
>>>>>>> > > s.relname,
>>>>>>> > > c.reltuples::bigint,
>>>>>>> > > n_tup_ins,
>>>>>>> > > n_tup_upd,
>>>>>>> > > n_tup_del,
>>>>>>> > > date_trunc('second', last_vacuum) as last_vacuum,
>>>>>>> > > date_trunc('second', last_autovacuum) as last_autovacuum,
>>>>>>> > > date_trunc('second', last_analyze) as last_analyze,
>>>>>>> > > date_trunc('second', last_autoanalyze) as last_autoanalyze
>>>>>>> > > ,
>>>>>>> > > round(
>>>>>>> current_setting('autovacuum_vacuum_threshold')::integer +
>>>>>>> > > current_setting('autovacuum_vacuum_scale_factor')::numeric *
>>>>>>> C.reltuples)
>>>>>>> > > AS av_threshold
>>>>>>> > > FROM pg_stat_all_tables s
>>>>>>> > > JOIN pg_class c ON c.oid = s.relid
>>>>>>> > > JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>>>>>> > > WHERE s.relname NOT LIKE 'pg_%'
>>>>>>> > > AND s.relname NOT LIKE 'sql_%'
>>>>>>> > > AND s.relname IN ' "your_small_table", "your_large_table"'
>>>>>>> > > ORDER by 1, 2;
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > Also, please confirm the indexes for both tables are using the
>>>>>>> same method
>>>>>>> > > (btree?).
>>>>>>> > >
>>>>>>> > > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie <nieyu04(at)gmail(dot)com>
>>>>>>> wrote:
>>>>>>> > >
>>>>>>> > >> Hi there,
>>>>>>> > >>
>>>>>>> > >> Recently I am working with a large amount of taxis GIS data and
>>>>>>> had
>>>>>>> > >> encountered some weird performance issues. I am hoping someone
>>>>>>> in this
>>>>>>> > >> community can help me figure it out.
>>>>>>> > >>
>>>>>>> > >> The taxi data were loaded in 5 minute block into a table. I
>>>>>>> have two
>>>>>>> > >> separate such tables, one stores a month of data with about 700
>>>>>>> million
>>>>>>> > >> rows, another stores about 10 days of data with about 300
>>>>>>> million rows.
>>>>>>> > >> The two tables have the exactly same schema and indexes. There
>>>>>>> are two
>>>>>>> > >> indexes: one on taxiid (text), and the other on the time stamp
>>>>>>> (date
>>>>>>> > >> time). In order to process the data, I need to get all points
>>>>>>> for a single
>>>>>>> > >> taxis; to do that, I use something like:
>>>>>>> > >> select * from table1 where taxiid = 'SZB00S41' order by time;
>>>>>>> > >> What puzzled me greatly is that this query runs consistently
>>>>>>> much faster
>>>>>>> > >> for the large table than for the small table, which seems to
>>>>>>> contradict
>>>>>>> > >> with intuition. At the end of message you may find explain
>>>>>>> (analyze
>>>>>>> > >> buffer) results of two particular queries for the same taxiid
>>>>>>> (one for each
>>>>>>> > >> table). You can see that it took much longer (more than 20
>>>>>>> times) to get
>>>>>>> > >> 20k rows from the small table than to get 44 k rows from the
>>>>>>> large table.
>>>>>>> > >> Interestingly it seems that the planner does expect about 1/3
>>>>>>> work for the
>>>>>>> > >> small table query - yet for some reason, it took much longer to
>>>>>>> fetch the
>>>>>>> > >> rows from the small table. Why there is such a huge
>>>>>>> performance between
>>>>>>> > >> the two seemingly identical queries executed on two different
>>>>>>> tables?
>>>>>>> > >>
>>>>>>> > >> Is is because the data on the second table is on some
>>>>>>> mysteriously
>>>>>>> > >> "broken part" of the disk? what else could explain such a
>>>>>>> bizarre
>>>>>>> > >> behavior? Your help is greatly appreciated.
>>>>>>> > >>
>>>>>>> > >> The above behavior is consistent through all queries. Another
>>>>>>> issue I
>>>>>>> > >> identified is that for the large table, the query can use the
>>>>>>> shared buffer
>>>>>>> > >> more effectively. For example, after you query one taxiid and
>>>>>>> immediately
>>>>>>> > >> following that query run the same query for another taxi whose
>>>>>>> id ranks
>>>>>>> > >> right behind the first id, then shared hit buffers would be
>>>>>>> quite high (and
>>>>>>> > >> the query would run much faster); this however never works for
>>>>>>> the small
>>>>>>> > >> table.
>>>>>>> > >>
>>>>>>> > >> Thanks a lot!
>>>>>>> > >>
>>>>>>> > >> Best, Marco
>>>>>>> > >>
>>>>>>> > >>
>>>>>>> > >> Results for the small table: it took 141 seconds to finish.
>>>>>>> The planning
>>>>>>> > >> time is 85256.31
>>>>>>> > >>
>>>>>>> > >> "Sort (cost=85201.05..85256.31 rows=22101 width=55) (actual
>>>>>>> > >> time=141419.499..141420.025 rows=20288 loops=1)"
>>>>>>> > >> " Sort Key: "time""
>>>>>>> > >> " Sort Method: quicksort Memory: 3622kB"
>>>>>>> > >> " Buffers: shared hit=92 read=19816"
>>>>>>> > >> " -> Bitmap Heap Scan on data2013_01w (cost=515.86..83606.27
>>>>>>> > >> rows=22101 width=55) (actual time=50.762..141374.777 rows=20288
>>>>>>> loops=1)"
>>>>>>> > >> " Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> " Heap Blocks: exact=19826"
>>>>>>> > >> " Buffers: shared hit=92 read=19816"
>>>>>>>
>>>>>>> ^^
>>>>>>> Note that despite this table being smaller, Postgres had to read
>>>>>>> 19816
>>>>>>> blocks from disk, which is 2.5x more than the larger table.
>>>>>>>
>>>>>>> > >> " -> Bitmap Index Scan on data2013_01w_ixtaxiid
>>>>>>> > >> (cost=0.00..510.33 rows=22101 width=0) (actual
>>>>>>> time=26.053..26.053
>>>>>>> > >> rows=20288 loops=1)"
>>>>>>> > >> " Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> " Buffers: shared hit=4 read=78"
>>>>>>> > >> "Planning time: 0.144 ms"
>>>>>>> > >> "Execution time: 141421.154 ms"
>>>>>>> > >>
>>>>>>> > >> Results for the large table: it took 5 seconds to finish. The
>>>>>>> planning
>>>>>>> > >> time is 252077.10
>>>>>>> > >> "Sort (cost=251913.32..252077.10 rows=65512 width=55) (actual
>>>>>>> > >> time=5038.571..5039.765 rows=44204 loops=1)"
>>>>>>> > >> " Sort Key: "time""
>>>>>>> > >> " Sort Method: quicksort Memory: 7753kB"
>>>>>>> > >> " Buffers: shared hit=2 read=7543"
>>>>>>> > >> " -> Bitmap Heap Scan on data2011_01 (cost=1520.29..246672.53
>>>>>>> > >> rows=65512 width=55) (actual time=36.935..5017.463 rows=44204
>>>>>>> loops=1)"
>>>>>>> > >> " Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> " Heap Blocks: exact=7372"
>>>>>>> > >> " Buffers: shared hit=2 read=7543"
>>>>>>> > >> " -> Bitmap Index Scan on data2011_01_ixtaxiid
>>>>>>> > >> (cost=0.00..1503.92 rows=65512 width=0) (actual
>>>>>>> time=35.792..35.792
>>>>>>> > >> rows=44204 loops=1)"
>>>>>>>
>>>>>>> ^^
>>>>>>> Note that the larger table took LONGER to do the index work than the
>>>>>>> smaller table, which probably means there's nothing wrong with your
>>>>>>> disks or anything ... that's the behavior I would expect.
>>>>>>>
>>>>>>> > >> " Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> " Buffers: shared hit=2 read=171"
>>>>>>> > >> "Planning time: 0.127 ms"
>>>>>>> > >> "Execution time: 5042.134 ms"
>>>>>>>
>>>>>>> So, what I'm seeing, is that Postgres is able to figure out _which_
>>>>>>> rows
>>>>>>> to fetch faster on the small table than the large table, which is
>>>>>>> what
>>>>>>> you would expect, since a smaller index should be faster than a
>>>>>>> large one.
>>>>>>>
>>>>>>> However, when it goes to actually fetch the row data, it takes
>>>>>>> significantly longer on the small table, despite the fact that it's
>>>>>>> only fetching 1/3 as many rows. It is, however, doing 2.5x as many
>>>>>>> disk reads to get those rows: For the large table, it reads 61MB from
>>>>>>> disk, but it reads 160MB to get all the data for the smaller table.
>>>>>>>
>>>>>>> How the data was inserted into each table could lead to similar data
>>>>>>> being clustered on common pages on the large table, while it's spread
>>>>>>> across many more pages on the small table.
>>>>>>>
>>>>>>> That still doesn't explain it all, though. 2.5x the disk
>>>>>>> activity normally wouldn't equate to 28x the time required. Unless
>>>>>>> you're disks are horrifically slow? Does this server have a lot of
>>>>>>> other activity against the disks? I.e. are other people running
>>>>>>> queries that you would have to contend with, or is the server a VM
>>>>>>> sharing storage with other VMs, or even a combined use server that
>>>>>>> has to share disk access with (for example) a web or mail server
>>>>>>> as well? Is the performance difference consistently ~28x?
>>>>>>>
>>>>>>> Other things: what is shared_buffers set to? The queries would seem
>>>>>>> to indicate that this server has less than 1M of those two tables
>>>>>>> cached in memory at the time you ran those queries, which seems to
>>>>>>> suggest that either you've got shared_buffers set very low, or that
>>>>>>> there are a lot of other tables that other queries are accessing at
>>>>>>> the time you're running these. Perhaps installing pg_buffercache to
>>>>>>> have a look at what's using your shared_buffers would be helpful.
>>>>>>>
>>>>>>> --
>>>>>>> Bill Moran
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Melvin Davidson*
>>> I reserve the right to fantasize. Whether or not you
>>> wish to share my fantasy is entirely up to you.
>>>
>>
>>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize. Whether or not you
> wish to share my fantasy is entirely up to you.
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Heine Ferreira 2016-02-02 08:01:40 Postgres 9.5 and windows and desktop heap?
Previous Message drum.lucas@gmail.com 2016-02-02 01:02:49 Re: pg_dump - ERROR - PostgreSQL 9.2