From: | PT <wmoran(at)potentialtech(dot)com> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Farber <alexander(dot)farber(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Not sure if I should CREATE INDEX for text columns on which I plan to filter later |
Date: | 2018-02-23 11:02:13 |
Message-ID: | 20180223060213.be2cca4ceddbdc8bc18bfbe5@potentialtech.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, 23 Feb 2018 17:10:56 +1300
David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 23 February 2018 at 04:00, Bill Moran <wmoran(at)potentialtech(dot)com> wrote:
> > 2) The negative impact of an unused index is tiny compared to the negative
> > impact of a missing index.
>
> I'd say that would entirely depend on the workload of the table and
> the entire cluster. Disk space and I/O to write WAL and index pages to
> is surely a finite resource. Not to mention the possibility of
> disallowing HOT-Updates in the heap.
I feel like you missed my point. You're absolutely right, but the
disagreement was not on whether or not an unused index could cause
problems, but on the likelihood that the OP was going to build the
simulation code to actually determine whether the index is needed
or not. Information from the original question led me to believe
that simulation was either beyond his skill level or beyond his
time allocation; so I provided a less good, but more likely to be
helpful (in my opinion) answer.
The pattern that almost ALL new ventures I've seen follow is:
1) Hack something together based on an idea for a product
2) If the product actually succeeds, experience tons of issues
related to scaling and performance
3) Run around like a crazy person fixing all the scaling and
performance issues
4) Sell the company to someone else who ultimately becomes responsible
for maturing the software
In theory, there's no reason this pattern _has_ to be so prevalent,
yet it is. Probably becuase it appears to minimize the up front cost,
which the people footing the bill just love.
Given that process, "shotgun" indexing is part of step 1 or step 3.
Whereas the building of load simulations and _real_ tuning of the
system is relegated to step 4.
Since investers tend to want to get out quick if #2 isn't going to
happen, they don't want people doing work that they consider part
of step #4.
I'm a pragmatist. I'd love to see everyone build software in a sane,
well-considered manner. I'd also love to see government without
corruption. However, in the world I _actually_ see, those things aren't
prevalent.
> It would seem to me that anyone using the "shotgun" indexing method
> may end up having to learn more about indexing the hard way. Learning
> the hard way is something I like to try to avoid, personally. Probably
> it all has to come down to how important it is that your app actually
> can handle the load vs devs/dba experience level divided by time, both
> of the latter two are also a finite resource. So, it probably all has
> to be carefully balanced and quite possibly a person's opinion
> strongly relates to their experience. If you were in the air traffic
> control business, perhaps your opinion might not be the same!? ...
> Sorry, the aeroplane crashed because the replica instance lagged and
> the plane's location wasn't updated... Analysis shows that the DBA
> indexed every column in the table and the WAL volume was more than the
> network's bandwidth could handle over the holiday period. (Note: I
> know nothing about air traffic control, but it does seem like
> something you'd want to make stable systems for, games on the
> internet, probably less so...).
I really hope that people writing ATC software have the experience to
do really good testing (including load simulation, etc) but the 3 mile
island accident happened partially because of faulty sensor design, so
there's no guarantee that's the case.
Interesting discussion.
--
Bill Moran
From | Date | Subject | |
---|---|---|---|
Next Message | mariusz | 2018-02-23 14:10:00 | Re: Given a set of daterange, finding the continuous range that includes a particular date |
Previous Message | David G. Johnston | 2018-02-23 05:58:48 | Re: Postgres hangs for the query "lock table <Tablename> in exclusive mode" |