Re: Naming-scheme for db-files

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Markus Wollny <Markus(dot)Wollny(at)computec(dot)de>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Naming-scheme for db-files
Date: 2002-08-29 10:41:41
Message-ID: Pine.GSO.4.44.0208291335200.8881-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I didn't follow this thread, but I'm a little bit surprised !
txtdix itself is just a parsed data from corresponding column !
and txtidx stores only unique lexems, so size of txtidx column should
be lesser than original one ! I'd be interested to see more details
(if you could send me your data and sql script). Also,
did you use oid2name from contrib to see sizes of tables and indices ?
What is an output of 'select version()' ?
100,000 rows is not a big deal.

regards,

Oleg

On Thu, 29 Aug 2002, Markus Wollny wrote:

> Just a try on using txtidx for a speedy search over the e-mail-fields.
> No success, though - fti would have been better for the task (like
> finding all users who have got an aol.com-adress).
>
> > -----UrsprNngliche Nachricht-----
> > Von: Martijn van Oosterhout [mailto:kleptog(at)svana(dot)org]
> > Gesendet: Donnerstag, 29. August 2002 01:18
> > An: Markus Wollny
> > Cc: pgsql-general(at)postgresql(dot)org
> > Betreff: Re: [GENERAL] Naming-scheme for db-files
> >
> >
> > On Wed, Aug 28, 2002 at 11:25:44PM +0200, Markus Wollny wrote:
> > > After dropping the mentioned column emailindex of type txtidx (via
> > > copying the data to a temporary table and renaming this table
> > > afterwards), disk usage of that table evaporated from
> > 1862MB to a humble
> > > 61MB. And emailindex was completely empty... Very strange,
> > but ~60MB is
> > > very much more more like what I had estimated the size of
> > that table to
> > > be.
> >
> > What on earth is:
> >
> > emailindex txtidx,
> >
> > Anyway?
> >
> > > -----UrsprУМngliche Nachricht-----
> > > Von: Markus Wollny
> > > Gesendet: Mi 28.08.2002 17:04
> > > An: Martijn van Oosterhout
> > > Cc: pgsql-general(at)postgresql(dot)org
> > > Betreff: Re: [GENERAL] Naming-scheme for db-files
> > >
> > >
> > >
> > > Hi!
> > >
> > > Yes, I run vacuum every night - and log-output indicates no
> > > errors at
> > > all. Yet I am indeed quite puzzled about the size of this table.
> > > Is
> > > there some way of finding out which column consumes so much
> > > space?
> > >
> > > select count(*) from ct_com_user return 95858 rows.
> > > The CREATE-statement for this table is as follows:
> > >
> > > CREATE TABLE ct_com_user (
> > > user_id numeric(10, 0),
> > > login varchar(1000),
> > > password varchar(1000),
> > > status numeric(10, 0),
> > > rights varchar(20) DEFAULT 'r',
> > > firstname varchar(1000),
> > > firstname_visible numeric(1, 0) DEFAULT 0,
> > > lastname varchar(1000),
> > > lastname_visible numeric(1, 0) DEFAULT 0,
> > > clan varchar(1000),
> > > clan_visible numeric(1, 0) DEFAULT 0,
> > > street varchar(1000),
> > > street_visible numeric(1, 0) DEFAULT 0,
> > > zipcode varchar(1000),
> > > zipcode_visible numeric(1, 0) DEFAULT 0,
> > > city varchar(1000),
> > > city_visible numeric(1, 0) DEFAULT 0,
> > > country varchar(1000),
> > > country_visible numeric(1, 0) DEFAULT 0,
> > > phone1 varchar(1000),
> > > phone1_visible numeric(1, 0) DEFAULT 0,
> > > phone2 varchar(1000),
> > > phone2_visible numeric(1, 0) DEFAULT 0,
> > > mobile varchar(1000),
> > > mobile_visible numeric(1, 0) DEFAULT 0,
> > > fax varchar(1000),
> > > fax_visible numeric(1, 0) DEFAULT 0,
> > > email1 varchar(1000),
> > > email1_visible numeric(1, 0) DEFAULT 0,
> > > email2 varchar(1000),
> > > email2_visible numeric(1, 0) DEFAULT 0,
> > > icq varchar(1000),
> > > icq_visible numeric(1, 0) DEFAULT 0,
> > > homepage varchar(1000),
> > > homepage_visible numeric(1, 0) DEFAULT 0,
> > > description varchar(1000),
> > > description_visible numeric(1, 0) DEFAULT 0,
> > > hobbies varchar(1000),
> > > hobbies_visible numeric(1, 0) DEFAULT 0,
> > > signature1 varchar(4000),
> > > signature2 varchar(4000),
> > > signature3 varchar(4000),
> > > forum_view varchar(20) DEFAULT 'flat',
> > > temp_password varchar(100),
> > > registered timestamptz,
> > > last_login timestamptz,
> > > created timestamptz DEFAULT ('now'::text)::timestamp(6) with
> > > time
> > > zone,
> > > lines numeric(3, 0) DEFAULT 400,
> > > lines_forum numeric(3, 0) DEFAULT 20,
> > > forum_lines numeric(3, 0) DEFAULT 20,
> > > forum_smileys varchar(50) DEFAULT 'enhanced',
> > > site_id numeric(10, 0) DEFAULT 0,
> > > flag_id numeric(10, 0) DEFAULT 0,
> > > forum_quoting varchar(50) DEFAULT 'enhanced',
> > > forum_flatpostings numeric(3, 0) DEFAULT 10,
> > > forum_images numeric(1, 0) DEFAULT 2,
> > > user_image numeric(1, 0) DEFAULT 0,
> > > user_image_visible numeric(1, 0) DEFAULT 0,
> > > chat_password varchar(50),
> > > chat_last_visit timestamptz,
> > > authorpoints_visible numeric(1, 0) DEFAULT 0,
> > > emailindex txtidx,
> > > CONSTRAINT idx_pk_ct_com_user UNIQUE (user_id),
> > > CONSTRAINT idx_u_ct_com_user_login UNIQUE (login)
> > > ) WITH OIDS;
> > >
> > > As hardly anyone actually uses a signature, hobby or
> > > description, I
> > > cannot imagine how this should amount to almost 2 GB of data
> > > (1862MB) -
> > > this would mean roughly 20k of data per row average - and no way
> > > have we
> > > got this amount of data - the absolute maximum data per row as
> > > defined
> > > would be about 30k, I guess, but that would mean that nearly all
> > > of our
> > > users would use nearly all available space and I know that this
> > > is quite
> > > far from true. I suspect it may have something to do with
> > > emailindex of
> > > the txtidx-type... As it's not absolutely necessary, I'll try
> > > and drop
> > > this column and see what I get...
> > >
> > > Regards,
> > >
> > > Markus
> > >
> > > > -----UrsprУМngliche Nachricht-----
> > > > Von: Martijn van Oosterhout [mailto:kleptog(at)svana(dot)org]
> > > > Gesendet: Mittwoch, 28. August 2002 16:30
> > > > An: Markus Wollny
> > > > Cc: pgsql-general(at)postgresql(dot)org
> > > > Betreff: Re: [GENERAL] Naming-scheme for db-files
> > > >
> > > >
> > > > On Wed, Aug 28, 2002 at 03:54:45PM +0200, Markus Wollny wrote:
> > > > > Hi!
> > > > >
> > > > > As I was just checking disk-usage of a database (PostgreSQL
> > > > 7.2.1), I
> > > > > stumbled over some files named with a trailing .1 added to
> > > the usual
> > > > > oid. Now if a table 'example' with oid 12345 exists, what
> > > > does the file
> > > > > 12345.1 contain exactly? I didn't find anything about
> > > > .1-files in the
> > > > > documentation...
> > > >
> > > > Postgres splits files at 1GB. The .1 file would be the second
> > > > part of the
> > > > file. When that also reaches 1GB, you'll get a .2 file.
> > > >
> > > > You're doing VACUUM [FULL] regularly, right?
> > > >
> > > > --
> > > > Martijn van Oosterhout <kleptog(at)svana(dot)org>
> > > > http://svana.org/kleptog/
> > > > > There are 10 kinds of people in the world, those that can do
> > > binary
> > > > > arithmetic and those that can't.
> > > >
> > >
> > > ---------------------------(end of
> > > broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister
> > > command
> > > (send "unregister YourEmailAddressHere" to
> > > majordomo(at)postgresql(dot)org)
> > >
> > >
> > >
> > > ---------------------------(end of
> > broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to
> > majordomo(at)postgresql(dot)org
> >
> > --
> > Martijn van Oosterhout <kleptog(at)svana(dot)org>
> > http://svana.org/kleptog/
> > > There are 10 kinds of people in the world, those that can do binary
> > > arithmetic and those that can't.
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2002-08-29 10:46:56 Re: Free space mapping (was Re: Multi-Versions and Vacuum)
Previous Message Markus Wollny 2002-08-29 10:20:04 Re: Naming-scheme for db-files