From: | Scott Marlowe <smarlowe(at)g2switchworks(dot)com> |
---|---|
To: | George Nychis <gnychis(at)cmu(dot)edu> |
Cc: | pgsql general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: insert only unique values in to a table, ignore rest? |
Date: | 2007-01-08 22:03:53 |
Message-ID: | 1168293832.20602.148.camel@state.g2switchworks.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 2007-01-08 at 15:59, George Nychis wrote:
> Scott Marlowe wrote:
> > On Mon, 2007-01-08 at 15:52, George Nychis wrote:
> >> Scott Marlowe wrote:
> >>> On Mon, 2007-01-08 at 14:58, George Nychis wrote:
> >>>> Hi,
> >>>>
> >>>> I have approximately 2 billion data entries that I would like to insert into a database.
> >>>> Each entry consists of:
> >>>> INT BOOLEAN INT BOOLEAN
> >>>>
SNIP
> >>> Assuming you're loading into an empty table, the load to temp, select
> >>> distinct out and into the final table seems reasonable, should run
> >>> reasonably fast. If you need to load to an existing table, it might get
> >>> a little more complex.
> >>>
> >> The goal is not to run queries while the data is being inserted....I am wondering if the
> >> postgresql method I have mentioned to actually insert and get only distinct values is most
> >> optimal, which would produce the same results method I explained in mysql.
> >
> > Did I fail to answer your question?
> >
> > Sorry if I gave you more information than you needed. Please feel free
> > to ask someone else next time.
> >
>
> ahhh i missed your last paragraph... so much text. Actually yeah that answers my
> question, thank you. I guess its more a single run through than the mysql method which
> was piece-wise.
Note that things will go faster if you do your initial data load using
"copy from stdin" for the initial bulk data load. individual inserts in
postgresql are quite costly compared to mysql. It's the transactional
overhead. by grouping them together you can make things much faster.
copy from stdin does all the inserts in one big transaction.
If you use insert statements, wrap them in a begin; end; pair to make
them be one transaction. not as fast as copy, due to parsing, but still
much faster than individual transactions.
From | Date | Subject | |
---|---|---|---|
Next Message | Jeremy Haile | 2007-01-08 22:09:34 | Re: insert only unique values in to a table, ignore rest? |
Previous Message | George Nychis | 2007-01-08 21:59:21 | Re: insert only unique values in to a table, ignore rest? |