Quick Links

A speed comparison with sqlite

From:	Shane Ambler <pgsql(at)Sheeky(dot)Biz>
To:	PostgreSQL Advocacy <pgsql-advocacy(at)postgresql(dot)org>
Subject:	A speed comparison with sqlite
Date:	2008-01-18 05:55:45
Message-ID:	47903F61.8000904@Sheeky.Biz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-advocacy

Just thought I would share some rough numbers here.
A bit of an unusual edge case but a big time difference...

A guy using RealBasic (GUI development IDE) which uses sqlite as it's
inbuilt db engine wanted to generate a series of unique codes and
decided sql will help (wanting 30 million codes starts to rule out ram
based solutions)

This is a 7 character alphanumeric code.

His program generated 30M codes in about 15 minutes into the sqlite db
file without any duplicate checks.

Select distinct(pincode) from codes; returned the results (that is to
his client not out to file) after 22 hours and he was after a faster
solution.

Using a unique index to check as he went, inserts dropped from 18,000
per second to about 200 a second after 8 hours (without completing)

(the following times are taken from psql's timing option)

With a P4 3Ghz - 80GB IDE drive - running XP pro - pg 8.2.6 -

Using postgresql (with psql not Realbasic) I created the table and
inserted 30M codes in 5.9 mins

select count(distinct(pincode)) from codes; took about 7.2 minutes
(29,993,182 unique codes of the 30M created)

setting output to a text file - select distinct(pincode) from codes took
18.45 minutes to send them all out (283MB file).

The same hardware running FreeBSD 7.0RC1 GENERIC Kernel -

create the table and insert 30M codes in 5.6 mins

select count(distinct(pincode)) from codes; took 5.9 minutes
(29,993,279 unique codes of the 30M created)

setting output to a text file - select distinct(pincode) from codes took
9.7 minutes to send them all out to file.

It would appear that the method of locating distinct results is the key
here - postgresql sorts the distinct column allowing it to identify a
duplicate or new value quicker - sqlite returns the distinct columns in
the order they are inserted - so it must be scanning all previous
entries in the output to ensure distinct.

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Responses

Re: A speed comparison with sqlite at 2008-01-24 19:56:26 from Shane Ambler

Browse pgsql-advocacy by date

	From	Date	Subject
Next Message	rotellaro	2008-01-18 08:29:42	Re: developer.com: PostgreSQL is the "Database of the year"
Previous Message	Shane Ambler	2008-01-18 05:30:17	Re: developer.com: PostgreSQL is the "Database of the year"