Re: Extracting hostname from URI column

From: ogjunk-pgjedan(at)yahoo(dot)com
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Extracting hostname from URI column
Date: 2007-09-12 03:58:16
Message-ID: 247444.36947.qm@web50311.mail.re2.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Ah, I figured out what to look for and found my uniq -c solution:

select substring( href from '.*://([^/]*)' ) as hostname, count(substring( href from '.*://([^/]*)' )) from url where id<10 group by hostname order by count desc;
hostname | count
--------------------------+-------
texturizer.net | 2
www.google.com | 2
dictionary.reference.com | 1
www.mozillazine.org | 1
devedge.netscape.com | 1
groups.google.com | 1
forums.mozillazine.org | 1

Thanks for the quick help with substring func, people!

Otis

----- Original Message ----
From: "ogjunk-pgjedan(at)yahoo(dot)com" <ogjunk-pgjedan(at)yahoo(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Sent: Tuesday, September 11, 2007 11:16:15 PM
Subject: Re: [SQL] Extracting hostname from URI column

Hi,

Thanks, perfect! (though I'll have to look into the regex warning):

=> select substring( href from '.*://\([^/]*)' ) as hostname from url where id<10;
WARNING: nonstandard use of escape in a string literal at character 29
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.

So now I have this:
hostname
--------------------------
texturizer.net
texturizer.net
forums.mozillazine.org
www.mozillazine.org
devedge.netscape.com
www.google.com
groups.google.com
www.google.com
dictionary.reference.com

And what I'd like is something that would give me the counts for the number of occurrences of each unique hostname. Something much like `uniq -c'. Can anyone tell me how that's done or where I should look for info? (I'm not sure what to look for, that's the problem).

Thanks,
Otis

----- Original Message ----
From: chester c young <chestercyoung(at)yahoo(dot)com>
To: ogjunk-pgjedan(at)yahoo(dot)com
Cc: sql pgsql <pgsql-sql(at)postgresql(dot)org>
Sent: Tuesday, September 11, 2007 8:42:46 PM
Subject: Re: [SQL] Extracting hostname from URI column

> I'm trying to use substr() and position() functions to extract the
> full host name (and later a domain) from a column that holds URLs.

substring( href from '.*://\([^/]*)' );


____________________________________________________________________________________
Pinpoint customers who are looking for what you sell.
http://searchmarketing.yahoo.com/

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Browse pgsql-sql by date

  From Date Subject
Next Message Paul Lambert 2007-09-12 04:03:07 Re: Extracting hostname from URI column
Previous Message chester c young 2007-09-12 03:56:19 Re: Extracting hostname from URI column