Text searching HTML

From: "Campbell, Lance" <lance(at)illinois(dot)edu>
To: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Text searching HTML
Date: 2014-11-03 17:15:58
Message-ID: B75CD08C73BD3543B97E4EF3964B7D701FC741C9@CITESMBX1.ad.uillinois.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

PostgreSQL 9.3
Is there a preferred way to search text within an HTML document? I have been reading up on searching via to_tsvector. You can pass the to_tsvector two parameters. The first appears to be a dictionary and the second text. Is there by chance an English HTML dictionary? That way html tags or html attributes would be ignored.

If not then what would be the suggested dictionary? Simple or English or something else.

Thanks for your assistance.

Thanks,

Lance Campbell<http://illinois.edu/person/lance>
Software Architect
Web Services at Public Affairs
217-333-0382
[University of Illinois at Urbana-Champaign logo]<http://illinois.edu/>

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2014-11-03 20:09:01 Re: Text searching HTML
Previous Message Adrian Klaver 2014-10-31 15:45:47 Re: INT8 / float casting question