Error inserting RFC1738-encoded URLs

From: Javier Amor garcia <jamor(at)zentyal(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Error inserting RFC1738-encoded URLs
Date: 2011-10-24 07:27:36
Message-ID: 4EA51368.7000202@zentyal.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,
sometimes I get encoding errors when inserting a s a encoded URL in a
text field.

The database uses UTF8, with both collation and c-type defined as
en_US.UTF-8, and the URL field itself is defined as VARCHAR(1024). In
the case that the URL is longer than 1024 the software truncates it.

The inserted URL is extracted from the log file of the Squid Proxy,
which is encoded in UTF8.

The URL is encoded with RFC 1738 encoding of all non-ASCII characters in
the path & query sections. puny-coding of characters in the host
authority section.
RFC 1738 -> http://www.ietf.org/rfc/rfc1738.txt

Example of URLs that raise error:

http://www.formacion.aimplas.es/_Documentos/2011/FORMACIÓN%20ABIERTA/Folleto%20Especialistas%20Universitarios%20Polímeros%20ok.pdf

http://ads.prisacom.com/RealMedia/ads/adstream_mjx.ads/www.elpais.es/edicionimpresa/deportes/articulos/1452867580(at)Middle,Middle1,Top,Top2,TopRight,x02,x20?search=VUELTA%20A%20ESPAÑA,Ciclismo,Deportes

http://ads.prisacom.com/RealMedia/ads/adstream_nx.ads/www.elpais.es/edicionimpresa/deportes/articulos/1452867580(at)Middle,Middle1,Top,Top2,TopRight,x02,x20!Middle?search=VUELTA%20A%20ESPAÑA,Ciclismo,Deportes

http://www.t-a-o.com/ES/moda-bebe-nino/pantalón/flash/zoom.swf?image_lien=52905_C1057_A_zoom.jpg&lang=ES

http://static.slidesharecdn.com/swf/menu.swf?embedCode=<div%20style="width:425px"%20id="__ss_1320169">%20<strong%20style="display:block;margin:12px%200%204px"><a%20href="http://www.slideshare.net/raimonesteve/que-es-openerp"%20title="¿Que%20es%20Openerp?"%20target="_blank">¿Que%20es%20Openerp?</a></strong>%20<iframe%20src="http://www.slideshare.net/slideshow/embed_code/1320169"%20width="425"%20height="355"%20frameborder="0"%20marginwidth="0"%20marginheight="0"%20scrolling="no"></iframe>%20<div%20style="padding:5px%200%2012px">%20View%20more%20<a%20href="http://www.slideshare.net/"%20target="_blank">presentations</a>%20from%20<a%20href="http://www.slideshare.net/raimonesteve"%20target="_blank">raimonesteve</a>%20</div>%20</div>&showID=1320169&showURL=http://www.slideshare.net/raimonesteve/que-es-openerp

---------------- End URL examples --------------------------------

Anyone know what I must do to be able to safely insert any http URL?.

Thanks for your time,
Javier

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2011-10-24 07:31:16 Re: PostGIS in a commercial project
Previous Message Chitra Creta 2011-10-24 06:26:01 Upgrading an existing database structure