From: | Jan-Erik <jan-erik(dot)larka(at)os2world(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: split string by special characters |
Date: | 2009-07-29 05:06:46 |
Message-ID: | 1c319957-477d-43d7-8041-498479ef36b4@d4g2000yqa.googlegroups.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Current design of function
---text_to_array( sen text, col collection )---
DECLARE
delimiter_pos int[] := '{}';
return_val text[] := '{}';
delimiters_chr RECORD;
remaining_text text;
delimiter_curr text;
i int := 0;
j int := 0;
prev_pos int := 1;
curr_pos int := 0;
delimiter_len int;
BEGIN
FOR delimiters_chr IN SELECT upper( chars.chars ) AS chars FROM
chars, word WHERE chars.chars_id = word.chars_id AND
word.classification_id = clss.classification_id LOOP
delimiter_len := character_length( delimiters_chr.chars ); --
Length of the current delimiter we're looking at
IF prev_pos > 0 THEN -- Don't add text if it hasn't been altered
(previous delimiter not found)
remaining_text := upper( sen ); -- Temporary copy of text to
search through
END IF;
prev_pos := 0;
curr_pos := position( delimiters_chr.chars IN sen ); -- Delimiter
position in text (sentence)
WHILE curr_pos > 0 LOOP -- Found delimiter in text
delimiter_pos := array_append( delimiter_pos, curr_pos +
prev_pos ); -- Add position of delimiter
delimiter_pos := array_append( delimiter_pos, curr_pos +
prev_pos + delimiter_len ); -- Add position where delimiter ends
remaining_text := substring( remaining_text FROM curr_pos +
delimiter_len ); -- Look only at text still
prev_pos := curr_pos + prev_pos; -- Need to keep track of how
much text has been stripped off
curr_pos := position( delimiters_chr.chars IN remaining_text );
-- Calculate the next occurance
END LOOP;
END LOOP;
remaining_text := sen; -- Temporary copy of text to extract text
from
SELECT DISTINCT * INTO delimiter_pos FROM array_append( array_prepend
( 1, sort( delimiter_pos ) ), character_length( sen ) + 1 ); -- Add
first character pos and last to the sorted array.
curr_pos := array_upper( delimiter_pos, 1 ); -- Calculate the size
of the array
WHILE j < curr_pos LOOP -- Look through the array
j := i + 1;
return_val := array_append( return_val, substring
( remaining_text FROM delimiter_pos[i] FOR (delimiter_pos[j] -
delimiter_pos[i]) ) ); -- Add the parts to a new array with text
i := i + 1;
END LOOP;
RETURN return_val; -- Return the result
END;
----
Need an additional function as well, to sort the array.
The delimiters has to be placed in another table.
Please note that it can't handle variations to a delimiter such as
<table style="..."> due to the present design as I couldn't use
regexp. <table> (case doesn't matter) without any other extras, work
fine though. I initially hoped it would be possible to allow matchings
such as <td(.*)> as well as pairs <td>(.*)</td> and variations
thereof, such as <td(.*)>(.*)</td>, but that's something I need a bit
of help with (hint, hint).
Written to run on eComStation v2.0 (successor of OS/2)
//Jan-Erik
From | Date | Subject | |
---|---|---|---|
Next Message | tomrevam | 2009-07-29 05:19:34 | Re: synchronous_commit=off doesn't always return immediately |
Previous Message | Scott Marlowe | 2009-07-29 01:52:22 | Re: Idle processes chewing up CPU? |