From: | Felipe Santos <felipepts(at)gmail(dot)com> |
---|---|
To: | Shmagi Kavtaradze <kavtaradze(dot)s(at)gmail(dot)com> |
Cc: | Hans Ginzel <hans(at)matfyz(dot)cz>, pgsql-novice(at)postgresql(dot)org |
Subject: | Re: truncate data before importing |
Date: | 2015-11-18 18:18:50 |
Message-ID: | CAPYcRiUHHsQ9Pw3hCzguVs+K=s0WJ0kjCrsuhDsRtPysPj6M-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
2015-11-18 16:08 GMT-02:00 Shmagi Kavtaradze <kavtaradze(dot)s(at)gmail(dot)com>:
> I was not able to find any details about "\S+\s+ ", can u explain them?
> Also thanks a lot, this worked perfectly!
>
> On Wed, Nov 18, 2015 at 4:36 PM, Hans Ginzel <hans(at)matfyz(dot)cz> wrote:
>
>> On Wed, Nov 18, 2015 at 01:49:35PM +0100, Shmagi Kavtaradze wrote:
>>
>>> I am importing sentences from txt file. They look like:
>>> "0,170 A recent statistical analysis by
>>> David
>>> Barton graphically illustrates how America has
>>> plummeted from righteous living , prosperity
>>> and success in the last quarter century
>>> .
>>> Each Sentence starts with coordinates and each word is delimited with
>>> tab.
>>> I want to import data to tables without coordinates, just text and if
>>> possible to convert tab delimited space with just 'space', not to have
>>> such a gap between words. Any solutions how to do it? maybe with shell
>>> script?
>>>
>>
>> You can use the 'PROGRAM' in COPY syntax
>> http://www.postgresql.org/docs/current/static/sql-copy.html
>>
>> -- DROP TABLE IF EXISTS Sentence;
>> CREATE TABLE IF NOT EXISTS Sentence (s text);
>> COPY Sentence
>> FROM PROGRAM 'sed -re ''s/\t/ /g; s/^\S+\s+//'' file.txt'
>> WITH (FORMAT text, NULL '');
>>
>> Take care of escape sequences – backslashes in the file.
>>
>> If the file is on the client side see the \copy command of psql client
>> instaed.
>>
>> http://www.postgresql.org/docs/9.4/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY
>>
>> https://www.gnu.org/software/sed/manual/sed.html
>>
>> H.
>>
>>
>
These ("S+" and "s+") are regular expressions (regex).
They are being used as part of the SED tool, which is available from your
Linux bash.
From | Date | Subject | |
---|---|---|---|
Next Message | Shmagi Kavtaradze | 2015-11-18 18:24:06 | Re: truncate data before importing |
Previous Message | Shmagi Kavtaradze | 2015-11-18 18:08:04 | Re: truncate data before importing |