Re: truncate data before importing

From: Shmagi Kavtaradze <kavtaradze(dot)s(at)gmail(dot)com>
To: Felipe Santos <felipepts(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: truncate data before importing
Date: 2015-11-18 18:34:50
Message-ID: CAHY6maxP9Df5iwfLJKmRJTCTBuCvEhQaCYVNV5Goeex33V8=3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Yes, it helped. Thanks a lot!

On Wed, Nov 18, 2015 at 7:32 PM, Felipe Santos <felipepts(at)gmail(dot)com> wrote:

> You´re almost there, here it is what they mean:
>
> "''s/\t/ /g" = Replaces tab (\t) for white space (" ")
>
> "''s/^\S+\s+//''" = Replaces everything that starts (^) + with a
> non-white space (S+) + and is then followed by a white space (s+) for
> nothing (//).
>
> Hope it helps
>
> 2015-11-18 16:24 GMT-02:00 Shmagi Kavtaradze <kavtaradze(dot)s(at)gmail(dot)com>:
>
>> I searched and found they are for whitespace characters. But they deleted
>> sentence coordinates from text for me. Still no idea how.
>>
>> On Wed, Nov 18, 2015 at 7:18 PM, Felipe Santos <felipepts(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>
>>> 2015-11-18 16:08 GMT-02:00 Shmagi Kavtaradze <kavtaradze(dot)s(at)gmail(dot)com>:
>>>
>>>> I was not able to find any details about "\S+\s+ ", can u explain
>>>> them? Also thanks a lot, this worked perfectly!
>>>>
>>>> On Wed, Nov 18, 2015 at 4:36 PM, Hans Ginzel <hans(at)matfyz(dot)cz> wrote:
>>>>
>>>>> On Wed, Nov 18, 2015 at 01:49:35PM +0100, Shmagi Kavtaradze wrote:
>>>>>
>>>>>> I am importing sentences from txt file. They look like:
>>>>>> "0,170 A recent statistical analysis by
>>>>>> David
>>>>>> Barton graphically illustrates how America has
>>>>>> plummeted from righteous living , prosperity
>>>>>> and success in the last quarter
>>>>>> century
>>>>>> .
>>>>>> Each Sentence starts with coordinates and each word is delimited
>>>>>> with tab.
>>>>>> I want to import data to tables without coordinates, just text and
>>>>>> if
>>>>>> possible to convert tab delimited space with just 'space', not to
>>>>>> have
>>>>>> such a gap between words. Any solutions how to do it? maybe with
>>>>>> shell
>>>>>> script?
>>>>>>
>>>>>
>>>>> You can use the 'PROGRAM' in COPY syntax
>>>>> http://www.postgresql.org/docs/current/static/sql-copy.html
>>>>>
>>>>> -- DROP TABLE IF EXISTS Sentence;
>>>>> CREATE TABLE IF NOT EXISTS Sentence (s text);
>>>>> COPY Sentence
>>>>> FROM PROGRAM 'sed -re ''s/\t/ /g; s/^\S+\s+//'' file.txt'
>>>>> WITH (FORMAT text, NULL '');
>>>>>
>>>>> Take care of escape sequences – backslashes in the file.
>>>>>
>>>>> If the file is on the client side see the \copy command of psql client
>>>>> instaed.
>>>>>
>>>>> http://www.postgresql.org/docs/9.4/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY
>>>>>
>>>>> https://www.gnu.org/software/sed/manual/sed.html
>>>>>
>>>>> H.
>>>>>
>>>>>
>>>>
>>>
>>> These ("S+" and "s+") are regular expressions (regex).
>>>
>>> They are being used as part of the SED tool, which is available from
>>> your Linux bash.
>>>
>>>
>>>
>>
>

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Shmagi Kavtaradze 2015-11-20 10:21:36 Combine Top-k with similarity search extensions
Previous Message Felipe Santos 2015-11-18 18:32:55 Re: truncate data before importing