Re: PG vs ElasticSearch for Logs

From: Thomas Güttler <guettliml(at)thomas-guettler(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: PG vs ElasticSearch for Logs
Date: 2016-08-19 10:06:05
Message-ID: 65f7b4fe-35bb-af1e-6580-7073ca15fd15@thomas-guettler.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Am 19.08.2016 um 11:21 schrieb Sameer Kumar:
>
>
> On Fri, Aug 19, 2016 at 4:58 PM Thomas Güttler <guettliml(at)thomas-guettler(dot)de <mailto:guettliml(at)thomas-guettler(dot)de>> wrote:
>
>
>
> Am 19.08.2016 um 09:42 schrieb John R Pierce:
> > On 8/19/2016 12:32 AM, Thomas Güttler wrote:
> >> What do you think?
> >
> > I store most of my logs in flat textfiles syslog style, and use grep for adhoc querying.
> >
> > 200K rows/day, thats 1.4 million/week, 6 million/month, pretty soon you're talking big tables.
> >
> > in fact thats several rows/second on a 24/7 basis
>
> There is no need to store them more then 6 weeks in my current use case.
>
> I think indexing in postgres is much faster than grep.
>
> And queries including json data are not possible with grep (or at least very hard to type)
>
> My concern is which DB (or indexing) to use ...
>
>
> How will you be using the logs? What kind of queries? What kind of searches?
> Correlating events and logs from various sources could be really easy with joins, count and summary operations.

Wishes raise with possibilities. First I want to do simple queries about hosts and timestamps. Then some simple
substring matches.

Up to now to structured logging (the json column) gets created. But if it gets filled, we will find a use case where
we use ssh+grep up to now.

Up to now we need no stemming and language support.

> The kind of volume you are anticipating should be fine with Postgres but before you really decide which one, you need to
> figure out what would you want to do with this data once it is in Postgres.

The goal is a bit fuzzy up to now: Better overview.

Thank you for your feedback ("The kind of volume you are anticipating should be fine with Postgres").

I guess I will use postgres, especial since Django ORM supports JSON in Postgres:

https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/fields/#jsonfield

Regards,
Thomas Güttler

--
Thomas Guettler http://www.thomas-guettler.de/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rafal Pietrak 2016-08-19 10:25:00 Re: PG vs ElasticSearch for Logs
Previous Message Francisco Olarte 2016-08-19 10:01:49 Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans