Quick Links

Re: PG vs ElasticSearch for Logs

From:	Terry Schmitt <terry(dot)schmitt(at)gmail(dot)com>
To:	Andy Colson <andy(at)squeakycode(dot)net>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: PG vs ElasticSearch for Logs
Date:	2016-08-23 20:42:19
Message-ID:	CAOOcyswtD_anQXPD6Ru9r-FRMU7dxii436CxbjO8qShAF9S6xw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Certainly Postgres is capable of handling this volume just fine. Throw in
some partition rotation handling and you have a solution.
If you want to play with something different, check out Graylog, which is
backed by Elasticsearch. A bit more work to set up than a single Postgres
table, but it has ben a success for us storing, syslog, app logs, and
Postgres logs from several hundred network devices, Windows and Linux
servers. Rotation is handled based on your requirements and drilling down
to the details is trivial. Alerting is baked in as well. It could well be
overkill for your needs, but I don't know what your environment looks like.

On Mon, Aug 22, 2016 at 7:03 AM, Andy Colson <andy(at)squeakycode(dot)net> wrote:

> On 8/22/2016 2:39 AM, Thomas Güttler wrote:
>
>>
>>
>> Am 19.08.2016 um 19:59 schrieb Andy Colson:
>>
>>> On 8/19/2016 2:32 AM, Thomas Güttler wrote:
>>>
>>>> I want to store logs in a simple table.
>>>>
>>>> Here my columns:
>>>>
>>>> Primary-key (auto generated)
>>>> timestamp
>>>> host
>>>> service-on-host
>>>> loglevel
>>>> msg
>>>> json (optional)
>>>>
>>>> I am unsure which DB to choose: Postgres, ElasticSearch or ...?
>>>>
>>>> We don't have high traffic. About 200k rows per day.
>>>>
>>>> My heart beats for postgres. We use it since several years.
>>>>
>>>> On the other hand, the sentence "Don't store logs in a DB" is
>>>> somewhere in my head.....
>>>>
>>>> What do you think?
>>>>
>>>>
>>>>
>>>>
>>> I played with ElasticSearch a little, mostly because I wanted to use
>>> Kibana which looks really pretty. I dumped a ton
>>> of logs into it, and made a pretty dashboard ... but in the end it
>>> didn't really help me, and wasn't that useful. My
>>> problem is, I don't want to have to go look at it. If something goes
>>> bad, then I want an email alert, at which point
>>> I'm going to go run top, and tail the logs.
>>>
>>> Another problem I had with kibana/ES is the syntax to search stuff is
>>> different than I'm used to. It made it hard to
>>> find stuff in kibana.
>>>
>>> Right now, I have a perl script that reads apache logs and fires off
>>> updates into PG to keep stats. But its an hourly
>>> summary, which the website turns around and queries the stats to show
>>> pretty usage graphs.
>>>
>>
>> You use Perl to read apache logs. Does this work?
>>
>> Forwarding logs reliably is not easy. Logs are streams, files in unix
>> are not streams. Sooner or later
>> the files get rotated. RELP exists, but AFAIK it's usage is not wide
>> spread:
>>
>> https://en.wikipedia.org/wiki/Reliable_Event_Logging_Protocol
>>
>> Let's see how to get the logs into postgres ....
>>
>> In the end, PG or ES, all depends on what you want.
>>>
>>
>> Most of my logs start from a http request. I want a unique id per request
>> in every log line which gets created. This way I can trace the request,
>> even if its impact spans to several hosts and systems which do not
>> receive http requests.
>>
>> Regards,
>> Thomas Güttler
>>
>>
>>
> I don't read the file. In apache.conf:
>
> # v, countyia, ip, sess, ts, url, query, status
> LogFormat "3,%{countyName}e,%a,%{VCSID}C,%{%Y-%m-%dT%H:%M:%S%z}t,\"%U\",\"%q\",%>s"
> csv3
>
> CustomLog "|/usr/local/bin/statSender.pl -r 127.0.0.1" csv3
>
> I think I read somewhere that if you pipe to a script (like above) and you
> dont read fast enough, it could slow apache down. That's why the script
> above dumps do redis first. That way I can move processes around, restart
> the database, etc, etc, and not break apache in any way.
>
> The important part of the script:
>
> while (my $x = <>)
> {
> chomp($x);
> next unless ($x);
> try_again:
> if ($redis)
> {
> eval {
> $redis->lpush($qname, $x);
> };
> if ($@)
> {
> $redis = redis_connect();
> goto try_again;
> }
> # just silence this one
> eval {
> $redis->ltrim($qname, 0, 1000);
> };
> }
> }
>
> Any other machine, or even multiple, then reads from redis and inserts
> into PG.
>
> You can see, in my script, I trim the queue to 1000 items, but that's
> because I'm not as worried about loosing results. Your setup would
> probably be different. I also setup redis to not save anything to disk,
> again, because I don't mind if I loose a few hits here or there. But you
> get the idea.
>
> -Andy
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Re: PG vs ElasticSearch for Logs at 2016-08-22 14:03:45 from Andy Colson

Responses

Graylog at 2016-08-24 08:43:34 from Thomas Güttler

Browse pgsql-general by date

	From	Date	Subject
Next Message	Adam Brusselback	2016-08-23 21:13:44	Re: Foreign key against a partitioned table
Previous Message	Igor Neyman	2016-08-23 20:07:30	Re: Foreign key against a partitioned table