Re: GSoC project: K-medoids clustering in Madlib

From: Maxence AHLOUCHE <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, hellerstein(at)cs(dot)berkeley(dot)edu
Cc: "Iyer, Rahul" <Rahul(dot)Iyer(at)emc(dot)com>, "pgsql-students(at)postgresql(dot)org" <pgsql-students(at)postgresql(dot)org>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "Philip, Sujit" <Sujit(dot)Philip(at)emc(dot)com>
Subject: Re: GSoC project: K-medoids clustering in Madlib
Date: 2013-04-21 21:21:14
Message-ID: CAJeaomWVrTvP5O3oYePCnmTnMQYkz2_JkC_8MgrrDf6ui+z1uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-students

2013/4/21 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>

>
> Interesting! Good work!
>
> Could you draw up a summary, giving your findings about the performance of
> different algorithms,and which one should be implemented,or both(k means++
> vs k medoids).
>
> Regards,
>
> Atri
>

From the few articles I've already read, I've found that K-medoids
clustering usually goes faster on standard datasets such as the ones I
generate). But I'll look for more detailed information during the week, and
report what I'll have found here!
By the way, have you got any idea of other forms of datasets that could be
useful to test?

2013/4/21 <hellerstein(at)cs(dot)berkeley(dot)edu>

> Very cool!
>
> May I suggest generating a visualization in a web toolkit? Perhaps the
> new vega library would be simplest (http://trifacta.github.io/vega/) or
> the more popular but lower-level D3.js?
>
> More generally, a project to connect MADlib outputs to vega vis
> specifications seems like it would be enormously useful!
>
> Joe
>

I'll give it a look during my holidays, in a week! It would indeed be nice
if one just had to open a webpage to test my work!
Considering your other idea, aren't MADlib outputs PostgreSQL/GreenPlum
outputs? If so, only a database connector is required, which probably
already exists (I may be wrong, I had never heard of D3.js or Vega before,
and I don't know well the MADlib project yet).

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

In response to

Browse pgsql-students by date

  From Date Subject
Next Message Thom Brown 2013-04-23 18:09:19 Student GSoC applications
Previous Message Atri Sharma 2013-04-21 18:03:07 Re: GSoC project: K-medoids clustering in Madlib