From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Jacob Champion <pchampion(at)vmware(dot)com> |
Cc: | "peter(dot)eisentraut(at)enterprisedb(dot)com" <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] Delegating pg_ident to a third party |
Date: | 2022-01-05 03:24:58 |
Message-ID: | CAOuzzgpFpuroNRabEvB9kST_TSyS2jFicBNoXvW7G2pZFixyBw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greetings,
On Tue, Jan 4, 2022 at 18:56 Jacob Champion <pchampion(at)vmware(dot)com> wrote:
> On Mon, 2022-01-03 at 19:42 -0500, Stephen Frost wrote:
> > * Jacob Champion (pchampion(at)vmware(dot)com) wrote:
> > >
> > > That last point was my motivation for the authn_id patch [1] -- so that
> > > auditing could see the actual user _and_ the generic role. The
> > > information is already there to be used, it's just not exposed to the
> > > stats framework yet.
> >
> > While that helps, and I generally support adding that information to the
> > logs, it's certainly not nearly as good or useful as having the actual
> > user known to the database.
>
> Could you talk more about the use cases for which having the "actual
> user" is better? From an auditing perspective I don't see why
> "authenticated as jacob(at)example(dot)net, logged in as admin" is any worse
> than "logged in as jacob".
The above case isn’t what we are talking about, as far as I understand
anyway. You’re suggesting “authenticated as jacob(at)example(dot)net, logged in as
sales” where the user in the database is “sales”. Consider triggers which
only have access to “sales”, or a tool like pgaudit which only has access
to “sales”. Who was it in sales that updated that record though? We don’t
know- we would have to go try to figure it out from the logs, but even if
we had time stamps on the row update, there could be 50 sales people logged
in at overlapping times.
> > Forcing one role per individual end user is wasteful and isn't really
> > > making good use of the role-based system that you already have.
> > > Generally speaking, when administering hundreds or thousands of users,
> > > people start dividing them up into groups as opposed to dealing with
> > > them individually. So I don't think new features should be taking away
> > > flexibility in this area -- if one role per user already works well for
> > > you, great, but don't make everyone do the same.
> >
> > Using the role system we have to assign privileges certainly is useful
> > and sensible, of course, though I don't see where you've actually made
> > an argument for why one role per individual is somehow wasteful or
> > somehow takes away from the role system that we have for granting
> > rights.
>
> I was responding more to your statement that "Being able to have roles
> and memberships automatically created is much more the direction that
> I'd say we should be going in". It's not that one-role-per-user is
> inherently wasteful, but forcing role proliferation where it's not
> needed is. If all users have the same set of permissions, there doesn't
> need to be more than one role. But see below.
Just saying it’s wasteful isn’t actually saying what is wasteful about it.
> I'm also not suggesting that we make everyone do the same
> > thing, indeed, later on I was supportive of having an external system
> > provide the mapping. Here, I'm just making the point that we should
> > also be looking at automatic role/membership creation.
>
> Gotcha. Agreed; that would open up the ability to administer role
> privileges externally too, which would be cool. That could be used in
> tandem with something like this patchset.
Not sure exactly what you’re referring to here by “administer role
privileges externally too”..? Curious to hear what you are imagining
specifically.
> > > I'd go a step further and suggest that the way to do this is with a
> > > > background worker that's started up and connects to an LDAP
> > > > infrastructure and listens for changes, allowing the system to pick
> up
> > > > on new roles/memberships as soon as they're created in the LDAP
> > > > environment. That would then be controlled by appropriate settings
> in
> > > > postgresql.conf/.auto.conf.
> > >
> > > This is roughly what you can already do with existing (third-party)
> > > tools, and that approach isn't scaling out in practice for some of our
> > > existing customers. The load on the central server, for thousands of
> > > idle databases dialing in just to see if there are any new users, is
> > > huge.
> >
> > If you're referring specifically to cron-based tools which are
> > constantly hammering on the LDAP servers running the same queries over
> > and over, sure, I agree that that's creating load on the LDAP
> > infrastructure (though, well, it was kind of designed to be very
> > scalable for exactly that kind of load, no? So I'm not really sure why
> > that's such an issue..).
>
> I don't have hands-on experience here -- just going on what I've been
> told via field/product teams -- but it seems to me that there's a big
> difference between asking an LDAP server to give you information on a
> user at the time that user logs in, and asking it to give a list of
> _all_ users to every single Postgres instance you have on a regular
> timer. The latter is what seems to be problematic.
And to be clear, I agree that’s not good (though, again, really, your ldap
infrastructure shouldn’t be having all that much trouble with it- you can
scale those out verryyyy far, and far more easily than a relational
database..).
I’d also point out though that having to do an ldap lookup on every login
to PG is *already* an issue in some environments, having to do multiple
amplifies that. Not to mention that when the ldap servers can’t be reached
for some reason, no one can log into the database and that’s rather
unfortunate too. These are, of course, arguments for moving away from
methods that require checking with some other system synchronously during
login- which is another reason why it’s better to have the authentication
credentials easily map to the PG role, without the need for external checks
at login time. That’s done with today’s pg_ident, but this patch would
change that.
Consider the approach I continue to advocate- GSSAPI based authentication,
where a user only needs to contact the Kerberos server perhaps every 8
hours or so for an updated ticket but otherwise can authorize directly to
PG using their existing ticket and credentials, where their role was
previously created and their memberships already exist thanks to a
background worker whose job it is to handle that and which deals with
transient network failures or other issues. In this world, most logins to
PG don’t require any other system to be involved besides the client, the PG
server, and the networking between them; perhaps DNS if things aren’t
cached on the client.
On the other hand, to use ldap authentication (which also happens to be
demonstrable insecure without any reasonable way to fix that), with an ldap
mapping setup, requires two logins to an ldap server every single time a
user logs into PG and if the ldap environment is offline or overloaded for
whatever reason, the login fails or takes an excessively long amount of
time.
> That's also why I specifically wasn't
> > suggesting that and was instead suggesting that we have something that's
> > connected to one of the (hopefully, many, many) LDAP servers and is
> > doing change monitoring, allowing changes to be pushed down to PG,
> > rather than cronjobs constantly running the same queries and re-checking
> > things over and over. I appreciate that that's also not free, but I
> > don't believe it's nearly as bad as the cron-based approach and it's
> > certainly something that an LDAP infrastructure should be really rather
> > good at.
>
> I guess I'd have to see an implementation -- I was under the impression
> that persistent search wasn't widely implemented?
I mean … let’s talk about the one that really matters here:
OpenLDAP has an audit log system which can be used though it’s certainly
not as nice and would require code specific to it.
This talks a bit about other directories:
https://docs.informatica.com/data-integration/powerexchange-adapters-for-powercenter/10-1/powerexchange-for-ldap-user-guide-for-powercenter/ldap-sessions/configuring-change-data-capture/methods-for-tracking-changes-in-different-directories.html
I do wish they all supported it cleanly in the same way.
> > > All that said, I do see how having the ability to call out to another
> > > > system for mappings may be useful, so I'm not sure that we shouldn't
> > > > consider this specific change and have it be specifically just for
> > > > mappings, in which case pg_ident seems appropriate.
> > >
> > > Yeah, this PoC was mostly an increment on the functionality that
> > > already existed. The division between what goes in pg_hba and what goes
> > > in pg_ident is starting to blur with this patchset, though, and I think
> > > Peter's point is sound.
> >
> > This part I tend to disagree with- pg_ident for mappings and for ways to
> > call out to other systems to provide those mappings strikes me as
> > entirely appropriate and doesn't blur the lines and that's really what
> > this patch seems to be primarily about. Peter noted that there might be
> > other things we want to do and argued that those might not be
> > appropriate in pg_ident, which I tend to agree with, but I don't think
> > we need to invent something entirely new for mappings when we have
> > pg_ident already.
>
> The current patchset here has pieces of what is usually contained in
> HBA (the LDAP host/port/base/filter/etc.) effectively moved into
> pg_ident, while other pieces (TLS settings) remain in the HBA and the
> environment. That's what I'm referring to. If that is workable for you
> in the end, that's fine, but for me it'd be much easier to maintain if
> the mapping query and the LDAP connection settings for that mapping
> query were next to each other.
I can agree with the point that it would be nicer to have the ldap
host/port/base/filter be in the hba instead, if there is a way to
accomplish that reasonably. Did you have a suggestion in mind for how to do
that..? If there’s an alternative approach to consider, it’d be useful to
see them next to each other and then we could all contemplate which is
better.
> When it comes to the question of "how to connect to an LDAP server for
> > $whatever", it seems like it'd be nice to be able to configure that once
> > and reuse that configuration. Not sure I have a great suggestion for
> > how to do that. The approach this patch takes of adding options to
> > pg_hba for that, just like other options in pg_hba do, strikes me as
> > pretty reasonable.
>
> Right. That part seems less reasonable to me, given the current format
> of the HBA. YMMV.
If the ldap connection info and filters and such could all exist in the
hba, then perhaps a way to define those credentials in one place in the hba
file and then use them on other lines would be possible..? Seems like that
would be easier than having them also in the ident or having the ident
refer to something defined elsewhere.
Consider in the hba having:
LDAPSERVER[ldap1]=“ldaps://whatever other options go here”
Then later:
hostssl all all ::0/0 ldap ldapserver=ldap1 ldapmapserver=ldap1
map=myldapmap
Clearly needs more thought needed due to different requirements for ldap
authentication vs. the map, but still, the general idea being to have all
of it in the hba and then a way to define ldap server configuration in the
hba once and then reused.
> I would advocate for other methods to work when it comes to
> > authenticating to LDAP from PG though (such as GSSAPI, in particular,
> > of course...).
>
> I can take a look at the Cyrus requirements for the GSSAPI mechanism.
> Might be tricky to add tests for it, though. Any others you're
> interested in?
GSSAPI is the main one … I suppose client side certificates would be nice
too if that’s possible. I suspect some would like a way to have
username/pw ldap credentials in some other file besides the hba, but that
isn’t as interesting to me, at least.
> > > I certainly don't think we should have this be limited to LDAP auth-
> > > > such an external mapping ability is suitable for any authentication
> > > > method that supports a mapping (thinking specifically of GSSAPI, of
> > > > course..). Not sure if that's what was meant above but did want to
> > > > make sure that was clear.
> > >
> > > You can't use usermaps with LDAP auth yet, so no, that's not what I
> > > meant. (I have another patch for that feature in commitfest, which
> > > would allow these two things to be used together.)
> >
> > Yes, I'm aware of the other patch, just wanted to make sure the intent
> > is for this to work for all map-supporting auth methods. Figured that
> > was the case but the examples in the prior email had me concerned and
> > just wanted to make sure.
>
> Correct. The new tests use cert auth, for instance.
Great.
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-01-05 03:30:57 | Re: Skipping logical replication transactions on subscriber side |
Previous Message | Corey Huinker | 2022-01-05 03:17:07 | Re: Suggestion: optionally return default value instead of error on failed cast |