
Hello, I have inherited responsibility for a LDAP user management system written in Python/Django: https://github.com/VPAC/django-placard In my rewrite come across a sticky problem: how do I reliably and efficiently allocate unique uidNumber for new users and gidNumber of new groups? The current solution, as used by my predecessor, is to list every user or group in the system, sort or scan though the list looking for the highest id, add 1, and use that. Unfortunately, this seems to be lacking in efficiency (specially if there are a lot of users) and relying on the hope that two users will never be created at the same time. Race conditions could occur. Only solution I can think of is to create a special server that atomically allocates ids. Seems a lot of work for something so simple. Any ideas for other solutions? (not sure if this is on-topic or not for luv-main, so posting here just to be safe) Thanks -- Brian May <brian@microcomaustralia.com.au>

Brian May <brian@microcomaustralia.com.au> wrote:
In my rewrite come across a sticky problem: how do I reliably and efficiently allocate unique uidNumber for new users and gidNumber of new groups?
The current solution, as used by my predecessor, is to list every user or group in the system, sort or scan though the list looking for the highest id, add 1, and use that.
Unfortunately, this seems to be lacking in efficiency (specially if there are a lot of users) and relying on the hope that two users will never be created at the same time. Race conditions could occur.
If you're mainly worried that your tool could end up creating two users at once, you could use locking to prevent it, I suppose. If you're concerned that someone might run useradd/groupadd at the same time as your tool is operating, I don't see how this could be prevented easily other than opening and locking /etc/passwd or /etc/group. Is there a good reason not simply to spawn useradd/groupadd and let them allocate the ids? Is it safe to assume that uids/gids appear in ascending order in /etc/passwd and /etc/group? In general, probably not, especially as an administrator might edit those files, so that's out.

On 22 November 2012 10:42, Jason White <jason@jasonjgw.net> wrote:
Is it safe to assume that uids/gids appear in ascending order in /etc/passwd and /etc/group? In general, probably not, especially as an administrator might edit those files, so that's out.
This is LDAP, the order the server gives the results is undefined. There is RFC2891 for server side sorting, looks like OpenLDAP doesn't support it yet. At least that is what Google said last week. Today it seems to suggest it is supported. http://www.openldap.org/software/man.cgi?query=slapo-sssvlv&apropos=0&sektio... -- Brian May <brian@microcomaustralia.com.au>

Brian May <brian@microcomaustralia.com.au> wrote:
This is LDAP, the order the server gives the results is undefined.
Can you look up a user/group by ID? If so, one idea would be: 1. Perform an initial scan of the users/groups. 2. Cache (i.e., memoize) the greatest uid/gid. 3. To add a new user/group, retrieve the highest uid/gid from the database/file/whatever, then probe consecutive uids/gids until an unused one is found, then (locking if necessary) allocate a new user/group and update the stored value accordingly. This way you only perform the expensive scan once.

On 22 November 2012 11:03, Jason White <jason@jasonjgw.net> wrote:
This way you only perform the expensive scan once.
Yes, was thinking of this. As it is a web based (Django) service the only perhaps downside is two administrators could be logged in simultaneously and create users at the same time. As it should be faster though, maybe that is less of a risk? -- Brian May <brian@microcomaustralia.com.au>

Brian May <brian@microcomaustralia.com.au> wrote:
As it is a web based (Django) service the only perhaps downside is two administrators could be logged in simultaneously and create users at the same time. As it should be faster though, maybe that is less of a risk?
That's why I mentioned locking - you need to lock something so this doesn't happen. I don't know what the locking mechanism is in your Web development environment, but surely there is one.

Jason White wrote:
Brian May <brian@microcomaustralia.com.au> wrote:
This is LDAP, the order the server gives the results is undefined.
Can you look up a user/group by ID? If so, one idea would be:
1. Perform an initial scan of the users/groups.
2. Cache (i.e., memoize) the greatest uid/gid.
3. To add a new user/group, retrieve the highest uid/gid from the database/file/whatever, then probe consecutive uids/gids until an unused one is found, then (locking if necessary) allocate a new user/group and update the stored value accordingly.
This way you only perform the expensive scan once.
This breaks as soon as anything else edits the LDAP objects, because your variable that remembers max-curr-uid will be out of sync. Are your LDAP queries so slow that it matters? I don't think mine are... $ time ldapsearch -xLLL objectClass=person uidNumber >/dev/null real 0m0.006s user 0m0.010s sys 0m0.000s

Jason White wrote:
Is there a good reason not simply to spawn useradd/groupadd and let them allocate the ids?
I think you missed the part where he said it's LDAP. useradd only knows about the flat files backend. A great annoyance of LDAP/krb is the lack of solid, portable management solutions. There's e.g. a webmin module, but yecch. So, you solve it... with Yet Another site-specific LDAP management UI.
If you're concerned that someone might run useradd/groupadd at the same time as your tool is operating, I don't see how this could be prevented easily other than opening and locking /etc/passwd or /etc/group.
You could dpkg-divert --rename /usr/sbin/useradd, and replace it with a little script that prints "YOU IDIOT! You're supposed to add accounts to LDAP!" Of course, that'll break package installs that try to create system groups, so then you extend your wrapper to parse useradd arguments and look for --system, and if it's there pass "$@" to useradd.distrib. That way lies madness...

On 2012-11-22 10:42, Jason White wrote:
If you're mainly worried that your tool could end up creating two users at once, you could use locking to prevent it, I suppose.
If you're concerned that someone might run useradd/groupadd at the same time as your tool is operating, I don't see how this could be prevented easily other than opening and locking /etc/passwd or /etc/group.
Is there a good reason not simply to spawn useradd/groupadd and let them allocate the ids?
Is it safe to assume that uids/gids appear in ascending order in /etc/passwd and /etc/group? In general, probably not, especially as an administrator might edit those files, so that's out.
Jason, Brian's talking about LDAP; /etc/passwd and /etc/group are irrelevent, as are, probably useradd/groupadd. Brian, I suspect that your predecessor's solution may be the most straightforward; Assuming you don't have a monstrous number of users, and have the uidNumber attribute indexed in the LDAP database, listing current UIDs shouldn't be too expensive. If you were using shell, you could do something like this: ldapsearch -LLLx uidNumber | grep uidNumber | sort -nk2 | tail -2 | head -1 That's ugly and hackish; the tail/head at the end excludes the nobody user whose uidNumber (on my system) is 65534. There are probably slightly more elegant ways to do it, especially if you're not using shell. The way we do it at work is by iterating through getent passwd, incrementing the UID until we find one that doesn't match an existing entry[1]. This would be somewhat slow with lots of users though. 1. while getent passwd $id || getent group $id do [ $id = 9999 ] && die "Ran out of IDs." || : ((id++)) done >/dev/null -- Regards, Matthew Cengia

Brian May wrote:
In my rewrite come across a sticky problem: how do I reliably and efficiently allocate unique uidNumber for new users and gidNumber of new groups?
The current solution, as used by my predecessor, is to list every user or group in the system, sort or scan though the list looking for the highest id, add 1, and use that.
That strategy renders a lot of UIDs unreachable when some enterprising fellow manually creates an account with a high UID. I start at the bottom and and count up until I find an unused one. Both strategies should be linear with the number of existing users (I think). http://www.cyber.com.au/~twb/snarf/ldapadduser
Unfortunately, this seems to be lacking in efficiency (specially if there are a lot of users) and relying on the hope that two users will never be created at the same time. Race conditions could occur.
I address that by wishful thinking (well, we do not get new staff very often). Since mine is a simple shell script, and it only works on the LDAP server itself (-Hldapi:/// -YEXTERNAL -- the LDAP server is deliberately has no superuser dn), I could probably bolt in lockfile-progs.
Only solution I can think of is to create a special server that atomically allocates ids. Seems a lot of work for something so simple.
Any ideas for other solutions?
Solve it at the policy level? E.g. tell everyone to create new users by telling Fred to do it, and telling Fred not to compulsively hit refresh when the page is slow to load. IME trying to avoid such a race condition is VERY likely to introduce bugs that crop up FAR MORE OFTEN than the original hypothetical race condition. Has the race actually happened yet? Maybe a quick kludge would be something like @hourly nobody getent passwd | cut -d: -f2 | sort | uniq -c | grep -v '^1 ' | sed 's/^/ERROR: duplicate UID detected!/' @hourly nobody getent group | cut -d: -f2 | sort | uniq -c | grep -v '^1 ' | sed 's/^/ERROR: duplicate GID detected!/' so the race can still happen, but at least you'll get an email about it before too much damage can occur.

On 22 November 2012 10:54, Trent W. Buck <trentbuck@gmail.com> wrote:
That strategy renders a lot of UIDs unreachable when some enterprising fellow manually creates an account with a high UID. I start at the bottom and and count up until I find an unused one. Both strategies should be linear with the number of existing users (I think).
Hmm. Problem with that strategy is that you risk reusing IDs, if you have a policy of deleting old users as they leave. This ID might still own resources on random computer systems. (of course another solution to this is to lock accounts, not delete them when staff leave. Locking accounts is another can of worms when you consider ssh key auth logins, locking the password is insufficient to block access to the account - have found setting the shell to an invalid value seems to work - ssh checks if the shell exists or not even with -N) -- Brian May <brian@microcomaustralia.com.au>

Brian May wrote:
On 22 November 2012 10:54, Trent W. Buck <trentbuck@gmail.com> wrote:
That strategy renders a lot of UIDs unreachable when some enterprising fellow manually creates an account with a high UID. I start at the bottom and and count up until I find an unused one. Both strategies should be linear with the number of existing users (I think).
Hmm. Problem with that strategy is that you risk reusing IDs, if you have a policy of deleting old users as they leave. This ID might still own resources on random computer systems.
(of course another solution to this is to lock accounts, not delete them when staff leave. Locking accounts is another can of worms when you consider ssh key auth logins, locking the password is insufficient to block access to the account - have found setting the shell to an invalid value seems to work - ssh checks if the shell exists or not even with -N)
Correct on all points. You've pretty well convinced me that your approach is better, but inertia will probably keep my code as-is for the current deployment :-)

On Thu, 22 Nov 2012, Brian May <brian@microcomaustralia.com.au> wrote:
In my rewrite come across a sticky problem: how do I reliably and efficiently allocate unique uidNumber for new users and gidNumber of new groups?
The current solution, as used by my predecessor, is to list every user or group in the system, sort or scan though the list looking for the highest id, add 1, and use that.
Unfortunately, this seems to be lacking in efficiency (specially if there are a lot of users) and relying on the hope that two users will never be created at the same time. Race conditions could occur.
It seems to me that there are two ways of avoiding race conditions, one is to create the object and then search for other objects with the same UID. The other is to include the UID but not the user-name in the dn, as the dn MUST be unique an attempt to add a second object with the same UID will fail at the LDAP protocol level. On Thu, 22 Nov 2012, "Trent W. Buck" <trentbuck@gmail.com> wrote:
The current solution, as used by my predecessor, is to list every user or group in the system, sort or scan though the list looking for the highest id, add 1, and use that.
That strategy renders a lot of UIDs unreachable when some enterprising fellow manually creates an account with a high UID. I start at the bottom and and count up until I find an unused one. Both strategies should be linear with the number of existing users (I think).
UIDs are 32bit nowadays. So there will still be a lot of spare space. The problem you are likely to encounter if you use large UIDs is the way some programs handle them. Last time I tried that (which was some time ago) I had problems with database files for things like last login which were flat files with the UID as an index. When a user with a large UID logged in the files became large sparse files which cause problems with backup programs. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Russell Coker wrote:
The other is to include the UID but not the user-name in the dn, as the dn MUST be unique an attempt to add a second object with the same UID will fail at the LDAP protocol level.
Is this allowed by the RFC2307 / RFC2307bis schemas? Otherwise, I have no strong objection to that approach.
UIDs are 32bit nowadays. So there will still be a lot of spare space.
On linux, at least. IIRC I ran into problems with negative UIDs when backing up to opensolaris[0]. Other than that, I agree. [0] whatever the hell this is -- osol9 IIRC: SunOS zhug 5.11 snv_111b i86pc i386 i86pc Solaris

I might have missed some of this thread. Did anyone mention the solution given here? http://www.rexconsulting.net/ldap-protocol-uidnumber.html Store the next number to use in an objectclass in the directory, then use atomic operations to increment it. Would that help?

On Thu, Nov 22, 2012 at 4:51 PM, Andrew Spiers <7andrew@gmail.com> wrote:
I might have missed some of this thread. Did anyone mention the solution given here? http://www.rexconsulting.net/ldap-protocol-uidnumber.html Store the next number to use in an objectclass in the directory, then use atomic operations to increment it. Would that help?
Although I suppose you need additional controls to ensure that the objectclass which stores the next uid to use is always used, an no other method is ever used to allocate a UID.

Hi all, thought about using MySQL as the LDAP backend. If so, having the user table with auto-increment and unique uid would do what you want. So, you get the uid at return, and have no race condition problem. Here is a setup how-to http://www.wingfoss.com/content/how-to-install-openldap-with-mysql-on-debian... I never tried it but it is an obvious solution for a database with "lots of users". Regards Peter

On Fri, 23 Nov 2012, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote:
thought about using MySQL as the LDAP backend.
In that case why not just use MySQL for account data directly? -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On Fri, 23 Nov 2012, Russell Coker wrote:
On Fri, 23 Nov 2012, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote:
thought about using MySQL as the LDAP backend.
In that case why not just use MySQL for account data directly?
If you can, yes. But there are more authentication solutions with LDAP backend, as with MySQL (including the Windows world), and changing the backend does not break what's running. Regards Peter

On Fri, 23 Nov 2012, Peter Ross wrote:
On Fri, 23 Nov 2012, Russell Coker wrote:
On Fri, 23 Nov 2012, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote:
thought about using MySQL as the LDAP backend.
In that case why not just use MySQL for account data directly?
If you can, yes. But there are more authentication solutions with LDAP backend, as with MySQL (including the Windows world), and changing the backend does not break what's running.
P.S. OpenLDAP is using Berkeley DB as a default, I think (my OpenLDAP under Zimbra does). I don't know how it structures the LDAP data but I doubt that you can create the same UID twice (BDB is a simple key=value database, so it should baulk at the same key). If so, one of the admins would be unlucky if he tries to insert the same UID. Just check success in the script and inform the admin in question. Regards Peter

P.P.S. You can create and delete a temporary lease object in LDAP that gets queried before you add a user, and prevents race condition problems. With an expiry date to prevent lose ends. And the ID of the creator to allow you to send a message: "Admin 4711 is just creating a new user entry. Tell him to hurry up!" Regards Peter

On 23 November 2012 10:35, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote:
P.S. OpenLDAP is using Berkeley DB as a default, I think (my OpenLDAP under Zimbra does).
Berkeley DB is the default backend. However, OpenLDAP still has to support a hierarchical database somehow. This implies that there isn't a simple one to one mapping between keys and values to get from LDAP to Berkeley DB.
I don't know how it structures the LDAP data but I doubt that you can create the same UID twice (BDB is a simple key=value database, so it should baulk at the same key).
s/UID/uidNumber/ I don't think there is anything in the LDAP standards that prevent duplicate uidNumbers, unless uidNumber is an RDN. Will test this out tomorrow. -- Brian May <brian@microcomaustralia.com.au>

On 23 November 2012 09:36, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote:
thought about using MySQL as the LDAP backend.
That is also an interesting idea. Anyone know how well it works in practise? e.g. is performance acceptable? Is it scalable? Have a vague recollection of investigating something similar some time ago, and found it was poorly supported. Maybe that has changed now... -- Brian May <brian@microcomaustralia.com.au>

On 22 November 2012 12:03, Russell Coker <russell@coker.com.au> wrote:
It seems to me that there are two ways of avoiding race conditions, one is to create the object and then search for other objects with the same UID. The other is to include the UID but not the user-name in the dn, as the dn MUST be unique an attempt to add a second object with the same UID will fail at the LDAP protocol level.
As we are talking about LDAP I assume you mean uidNumber here. uid is the LDAP field for the user's login name (was known as userid in X.500). Defacto standard practice make uid= an RDN value (i.e. include it in the DN), for precisely this reason, not the uidNumber. Another standard I have seen is to use cn= in the RDN. As far as I can tell, after very quick glance, none of the standards, e.g. http://www.ietf.org/rfc/rfc2253.txt, care about what value you use in the RDN, although I may have missed something. So, yes, having uidNumber a RDN might be OK, however (a) this ideally needs to be done before the database is created, or you end up with inconsistent DNs (not that this really matters) and (b) you lose the ability to keep the uid unique. -- Brian May <brian@microcomaustralia.com.au>
participants (7)
-
Andrew Spiers
-
Brian May
-
Jason White
-
Matthew Cengia
-
Peter Ross
-
Russell Coker
-
Trent W. Buck