Wednesday, March 09, 2005

What's the deal on the 16 group id limitation in NFS?

So the executive summary here is:
Now I'll provide the deeper explanation for why.

NFS is built on ONC RPC (Sun RPC). NFS depends on RPC for authentication and identification of users. Most NFS deployments use an RPC authentication flavor called AUTH_SYS (originally called AUTH_UNIX, but renamed to AUTH_SYS).

AUTH_SYS sends 3 important things:
  • A 32 bit numeric user identifier (what you'd see in the UNIX /etc/passwd file)
  • A 32 bit primary numeric group identifier (ditto)
  • A variable length list of up to 16 32-bit numeric supplemental group identifiers (what'd you see in the /etc/group file)
So the 16 group id limit actually refers to the supplemental group identifiers, and it is specific to AUTH_SYS, not NFS. It is just that NFS (i.e. Not For Security :) has historically been deployed with AUTH_SYS. It doesn't help either that most, if not all NFS clients and servers use AUTH_SYS by default, even if they support better forms of authentication like AUTH_DH (AUTH_DES) or RPCSEC_GSS (both AUTH_DH and RPCSEC_GSS rely on cryptography to authenticate users).

It turns out that with 800 (someday I'll talk about why that limit is there) available bytes of authentication stuff in the variable length ONC RPC header for credentials and verifier, we could actually support nearly 200 supplemental group identifiers. So why don't NFS clients and servers do that?
  • The standard (yes, AUTH_SYS is part of an IETF standard) says 16. An NFS client that sends more is breaking the standard, and if it did send more, and the server rejected it (per the standard), what would the client do? It would have to truncate the number of supplemental group identifiers. Which 16 would it pick?
  • An NFS server could be forgiving and accept more than 16 supplement group identifiers, but that then begs the question as to which client is going send more given the first bullet item.
So why does the standard limit us to 16 group identifiers? The value 16 is a reflection of what UNIX operating systems supported at the time (the 1980s). Indeed, when Sun owned and controlled ONC RPC (before graciously giving IETF control), my foggy recollection (and I'm really dating myself here) is that AUTH_SYS started off with 8, then went to 12, and finally settled on 16 supplemental group identifiers. Since then, most AUTH_SYS clients and servers live in operating environments and file systems that support at least 32 supplemental group identifiers. Which is great if you don't have to use NFS to access data. Even an NFS client's operating environment supports more than 16 supplemental groups, in every case I know of, the NFS client will refuse to violate the AUTH_SYS standard and so it will not send more 16 supplemental groups. Some clients will truncate the number of supplemental groups to 16, and others will simply refuse to issue the NFS/AUTH_SYS request. So even if an NFS server wanted to be forgiving and accept AUTH_SYS requests that had more than 16 supplemental groups, this would be in vain.

So how do we get out of this?
  1. One possible answer is to create an RPC authentication flavor like AUTH_SYS but with no limit on the number of group identifiers. The trouble is, AUTH_SYS is really bad. It isn't rocket science to exploit it. The 'Net is a much more dangerous place today than in 1980s, and so it would be unethical if IETF published an AUTH_SYS_PLUS standard. In theory, nothing prevents someone from asking IANA for a new ONC RPC flavor number, and building their own authentication flavor that does just that, and publishing it. But I think it would be unethical for vendors of NFS software to support it. But the free market often trumps ethics so we'll see if any vendor cracks first. And gee, why stop at ~200 group identifiers? Just ignore the 800 byte limitations in the ONC RPC header, and send as many as the client wants. But as we will see later, supporting nearly 200 supplemental group identifiers as other issues beyond ONC RPC and NFS.
  2. Another way is to use a flavor like RPCSEC_GSS which doesn't send group identifiers. Instead, it lets the NFS server decide what groups the user is in (server determining access controls; what a novel concept!) based on the local /etc/group file or group tables in NIS or LDAP. Since there is no group id array in the RPC message, only internal NFS server limitations get in the way. NetApp's ONTAP server for example supports 32 supplemental group identifiers. Last I checked Solaris was either unlimited or up to 64, but it was subject to a tunable parameter. A side benefit of RPCSEC_GSS, if used over something like Kerberos V5 or public key certificates, gives you true authentication.
Does RPCSEC_GSS completely get you out of the 16 group id tangle? Not quite. As my colleague Chuck Lever pointed out to me recently, there is this side band protocol called NLM used for advisory byte range locking. I've seen just one NLM client use RPCSEC_GSS, and it wasn't Linux or Solaris. And not all NLM servers support RPCSEC_GSS. Practically speaking, this means that you have to either not use locking (for example use the llock mount option in Solaris, or use the nolock option in Linux), or you'll have to use NFS version 4.

NFS version 4 combines locking and filing (and mounting) in one single protocol. So use NFS version 4 with RPCSEC_GSS to blast past the 16 group identifier limitation.

Some caveats:
  • Most people use a directory service like NIS or LDAP to store their supplemental group identifier information. If you establish more than 16 supplemental groups in NIS or LDAP for your users, you'll want to make sure that all your other NFS clients support NFSv4 and support RPCSEC_GSS, and of course are configured to use Kerberos V5.
  • For a similar reason, make sure your NFS clients can support more than 16 group identifiers per user. When a user logs into his desktop system, the operating system will establish his credentials. If the user is in more than 16 groups, he may well be denied login access if his home directory is NFS mounted.
So you might ask, this is great but why am I limited to 32 or 64 group identifiers? The reason relates to how operating systems set up their in-kernel credentials. Usually the supplemental group identifiers are a simple array of integers. This means that each access attempt can require searching the entire array of integers. This is one thing if the array is 16-64 group identifiers, but get into 100s to 1000s or more, and the performance impact of that many group identifiers might start to get in the way. An answer might be to organize in-kernels as hash tables or trees, but this has costs too. Not to mention that as each in-kernel credential gets bigger the impact on kernel memory usage, which takes away from applications, becomes important.

Another approach to consider is ACLs. NFSv4 has them. An ACL (Access Control List) is a list of ACEs (Access Control Entries). In NFSv4 an ACE is basically:
  • user name or group name
  • permission bits
  • whether the named user or group is being denied or allowed access
How does this solve the problem that lots of groups solves? For a given file, you can list a bunch of users that are allowed access, and there is no over the network specification that limits how many user ACEs you can have in an ACL. The limits are purely on the server. So for a given set of files, you can let lots of users and lots of different sets of users access each file. Compare that to what lots of supplemental groups do for you. Each file has a single group id assigned to it, and you can then assign a lot of users to the group id in /etc/group or the group table in NIS or LDAP. You can assign a different group id to each file. So for a set of files, you can grant access to lots of users, and lots of different sets of users. Semantically the same.

So what ACLs do for the NFS community is make extended access purely a server problem in terms of flexibility and performance. Of course, there needs to be away to edit the ACLs on a given file, which is what NFSv4 does for you.

15 Comments:

Anonymous Anonymous said...

I wrote up the Solaris side of this when OpenSolaris was launched. Check my blog.

The Sun bug number is 4088757 (Customer would like to increase ngroups_max more than 32) though it's currently marked as an RFE (request for enhancement).

-- Peter

Tuesday, March 07, 2006 1:31:00 AM  
Anonymous Anonymous said...

For linux NFSv2/v3 clients it is possible to bypass the 16 groups limitation without breaking the NFS/RPC protocol. It requires a kernel patch available from http://www.frankvm.com/nfs-ngroups/

Thursday, January 04, 2007 1:09:00 PM  
Blogger John Jarocki said...

You say, "So use NFS version 4 with RPCSEC_GSS to blast past the 16 group identifier limitation." My question is: Will doing so get us to at least 32 groups per user (or are we still tied to individual client and server implementations)? Also, does using ACLs and NFSv4 automatically get rid of any 16, 32, or other group count limit (since ACLs are completely different beasts)? Finally, does RPCSEC_GSS + NFSv3 get us past the 16 group limit except for the NLM issue you mentioned?
Thanks in advance!
--john

Tuesday, March 13, 2007 3:57:00 PM  
Blogger Mike Eisler said...

> You say, "So use NFS version 4 with RPCSEC_GSS to blast past the 16 group identifier limitation." My question is: Will doing so get us to at least 32 groups per user (or are we still tied to individual client and server implementations)?

The use of more than 16 groups is certainly going to depend on the client and server.

> Also, does using ACLs and NFSv4 automatically get rid of any 16, 32, or other group count limit (since ACLs are completely different beasts)?

Using ACLs doesn't change any group count limits. What ACLs do is make group count limits moot: instead of putting a user in z zillio groups, you add a zillion users (and/or groups) to a file's ACL.

> Finally, does RPCSEC_GSS + NFSv3 get us past the 16 group limit except for the NLM issue you mentioned?
Thanks in advance!

Yes.

Tuesday, March 13, 2007 6:17:00 PM  
Blogger David said...

Uff, thank you so much for this article! I just spent around 5h of debugging why I couldn't get access to a particular directory even though I was in the correct group, and it turns out it was the 16 groups problem!

I can't believe this isn't put visibly into more tutorials and documents - my Debian system is setup with so many groups, that I just added my user to all of them that seemed useful -- which was about 18 of them!

Saturday, December 01, 2007 2:35:00 AM  
Blogger Unknown said...

> So you might ask, this is great but why am I limited to 32 or 64 group identifiers? The reason relates to how operating systems set up their in-kernel credentials. Usually the supplemental group identifiers are a simple array of integers. This means that each access attempt can require searching the entire array of integers. This is one thing if the array is 16-64 group identifiers, but get into 100s to 1000s or more, and the performance impact of that many group identifiers might start to get in the way. An answer might be to organize in-kernels as hash tables or trees, but this has costs too. Not to mention that as each in-kernel credential gets bigger the impact on kernel memory usage, which takes away from applications, becomes important.

This is probably a stupid question, but why couldn't kernels store supplemental group permissions for processes in bit fields? The maximum number of groups could be a kernel parameter, settable at boot time maybe, and it could default to, say, 128 or 256. Storing flags for 128 groups as bits would fit in a total of 16 bytes, and looking up an access permission would be something on the order of divide by 8 (usually a shift), a bit shift, and an "and". On many machines it should be faster than searching 16 integers. Of course I know nothing of the kernel code that would do this, so maybe I'm being hopelessly naive. I guess there would have to be some kind of mapping if group IDs are allowed to be 32 bits and sparse though.

I see what you mean about ACLs shifting the problem to the server, but I'm not so sure that's practical in all cases. Wouldn't searching a list of hundreds or thousands of users on the server likely be slow as well, and be a lot more work to maintain?

In any case, thanks much for the article!

Wednesday, December 03, 2008 11:23:00 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Tuesday, February 03, 2009 7:46:00 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Sunday, February 15, 2009 4:00:00 AM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Sunday, March 08, 2009 8:27:00 AM  
Anonymous Anonymous said...

worth looking at the resolution of
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=493059
dated Nov 2008 ;)

Monday, March 16, 2009 8:11:00 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Sunday, March 22, 2009 9:27:00 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Tuesday, March 24, 2009 7:57:00 AM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Tuesday, April 07, 2009 11:11:00 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Thursday, April 09, 2009 1:18:00 AM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

Thursday, April 09, 2009 7:06:00 PM  

Post a Comment

<< Home