All of lore.kernel.org
 help / color / mirror / Atom feed
* Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096)
@ 2010-08-04 14:38 Jonathan Manton
  2010-08-04 15:28 ` Andy Adamson
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Manton @ 2010-08-04 14:38 UTC (permalink / raw)
  To: linux-nfs

Kerberized NFS does not work for users who have a Kerberos token that  
is 2048 bytes or larger (assuming PAGE_SIZE == 4096).  This can  
(easily) happen in enterprise environments that use Active Directory  
as their Kerberos source, due to the additional Privilege Account  
Certificate (PAC) added by AD to the token.

When a user with a Kerberos token of 2048 bytes or larger attempts to  
access a filesystem mounted using Kerberized NFS, the NFS server locks  
up for 30 seconds, and ultimately the call fails.

The root cause of this problem is the interface between the sunrpc  
layer and the rpc.svcgssd daemon.  When an NFS NULL request comes in  
from a client to establish the initial security context, information  
is passed via the rpc cache mechanism through a named pipe  
(sunrpc_cache_pipe_upcall() in net/sunrpc/cache.c), to be consumed by  
the rpc.svcgssd daemon.  This results in the upcall data being  
formatted via rsi_request() in net/sunrpc/auth_gss/svcauth_gss.c.   
rsi_request() uses the qword_addhex() routine (implemented in net/ 
sunrpc/cache.c) to encode the upcall data as ASCII for the named pipe.

The issue is that the upcall data is limited to PAGE_SIZE bytes (this  
buffer is allocated in sunrpc_cache_pipe_upcall).  On my kernel at  
least, this is 4096 bytes.  The upcall data is encoded as ASCII  
characters.  It takes two ASCII characters to encode each byte of  
upcall data, meaning that any token over 2047 bytes will fill the  
buffer and result in an error condition.

When that happens, sunrpc_cache_pipe_upcall returns -EAGAIN, which  
implies (according to Documentation/filesystems/nfs/rpc-cache.txt)  
that the upcall is pending, even though in fact  
sunrpc_cache_pipe_upcall has actually freed the buffer and never added  
the call to the cache request queue.

The result is that all nfsd kernel processes continue to try to  
process the request and check back on the request, continuously, for  
30 seconds, trying to enqueue the upcall.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096)
  2010-08-04 14:38 Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096) Jonathan Manton
@ 2010-08-04 15:28 ` Andy Adamson
  2010-08-04 15:45   ` Jim Rees
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Adamson @ 2010-08-04 15:28 UTC (permalink / raw)
  To: Jonathan Manton; +Cc: linux-nfs

Yes, this limitation has been known for a long time. We ran into this same issue using X.509 certs and spkm3. I imagine PKINIT will also hit this limitation.

-->Andy

On Aug 4, 2010, at 10:38 AM, Jonathan Manton wrote:

> Kerberized NFS does not work for users who have a Kerberos token that is 2048 bytes or larger (assuming PAGE_SIZE == 4096).  This can (easily) happen in enterprise environments that use Active Directory as their Kerberos source, due to the additional Privilege Account Certificate (PAC) added by AD to the token.
> 
> When a user with a Kerberos token of 2048 bytes or larger attempts to access a filesystem mounted using Kerberized NFS, the NFS server locks up for 30 seconds, and ultimately the call fails.
> 
> The root cause of this problem is the interface between the sunrpc layer and the rpc.svcgssd daemon.  When an NFS NULL request comes in from a client to establish the initial security context, information is passed via the rpc cache mechanism through a named pipe (sunrpc_cache_pipe_upcall() in net/sunrpc/cache.c), to be consumed by the rpc.svcgssd daemon.  This results in the upcall data being formatted via rsi_request() in net/sunrpc/auth_gss/svcauth_gss.c.  rsi_request() uses the qword_addhex() routine (implemented in net/sunrpc/cache.c) to encode the upcall data as ASCII for the named pipe.
> 
> The issue is that the upcall data is limited to PAGE_SIZE bytes (this buffer is allocated in sunrpc_cache_pipe_upcall).  On my kernel at least, this is 4096 bytes.  The upcall data is encoded as ASCII characters.  It takes two ASCII characters to encode each byte of upcall data, meaning that any token over 2047 bytes will fill the buffer and result in an error condition.
> 
> When that happens, sunrpc_cache_pipe_upcall returns -EAGAIN, which implies (according to Documentation/filesystems/nfs/rpc-cache.txt) that the upcall is pending, even though in fact sunrpc_cache_pipe_upcall has actually freed the buffer and never added the call to the cache request queue.
> 
> The result is that all nfsd kernel processes continue to try to process the request and check back on the request, continuously, for 30 seconds, trying to enqueue the upcall.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096)
  2010-08-04 15:28 ` Andy Adamson
@ 2010-08-04 15:45   ` Jim Rees
  2010-08-04 16:22     ` Jonathan Manton
  0 siblings, 1 reply; 4+ messages in thread
From: Jim Rees @ 2010-08-04 15:45 UTC (permalink / raw)
  To: Andy Adamson; +Cc: Jonathan Manton, linux-nfs

Andy Adamson wrote:

  > When a user with a Kerberos token of 2048 bytes or larger attempts to
  > access a filesystem mounted using Kerberized NFS, the NFS server locks up
  > for 30 seconds, and ultimately the call fails.

  Yes, this limitation has been known for a long time. We ran into this same
  issue using X.509 certs and spkm3. I imagine PKINIT will also hit this
  limitation.

But shouldn't it fail right away instead of locking up for 30 seconds?

Does the entire server lock up, or just that one rpc?

Can a malicious client use this as a DOS?  Does it require a valid ticket,
or will any ticket >= 2048 do?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096)
  2010-08-04 15:45   ` Jim Rees
@ 2010-08-04 16:22     ` Jonathan Manton
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Manton @ 2010-08-04 16:22 UTC (permalink / raw)
  To: Jim Rees; +Cc: Andy Adamson, linux-nfs


On Aug 4, 2010, at 10:45 AM, Jim Rees wrote:

> Andy Adamson wrote:
>
>> When a user with a Kerberos token of 2048 bytes or larger attempts to
>> access a filesystem mounted using Kerberized NFS, the NFS server  
>> locks up
>> for 30 seconds, and ultimately the call fails.
>
>  Yes, this limitation has been known for a long time. We ran into  
> this same
>  issue using X.509 certs and spkm3. I imagine PKINIT will also hit  
> this
>  limitation.
>
> But shouldn't it fail right away instead of locking up for 30 seconds?
>

It seems to me that it should error out with a log message, rather  
than simply trying over and over again.

> Does the entire server lock up, or just that one rpc?
>

The concrete manifestation of this is that all of the NFS kernel  
processes run continuously.  So on a single-processor system, it takes  
100% of the CPU for those 30 seconds.  On a multiprocessor system (at  
least my RHEL system), the NFS kernel processes keep affinity with a  
CPU, so it just consumes one processor.  I have not tested if other  
NFS requests can be processed during that window on a multiprocessor  
system.  It does not really "lock up", but rather monopolizes the CPU  
with high-priority kernel threads.

Related to this, it was a real pain for me to debug, since setting any  
of the rpcdebug flags in rpc simply overloaded the logging subsystem.   
I had to put an ssleep() in svcauth_gss_handle_init() in order to get  
debugging output I could use from rpcdebug.

> Can a malicious client use this as a DOS?

Yes.

> Does it require a valid ticket,
> or will any ticket >= 2048 do?

I believe that all of the validity-checking of the token is done in  
the upcall rpc.svcgssd, not in the sunrpc kernel code.  I am a kernel  
newbie though, so I am not sure.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-08-04 16:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-04 14:38 Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096) Jonathan Manton
2010-08-04 15:28 ` Andy Adamson
2010-08-04 15:45   ` Jim Rees
2010-08-04 16:22     ` Jonathan Manton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.