All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS client pNFS handling of NFS4ERR_NOSPC
@ 2021-11-07  0:03 Rick Macklem
  2021-11-07  0:16 ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: Rick Macklem @ 2021-11-07  0:03 UTC (permalink / raw)
  To: linux-nfs

Hi,

I ran a simple test using a Linux 5.12 client NFSv4.2 mount
against a FreeBSD pNFS server, where the DS is out of space
(intentionally, by creating a large file on it).

I tried to write a file on the Linux NFS client mount and the
mount point gets "stuck" (will not <ctrl>C nor "umount -f").
--> The client is attempting writes against the DS repeatedly,
       with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
       over and over and over again.)
--> The client is repeatedly sending RPCs with LayoutError in
       them to the MDS, reporting the NFS4ERR_NOSPC.

I'll leave it up to others, but failing the program trying to
write the file with ENOSPC would seem preferable to the
"stuck" mount?
--> Removing the large file from the DS so that the Writes
      can succeed does cause the client to recover.

rick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS client pNFS handling of NFS4ERR_NOSPC
  2021-11-07  0:03 NFS client pNFS handling of NFS4ERR_NOSPC Rick Macklem
@ 2021-11-07  0:16 ` Trond Myklebust
  2021-11-08  2:27   ` Rick Macklem
  0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2021-11-07  0:16 UTC (permalink / raw)
  To: linux-nfs, rmacklem

On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> Hi,
> 
> I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> against a FreeBSD pNFS server, where the DS is out of space
> (intentionally, by creating a large file on it).
> 
> I tried to write a file on the Linux NFS client mount and the
> mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> --> The client is attempting writes against the DS repeatedly,
>        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
>        over and over and over again.)
> --> The client is repeatedly sending RPCs with LayoutError in
>        them to the MDS, reporting the NFS4ERR_NOSPC.
> 
> I'll leave it up to others, but failing the program trying to
> write the file with ENOSPC would seem preferable to the
> "stuck" mount?
> --> Removing the large file from the DS so that the Writes
>       can succeed does cause the client to recover.
> 

The client expectation is that the MDS will either remedy the
situation, or it will return an appropriate application-level error to
the LAYOUTGET.

What we do not expect is for the client to have to handle DS level
errors.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS client pNFS handling of NFS4ERR_NOSPC
  2021-11-07  0:16 ` Trond Myklebust
@ 2021-11-08  2:27   ` Rick Macklem
  2021-11-08  2:41     ` Trond Myklebust
  0 siblings, 1 reply; 6+ messages in thread
From: Rick Macklem @ 2021-11-08  2:27 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs

Trond Myklebust wrote:
> On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > Hi,
> >
> > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > against a FreeBSD pNFS server, where the DS is out of space
> > (intentionally, by creating a large file on it).
> >
> > I tried to write a file on the Linux NFS client mount and the
> > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > --> The client is attempting writes against the DS repeatedly,
> >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> >        over and over and over again.)
> > --> The client is repeatedly sending RPCs with LayoutError in
> >        them to the MDS, reporting the NFS4ERR_NOSPC.
> >
> > I'll leave it up to others, but failing the program trying to
> > write the file with ENOSPC would seem preferable to the
> > "stuck" mount?
> > --> Removing the large file from the DS so that the Writes
> >       can succeed does cause the client to recover.
> >
> 
> The client expectation is that the MDS will either remedy the
> situation, or it will return an appropriate application-level error to
> the LAYOUTGET.
Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
(This is triggered by the LayoutError.)

For NFSv4.1, things don't work as well, since there is no LayoutError
operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
but that doesn't happen until it finishes (which doesn't happen until
I free up space on the DS).
But I can live with only 4.2 working well. I can't be bothered endlessly
probing the DSs to see if they are out of space.

rick


What we do not expect is for the client to have to handle DS level
errors.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS client pNFS handling of NFS4ERR_NOSPC
  2021-11-08  2:27   ` Rick Macklem
@ 2021-11-08  2:41     ` Trond Myklebust
  2021-11-08 16:26       ` Rick Macklem
  0 siblings, 1 reply; 6+ messages in thread
From: Trond Myklebust @ 2021-11-08  2:41 UTC (permalink / raw)
  To: linux-nfs, rmacklem

On Mon, 2021-11-08 at 02:27 +0000, Rick Macklem wrote:
> Trond Myklebust wrote:
> > On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > > Hi,
> > > 
> > > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > > against a FreeBSD pNFS server, where the DS is out of space
> > > (intentionally, by creating a large file on it).
> > > 
> > > I tried to write a file on the Linux NFS client mount and the
> > > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > > --> The client is attempting writes against the DS repeatedly,
> > >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> > >        over and over and over again.)
> > > --> The client is repeatedly sending RPCs with LayoutError in
> > >        them to the MDS, reporting the NFS4ERR_NOSPC.
> > > 
> > > I'll leave it up to others, but failing the program trying to
> > > write the file with ENOSPC would seem preferable to the
> > > "stuck" mount?
> > > --> Removing the large file from the DS so that the Writes
> > >       can succeed does cause the client to recover.
> > > 
> > 
> > The client expectation is that the MDS will either remedy the
> > situation, or it will return an appropriate application-level error
> > to
> > the LAYOUTGET.
> Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
> to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
> (This is triggered by the LayoutError.)
> 
> For NFSv4.1, things don't work as well, since there is no LayoutError
> operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
> but that doesn't happen until it finishes (which doesn't happen until
> I free up space on the DS).

Hmm... The ENOSPC error from the DS should in principle be marking the
layout for return. You're saying that the return isn't happening?

Does a newer client fix the issue?

> But I can live with only 4.2 working well. I can't be bothered
> endlessly
> probing the DSs to see if they are out of space.

Agreed. Your server should be able to rely on the layout error reports
from the client (either in LAYOUTERROR or in the LAYOUTRETURN) in order
to figure out when the DS might be out of space.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS client pNFS handling of NFS4ERR_NOSPC
  2021-11-08  2:41     ` Trond Myklebust
@ 2021-11-08 16:26       ` Rick Macklem
  2021-11-09 23:03         ` Rick Macklem
  0 siblings, 1 reply; 6+ messages in thread
From: Rick Macklem @ 2021-11-08 16:26 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs

Trond Myklebust wrote:
> On Mon, 2021-11-08 at 02:27 +0000, Rick Macklem wrote:
> > Trond Myklebust wrote:
> > > On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > > > Hi,
> > > >
> > > > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > > > against a FreeBSD pNFS server, where the DS is out of space
> > > > (intentionally, by creating a large file on it).
> > > >
> > > > I tried to write a file on the Linux NFS client mount and the
> > > > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > > > --> The client is attempting writes against the DS repeatedly,
> > > >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> > > >        over and over and over again.)
> > > > --> The client is repeatedly sending RPCs with LayoutError in
> > > >        them to the MDS, reporting the NFS4ERR_NOSPC.
> > > >
> > > > I'll leave it up to others, but failing the program trying to
> > > > write the file with ENOSPC would seem preferable to the
> > > > "stuck" mount?
> > > > --> Removing the large file from the DS so that the Writes
> > > >       can succeed does cause the client to recover.
> > > >
> > >
> > > The client expectation is that the MDS will either remedy the
> > > situation, or it will return an appropriate application-level error
> > > to
> > > the LAYOUTGET.
> > Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
> > to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
> > (This is triggered by the LayoutError.)
> >
> > For NFSv4.1, things don't work as well, since there is no LayoutError
> > operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
> > but that doesn't happen until it finishes (which doesn't happen until
> > I free up space on the DS).
>
> Hmm... The ENOSPC error from the DS should in principle be marking the
> layout for return. You're saying that the return isn't happening?
Not until the end, after I have deleted the large file, so there is space on the
DS for the writes. It is in the same compound as Close.
The packet capture is here, in case you are interested:
https://people.freebsd.org/~rmacklem/linux-ds-out-of-space.pcap
(Taken at the MDS, so it doesn't show the DS RPCs, but they're just
 a lot of writes that fail with NFS4ERR_NOSPC until near the end.)

If you look, you'll see it gets a layout for the entire file first,
then it repeatedly does LayoutGets that are a little weird.
- For 4K only, but always on for an offset that is an exact multiple
   of 1Mbyte.
--> Then, once I free up space on the DS, it does the compound
      that includes both Close and LayoutReturn (which has the
      NFS4ERR_NOSPC error report in it).

> Does a newer client fix the issue?
This was 5.12. I'll build/test a newer kernel in the next couple of
days and report back (it's an old single core i386, so it takes a while;-).

rick

> But I can live with only 4.2 working well. I can't be bothered
> endlessly
> probing the DSs to see if they are out of space.

Agreed. Your server should be able to rely on the layout error reports
from the client (either in LAYOUTERROR or in the LAYOUTRETURN) in order
to figure out when the DS might be out of space.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: NFS client pNFS handling of NFS4ERR_NOSPC
  2021-11-08 16:26       ` Rick Macklem
@ 2021-11-09 23:03         ` Rick Macklem
  0 siblings, 0 replies; 6+ messages in thread
From: Rick Macklem @ 2021-11-09 23:03 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs

Rick Macklem wrote:
> Trond Myklebust wrote:
> > On Mon, 2021-11-08 at 02:27 +0000, Rick Macklem wrote:
> > > Trond Myklebust wrote:
> > > > On Sun, 2021-11-07 at 00:03 +0000, Rick Macklem wrote:
> > > > > Hi,
> > > > >
> > > > > I ran a simple test using a Linux 5.12 client NFSv4.2 mount
> > > > > against a FreeBSD pNFS server, where the DS is out of space
> > > > > (intentionally, by creating a large file on it).
> > > > >
> > > > > I tried to write a file on the Linux NFS client mount and the
> > > > > mount point gets "stuck" (will not <ctrl>C nor "umount -f").
> > > > > --> The client is attempting writes against the DS repeatedly,
> > > > >        with the DS replying NFS4ERR_NOSPC. (Same byte offsets,
> > > > >        over and over and over again.)
> > > > > --> The client is repeatedly sending RPCs with LayoutError in
> > > > >        them to the MDS, reporting the NFS4ERR_NOSPC.
> > > > >
> > > > > I'll leave it up to others, but failing the program trying to
> > > > > write the file with ENOSPC would seem preferable to the
> > > > > "stuck" mount?
> > > > > --> Removing the large file from the DS so that the Writes
> > > > >       can succeed does cause the client to recover.
> > > > >
> > > >
> > > > The client expectation is that the MDS will either remedy the
> > > > situation, or it will return an appropriate application-level error
> > > > to
> > > > the LAYOUTGET.
> > > Thanks Trond, that worked fine for NFSv4.2. I tweaked the pNFS server
> > > to reply NFS4ERR_NOSPC to LayoutGet and that worked fine.
> > > (This is triggered by the LayoutError.)
> > >
> > > For NFSv4.1, things don't work as well, since there is no LayoutError
> > > operation. The LayoutReturn has the NFS4ERR_NOSPC error in it,
> > > but that doesn't happen until it finishes (which doesn't happen until
> > > I free up space on the DS).
> >
> > Hmm... The ENOSPC error from the DS should in principle be marking the
> > layout for return. You're saying that the return isn't happening?
> Not until the end, after I have deleted the large file, so there is space on the
> DS for the writes. It is in the same compound as Close.
> The packet capture is here, in case you are interested:
> https://people.freebsd.org/~rmacklem/linux-ds-out-of-space.pcap
> (Taken at the MDS, so it doesn't show the DS RPCs, but they're just
>  a lot of writes that fail with NFS4ERR_NOSPC until near the end.)
> 
> If you look, you'll see it gets a layout for the entire file first,
> then it repeatedly does LayoutGets that are a little weird.
> - For 4K only, but always on for an offset that is an exact multiple
>    of 1Mbyte.
> --> Then, once I free up space on the DS, it does the compound
>       that includes both Close and LayoutReturn (which has the
>       NFS4ERR_NOSPC error report in it).
>
> > Does a newer client fix the issue?
> This was 5.12. I'll build/test a newer kernel in the next couple of
> days and report back (it's an old single core i386, so it takes a while;-).
5.15.1 exhibits the same behaviour. The only difference is that LayoutReturn
was in a separate RPC from Close, but still didn't happen until the
end, after I free'd up space on the DS and the writes to the DS
succeeded. (This time I had delegations enabled, which might be why
the LayoutReturn wasn't in the same compound RPC as Close?)

rick

> rick
>
> > > But I can live with only 4.2 working well. I can't be bothered
> > > endlessly
> > > probing the DSs to see if they are out of space.
>
> > Agreed. Your server should be able to rely on the layout error reports
> > from the client (either in LAYOUTERROR or in the LAYOUTRETURN) in order
> > to figure out when the DS might be out of space.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-09 23:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-07  0:03 NFS client pNFS handling of NFS4ERR_NOSPC Rick Macklem
2021-11-07  0:16 ` Trond Myklebust
2021-11-08  2:27   ` Rick Macklem
2021-11-08  2:41     ` Trond Myklebust
2021-11-08 16:26       ` Rick Macklem
2021-11-09 23:03         ` Rick Macklem

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.