All of lore.kernel.org
 help / color / mirror / Atom feed
* Callback slot table overflowed
@ 2021-08-05  0:00 Timothy Pearson
  2021-08-05  0:04 ` Timothy Pearson
  0 siblings, 1 reply; 8+ messages in thread
From: Timothy Pearson @ 2021-08-05  0:00 UTC (permalink / raw)
  To: linux-nfs

All,

We've hit an odd issue after upgrading a main NFS server from Debian Stretch to Debian Buster.  In both cases the 5.13.4 kernel was used, however after the upgrade none of our ARM thin clients can mount their root filesystems -- early in the boot process I/O errors are returned immediately following "Callback slot table overflowed" in the client dmesg.

I am unable to find any useful information on this "Callback slot table overflowed" message, and have no idea why it is only impacting our ARM (armel) clients.  Both 4.14 and 5.3 on the client side show the issue, other client kernel versions were not tested.

Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but does not eliminate) the number of callback overflow messages.

The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin clients are experiencing any problems.  Nothing of interest appears in the server message log.

Any troubleshooting hints would be most welcome.

Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Callback slot table overflowed
  2021-08-05  0:00 Callback slot table overflowed Timothy Pearson
@ 2021-08-05  0:04 ` Timothy Pearson
  2021-08-05  0:37   ` Timothy Pearson
  0 siblings, 1 reply; 8+ messages in thread
From: Timothy Pearson @ 2021-08-05  0:04 UTC (permalink / raw)
  To: linux-nfs

Other information that may be helpful:

All clients are using TCP
arm64 clients are unaffected by the bug
The armel clients use very small (4k) rsize/wsize buffers
Prior to the upgrade from Debian Stretch, everything was working perfectly

----- Original Message -----
> From: "Timothy Pearson" <tpearson@raptorengineering.com>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Wednesday, August 4, 2021 7:00:20 PM
> Subject: Callback slot table overflowed

> All,
> 
> We've hit an odd issue after upgrading a main NFS server from Debian Stretch to
> Debian Buster.  In both cases the 5.13.4 kernel was used, however after the
> upgrade none of our ARM thin clients can mount their root filesystems -- early
> in the boot process I/O errors are returned immediately following "Callback
> slot table overflowed" in the client dmesg.
> 
> I am unable to find any useful information on this "Callback slot table
> overflowed" message, and have no idea why it is only impacting our ARM (armel)
> clients.  Both 4.14 and 5.3 on the client side show the issue, other client
> kernel versions were not tested.
> 
> Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but
> does not eliminate) the number of callback overflow messages.
> 
> The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin
> clients are experiencing any problems.  Nothing of interest appears in the
> server message log.
> 
> Any troubleshooting hints would be most welcome.
> 
> Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Callback slot table overflowed
  2021-08-05  0:04 ` Timothy Pearson
@ 2021-08-05  0:37   ` Timothy Pearson
  2021-08-06 19:53     ` Olga Kornievskaia
  0 siblings, 1 reply; 8+ messages in thread
From: Timothy Pearson @ 2021-08-05  0:37 UTC (permalink / raw)
  To: linux-nfs

On further investigation, the working server had already been rolled back to 4.19.0.  Apparently the issue was insurmountable in 5.x.

It should be simple enough to set up a test environment out of production for 5.x, if you have any debug tips / would like to see any debug options compiled in.

Thanks!

----- Original Message -----
> From: "Timothy Pearson" <tpearson@raptorengineering.com>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Wednesday, August 4, 2021 7:04:16 PM
> Subject: Re: Callback slot table overflowed

> Other information that may be helpful:
> 
> All clients are using TCP
> arm64 clients are unaffected by the bug
> The armel clients use very small (4k) rsize/wsize buffers
> Prior to the upgrade from Debian Stretch, everything was working perfectly
> 
> ----- Original Message -----
>> From: "Timothy Pearson" <tpearson@raptorengineering.com>
>> To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Wednesday, August 4, 2021 7:00:20 PM
>> Subject: Callback slot table overflowed
> 
>> All,
>> 
>> We've hit an odd issue after upgrading a main NFS server from Debian Stretch to
>> Debian Buster.  In both cases the 5.13.4 kernel was used, however after the
>> upgrade none of our ARM thin clients can mount their root filesystems -- early
>> in the boot process I/O errors are returned immediately following "Callback
>> slot table overflowed" in the client dmesg.
>> 
>> I am unable to find any useful information on this "Callback slot table
>> overflowed" message, and have no idea why it is only impacting our ARM (armel)
>> clients.  Both 4.14 and 5.3 on the client side show the issue, other client
>> kernel versions were not tested.
>> 
>> Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but
>> does not eliminate) the number of callback overflow messages.
>> 
>> The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin
>> clients are experiencing any problems.  Nothing of interest appears in the
>> server message log.
>> 
>> Any troubleshooting hints would be most welcome.
>> 
> > Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Callback slot table overflowed
  2021-08-05  0:37   ` Timothy Pearson
@ 2021-08-06 19:53     ` Olga Kornievskaia
  2021-08-06 20:37       ` Timothy Pearson
  0 siblings, 1 reply; 8+ messages in thread
From: Olga Kornievskaia @ 2021-08-06 19:53 UTC (permalink / raw)
  To: Timothy Pearson; +Cc: linux-nfs

On Thu, Aug 5, 2021 at 12:15 AM Timothy Pearson
<tpearson@raptorengineering.com> wrote:
>
> On further investigation, the working server had already been rolled back to 4.19.0.  Apparently the issue was insurmountable in 5.x.
>
> It should be simple enough to set up a test environment out of production for 5.x, if you have any debug tips / would like to see any debug options compiled in.
>
> Thanks!
>
> ----- Original Message -----
> > From: "Timothy Pearson" <tpearson@raptorengineering.com>
> > To: "linux-nfs" <linux-nfs@vger.kernel.org>
> > Sent: Wednesday, August 4, 2021 7:04:16 PM
> > Subject: Re: Callback slot table overflowed
>
> > Other information that may be helpful:
> >
> > All clients are using TCP
> > arm64 clients are unaffected by the bug
> > The armel clients use very small (4k) rsize/wsize buffers
> > Prior to the upgrade from Debian Stretch, everything was working perfectly
> >
> > ----- Original Message -----
> >> From: "Timothy Pearson" <tpearson@raptorengineering.com>
> >> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> >> Sent: Wednesday, August 4, 2021 7:00:20 PM
> >> Subject: Callback slot table overflowed
> >
> >> All,
> >>
> >> We've hit an odd issue after upgrading a main NFS server from Debian Stretch to
> >> Debian Buster.  In both cases the 5.13.4 kernel was used, however after the
> >> upgrade none of our ARM thin clients can mount their root filesystems -- early
> >> in the boot process I/O errors are returned immediately following "Callback
> >> slot table overflowed" in the client dmesg.
> >>
> >> I am unable to find any useful information on this "Callback slot table
> >> overflowed" message, and have no idea why it is only impacting our ARM (armel)
> >> clients.  Both 4.14 and 5.3 on the client side show the issue, other client
> >> kernel versions were not tested.
> >>
> >> Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but
> >> does not eliminate) the number of callback overflow messages.
> >>
> >> The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin
> >> clients are experiencing any problems.  Nothing of interest appears in the
> >> server message log.
> >>
> >> Any troubleshooting hints would be most welcome.

A network trace would be useful.

5.3 should have this patch "SUNRPC: Fix up backchannel slot table
accounting". I believe "callback slot table overflowed" is hit when
the server sent more reqs than client can handle (ie doesn't have a
free slot to handle the request). A network trace would show that.
However you said this happens when the client is trying to mount and
besides cb_null requests I'm not sure what could be happening.

> >>
> > > Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Callback slot table overflowed
  2021-08-06 19:53     ` Olga Kornievskaia
@ 2021-08-06 20:37       ` Timothy Pearson
  0 siblings, 0 replies; 8+ messages in thread
From: Timothy Pearson @ 2021-08-06 20:37 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs



----- Original Message -----
> From: "Olga Kornievskaia" <aglo@umich.edu>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Friday, August 6, 2021 2:53:19 PM
> Subject: Re: Callback slot table overflowed

> On Thu, Aug 5, 2021 at 12:15 AM Timothy Pearson
> <tpearson@raptorengineering.com> wrote:
>>
>> On further investigation, the working server had already been rolled back to
>> 4.19.0.  Apparently the issue was insurmountable in 5.x.
>>
>> It should be simple enough to set up a test environment out of production for
>> 5.x, if you have any debug tips / would like to see any debug options compiled
>> in.
>>
>> Thanks!
>>
>> ----- Original Message -----
>> > From: "Timothy Pearson" <tpearson@raptorengineering.com>
>> > To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> > Sent: Wednesday, August 4, 2021 7:04:16 PM
>> > Subject: Re: Callback slot table overflowed
>>
>> > Other information that may be helpful:
>> >
>> > All clients are using TCP
>> > arm64 clients are unaffected by the bug
>> > The armel clients use very small (4k) rsize/wsize buffers
>> > Prior to the upgrade from Debian Stretch, everything was working perfectly
>> >
>> > ----- Original Message -----
>> >> From: "Timothy Pearson" <tpearson@raptorengineering.com>
>> >> To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> >> Sent: Wednesday, August 4, 2021 7:00:20 PM
>> >> Subject: Callback slot table overflowed
>> >
>> >> All,
>> >>
>> >> We've hit an odd issue after upgrading a main NFS server from Debian Stretch to
>> >> Debian Buster.  In both cases the 5.13.4 kernel was used, however after the
>> >> upgrade none of our ARM thin clients can mount their root filesystems -- early
>> >> in the boot process I/O errors are returned immediately following "Callback
>> >> slot table overflowed" in the client dmesg.
>> >>
>> >> I am unable to find any useful information on this "Callback slot table
>> >> overflowed" message, and have no idea why it is only impacting our ARM (armel)
>> >> clients.  Both 4.14 and 5.3 on the client side show the issue, other client
>> >> kernel versions were not tested.
>> >>
>> >> Curiously, increasing the rsize/wsize values to 65536 or higher reduces (but
>> >> does not eliminate) the number of callback overflow messages.
>> >>
>> >> The server is a ppc64el 64k page host, and none of our pcc64el or amd64 thin
>> >> clients are experiencing any problems.  Nothing of interest appears in the
>> >> server message log.
>> >>
>> >> Any troubleshooting hints would be most welcome.
> 
> A network trace would be useful.
> 
> 5.3 should have this patch "SUNRPC: Fix up backchannel slot table
> accounting". I believe "callback slot table overflowed" is hit when
> the server sent more reqs than client can handle (ie doesn't have a
> free slot to handle the request). A network trace would show that.
> However you said this happens when the client is trying to mount and
> besides cb_null requests I'm not sure what could be happening.

I'll work to get a network trace out of the test environment once it's set up.  I should however clarify that this is immediately *after* mount, when the diskless ARM device is attempting to run early startup (i.e. reading /etc/init.d and such).

>> >>
> > > > Thank you!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Callback slot table overflowed
  2011-01-20  3:07 Jim Rees
  2011-01-20 14:01 ` Halevy, Benny
@ 2011-01-20 15:19 ` Andy Adamson
  1 sibling, 0 replies; 8+ messages in thread
From: Andy Adamson @ 2011-01-20 15:19 UTC (permalink / raw)
  To: Jim Rees; +Cc: linux-nfs, peter honeyman

Examine the CREATE_SESSION that created the fore channel, the back channel limits are set there.  Also note that since this is the back channel server setting the CREATE_SESSION backchannel attributes that the back channel client (e.g. the server) can not increase any of the limits.

-->Andy

On Jan 19, 2011, at 10:07 PM, Jim Rees wrote:

> Is this bad?
> Jan 19 21:53:27 pdsi7 kernel: Callback slot table overflowed
> 
> All I can tell from the code is that xprt_alloc_bc_request() failed, and the
> slots are "preallocated during the backchannel setup".  Is there a hard
> limit on these slots?  Where is it set?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Callback slot table overflowed
  2011-01-20  3:07 Jim Rees
@ 2011-01-20 14:01 ` Halevy, Benny
  2011-01-20 15:19 ` Andy Adamson
  1 sibling, 0 replies; 8+ messages in thread
From: Halevy, Benny @ 2011-01-20 14:01 UTC (permalink / raw)
  To: Jim Rees, linux-nfs; +Cc: peter honeyman

This shouldn't happen as the slot table size is negotiated on session
establishment.  Currently, the client support NFS41_BC_MIN_CALLBACKS (1)
request(s)

Benny


-----Original Message-----
From: linux-nfs-owner@vger.kernel.org on behalf of Jim Rees
Sent: Thu 2011-01-20 05:07
To: linux-nfs@vger.kernel.org
Cc: peter honeyman
Subject: Callback slot table overflowed
 
Is this bad?
Jan 19 21:53:27 pdsi7 kernel: Callback slot table overflowed

All I can tell from the code is that xprt_alloc_bc_request() failed, and the
slots are "preallocated during the backchannel setup".  Is there a hard
limit on these slots?  Where is it set?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Callback slot table overflowed
@ 2011-01-20  3:07 Jim Rees
  2011-01-20 14:01 ` Halevy, Benny
  2011-01-20 15:19 ` Andy Adamson
  0 siblings, 2 replies; 8+ messages in thread
From: Jim Rees @ 2011-01-20  3:07 UTC (permalink / raw)
  To: linux-nfs; +Cc: peter honeyman

Is this bad?
Jan 19 21:53:27 pdsi7 kernel: Callback slot table overflowed

All I can tell from the code is that xprt_alloc_bc_request() failed, and the
slots are "preallocated during the backchannel setup".  Is there a hard
limit on these slots?  Where is it set?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-06 20:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05  0:00 Callback slot table overflowed Timothy Pearson
2021-08-05  0:04 ` Timothy Pearson
2021-08-05  0:37   ` Timothy Pearson
2021-08-06 19:53     ` Olga Kornievskaia
2021-08-06 20:37       ` Timothy Pearson
  -- strict thread matches above, loose matches on Subject: below --
2011-01-20  3:07 Jim Rees
2011-01-20 14:01 ` Halevy, Benny
2011-01-20 15:19 ` Andy Adamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.