All of lore.kernel.org
 help / color / mirror / Atom feed
* rpc_exit_task warning.
@ 2012-07-24 18:58 Taylan Develioglu
  2012-07-25 16:52 ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Taylan Develioglu @ 2012-07-24 18:58 UTC (permalink / raw)
  To: linux-nfs

Hi,

We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:

Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.

I don't really have time to debug this, but I felt I should report it.

- Client
  ii  libevent-1.4-2                      1.4.13-stable-1
  ii  util-linux                          2.17.2-9
  ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
  ii  nfs-common                          1:1.2.2-4squeeze2

- Server
  ii  libevent-1.4-2                      1.4.13-stable-1
  ii  util-linux                          2.17.2-9
  ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
  ii  libnfsidmap2                        0.24-1~bpo60+1               
  ii  nfs-common                          1:1.2.5-4~bpo60+1            
  ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            

exportfs -v
/var/www/pictures
                x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,anonuid=33,anongid=33)
/var/www/pictures
                10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check,anonuid=33,anongid=33)

----------------------------------------------------------------------------------------
[ 1913.662849] WARNING: at /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/sunrpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()
[ 1913.662851] Hardware name: X8STi
[ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
[ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
[ 1913.662904] Call Trace:
[ 1913.662909]  [<ffffffff810498ac>] ? warn_slowpath_common+0x78/0x8c
[ 1913.662916]  [<ffffffffa0327871>] ? rpc_exit_task+0x40/0x7a [sunrpc]
[ 1913.662922]  [<ffffffffa0327ddb>] ? __rpc_execute+0x71/0x23f [sunrpc]
[ 1913.662928]  [<ffffffffa0327fe1>] ? rpc_execute+0x38/0x38 [sunrpc]
[ 1913.662981]  [<ffffffff8105f96c>] ? process_one_work+0x1cc/0x2ea
[ 1913.662985]  [<ffffffff8105fbb7>] ? worker_thread+0x12d/0x247
[ 1913.662987]  [<ffffffff8105fa8a>] ? process_one_work+0x2ea/0x2ea
[ 1913.662990]  [<ffffffff8105fa8a>] ? process_one_work+0x2ea/0x2ea
[ 1913.662993]  [<ffffffff810633c5>] ? kthread+0x7a/0x82
[ 1913.662998]  [<ffffffff8136ca74>] ? kernel_thread_helper+0x4/0x10
[ 1913.663000]  [<ffffffff8106334b>] ? kthread_worker_fn+0x147/0x147
[ 1913.663003]  [<ffffffff8136ca70>] ? gs_change+0x13/0x13
[ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac ]---
----------------------------------------------------------------------------------------

Regards,

Taylan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rpc_exit_task warning.
  2012-07-24 18:58 rpc_exit_task warning Taylan Develioglu
@ 2012-07-25 16:52 ` J. Bruce Fields
  2012-07-25 17:01   ` Taylan Develioglu
  2012-07-26 10:43   ` Taylan Develioglu
  0 siblings, 2 replies; 6+ messages in thread
From: J. Bruce Fields @ 2012-07-25 16:52 UTC (permalink / raw)
  To: Taylan Develioglu; +Cc: linux-nfs, Trond Myklebust

On Tue, Jul 24, 2012 at 08:58:43PM +0200, Taylan Develioglu wrote:
> We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:

Did this replace an old server that didn't see these warnings?

> 
> Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.
> 
> I don't really have time to debug this, but I felt I should report it.
> 
> - Client
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
>   ii  nfs-common                          1:1.2.2-4squeeze2
> 
> - Server
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
>   ii  libnfsidmap2                        0.24-1~bpo60+1               
>   ii  nfs-common                          1:1.2.5-4~bpo60+1            
>   ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            
> 
> exportfs -v
> /var/www/pictures
>                 x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,anonuid=33,anongid=33)
> /var/www/pictures
>                 10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check,anonuid=33,anongid=33)
> 
> ----------------------------------------------------------------------------------------
> [ 1913.662849] WARNING: at /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/sunrpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()

That's a warning from rpc_task that both tk_action and RPC_TASK_KILLED
were set on exit from rpc_calL_done.

Couldn't that happen if there's a race between rpc_killall and
rpc_call_done trying to restart the task?  rpc_restart_call{_prepare}
check RPC_TASK_KILLED before setting the action, but does anything
prevent the flag being set after that check?

--b.

> [ 1913.662851] Hardware name: X8STi
> [ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
> [ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
> [ 1913.662904] Call Trace:
> [ 1913.662909]  [<ffffffff810498ac>] ? warn_slowpath_common+0x78/0x8c
> [ 1913.662916]  [<ffffffffa0327871>] ? rpc_exit_task+0x40/0x7a [sunrpc]
> [ 1913.662922]  [<ffffffffa0327ddb>] ? __rpc_execute+0x71/0x23f [sunrpc]
> [ 1913.662928]  [<ffffffffa0327fe1>] ? rpc_execute+0x38/0x38 [sunrpc]
> [ 1913.662981]  [<ffffffff8105f96c>] ? process_one_work+0x1cc/0x2ea
> [ 1913.662985]  [<ffffffff8105fbb7>] ? worker_thread+0x12d/0x247
> [ 1913.662987]  [<ffffffff8105fa8a>] ? process_one_work+0x2ea/0x2ea
> [ 1913.662990]  [<ffffffff8105fa8a>] ? process_one_work+0x2ea/0x2ea
> [ 1913.662993]  [<ffffffff810633c5>] ? kthread+0x7a/0x82
> [ 1913.662998]  [<ffffffff8136ca74>] ? kernel_thread_helper+0x4/0x10
> [ 1913.663000]  [<ffffffff8106334b>] ? kthread_worker_fn+0x147/0x147
> [ 1913.663003]  [<ffffffff8136ca70>] ? gs_change+0x13/0x13
> [ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac ]---
> ----------------------------------------------------------------------------------------
> 
> Regards,
> 
> Taylan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: rpc_exit_task warning.
  2012-07-25 16:52 ` J. Bruce Fields
@ 2012-07-25 17:01   ` Taylan Develioglu
  2012-07-26 10:43   ` Taylan Develioglu
  1 sibling, 0 replies; 6+ messages in thread
From: Taylan Develioglu @ 2012-07-25 17:01 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, Trond Myklebust

Yes, but the previous server was using kernel version 2.6.39, so it looks like this is a regression.

------------------------------------------------------------------------------------------------------------------

 
TAYLAN DEVELIOGLU
Operations Manager
Email: tdevelioglu@ebuddy.com
mobile: +31 (0) 62 122 3115

eBuddy BV
Keizersgracht 585
1017 DR Amsterdam
The Netherlands
www.ebuddy.com


------------------------------------------------------------------------------------------------------------------



-----Original Message-----
From: J. Bruce Fields [mailto:bfields@fieldses.org] 
Sent: Wednesday, July 25, 2012 18:52
To: Taylan Develioglu
Cc: linux-nfs@vger.kernel.org; Trond Myklebust
Subject: Re: rpc_exit_task warning.


On Tue, Jul 24, 2012 at 08:58:43PM +0200, Taylan Develioglu wrote:
> We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:

Did this replace an old server that didn't see these warnings?

> 
> Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.
> 
> I don't really have time to debug this, but I felt I should report it.
> 
> - Client
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
>   ii  nfs-common                          1:1.2.2-4squeeze2
> 
> - Server
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
>   ii  libnfsidmap2                        0.24-1~bpo60+1               
>   ii  nfs-common                          1:1.2.5-4~bpo60+1            
>   ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            
> 
> exportfs -v
> /var/www/pictures
>                 
> x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,ano
> nuid=33,anongid=33)
> /var/www/pictures
>                 
> 10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check,a
> nonuid=33,anongid=33)
> 
> ----------------------------------------------------------------------
> ------------------ [ 1913.662849] WARNING: at 
> /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/sun
> rpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()

That's a warning from rpc_task that both tk_action and RPC_TASK_KILLED were set on exit from rpc_calL_done.

Couldn't that happen if there's a race between rpc_killall and rpc_call_done trying to restart the task?  rpc_restart_call{_prepare} check RPC_TASK_KILLED before setting the action, but does anything prevent the flag being set after that check?

--b.

> [ 1913.662851] Hardware name: X8STi
> [ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
> [ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
> [ 1913.662904] Call Trace:
> [ 1913.662909]  [<ffffffff810498ac>] ? warn_slowpath_common+0x78/0x8c 
> [ 1913.662916]  [<ffffffffa0327871>] ? rpc_exit_task+0x40/0x7a 
> [sunrpc] [ 1913.662922]  [<ffffffffa0327ddb>] ? 
> __rpc_execute+0x71/0x23f [sunrpc] [ 1913.662928]  [<ffffffffa0327fe1>] 
> ? rpc_execute+0x38/0x38 [sunrpc] [ 1913.662981]  [<ffffffff8105f96c>] 
> ? process_one_work+0x1cc/0x2ea [ 1913.662985]  [<ffffffff8105fbb7>] ? 
> worker_thread+0x12d/0x247 [ 1913.662987]  [<ffffffff8105fa8a>] ? 
> process_one_work+0x2ea/0x2ea [ 1913.662990]  [<ffffffff8105fa8a>] ? 
> process_one_work+0x2ea/0x2ea [ 1913.662993]  [<ffffffff810633c5>] ? 
> kthread+0x7a/0x82 [ 1913.662998]  [<ffffffff8136ca74>] ? 
> kernel_thread_helper+0x4/0x10 [ 1913.663000]  [<ffffffff8106334b>] ? 
> kthread_worker_fn+0x147/0x147 [ 1913.663003]  [<ffffffff8136ca70>] ? 
> gs_change+0x13/0x13 [ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac 
> ]---
> ----------------------------------------------------------------------
> ------------------
> 
> Regards,
> 
> Taylan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: rpc_exit_task warning.
  2012-07-25 16:52 ` J. Bruce Fields
  2012-07-25 17:01   ` Taylan Develioglu
@ 2012-07-26 10:43   ` Taylan Develioglu
  2012-07-26 11:50     ` J. Bruce Fields
  1 sibling, 1 reply; 6+ messages in thread
From: Taylan Develioglu @ 2012-07-26 10:43 UTC (permalink / raw)
  To: Taylan Develioglu, J. Bruce Fields; +Cc: linux-nfs, Trond Myklebust

I was wrong.

Changed kernel version to 2.6.39 and it's still happening.

Only other difference I can think of is the fact we now use lvm instead of a regular partition.

------------------------------------------------------------------------------------------------------------------

 
TAYLAN DEVELIOGLU
Operations Manager
Email: tdevelioglu@ebuddy.com
mobile: +31 (0) 62 122 3115

eBuddy BV
Keizersgracht 585
1017 DR Amsterdam
The Netherlands
www.ebuddy.com


------------------------------------------------------------------------------------------------------------------



-----Original Message-----
From: Taylan Develioglu 
Sent: Wednesday, July 25, 2012 19:01
To: 'J. Bruce Fields'
Cc: linux-nfs@vger.kernel.org; Trond Myklebust
Subject: RE: rpc_exit_task warning.

Yes, but the previous server was using kernel version 2.6.39, so it looks like this is a regression.

------------------------------------------------------------------------------------------------------------------

 
TAYLAN DEVELIOGLU
Operations Manager
Email: tdevelioglu@ebuddy.com
mobile: +31 (0) 62 122 3115

eBuddy BV
Keizersgracht 585
1017 DR Amsterdam
The Netherlands
www.ebuddy.com


------------------------------------------------------------------------------------------------------------------



-----Original Message-----
From: J. Bruce Fields [mailto:bfields@fieldses.org] 
Sent: Wednesday, July 25, 2012 18:52
To: Taylan Develioglu
Cc: linux-nfs@vger.kernel.org; Trond Myklebust
Subject: Re: rpc_exit_task warning.


On Tue, Jul 24, 2012 at 08:58:43PM +0200, Taylan Develioglu wrote:
> We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:

Did this replace an old server that didn't see these warnings?

> 
> Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.
> 
> I don't really have time to debug this, but I felt I should report it.
> 
> - Client
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
>   ii  nfs-common                          1:1.2.2-4squeeze2
> 
> - Server
>   ii  libevent-1.4-2                      1.4.13-stable-1
>   ii  util-linux                          2.17.2-9
>   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
>   ii  libnfsidmap2                        0.24-1~bpo60+1               
>   ii  nfs-common                          1:1.2.5-4~bpo60+1            
>   ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            
> 
> exportfs -v
> /var/www/pictures
>                 
> x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,ano
> nuid=33,anongid=33)
> /var/www/pictures
>                 
> 10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check,a
> nonuid=33,anongid=33)
> 
> ----------------------------------------------------------------------
> ------------------ [ 1913.662849] WARNING: at 
> /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/sun
> rpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()

That's a warning from rpc_task that both tk_action and RPC_TASK_KILLED were set on exit from rpc_calL_done.

Couldn't that happen if there's a race between rpc_killall and rpc_call_done trying to restart the task?  rpc_restart_call{_prepare} check RPC_TASK_KILLED before setting the action, but does anything prevent the flag being set after that check?

--b.

> [ 1913.662851] Hardware name: X8STi
> [ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
> [ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
> [ 1913.662904] Call Trace:
> [ 1913.662909]  [<ffffffff810498ac>] ? warn_slowpath_common+0x78/0x8c 
> [ 1913.662916]  [<ffffffffa0327871>] ? rpc_exit_task+0x40/0x7a 
> [sunrpc] [ 1913.662922]  [<ffffffffa0327ddb>] ? 
> __rpc_execute+0x71/0x23f [sunrpc] [ 1913.662928]  [<ffffffffa0327fe1>] 
> ? rpc_execute+0x38/0x38 [sunrpc] [ 1913.662981]  [<ffffffff8105f96c>] 
> ? process_one_work+0x1cc/0x2ea [ 1913.662985]  [<ffffffff8105fbb7>] ? 
> worker_thread+0x12d/0x247 [ 1913.662987]  [<ffffffff8105fa8a>] ? 
> process_one_work+0x2ea/0x2ea [ 1913.662990]  [<ffffffff8105fa8a>] ? 
> process_one_work+0x2ea/0x2ea [ 1913.662993]  [<ffffffff810633c5>] ? 
> kthread+0x7a/0x82 [ 1913.662998]  [<ffffffff8136ca74>] ? 
> kernel_thread_helper+0x4/0x10 [ 1913.663000]  [<ffffffff8106334b>] ? 
> kthread_worker_fn+0x147/0x147 [ 1913.663003]  [<ffffffff8136ca70>] ? 
> gs_change+0x13/0x13 [ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac 
> ]---
> ----------------------------------------------------------------------
> ------------------
> 
> Regards,
> 
> Taylan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rpc_exit_task warning.
  2012-07-26 10:43   ` Taylan Develioglu
@ 2012-07-26 11:50     ` J. Bruce Fields
  2012-07-26 12:00       ` Taylan Develioglu
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2012-07-26 11:50 UTC (permalink / raw)
  To: Taylan Develioglu; +Cc: linux-nfs, Trond Myklebust

On Thu, Jul 26, 2012 at 12:43:46PM +0200, Taylan Develioglu wrote:
> I was wrong.
> 
> Changed kernel version to 2.6.39 and it's still happening.
> 
> Only other difference I can think of is the fact we now use lvm instead of a regular partition.

Did you change any of the clients at the same time you upgraded the
server?  (E.g. is it possible any of them are using NFSv4 now and
weren't before?)

--b.

> From: Taylan Develioglu 
> 
> Yes, but the previous server was using kernel version 2.6.39, so it looks like this is a regression.
...
> From: J. Bruce Fields [mailto:bfields@fieldses.org] 
> 
> 
> On Tue, Jul 24, 2012 at 08:58:43PM +0200, Taylan Develioglu wrote:
> > We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:
> 
> Did this replace an old server that didn't see these warnings?
> 
> > 
> > Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.
> > 
> > I don't really have time to debug this, but I felt I should report it.
> > 
> > - Client
> >   ii  libevent-1.4-2                      1.4.13-stable-1
> >   ii  util-linux                          2.17.2-9
> >   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
> >   ii  nfs-common                          1:1.2.2-4squeeze2
> > 
> > - Server
> >   ii  libevent-1.4-2                      1.4.13-stable-1
> >   ii  util-linux                          2.17.2-9
> >   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
> >   ii  libnfsidmap2                        0.24-1~bpo60+1               
> >   ii  nfs-common                          1:1.2.5-4~bpo60+1            
> >   ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            
> > 
> > exportfs -v
> > /var/www/pictures
> >                 
> > x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,ano
> > nuid=33,anongid=33)
> > /var/www/pictures
> >                 
> > 10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check,a
> > nonuid=33,anongid=33)
> > 
> > ----------------------------------------------------------------------
> > ------------------ [ 1913.662849] WARNING: at 
> > /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/sun
> > rpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()
> 
> That's a warning from rpc_task that both tk_action and RPC_TASK_KILLED were set on exit from rpc_calL_done.
> 
> Couldn't that happen if there's a race between rpc_killall and rpc_call_done trying to restart the task?  rpc_restart_call{_prepare} check RPC_TASK_KILLED before setting the action, but does anything prevent the flag being set after that check?
> 
> --b.
> 
> > [ 1913.662851] Hardware name: X8STi
> > [ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
> > [ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
> > [ 1913.662904] Call Trace:
> > [ 1913.662909]  [<ffffffff810498ac>] ? warn_slowpath_common+0x78/0x8c 
> > [ 1913.662916]  [<ffffffffa0327871>] ? rpc_exit_task+0x40/0x7a 
> > [sunrpc] [ 1913.662922]  [<ffffffffa0327ddb>] ? 
> > __rpc_execute+0x71/0x23f [sunrpc] [ 1913.662928]  [<ffffffffa0327fe1>] 
> > ? rpc_execute+0x38/0x38 [sunrpc] [ 1913.662981]  [<ffffffff8105f96c>] 
> > ? process_one_work+0x1cc/0x2ea [ 1913.662985]  [<ffffffff8105fbb7>] ? 
> > worker_thread+0x12d/0x247 [ 1913.662987]  [<ffffffff8105fa8a>] ? 
> > process_one_work+0x2ea/0x2ea [ 1913.662990]  [<ffffffff8105fa8a>] ? 
> > process_one_work+0x2ea/0x2ea [ 1913.662993]  [<ffffffff810633c5>] ? 
> > kthread+0x7a/0x82 [ 1913.662998]  [<ffffffff8136ca74>] ? 
> > kernel_thread_helper+0x4/0x10 [ 1913.663000]  [<ffffffff8106334b>] ? 
> > kthread_worker_fn+0x147/0x147 [ 1913.663003]  [<ffffffff8136ca70>] ? 
> > gs_change+0x13/0x13 [ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac 
> > ]---
> > ----------------------------------------------------------------------
> > ------------------
> > 
> > Regards,
> > 
> > Taylan
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: rpc_exit_task warning.
  2012-07-26 11:50     ` J. Bruce Fields
@ 2012-07-26 12:00       ` Taylan Develioglu
  0 siblings, 0 replies; 6+ messages in thread
From: Taylan Develioglu @ 2012-07-26 12:00 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, Trond Myklebust

No, the clients were the same before and after.

It's difficult to ascertain what command is exactly triggering this, I could load up a patched kernel with some extra debug stuff in the function if that would help.

Ofcourse you would have to tell me what stuff to add :)
------------------------------------------------------------------------------------------------------------------

 
TAYLAN DEVELIOGLU
Operations Manager
Email: tdevelioglu@ebuddy.com
mobile: +31 (0) 62 122 3115

eBuddy BV
Keizersgracht 585
1017 DR Amsterdam
The Netherlands
www.ebuddy.com


------------------------------------------------------------------------------------------------------------------



-----Original Message-----
From: J. Bruce Fields [mailto:bfields@fieldses.org] 
Sent: Thursday, July 26, 2012 13:51
To: Taylan Develioglu
Cc: linux-nfs@vger.kernel.org; Trond Myklebust
Subject: Re: rpc_exit_task warning.


On Thu, Jul 26, 2012 at 12:43:46PM +0200, Taylan Develioglu wrote:
> I was wrong.
> 
> Changed kernel version to 2.6.39 and it's still happening.
> 
> Only other difference I can think of is the fact we now use lvm instead of a regular partition.

Did you change any of the clients at the same time you upgraded the server?  (E.g. is it possible any of them are using NFSv4 now and weren't before?)

--b.

> From: Taylan Develioglu
> 
> Yes, but the previous server was using kernel version 2.6.39, so it looks like this is a regression.
...
> From: J. Bruce Fields [mailto:bfields@fieldses.org]
> 
> 
> On Tue, Jul 24, 2012 at 08:58:43PM +0200, Taylan Develioglu wrote:
> > We just deployed a new nfs server and have about a hundred clients connected but are getting repeated kernel warnings on the server:
> 
> Did this replace an old server that didn't see these warnings?
> 
> > 
> > Clients and servers run 3.2.18 and 3.2.20 respectively. We do not use any security options.
> > 
> > I don't really have time to debug this, but I felt I should report it.
> > 
> > - Client
> >   ii  libevent-1.4-2                      1.4.13-stable-1
> >   ii  util-linux                          2.17.2-9
> >   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.18-1~bpo60+1
> >   ii  nfs-common                          1:1.2.2-4squeeze2
> > 
> > - Server
> >   ii  libevent-1.4-2                      1.4.13-stable-1
> >   ii  util-linux                          2.17.2-9
> >   ii  linux-image-3.2.0-0.bpo.2-amd64     3.2.20-1~bpo60+1
> >   ii  libnfsidmap2                        0.24-1~bpo60+1               
> >   ii  nfs-common                          1:1.2.5-4~bpo60+1            
> >   ii  nfs-kernel-server                   1:1.2.5-4~bpo60+1            
> > 
> > exportfs -v
> > /var/www/pictures
> >                 
> > x.x.x.x/22(rw,async,wdelay,root_squash,all_squash,no_subtree_check,a
> > no
> > nuid=33,anongid=33)
> > /var/www/pictures
> >                 
> > 10.40.0.0/23(rw,async,wdelay,root_squash,all_squash,no_subtree_check
> > ,a
> > nonuid=33,anongid=33)
> > 
> > --------------------------------------------------------------------
> > --
> > ------------------ [ 1913.662849] WARNING: at 
> > /build/buildd-linux_3.2.20-1~bpo60+1-amd64-tQMw4f/linux-3.2.20/net/s
> > un
> > rpc/sched.c:630 rpc_exit_task+0x40/0x7a [sunrpc]()
> 
> That's a warning from rpc_task that both tk_action and RPC_TASK_KILLED were set on exit from rpc_calL_done.
> 
> Couldn't that happen if there's a race between rpc_killall and rpc_call_done trying to restart the task?  rpc_restart_call{_prepare} check RPC_TASK_KILLED before setting the action, but does anything prevent the flag being set after that check?
> 
> --b.
> 
> > [ 1913.662851] Hardware name: X8STi
> > [ 1913.662852] Modules linked in: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc hmac drbd lru_cache cn ipmi_si ipmi_devintf ipmi_msghandler loop tpm_tis tpm parport_pc i2c_i801 i7core_edac i2c_core snd_pcm snd_timer snd ioatdma soundcore tpm_bios snd_page_alloc parport edac_core dca pcspkr psmouse processor serio_raw thermal_sys evdev joydev button ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom ses enclosure sd_mod crc_t10dif usb_storage usbhid hid uas uhci_hcd mptsas mptscsih mptbase scsi_transport_sas ahci libahci ehci_hcd libata usbcore aacraid usb_common scsi_mod e1000e [last unloaded: scsi_wait_scan]
> > [ 1913.662902] Pid: 11, comm: kworker/0:1 Tainted: G        W    3.2.0-0.bpo.2-amd64 #1
> > [ 1913.662904] Call Trace:
> > [ 1913.662909]  [<ffffffff810498ac>] ? 
> > warn_slowpath_common+0x78/0x8c [ 1913.662916]  [<ffffffffa0327871>] 
> > ? rpc_exit_task+0x40/0x7a [sunrpc] [ 1913.662922]  [<ffffffffa0327ddb>] ?
> > __rpc_execute+0x71/0x23f [sunrpc] [ 1913.662928]  
> > [<ffffffffa0327fe1>] ? rpc_execute+0x38/0x38 [sunrpc] [ 1913.662981]  
> > [<ffffffff8105f96c>] ? process_one_work+0x1cc/0x2ea [ 1913.662985]  [<ffffffff8105fbb7>] ?
> > worker_thread+0x12d/0x247 [ 1913.662987]  [<ffffffff8105fa8a>] ? 
> > process_one_work+0x2ea/0x2ea [ 1913.662990]  [<ffffffff8105fa8a>] ? 
> > process_one_work+0x2ea/0x2ea [ 1913.662993]  [<ffffffff810633c5>] ? 
> > kthread+0x7a/0x82 [ 1913.662998]  [<ffffffff8136ca74>] ? 
> > kernel_thread_helper+0x4/0x10 [ 1913.663000]  [<ffffffff8106334b>] ? 
> > kthread_worker_fn+0x147/0x147 [ 1913.663003]  [<ffffffff8136ca70>] ? 
> > gs_change+0x13/0x13 [ 1913.663005] ---[ end trace 7cee9f1fd80fe6ac
> > ]---
> > --------------------------------------------------------------------
> > --
> > ------------------
> > 
> > Regards,
> > 
> > Taylan
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-26 12:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-24 18:58 rpc_exit_task warning Taylan Develioglu
2012-07-25 16:52 ` J. Bruce Fields
2012-07-25 17:01   ` Taylan Develioglu
2012-07-26 10:43   ` Taylan Develioglu
2012-07-26 11:50     ` J. Bruce Fields
2012-07-26 12:00       ` Taylan Develioglu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.