* Inconsistent error codes between NFSv4 and v3 on network issues @ 2013-03-01 11:43 Jan Engelhardt 2013-03-04 14:16 ` J. Bruce Fields 2013-03-04 17:47 ` Chuck Lever 0 siblings, 2 replies; 8+ messages in thread From: Jan Engelhardt @ 2013-03-01 11:43 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Hi. I had here a case with a Linux 3.7.9 system, a virtual machine in a RFC1918 range, that did not want to mount NFS. linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt Process 1477 attached [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) mount.nfs: mount system call failed [pid 1477] +++ exited with 32 +++ --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} --- +++ exited with 32 +++ Nothing in dmesg... [ 84.202243] RPC: Registered named UNIX socket transport module. [ 84.202246] RPC: Registered udp transport module. [ 84.202248] RPC: Registered tcp transport module. [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 84.205909] FS-Cache: Loaded [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching [ 84.215733] NFS: Registering the id_resolver key type [ 84.215762] Key type id_resolver registered [ 84.215763] Key type id_legacy registered When mounting with NFSv3, the error became clear: # strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock Process 1550 attached mount.nfs: Network is unreachable [pid 1550] +++ exited with 32 +++ --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} --- +++ exited with 32 +++ Can NFSv4 be made to return -ENETUNREACH as well? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-01 11:43 Inconsistent error codes between NFSv4 and v3 on network issues Jan Engelhardt @ 2013-03-04 14:16 ` J. Bruce Fields 2013-03-04 14:16 ` J. Bruce Fields 2013-03-04 17:47 ` Chuck Lever 1 sibling, 1 reply; 8+ messages in thread From: J. Bruce Fields @ 2013-03-04 14:16 UTC (permalink / raw) To: Jan Engelhardt; +Cc: linux-nfs Adding Chuck to cc:, as he's probably done the most mucking with the mount code in recent years? --b. On Fri, Mar 01, 2013 at 12:43:58PM +0100, Jan Engelhardt wrote: > Hi. > > > I had here a case with a Linux 3.7.9 system, a virtual machine in a > RFC1918 range, that did not want to mount NFS. > > linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt > Process 1477 attached > [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) > mount.nfs: mount system call failed > [pid 1477] +++ exited with 32 +++ > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} --- > +++ exited with 32 +++ > > Nothing in dmesg... > > [ 84.202243] RPC: Registered named UNIX socket transport module. > [ 84.202246] RPC: Registered udp transport module. > [ 84.202248] RPC: Registered tcp transport module. > [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module. > [ 84.205909] FS-Cache: Loaded > [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching > [ 84.215733] NFS: Registering the id_resolver key type > [ 84.215762] Key type id_resolver registered > [ 84.215763] Key type id_legacy registered > > When mounting with NFSv3, the error became clear: > > # strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock > Process 1550 attached > mount.nfs: Network is unreachable > [pid 1550] +++ exited with 32 +++ > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} --- > +++ exited with 32 +++ > > Can NFSv4 be made to return -ENETUNREACH as well? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-04 14:16 ` J. Bruce Fields @ 2013-03-04 14:16 ` J. Bruce Fields 0 siblings, 0 replies; 8+ messages in thread From: J. Bruce Fields @ 2013-03-04 14:16 UTC (permalink / raw) To: Jan Engelhardt; +Cc: linux-nfs, chuck.lever Um, for real this time. On Mon, Mar 04, 2013 at 09:16:00AM -0500, J. Bruce Fields wrote: > Adding Chuck to cc:, as he's probably done the most mucking with the > mount code in recent years? > > --b. > > On Fri, Mar 01, 2013 at 12:43:58PM +0100, Jan Engelhardt wrote: > > Hi. > > > > > > I had here a case with a Linux 3.7.9 system, a virtual machine in a > > RFC1918 range, that did not want to mount NFS. > > > > linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt > > Process 1477 attached > > [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) > > mount.nfs: mount system call failed > > [pid 1477] +++ exited with 32 +++ > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} --- > > +++ exited with 32 +++ > > > > Nothing in dmesg... > > > > [ 84.202243] RPC: Registered named UNIX socket transport module. > > [ 84.202246] RPC: Registered udp transport module. > > [ 84.202248] RPC: Registered tcp transport module. > > [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module. > > [ 84.205909] FS-Cache: Loaded > > [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching > > [ 84.215733] NFS: Registering the id_resolver key type > > [ 84.215762] Key type id_resolver registered > > [ 84.215763] Key type id_legacy registered > > > > When mounting with NFSv3, the error became clear: > > > > # strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock > > Process 1550 attached > > mount.nfs: Network is unreachable > > [pid 1550] +++ exited with 32 +++ > > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} --- > > +++ exited with 32 +++ > > > > Can NFSv4 be made to return -ENETUNREACH as well? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-01 11:43 Inconsistent error codes between NFSv4 and v3 on network issues Jan Engelhardt 2013-03-04 14:16 ` J. Bruce Fields @ 2013-03-04 17:47 ` Chuck Lever 2013-03-04 19:10 ` Jan Engelhardt 1 sibling, 1 reply; 8+ messages in thread From: Chuck Lever @ 2013-03-04 17:47 UTC (permalink / raw) To: Jan Engelhardt; +Cc: J. Bruce Fields, linux-nfs On Mar 1, 2013, at 6:43 AM, Jan Engelhardt <jengelh@inai.de> wrote: > Hi. > > > I had here a case with a Linux 3.7.9 system, a virtual machine in a > RFC1918 range, that did not want to mount NFS. > > linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt > Process 1477 attached > [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) "clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command should have failed right there. But, let's find out why the kernel is returning EIO. Enter: # rpcdebug -m nfs -s all # rpcdebug -m rpc -s call xprt Try your NFSv4 mount command again, then post relevant excerpts from the kernel log. By the way, does this happen on older kernels? > mount.nfs: mount system call failed > [pid 1477] +++ exited with 32 +++ > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} --- > +++ exited with 32 +++ > > Nothing in dmesg... > > [ 84.202243] RPC: Registered named UNIX socket transport module. > [ 84.202246] RPC: Registered udp transport module. > [ 84.202248] RPC: Registered tcp transport module. > [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module. > [ 84.205909] FS-Cache: Loaded > [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching > [ 84.215733] NFS: Registering the id_resolver key type > [ 84.215762] Key type id_resolver registered > [ 84.215763] Key type id_legacy registered -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-04 17:47 ` Chuck Lever @ 2013-03-04 19:10 ` Jan Engelhardt 2013-03-04 20:43 ` Myklebust, Trond 0 siblings, 1 reply; 8+ messages in thread From: Jan Engelhardt @ 2013-03-04 19:10 UTC (permalink / raw) To: Chuck Lever; +Cc: J. Bruce Fields, linux-nfs On Monday 2013-03-04 18:47, Chuck Lever wrote: >> >> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt >> Process 1477 attached >> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) > >"clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command >should have failed right there. Maybe (since there is no route). Maybe not: I would interpret 0.0.0.0 as "kernel will pick something". This kernel 3.7.9 system has rpcbind-0.2.0_git201103171419-7.1.1.x86_64 and nfs-client-1.2.7-2.1.1.x86_64. >But, let's find out why the kernel is returning EIO. Enter: > > # rpcdebug -m nfs -s all > # rpcdebug -m rpc -s call xprt > >Try your NFSv4 mount command again, then post relevant excerpts from the kernel log. [205196.949572] NFS: nfs mount opts='vers=4,addr=134.76.12.5,clientaddr=0.0.0.0' [205196.949580] NFS: parsing nfs mount option 'vers=4' [205196.949586] NFS: parsing nfs mount option 'addr=134.76.12.5' [205196.949592] NFS: parsing nfs mount option 'clientaddr=0.0.0.0' [205196.949597] NFS: MNTPATH: '/X' [205196.949600] --> nfs4_try_mount() [205196.949607] --> nfs4_create_server() [205196.949639] --> nfs4_init_server() [205196.949641] --> nfs4_set_client() [205196.949644] --> nfs_get_client(134.76.12.5,v4) [205196.949651] NFS: get client cookie (0xffff88007abe3800/0xffff88007ba29420) [205196.949666] RPC: created transport ffff88007ba50800 with 65536 slots [205196.949670] RPC: creating nfs client for 134.76.12.5 (xprt ffff88007ba50800) [205196.949693] RPC: 2 call_start nfs4 proc NULL (sync) [205196.949705] RPC: 2 call_reserve (status 0) [205196.949709] RPC: 2 reserved req ffff88007b342800 xid 11fca56c [205196.949712] RPC: 2 call_reserveresult (status 0) [205196.949714] RPC: 2 call_refresh (status 0) [205196.949716] RPC: 2 call_refreshresult (status 0) [205196.949718] RPC: 2 call_allocate (status 0) [205196.949722] RPC: 2 call_bind (status 0) [205196.949724] RPC: 2 call_connect xprt ffff88007ba50800 is not connected [205196.949727] RPC: 2 xprt_connect xprt ffff88007ba50800 is not connected [205196.949765] RPC: 2 xprt_connect_status: error 101 connecting to server 134.76.12.5 [205196.949779] RPC: 2 call_connect_status (status -5) [205196.949791] RPC: 2 release request ffff88007b342800 [205196.949794] RPC: rpc_release_client(ffff88007b342e00) [205196.949797] RPC: shutting down nfs client for 134.76.12.5 [205196.949798] RPC: rpc_release_client(ffff88007b342e00) [205196.949801] RPC: destroying nfs client for 134.76.12.5 [205196.949803] RPC: destroying transport ffff88007ba50800 [205196.949812] RPC: disconnected transport ffff88007ba50800 [205196.949816] nfs_create_rpc_client: cannot create RPC client. Error = -5 [205196.949818] --> nfs_put_client({1}) [205196.949820] --> nfs_free_client(4) [205196.949822] NFS: releasing client cookie (0xffff88007abe3800/0xffff88007ba29420) [205196.949824] <-- nfs_free_client() [205196.949826] <-- nfs4_init_client() = xerror -5 [205196.949828] <-- nfs4_set_client() = xerror -5 [205196.949829] <-- nfs4_init_server() = -5 [205196.949831] --> nfs_free_server() [205196.949850] <-- nfs_free_server() [205196.949852] <-- nfs4_create_server() = error -5 [205196.949856] <-- nfs4_try_mount() = -5 [error] Just nuke your default route, and it should be easily reproducible. >By the way, does this happen on older kernels? On kernel 3.0.51 with rpcbind-0.1.6+git20080930-6.18.1 and nfs-utils-1.2.3, I observe that rtsnode1:~ # strace -fe mount -s 65535 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=4 Process 24358 attached Process 24357 suspended [pid 24358] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "nfsvers=4,addr=134.76.12.5,clientaddr=0.0.0.0" <hang, but interruptible> rpcdebug/dmesg: [790206.304549] NFS: nfs mount opts='nfsvers=4,addr=134.76.12.5,clientaddr=0.0.0.0' [790206.304549] NFS: parsing nfs mount option 'nfsvers=4' [790206.304549] NFS: parsing nfs mount option 'addr=134.76.12.5' [790206.304549] NFS: parsing nfs mount option 'clientaddr=0.0.0.0' [790206.304549] NFS: MNTPATH: '/X' [790206.304549] --> nfs4_try_mount() [790206.304549] --> nfs4_create_server() [790206.304549] --> nfs4_init_server() [790206.304549] --> nfs4_set_client() [790206.304549] --> nfs_get_client(134.76.12.5,v4) [790206.304549] NFS: get client cookie (0xffff880079a16400/0xffff880079cd41e0) [790206.304549] RPC: created transport ffff880078f26800 with 16 slots [790206.304549] RPC: creating nfs client for 134.76.12.5 (xprt ffff880078f26800) [790206.304550] RPC: 45 call_start nfs4 proc NULL (sync) [790206.304550] RPC: 45 call_reserve (status 0) [790206.304550] RPC: 45 reserved req ffff880037a16000 xid 68604dfd [790206.304550] RPC: 45 call_reserveresult (status 0) [790206.304550] RPC: 45 call_refresh (status 0) [790206.304550] RPC: 45 call_refreshresult (status 0) [790206.304550] RPC: 45 call_allocate (status 0) [790206.304550] RPC: 45 call_bind (status 0) [790206.304550] RPC: 45 call_connect xprt ffff880078f26800 is not connected [790206.304550] RPC: 45 xprt_connect xprt ffff880078f26800 is not connected <no further output until Ctrl-C issued> [790320.020951] RPC: 45 xprt_connect_status: error 512 connecting to server 134.76.12.5 [790320.020960] RPC: 45 release request ffff880037a16000 [790320.020963] RPC: rpc_release_client(ffff880079bc0200) [790320.020968] RPC: shutting down nfs client for 134.76.12.5 [790320.020970] RPC: rpc_release_client(ffff880079bc0200) [790320.020973] RPC: destroying nfs client for 134.76.12.5 [790320.024752] RPC: destroying transport ffff880078f26800 [790320.024767] RPC: disconnected transport ffff880078f26800 [790320.024772] nfs_create_rpc_client: cannot create RPC client. Error = -5 [790320.024775] <-- nfs4_init_client() = xerror -5 [790320.024777] --> nfs_put_client({1}) [790320.024781] --> nfs_free_client(4) [790320.024783] NFS: releasing client cookie (0xffff880079a16400/0xffff880079cd41e0) [790320.024788] <-- nfs_free_client() [790320.024789] <-- nfs4_set_client() = xerror -5 [790320.024791] <-- nfs4_init_server() = -5 [790320.024810] --> nfs_free_server() [790320.024816] <-- nfs_free_server() [790320.024818] <-- nfs4_create_server() = error -5 [790320.024822] <-- nfs4_try_mount() = -5 [error] And, if network connectivity is present, this occurs: # strace -fe mount -s 65535 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=4 Process 24383 attached Process 24382 suspended [pid 24383] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "nfsvers=4,addr=134.76.12.5,clientaddr=10.10.7.142" <short wait> ) = 0 Process 24382 resumed Process 24383 detached --- SIGCHLD (Child exited) @ 0 (0) --- And that then probably answers your clientaddr= concern. So what 3.7.x does better is that it immediately identifies EHOSTUNREACH/ENETUNREACH while 3.0.x waits for Godot. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-04 19:10 ` Jan Engelhardt @ 2013-03-04 20:43 ` Myklebust, Trond 2013-03-04 22:37 ` Myklebust, Trond 0 siblings, 1 reply; 8+ messages in thread From: Myklebust, Trond @ 2013-03-04 20:43 UTC (permalink / raw) To: Jan Engelhardt; +Cc: Chuck Lever, J. Bruce Fields, linux-nfs On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote: > On Monday 2013-03-04 18:47, Chuck Lever wrote: > >> > >> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt > >> Process 1477 attached > >> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error) > > > >"clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command > >should have failed right there. > > Maybe (since there is no route). Maybe not: I would interpret 0.0.0.0 > as "kernel will pick something". > > This kernel 3.7.9 system has rpcbind-0.2.0_git201103171419-7.1.1.x86_64 > and nfs-client-1.2.7-2.1.1.x86_64. > > > >But, let's find out why the kernel is returning EIO. Enter: > > > > # rpcdebug -m nfs -s all > > # rpcdebug -m rpc -s call xprt > > > >Try your NFSv4 mount command again, then post relevant excerpts from the kernel log. > > [205196.949572] NFS: nfs mount opts='vers=4,addr=134.76.12.5,clientaddr=0.0.0.0' > [205196.949580] NFS: parsing nfs mount option 'vers=4' > [205196.949586] NFS: parsing nfs mount option 'addr=134.76.12.5' > [205196.949592] NFS: parsing nfs mount option 'clientaddr=0.0.0.0' > [205196.949597] NFS: MNTPATH: '/X' > [205196.949600] --> nfs4_try_mount() > [205196.949607] --> nfs4_create_server() > [205196.949639] --> nfs4_init_server() > [205196.949641] --> nfs4_set_client() > [205196.949644] --> nfs_get_client(134.76.12.5,v4) > [205196.949651] NFS: get client cookie (0xffff88007abe3800/0xffff88007ba29420) > [205196.949666] RPC: created transport ffff88007ba50800 with 65536 slots > [205196.949670] RPC: creating nfs client for 134.76.12.5 (xprt ffff88007ba50800) > [205196.949693] RPC: 2 call_start nfs4 proc NULL (sync) > [205196.949705] RPC: 2 call_reserve (status 0) > [205196.949709] RPC: 2 reserved req ffff88007b342800 xid 11fca56c > [205196.949712] RPC: 2 call_reserveresult (status 0) > [205196.949714] RPC: 2 call_refresh (status 0) > [205196.949716] RPC: 2 call_refreshresult (status 0) > [205196.949718] RPC: 2 call_allocate (status 0) > [205196.949722] RPC: 2 call_bind (status 0) > [205196.949724] RPC: 2 call_connect xprt ffff88007ba50800 is not connected > [205196.949727] RPC: 2 xprt_connect xprt ffff88007ba50800 is not connected > [205196.949765] RPC: 2 xprt_connect_status: error 101 connecting to server 134.76.12.5 > [205196.949779] RPC: 2 call_connect_status (status -5) > [205196.949791] RPC: 2 release request ffff88007b342800 > [205196.949794] RPC: rpc_release_client(ffff88007b342e00) > [205196.949797] RPC: shutting down nfs client for 134.76.12.5 > [205196.949798] RPC: rpc_release_client(ffff88007b342e00) > [205196.949801] RPC: destroying nfs client for 134.76.12.5 > [205196.949803] RPC: destroying transport ffff88007ba50800 > [205196.949812] RPC: disconnected transport ffff88007ba50800 > [205196.949816] nfs_create_rpc_client: cannot create RPC client. Error = -5 > [205196.949818] --> nfs_put_client({1}) > [205196.949820] --> nfs_free_client(4) > [205196.949822] NFS: releasing client cookie (0xffff88007abe3800/0xffff88007ba29420) > [205196.949824] <-- nfs_free_client() > [205196.949826] <-- nfs4_init_client() = xerror -5 > [205196.949828] <-- nfs4_set_client() = xerror -5 > [205196.949829] <-- nfs4_init_server() = -5 > [205196.949831] --> nfs_free_server() > [205196.949850] <-- nfs_free_server() > [205196.949852] <-- nfs4_create_server() = error -5 > [205196.949856] <-- nfs4_try_mount() = -5 [error] > > Just nuke your default route, and it should be easily reproducible. > The problem is that call_connect_status() is converting that ENETUNREACH into a EIO. We shouldn't be doing that, but should leave it up to the caller (i.e. the NFS layer) to perform that kind of mapping. Cheers Trond ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-04 20:43 ` Myklebust, Trond @ 2013-03-04 22:37 ` Myklebust, Trond 2013-03-13 14:47 ` Jan Engelhardt 0 siblings, 1 reply; 8+ messages in thread From: Myklebust, Trond @ 2013-03-04 22:37 UTC (permalink / raw) To: Jan Engelhardt; +Cc: Chuck Lever, J. Bruce Fields, linux-nfs [-- Attachment #1: Type: text/plain, Size: 576 bytes --] On Mon, 2013-03-04 at 20:43 +0000, Myklebust, Trond wrote: > On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote: > > Just nuke your default route, and it should be easily reproducible. > > > > The problem is that call_connect_status() is converting that ENETUNREACH > into a EIO. We shouldn't be doing that, but should leave it up to the > caller (i.e. the NFS layer) to perform that kind of mapping. Could you please check if the attached patch helps. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-SUNRPC-Report-network-connection-errors-correctly-fo.patch --] [-- Type: text/x-patch; name="0001-SUNRPC-Report-network-connection-errors-correctly-fo.patch", Size: 2560 bytes --] From 724bf7a71b44145811a1fec3b20a26bd1edae51b Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Mon, 4 Mar 2013 17:29:33 -0500 Subject: [PATCH] SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasks In the case of a SOFTCONN rpc task, we really want to ensure that it reports errors like ENETUNREACH back to the caller. Currently, only some of these errors are being reported back (connect errors are not), and they are being converted by the RPC layer into EIO. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- net/sunrpc/clnt.c | 24 ++++++++++++++---------- net/sunrpc/xprtsock.c | 8 ++++---- 2 files changed, 18 insertions(+), 14 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index dcc446e..b95a0a2 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1644,22 +1644,26 @@ call_connect_status(struct rpc_task *task) dprint_status(task); - task->tk_status = 0; - if (status >= 0 || status == -EAGAIN) { - clnt->cl_stats->netreconn++; - task->tk_action = call_transmit; - return; - } - trace_rpc_connect_status(task, status); switch (status) { /* if soft mounted, test if we've timed out */ case -ETIMEDOUT: task->tk_action = call_timeout; - break; - default: - rpc_exit(task, -EIO); + return; + case -ECONNREFUSED: + case -ECONNRESET: + case -ENETUNREACH: + if (RPC_IS_SOFTCONN(task)) + break; + /* retry with existing socket, after a delay */ + case 0: + case -EAGAIN: + task->tk_status = 0; + clnt->cl_stats->netreconn++; + task->tk_action = call_transmit; + return; } + rpc_exit(task, status); } /* diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index c1d8476..3081620 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2202,10 +2202,6 @@ static void xs_tcp_setup_socket(struct work_struct *work) */ xs_tcp_force_close(xprt); break; - case -ECONNREFUSED: - case -ECONNRESET: - case -ENETUNREACH: - /* retry with existing socket, after a delay */ case 0: case -EINPROGRESS: case -EALREADY: @@ -2216,6 +2212,10 @@ static void xs_tcp_setup_socket(struct work_struct *work) /* Happens, for instance, if the user specified a link * local IPv6 address without a scope-id. */ + case -ECONNREFUSED: + case -ECONNRESET: + case -ENETUNREACH: + /* retry with existing socket, after a delay */ goto out; } out_eagain: -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Inconsistent error codes between NFSv4 and v3 on network issues 2013-03-04 22:37 ` Myklebust, Trond @ 2013-03-13 14:47 ` Jan Engelhardt 0 siblings, 0 replies; 8+ messages in thread From: Jan Engelhardt @ 2013-03-13 14:47 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Chuck Lever, J. Bruce Fields, linux-nfs On Monday 2013-03-04 23:37, Myklebust, Trond wrote: >On Mon, 2013-03-04 at 20:43 +0000, Myklebust, Trond wrote: >> On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote: >> > Just nuke your default route, and it should be easily reproducible. >> > >> >> The problem is that call_connect_status() is converting that ENETUNREACH >> into a EIO. We shouldn't be doing that, but should leave it up to the >> caller (i.e. the NFS layer) to perform that kind of mapping. > >Could you please check if the attached patch helps. With the patch, I still get EIO from the mount syscall. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-03-13 14:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-03-01 11:43 Inconsistent error codes between NFSv4 and v3 on network issues Jan Engelhardt 2013-03-04 14:16 ` J. Bruce Fields 2013-03-04 14:16 ` J. Bruce Fields 2013-03-04 17:47 ` Chuck Lever 2013-03-04 19:10 ` Jan Engelhardt 2013-03-04 20:43 ` Myklebust, Trond 2013-03-04 22:37 ` Myklebust, Trond 2013-03-13 14:47 ` Jan Engelhardt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.