* Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym @ 2019-08-26 9:47 Naresh Kamboju 2019-08-26 10:41 ` Cyril Hrubis 0 siblings, 1 reply; 12+ messages in thread From: Naresh Kamboju @ 2019-08-26 9:47 UTC (permalink / raw) To: ltp, Linux-Next Mailing List, open list, alexey.kodanev, the_hoang0709 Cc: Jan Stancek, chrubis Do you see this LTP prot_hsymlinks failure on linux next 20190823 on x86_64 and i386 devices? test output log, useradd: failure while writing changes to /etc/passwd useradd: /home/hsym was created, but could not be removed userdel: user 'hsym' does not exist prot_hsymlinks 1 TBROK : prot_hsymlinks.c:325: Failed to run cmd: useradd hsym prot_hsymlinks 2 TBROK : prot_hsymlinks.c:325: Remaining cases broken prot_hsymlinks 3 TBROK : prot_hsymlinks.c:325: Failed to run cmd: userdel -r hsym prot_hsymlinks 4 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 5 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 6 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 7 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 8 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 9 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). prot_hsymlinks 10 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). Full test log, https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log Linux version: Linux version 5.3.0-rc5-next-20190823 (oe-user@oe-host) (gcc version 7.3.0 (GCC)) #1 SMP Fri Aug 23 09:35:54 UTC 2019 steps to reproduce: cd /opt/ltp ./runltp -s prot_hsymlinks metadata: metadata: git branch: master git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git git commit: 9733a7c62c66722bcfdb1a6fe4d35c497312d59a git describe: next-20190823 make_kernelversion: 5.3.0-rc5 kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft/linux-next/591/config build-location: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft/linux-next/591 toolchain: x86_64-linaro-linux 7.% series: lkft ltp-syscalls-tests__url: git://github.com/linux-test-project/ltp.git ltp-syscalls-tests__version: '20190517' Best regards Naresh Kamboju ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 9:47 Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym Naresh Kamboju @ 2019-08-26 10:41 ` Cyril Hrubis 2019-08-26 11:05 ` Jan Stancek 0 siblings, 1 reply; 12+ messages in thread From: Cyril Hrubis @ 2019-08-26 10:41 UTC (permalink / raw) To: Naresh Kamboju Cc: ltp, Linux-Next Mailing List, open list, alexey.kodanev, the_hoang0709, Jan Stancek Hi! > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on > x86_64 and i386 devices? > > test output log, > useradd: failure while writing changes to /etc/passwd > useradd: /home/hsym was created, but could not be removed This looks like an unrelated problem, failure to write to /etc/passwd probably means that filesystem is full or some problem happend and how is remounted RO. I do not see the kernel messages from this job anywhere at the job pages, is it stored somewhere? -- Cyril Hrubis chrubis@suse.cz ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 10:41 ` Cyril Hrubis @ 2019-08-26 11:05 ` Jan Stancek 2019-08-26 13:50 ` Naresh Kamboju 0 siblings, 1 reply; 12+ messages in thread From: Jan Stancek @ 2019-08-26 11:05 UTC (permalink / raw) To: Cyril Hrubis, Naresh Kamboju Cc: ltp, Linux-Next Mailing List, open list, alexey kodanev, the hoang0709 ----- Original Message ----- > Hi! > > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on > > x86_64 and i386 devices? > > > > test output log, > > useradd: failure while writing changes to /etc/passwd > > useradd: /home/hsym was created, but could not be removed > > This looks like an unrelated problem, failure to write to /etc/passwd > probably means that filesystem is full or some problem happend and how > is remounted RO. In Naresh' example, root is on NFS: root=/dev/nfs rw nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm on / type nfs (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) devtmpfs on /dev type devtmpfs (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) Following message repeats couple times in logs: NFS: Server wrote zero bytes, expected XXX Naresh, can you check if there are any errors on NFS server side? Maybe run NFS cthon against that server with client running next-20190822 and next-20190823. > > I do not see the kernel messages from this job anywhere at the job > pages, is it stored somewhere? It appears to be mixed in same log file: https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 11:05 ` Jan Stancek @ 2019-08-26 13:50 ` Naresh Kamboju 2019-08-26 14:38 ` Jan Stancek 0 siblings, 1 reply; 12+ messages in thread From: Naresh Kamboju @ 2019-08-26 13:50 UTC (permalink / raw) To: Jan Stancek Cc: Cyril Hrubis, ltp, Linux-Next Mailing List, open list, alexey kodanev, the hoang0709 Hi Jan and Cyril, On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> wrote: > > > > ----- Original Message ----- > > Hi! > > > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on > > > x86_64 and i386 devices? > > > > > > test output log, > > > useradd: failure while writing changes to /etc/passwd > > > useradd: /home/hsym was created, but could not be removed > > > > This looks like an unrelated problem, failure to write to /etc/passwd > > probably means that filesystem is full or some problem happend and how > > is remounted RO. > > In Naresh' example, root is on NFS: > root=/dev/nfs rw nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr Right ! root is mounted on NFS. > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm on / type nfs (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > devtmpfs on /dev type devtmpfs (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > Following message repeats couple times in logs: > NFS: Server wrote zero bytes, expected XXX > > Naresh, can you check if there are any errors on NFS server side? I have re-tested the failed tests on next-20190822 and all get pass which is also using same NFS server [1] [2]. > Maybe run NFS cthon against that server with client running next-20190822 and next-20190823. Thanks for the pointers. I will setup and run NFS cthon on next-20190822 and next-20190823. > > > > > I do not see the kernel messages from this job anywhere at the job > > pages, is it stored somewhere? > > It appears to be mixed in same log file: > https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log For the record the following tests failed on linux -next-20190823 on x86_64 and i386. The filesystem is mounted on NFS and tests are using locally mounted hard drive ( with -d /scratch ). Juno-r2 device filesystem mounted on NFS and did not see these errors and test getting pass on -next-20190823. These failures are reproducible all time on next-20190823 kernel on x86_64 and i386 device with root mounted on NFS [3] [4] [5] [6]. I will git bisect to find out which is bad commit. prot_hsymlinks: [3] ------------------ useradd: failure while writing changes to /etc/passwd useradd: /home/hsym was created, but could not be removed userdel: user 'hsym' does not exist prot_hsymlinks 1 TBROK : prot_hsymlinks.c:325: Failed to run cmd: useradd hsym prot_hsymlinks 2 TBROK : prot_hsymlinks.c:325: Remaining cases broken prot_hsymlinks 3 TBROK : prot_hsymlinks.c:325: Failed to run cmd: userdel -r hsym prot_hsymlinks 4 TBROK : tst_sig.c:234: unexpected signal SIGIOT/SIGABRT(6) received (pid = 8324). logrotate01: [4] ------------- compressing log with: /bin/gzip error: error creating temp state file /var/lib/logrotate.status.tmp: Input/output error logrotate01 1 TFAIL : ltpapicmd.c:154: Test #1: logrotate command exited with 1 return code. Output: sem_unlink_2-2: [5] ------------------ make[3]: Entering directory '/opt/ltp/testcases/open_posix_testsuite/conformance/interfaces/sem_unlink' cat: write error: Input/output error conformance/interfaces/sem_unlink/sem_unlink_2-2: execution: FAILED syslog{01 ...10} [6] ------------------- cp: failed to close '/etc/syslog.conf.ltpback': Input/output error syslog01 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf cp: failed to close '/etc/syslog.conf.ltpback': Input/output error syslog02 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf ... cp: failed to close '/etc/syslog.conf.ltpback': Input/output error syslog10 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf ref: PASS on 20190222: [1] https://lkft.validation.linaro.org/scheduler/job/890446#L1232 [2] https://lkft.validation.linaro.org/scheduler/job/890454 FAILED on 20190823: [3] https://lkft.validation.linaro.org/scheduler/job/890404#L1245 [4] https://lkft.validation.linaro.org/scheduler/job/886408#L2544 [5] https://lkft.validation.linaro.org/scheduler/job/886409#L3088 [6] https://lkft.validation.linaro.org/scheduler/job/890400#L1234 - Naresh ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 13:50 ` Naresh Kamboju @ 2019-08-26 14:38 ` Jan Stancek 2019-08-26 15:58 ` Trond Myklebust 0 siblings, 1 reply; 12+ messages in thread From: Jan Stancek @ 2019-08-26 14:38 UTC (permalink / raw) To: Naresh Kamboju Cc: Cyril Hrubis, ltp, Linux-Next Mailing List, open list, alexey kodanev, the hoang0709, trond.myklebust ----- Original Message ----- > Hi Jan and Cyril, > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> wrote: > > > > > > > > ----- Original Message ----- > > > Hi! > > > > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on > > > > x86_64 and i386 devices? > > > > > > > > test output log, > > > > useradd: failure while writing changes to /etc/passwd > > > > useradd: /home/hsym was created, but could not be removed > > > > > > This looks like an unrelated problem, failure to write to /etc/passwd > > > probably means that filesystem is full or some problem happend and how > > > is remounted RO. > > > > In Naresh' example, root is on NFS: > > root=/dev/nfs rw > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr > > Right ! > root is mounted on NFS. > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm > > on / type nfs > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > > devtmpfs on /dev type devtmpfs > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > > > Following message repeats couple times in logs: > > NFS: Server wrote zero bytes, expected XXX > > > > Naresh, can you check if there are any errors on NFS server side? > > I have re-tested the failed tests on next-20190822 and all get pass > which is also > using same NFS server [1] [2]. Thanks, that suggests some client side change between next-20190822 and next-20190823 might introduced it. > > > Maybe run NFS cthon against that server with client running next-20190822 > > and next-20190823. > > Thanks for the pointers. > I will setup and run NFS cthon on next-20190822 and next-20190823. I'll try to reproduce too. > > > > > > > > > I do not see the kernel messages from this job anywhere at the job > > > pages, is it stored somewhere? > > > > It appears to be mixed in same log file: > > https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log > > For the record the following tests failed on linux -next-20190823 on x86_64 > and i386. The filesystem is mounted on NFS and tests are using > locally mounted hard drive ( with -d /scratch ). > > Juno-r2 device filesystem mounted on NFS and did not see these errors > and test getting pass on -next-20190823. > > These failures are reproducible all time on next-20190823 kernel on x86_64 > and i386 device with root mounted on NFS [3] [4] [5] [6]. > > I will git bisect to find out which is bad commit. > > prot_hsymlinks: [3] > ------------------ > useradd: failure while writing changes to /etc/passwd > useradd: /home/hsym was created, but could not be removed > userdel: user 'hsym' does not exist > prot_hsymlinks 1 TBROK : prot_hsymlinks.c:325: Failed to run > cmd: useradd hsym > prot_hsymlinks 2 TBROK : prot_hsymlinks.c:325: Remaining cases broken > prot_hsymlinks 3 TBROK : prot_hsymlinks.c:325: Failed to run > cmd: userdel -r hsym > prot_hsymlinks 4 TBROK : tst_sig.c:234: unexpected signal > SIGIOT/SIGABRT(6) received (pid = 8324). > > logrotate01: [4] > ------------- > compressing log with: /bin/gzip > error: error creating temp state file /var/lib/logrotate.status.tmp: > Input/output error > logrotate01 1 TFAIL : ltpapicmd.c:154: Test #1: logrotate > command exited with 1 return code. Output: > > sem_unlink_2-2: [5] > ------------------ > make[3]: Entering directory > '/opt/ltp/testcases/open_posix_testsuite/conformance/interfaces/sem_unlink' > cat: write error: Input/output error > conformance/interfaces/sem_unlink/sem_unlink_2-2: execution: FAILED > > syslog{01 ...10} [6] > ------------------- > cp: failed to close '/etc/syslog.conf.ltpback': Input/output error > syslog01 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf > > cp: failed to close '/etc/syslog.conf.ltpback': Input/output error > syslog02 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf > > ... > cp: failed to close '/etc/syslog.conf.ltpback': Input/output error > syslog10 1 TBROK : ltpapicmd.c:188: failed to backup /etc/syslog.conf > > ref: > PASS on 20190222: > [1] https://lkft.validation.linaro.org/scheduler/job/890446#L1232 > [2] https://lkft.validation.linaro.org/scheduler/job/890454 > > FAILED on 20190823: > [3] https://lkft.validation.linaro.org/scheduler/job/890404#L1245 > [4] https://lkft.validation.linaro.org/scheduler/job/886408#L2544 > [5] https://lkft.validation.linaro.org/scheduler/job/886409#L3088 > [6] https://lkft.validation.linaro.org/scheduler/job/890400#L1234 > > - Naresh > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 14:38 ` Jan Stancek @ 2019-08-26 15:58 ` Trond Myklebust 2019-08-26 23:12 ` Jan Stancek 0 siblings, 1 reply; 12+ messages in thread From: Trond Myklebust @ 2019-08-26 15:58 UTC (permalink / raw) To: naresh.kamboju, jstancek Cc: the_hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey.kodanev On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote: > ----- Original Message ----- > > Hi Jan and Cyril, > > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> > > wrote: > > > > > > > > > ----- Original Message ----- > > > > Hi! > > > > > Do you see this LTP prot_hsymlinks failure on linux next > > > > > 20190823 on > > > > > x86_64 and i386 devices? > > > > > > > > > > test output log, > > > > > useradd: failure while writing changes to /etc/passwd > > > > > useradd: /home/hsym was created, but could not be removed > > > > > > > > This looks like an unrelated problem, failure to write to > > > > /etc/passwd > > > > probably means that filesystem is full or some problem happend > > > > and how > > > > is remounted RO. > > > > > > In Naresh' example, root is on NFS: > > > root=/dev/nfs rw > > > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > nfsrootfs-tyuevoxm,tcp,hard,intr > > > > Right ! > > root is mounted on NFS. > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > nfsrootfs-tyuevoxm > > > on / type nfs > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock, > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,moun > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > > > devtmpfs on /dev type devtmpfs > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > > The only thing I can think of that might cause an EIO on NFSv2 would be this patch http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0 assuming that a bind-related error is leaking through. I'd suggest something like the following to fix it up: 8<--------------------------------------- From 1e9336ac5363914dfcc1f49bf091409edbf36f8d Mon Sep 17 00:00:00 2001 From: Trond Myklebust <trond.myklebust@hammerspace.com> Date: Mon, 26 Aug 2019 11:44:04 -0400 Subject: [PATCH] fixup! SUNRPC: Don't handle errors if the bind/connect succeeded Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- net/sunrpc/clnt.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index f13ec73c8299..a07b516e503a 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1980,9 +1980,11 @@ call_bind_status(struct rpc_task *task) dprint_status(task); trace_rpc_bind_status(task); - if (task->tk_status >= 0 || xprt_bound(xprt)) { - task->tk_action = call_connect; - return; + if (task->tk_status >= 0) + goto out_next; + if (xprt_bound(xprt)) { + task->tk_status = 0; + goto out_next; } switch (task->tk_status) { @@ -2045,6 +2047,9 @@ call_bind_status(struct rpc_task *task) rpc_call_rpcerror(task, status); return; +out_next: + task->tk_action = call_connect; + return; retry_timeout: task->tk_status = 0; task->tk_action = call_bind; @@ -2107,8 +2112,10 @@ call_connect_status(struct rpc_task *task) clnt->cl_stats->netreconn++; goto out_next; } - if (xprt_connected(xprt)) + if (xprt_connected(xprt)) { + task->tk_status = 0; goto out_next; + } task->tk_status = 0; switch (status) { -- 2.21.0 -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 15:58 ` Trond Myklebust @ 2019-08-26 23:12 ` Jan Stancek 2019-08-27 0:59 ` Trond Myklebust 2019-08-27 6:34 ` Naresh Kamboju 0 siblings, 2 replies; 12+ messages in thread From: Jan Stancek @ 2019-08-26 23:12 UTC (permalink / raw) To: Trond Myklebust Cc: naresh kamboju, the hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey kodanev ----- Original Message ----- > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote: > > ----- Original Message ----- > > > Hi Jan and Cyril, > > > > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> > > > wrote: > > > > > > > > > > > > ----- Original Message ----- > > > > > Hi! > > > > > > Do you see this LTP prot_hsymlinks failure on linux next > > > > > > 20190823 on > > > > > > x86_64 and i386 devices? > > > > > > > > > > > > test output log, > > > > > > useradd: failure while writing changes to /etc/passwd > > > > > > useradd: /home/hsym was created, but could not be removed > > > > > > > > > > This looks like an unrelated problem, failure to write to > > > > > /etc/passwd > > > > > probably means that filesystem is full or some problem happend > > > > > and how > > > > > is remounted RO. > > > > > > > > In Naresh' example, root is on NFS: > > > > root=/dev/nfs rw > > > > > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > > nfsrootfs-tyuevoxm,tcp,hard,intr > > > > > > Right ! > > > root is mounted on NFS. > > > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > > nfsrootfs-tyuevoxm > > > > on / type nfs > > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock, > > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,moun > > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > > > > devtmpfs on /dev type devtmpfs > > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > > > > > The only thing I can think of that might cause an EIO on NFSv2 would be > this patch > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0 > assuming that a bind-related error is leaking through. > > I'd suggest something like the following to fix it up: No change with that patch, but following one fixes it for me: diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index 20b3717cd7ca..56cefa0ab804 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct nfs_pgio_header *hdr, } hdr->res.fattr = &hdr->fattr; - hdr->res.count = 0; + hdr->res.count = count; hdr->res.eof = 0; hdr->res.verf = &hdr->verf; nfs_fattr_init(&hdr->fattr); which is functionally revert of "NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup". This hunk caught my eye, could res.eof == 0 explain those I/O errors? /* Emulate the eof flag, which isn't normally needed in NFSv2 * as it is guaranteed to always return the file attributes */ if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size) hdr->res.eof = 1; ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 23:12 ` Jan Stancek @ 2019-08-27 0:59 ` Trond Myklebust 2019-08-27 10:25 ` Jan Stancek 2019-08-27 6:34 ` Naresh Kamboju 1 sibling, 1 reply; 12+ messages in thread From: Trond Myklebust @ 2019-08-27 0:59 UTC (permalink / raw) To: jstancek Cc: naresh.kamboju, the_hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey.kodanev On Mon, 2019-08-26 at 19:12 -0400, Jan Stancek wrote: > ----- Original Message ----- > > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote: > > > ----- Original Message ----- > > > > Hi Jan and Cyril, > > > > > > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> > > > > wrote: > > > > > > > > > > ----- Original Message ----- > > > > > > Hi! > > > > > > > Do you see this LTP prot_hsymlinks failure on linux next > > > > > > > 20190823 on > > > > > > > x86_64 and i386 devices? > > > > > > > > > > > > > > test output log, > > > > > > > useradd: failure while writing changes to /etc/passwd > > > > > > > useradd: /home/hsym was created, but could not be removed > > > > > > > > > > > > This looks like an unrelated problem, failure to write to > > > > > > /etc/passwd > > > > > > probably means that filesystem is full or some problem > > > > > > happend > > > > > > and how > > > > > > is remounted RO. > > > > > > > > > > In Naresh' example, root is on NFS: > > > > > root=/dev/nfs rw > > > > > > > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extr > > > > > act- > > > > > nfsrootfs-tyuevoxm,tcp,hard,intr > > > > > > > > Right ! > > > > root is mounted on NFS. > > > > > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > > > nfsrootfs-tyuevoxm > > > > > on / type nfs > > > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nol > > > > > ock, > > > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123, > > > > > moun > > > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > > > > > devtmpfs on /dev type devtmpfs > > > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > > > > > > > > The only thing I can think of that might cause an EIO on NFSv2 > > would be > > this patch > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0 > > assuming that a bind-related error is leaking through. > > > > I'd suggest something like the following to fix it up: > > No change with that patch, > but following one fixes it for me: > > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c > index 20b3717cd7ca..56cefa0ab804 100644 > --- a/fs/nfs/pagelist.c > +++ b/fs/nfs/pagelist.c > @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct > nfs_pgio_header *hdr, > } > > hdr->res.fattr = &hdr->fattr; > - hdr->res.count = 0; > + hdr->res.count = count; > hdr->res.eof = 0; > hdr->res.verf = &hdr->verf; > nfs_fattr_init(&hdr->fattr); > > which is functionally revert of "NFS: Fix initialisation of I/O > result struct in nfs_pgio_rpcsetup". > > This hunk caught my eye, could res.eof == 0 explain those I/O errors? Interesting hypothesis. It could if res.count ends up being 0. So does the following also fix the problem? 8<---------------------------------------- From b5bc0812350e94f8c9331174d22f24692411aef9 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <trond.myklebust@hammerspace.com> Date: Mon, 26 Aug 2019 20:41:16 -0400 Subject: [PATCH] NFSv2: Fix eof handling If we received a reply from the server with a zero length read and no error, then that implies we are at eof. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> --- fs/nfs/proc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 5552fa8b6e12..5919878549d2 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -594,7 +594,8 @@ static int nfs_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr) /* Emulate the eof flag, which isn't normally needed in NFSv2 * as it is guaranteed to always return the file attributes */ - if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size) + if (hdr->res.count == 0 && hdr->args.count > 0 || + hdr->args.offset + hdr->res.count >= hdr->res.fattr->size) hdr->res.eof = 1; } return 0; -- 2.21.0 -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-27 0:59 ` Trond Myklebust @ 2019-08-27 10:25 ` Jan Stancek 2019-08-27 12:58 ` Trond Myklebust 0 siblings, 1 reply; 12+ messages in thread From: Jan Stancek @ 2019-08-27 10:25 UTC (permalink / raw) To: Trond Myklebust Cc: naresh kamboju, the hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey kodanev ----- Original Message ----- > On Mon, 2019-08-26 at 19:12 -0400, Jan Stancek wrote: > > ----- Original Message ----- > > > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote: > > > > ----- Original Message ----- > > > > > Hi Jan and Cyril, > > > > > > > > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> > > > > > wrote: > > > > > > > > > > > > ----- Original Message ----- > > > > > > > Hi! > > > > > > > > Do you see this LTP prot_hsymlinks failure on linux next > > > > > > > > 20190823 on > > > > > > > > x86_64 and i386 devices? > > > > > > > > > > > > > > > > test output log, > > > > > > > > useradd: failure while writing changes to /etc/passwd > > > > > > > > useradd: /home/hsym was created, but could not be removed > > > > > > > > > > > > > > This looks like an unrelated problem, failure to write to > > > > > > > /etc/passwd > > > > > > > probably means that filesystem is full or some problem > > > > > > > happend > > > > > > > and how > > > > > > > is remounted RO. > > > > > > > > > > > > In Naresh' example, root is on NFS: > > > > > > root=/dev/nfs rw > > > > > > > > > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extr > > > > > > act- > > > > > > nfsrootfs-tyuevoxm,tcp,hard,intr > > > > > > > > > > Right ! > > > > > root is mounted on NFS. > > > > > > > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract- > > > > > > nfsrootfs-tyuevoxm > > > > > > on / type nfs > > > > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nol > > > > > > ock, > > > > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123, > > > > > > moun > > > > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123) > > > > > > devtmpfs on /dev type devtmpfs > > > > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755) > > > > > > > > > > > > The only thing I can think of that might cause an EIO on NFSv2 > > > would be > > > this patch > > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0 > > > assuming that a bind-related error is leaking through. > > > > > > I'd suggest something like the following to fix it up: > > > > No change with that patch, > > but following one fixes it for me: > > > > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c > > index 20b3717cd7ca..56cefa0ab804 100644 > > --- a/fs/nfs/pagelist.c > > +++ b/fs/nfs/pagelist.c > > @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct > > nfs_pgio_header *hdr, > > } > > > > hdr->res.fattr = &hdr->fattr; > > - hdr->res.count = 0; > > + hdr->res.count = count; > > hdr->res.eof = 0; > > hdr->res.verf = &hdr->verf; > > nfs_fattr_init(&hdr->fattr); > > > > which is functionally revert of "NFS: Fix initialisation of I/O > > result struct in nfs_pgio_rpcsetup". > > > > This hunk caught my eye, could res.eof == 0 explain those I/O errors? > > Interesting hypothesis. It could if res.count ends up being 0. So does > the following also fix the problem? It didn't fix it. That theory is probably not correct for this case, since EIO I see appears to originate from write and nfs_writeback_result(). This function also produces message we saw in logs from Naresh. I can't find where/how is resp->count updated on WRITE reply in NFSv2. Issue also goes away with patch below, though I can't speak about its correctness: NFS version Type Test Return code nfsvers=2 tcp -b:base 0 nfsvers=2 tcp -g:general 0 nfsvers=2 tcp -s:special 0 nfsvers=2 tcp -l:lock 0 Total time: 141 diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c index cbc17a203248..4913c6da270b 100644 --- a/fs/nfs/nfs2xdr.c +++ b/fs/nfs/nfs2xdr.c @@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst *req, struct xdr_stream *xdr, void *data) { struct nfs_pgio_res *result = data; + struct rpc_task *rq_task = req->rq_task; + + if (rq_task) { + struct nfs_pgio_args *args = rq_task->tk_msg.rpc_argp; + + if (args) { + result->count = args->count; + } + } /* All NFSv2 writes are "file sync" writes */ result->verf->committed = NFS_FILE_SYNC; ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-27 10:25 ` Jan Stancek @ 2019-08-27 12:58 ` Trond Myklebust 2019-08-27 13:20 ` Jan Stancek 0 siblings, 1 reply; 12+ messages in thread From: Trond Myklebust @ 2019-08-27 12:58 UTC (permalink / raw) To: jstancek Cc: naresh.kamboju, the_hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey.kodanev On Tue, 2019-08-27 at 06:25 -0400, Jan Stancek wrote: > That theory is probably not correct for this case, since EIO I see > appears > to originate from write and nfs_writeback_result(). This function > also > produces message we saw in logs from Naresh. > > I can't find where/how is resp->count updated on WRITE reply in > NFSv2. > Issue also goes away with patch below, though I can't speak about its > correctness: > > NFS version Type Test Return code > nfsvers=2 tcp -b:base 0 > nfsvers=2 tcp -g:general 0 > nfsvers=2 tcp -s:special 0 > nfsvers=2 tcp -l:lock 0 > Total time: 141 > > diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c > index cbc17a203248..4913c6da270b 100644 > --- a/fs/nfs/nfs2xdr.c > +++ b/fs/nfs/nfs2xdr.c > @@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst > *req, struct xdr_stream *xdr, > void *data) > { > struct nfs_pgio_res *result = data; > + struct rpc_task *rq_task = req->rq_task; > + > + if (rq_task) { > + struct nfs_pgio_args *args = rq_task- > >tk_msg.rpc_argp; > + > + if (args) { > + result->count = args->count; > + } > + } > > /* All NFSv2 writes are "file sync" writes */ > result->verf->committed = NFS_FILE_SYNC; Thanks! I've moved the above to nfs_write_done() so that we do it only on success (see http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3ba5688da709dd0f7d917029c206bc1848a6ae74 ) -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-27 12:58 ` Trond Myklebust @ 2019-08-27 13:20 ` Jan Stancek 0 siblings, 0 replies; 12+ messages in thread From: Jan Stancek @ 2019-08-27 13:20 UTC (permalink / raw) To: Trond Myklebust, naresh kamboju Cc: the hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey kodanev ----- Original Message ----- > On Tue, 2019-08-27 at 06:25 -0400, Jan Stancek wrote: > > That theory is probably not correct for this case, since EIO I see > > appears > > to originate from write and nfs_writeback_result(). This function > > also > > produces message we saw in logs from Naresh. > > > > I can't find where/how is resp->count updated on WRITE reply in > > NFSv2. > > Issue also goes away with patch below, though I can't speak about its > > correctness: > > > > NFS version Type Test Return code > > nfsvers=2 tcp -b:base 0 > > nfsvers=2 tcp -g:general 0 > > nfsvers=2 tcp -s:special 0 > > nfsvers=2 tcp -l:lock 0 > > Total time: 141 > > > > diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c > > index cbc17a203248..4913c6da270b 100644 > > --- a/fs/nfs/nfs2xdr.c > > +++ b/fs/nfs/nfs2xdr.c > > @@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst > > *req, struct xdr_stream *xdr, > > void *data) > > { > > struct nfs_pgio_res *result = data; > > + struct rpc_task *rq_task = req->rq_task; > > + > > + if (rq_task) { > > + struct nfs_pgio_args *args = rq_task- > > >tk_msg.rpc_argp; > > + > > + if (args) { > > + result->count = args->count; > > + } > > + } > > > > /* All NFSv2 writes are "file sync" writes */ > > result->verf->committed = NFS_FILE_SYNC; > > Thanks! I've moved the above to nfs_write_done() so that we do it only > on success (see > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3ba5688da709dd0f7d917029c206bc1848a6ae74 > ) Thanks, retested with 3ba5688da, all PASS: NFS version Type Test Return code nfsvers=2 tcp -b:base 0 nfsvers=2 tcp -g:general 0 nfsvers=2 tcp -s:special 0 nfsvers=2 tcp -l:lock 0 NFS version Type Test Return code nfsvers=3 tcp -b:base 0 nfsvers=3 tcp -g:general 0 nfsvers=3 tcp -s:special 0 nfsvers=3 tcp -l:lock 0 nfsvers=3 tcp6 -b:base 0 nfsvers=3 tcp6 -g:general 0 nfsvers=3 tcp6 -s:special 0 nfsvers=3 tcp6 -l:lock 0 NFS version Type Test Return code nfsvers=4 tcp -b:base 0 nfsvers=4 tcp -g:general 0 nfsvers=4 tcp -s:special 0 nfsvers=4 tcp -l:lock 0 nfsvers=4 tcp6 -b:base 0 nfsvers=4 tcp6 -g:general 0 nfsvers=4 tcp6 -s:special 0 nfsvers=4 tcp6 -l:lock 0 Feel free to add also: Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Jan Stancek <jstancek@redhat.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym 2019-08-26 23:12 ` Jan Stancek 2019-08-27 0:59 ` Trond Myklebust @ 2019-08-27 6:34 ` Naresh Kamboju 1 sibling, 0 replies; 12+ messages in thread From: Naresh Kamboju @ 2019-08-27 6:34 UTC (permalink / raw) To: Jan Stancek Cc: Trond Myklebust, the hoang0709, Linux-Next Mailing List, ltp, open list, chrubis, alexey kodanev On Tue, 27 Aug 2019 at 04:42, Jan Stancek <jstancek@redhat.com> wrote: > > > ----- Original Message ----- > > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote: > > No change with that patch, Same for me. > but following one fixes it for me: Works for me. Thanks for the fix patch. > > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c > index 20b3717cd7ca..56cefa0ab804 100644 > --- a/fs/nfs/pagelist.c > +++ b/fs/nfs/pagelist.c > @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct nfs_pgio_header *hdr, > } > > hdr->res.fattr = &hdr->fattr; > - hdr->res.count = 0; > + hdr->res.count = count; > hdr->res.eof = 0; > hdr->res.verf = &hdr->verf; > nfs_fattr_init(&hdr->fattr); > > which is functionally revert of "NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup". > > This hunk caught my eye, could res.eof == 0 explain those I/O errors? > /* Emulate the eof flag, which isn't normally needed in NFSv2 > * as it is guaranteed to always return the file attributes > */ > if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size) > hdr->res.eof = 1; - Naresh ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-08-27 13:20 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-26 9:47 Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym Naresh Kamboju 2019-08-26 10:41 ` Cyril Hrubis 2019-08-26 11:05 ` Jan Stancek 2019-08-26 13:50 ` Naresh Kamboju 2019-08-26 14:38 ` Jan Stancek 2019-08-26 15:58 ` Trond Myklebust 2019-08-26 23:12 ` Jan Stancek 2019-08-27 0:59 ` Trond Myklebust 2019-08-27 10:25 ` Jan Stancek 2019-08-27 12:58 ` Trond Myklebust 2019-08-27 13:20 ` Jan Stancek 2019-08-27 6:34 ` Naresh Kamboju
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).