linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
@ 2019-08-26  9:47 Naresh Kamboju
  2019-08-26 10:41 ` Cyril Hrubis
  0 siblings, 1 reply; 12+ messages in thread
From: Naresh Kamboju @ 2019-08-26  9:47 UTC (permalink / raw)
  To: ltp, Linux-Next Mailing List, open list, alexey.kodanev, the_hoang0709
  Cc: Jan Stancek, chrubis

Do you see this LTP prot_hsymlinks failure on linux next 20190823 on
x86_64 and i386 devices?

test output log,
useradd: failure while writing changes to /etc/passwd
useradd: /home/hsym was created, but could not be removed
userdel: user 'hsym' does not exist
prot_hsymlinks    1  TBROK  :  prot_hsymlinks.c:325: Failed to run
cmd: useradd hsym
prot_hsymlinks    2  TBROK  :  prot_hsymlinks.c:325: Remaining cases broken
prot_hsymlinks    3  TBROK  :  prot_hsymlinks.c:325: Failed to run
cmd: userdel -r hsym
prot_hsymlinks    4  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks    5  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks    6  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks    7  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks    8  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks    9  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).
prot_hsymlinks   10  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).

Full test log,
https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log

Linux version:
Linux version 5.3.0-rc5-next-20190823 (oe-user@oe-host) (gcc version
7.3.0 (GCC)) #1 SMP Fri Aug 23 09:35:54 UTC 2019

steps to reproduce:
   cd /opt/ltp
   ./runltp -s prot_hsymlinks

metadata:
metadata:
  git branch: master
  git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
  git commit: 9733a7c62c66722bcfdb1a6fe4d35c497312d59a
  git describe: next-20190823
  make_kernelversion: 5.3.0-rc5
  kernel-config:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft/linux-next/591/config
  build-location:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft/linux-next/591
  toolchain: x86_64-linaro-linux 7.%
  series: lkft
  ltp-syscalls-tests__url: git://github.com/linux-test-project/ltp.git
  ltp-syscalls-tests__version: '20190517'

Best regards
Naresh Kamboju

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26  9:47 Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym Naresh Kamboju
@ 2019-08-26 10:41 ` Cyril Hrubis
  2019-08-26 11:05   ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Cyril Hrubis @ 2019-08-26 10:41 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: ltp, Linux-Next Mailing List, open list, alexey.kodanev,
	the_hoang0709, Jan Stancek

Hi!
> Do you see this LTP prot_hsymlinks failure on linux next 20190823 on
> x86_64 and i386 devices?
> 
> test output log,
> useradd: failure while writing changes to /etc/passwd
> useradd: /home/hsym was created, but could not be removed

This looks like an unrelated problem, failure to write to /etc/passwd
probably means that filesystem is full or some problem happend and how
is remounted RO.

I do not see the kernel messages from this job anywhere at the job
pages, is it stored somewhere?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 10:41 ` Cyril Hrubis
@ 2019-08-26 11:05   ` Jan Stancek
  2019-08-26 13:50     ` Naresh Kamboju
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-08-26 11:05 UTC (permalink / raw)
  To: Cyril Hrubis, Naresh Kamboju
  Cc: ltp, Linux-Next Mailing List, open list, alexey kodanev, the hoang0709



----- Original Message -----
> Hi!
> > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on
> > x86_64 and i386 devices?
> > 
> > test output log,
> > useradd: failure while writing changes to /etc/passwd
> > useradd: /home/hsym was created, but could not be removed
> 
> This looks like an unrelated problem, failure to write to /etc/passwd
> probably means that filesystem is full or some problem happend and how
> is remounted RO.

In Naresh' example, root is on NFS:
  root=/dev/nfs rw nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr

10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm on / type nfs (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
devtmpfs on /dev type devtmpfs (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)

Following message repeats couple times in logs:
  NFS: Server wrote zero bytes, expected XXX

Naresh, can you check if there are any errors on NFS server side?
Maybe run NFS cthon against that server with client running next-20190822 and next-20190823.

> 
> I do not see the kernel messages from this job anywhere at the job
> pages, is it stored somewhere?

It appears to be mixed in same log file:
  https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 11:05   ` Jan Stancek
@ 2019-08-26 13:50     ` Naresh Kamboju
  2019-08-26 14:38       ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Naresh Kamboju @ 2019-08-26 13:50 UTC (permalink / raw)
  To: Jan Stancek
  Cc: Cyril Hrubis, ltp, Linux-Next Mailing List, open list,
	alexey kodanev, the hoang0709

Hi Jan and Cyril,

On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> wrote:
>
>
>
> ----- Original Message -----
> > Hi!
> > > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on
> > > x86_64 and i386 devices?
> > >
> > > test output log,
> > > useradd: failure while writing changes to /etc/passwd
> > > useradd: /home/hsym was created, but could not be removed
> >
> > This looks like an unrelated problem, failure to write to /etc/passwd
> > probably means that filesystem is full or some problem happend and how
> > is remounted RO.
>
> In Naresh' example, root is on NFS:
>   root=/dev/nfs rw nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr

Right !
root is mounted on NFS.

>
> 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm on / type nfs (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> devtmpfs on /dev type devtmpfs (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
>
> Following message repeats couple times in logs:
>   NFS: Server wrote zero bytes, expected XXX
>
> Naresh, can you check if there are any errors on NFS server side?

I have re-tested the failed tests on next-20190822 and all get pass
which is also
using same NFS server [1] [2].

> Maybe run NFS cthon against that server with client running next-20190822 and next-20190823.

Thanks for the pointers.
I will setup and run NFS cthon on next-20190822 and next-20190823.

>
> >
> > I do not see the kernel messages from this job anywhere at the job
> > pages, is it stored somewhere?
>
> It appears to be mixed in same log file:
>   https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log

For the record the following tests failed on linux -next-20190823 on x86_64
and i386. The filesystem is mounted on NFS and tests are using
locally mounted hard drive ( with -d /scratch ).

Juno-r2 device filesystem mounted on NFS and did not see these errors
and test getting pass on -next-20190823.

These failures are reproducible all time on next-20190823 kernel on x86_64
and i386 device with root mounted on NFS [3] [4] [5] [6].

I will git bisect to find out which is bad commit.

prot_hsymlinks: [3]
------------------
useradd: failure while writing changes to /etc/passwd
useradd: /home/hsym was created, but could not be removed
userdel: user 'hsym' does not exist
prot_hsymlinks    1  TBROK  :  prot_hsymlinks.c:325: Failed to run
cmd: useradd hsym
prot_hsymlinks    2  TBROK  :  prot_hsymlinks.c:325: Remaining cases broken
prot_hsymlinks    3  TBROK  :  prot_hsymlinks.c:325: Failed to run
cmd: userdel -r hsym
prot_hsymlinks    4  TBROK  :  tst_sig.c:234: unexpected signal
SIGIOT/SIGABRT(6) received (pid = 8324).

logrotate01: [4]
-------------
compressing log with: /bin/gzip
error: error creating temp state file /var/lib/logrotate.status.tmp:
Input/output error
logrotate01    1  TFAIL  :  ltpapicmd.c:154: Test #1: logrotate
command exited with 1 return code. Output:

sem_unlink_2-2: [5]
------------------
make[3]: Entering directory
'/opt/ltp/testcases/open_posix_testsuite/conformance/interfaces/sem_unlink'
cat: write error: Input/output error
conformance/interfaces/sem_unlink/sem_unlink_2-2: execution: FAILED

syslog{01 ...10} [6]
-------------------
cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
syslog01    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf

cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
syslog02    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf

...
cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
syslog10    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf

ref:
PASS on 20190222:
[1] https://lkft.validation.linaro.org/scheduler/job/890446#L1232
[2] https://lkft.validation.linaro.org/scheduler/job/890454

FAILED on 20190823:
[3] https://lkft.validation.linaro.org/scheduler/job/890404#L1245
[4] https://lkft.validation.linaro.org/scheduler/job/886408#L2544
[5] https://lkft.validation.linaro.org/scheduler/job/886409#L3088
[6] https://lkft.validation.linaro.org/scheduler/job/890400#L1234

 - Naresh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 13:50     ` Naresh Kamboju
@ 2019-08-26 14:38       ` Jan Stancek
  2019-08-26 15:58         ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-08-26 14:38 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Cyril Hrubis, ltp, Linux-Next Mailing List, open list,
	alexey kodanev, the hoang0709, trond.myklebust


----- Original Message -----
> Hi Jan and Cyril,
> 
> On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com> wrote:
> >
> >
> >
> > ----- Original Message -----
> > > Hi!
> > > > Do you see this LTP prot_hsymlinks failure on linux next 20190823 on
> > > > x86_64 and i386 devices?
> > > >
> > > > test output log,
> > > > useradd: failure while writing changes to /etc/passwd
> > > > useradd: /home/hsym was created, but could not be removed
> > >
> > > This looks like an unrelated problem, failure to write to /etc/passwd
> > > probably means that filesystem is full or some problem happend and how
> > > is remounted RO.
> >
> > In Naresh' example, root is on NFS:
> >   root=/dev/nfs rw
> >   nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm,tcp,hard,intr
> 
> Right !
> root is mounted on NFS.
> 
> >
> > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-nfsrootfs-tyuevoxm
> > on / type nfs
> > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,mountvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> > devtmpfs on /dev type devtmpfs
> > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
> >
> > Following message repeats couple times in logs:
> >   NFS: Server wrote zero bytes, expected XXX
> >
> > Naresh, can you check if there are any errors on NFS server side?
> 
> I have re-tested the failed tests on next-20190822 and all get pass
> which is also
> using same NFS server [1] [2].

Thanks, that suggests some client side change between next-20190822 and next-20190823
might introduced it.

> 
> > Maybe run NFS cthon against that server with client running next-20190822
> > and next-20190823.
> 
> Thanks for the pointers.
> I will setup and run NFS cthon on next-20190822 and next-20190823.

I'll try to reproduce too.

> 
> >
> > >
> > > I do not see the kernel messages from this job anywhere at the job
> > > pages, is it stored somewhere?
> >
> > It appears to be mixed in same log file:
> >   https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20190823/testrun/886412/log
> 
> For the record the following tests failed on linux -next-20190823 on x86_64
> and i386. The filesystem is mounted on NFS and tests are using
> locally mounted hard drive ( with -d /scratch ).
> 
> Juno-r2 device filesystem mounted on NFS and did not see these errors
> and test getting pass on -next-20190823.
> 
> These failures are reproducible all time on next-20190823 kernel on x86_64
> and i386 device with root mounted on NFS [3] [4] [5] [6].
> 
> I will git bisect to find out which is bad commit.
> 
> prot_hsymlinks: [3]
> ------------------
> useradd: failure while writing changes to /etc/passwd
> useradd: /home/hsym was created, but could not be removed
> userdel: user 'hsym' does not exist
> prot_hsymlinks    1  TBROK  :  prot_hsymlinks.c:325: Failed to run
> cmd: useradd hsym
> prot_hsymlinks    2  TBROK  :  prot_hsymlinks.c:325: Remaining cases broken
> prot_hsymlinks    3  TBROK  :  prot_hsymlinks.c:325: Failed to run
> cmd: userdel -r hsym
> prot_hsymlinks    4  TBROK  :  tst_sig.c:234: unexpected signal
> SIGIOT/SIGABRT(6) received (pid = 8324).
> 
> logrotate01: [4]
> -------------
> compressing log with: /bin/gzip
> error: error creating temp state file /var/lib/logrotate.status.tmp:
> Input/output error
> logrotate01    1  TFAIL  :  ltpapicmd.c:154: Test #1: logrotate
> command exited with 1 return code. Output:
> 
> sem_unlink_2-2: [5]
> ------------------
> make[3]: Entering directory
> '/opt/ltp/testcases/open_posix_testsuite/conformance/interfaces/sem_unlink'
> cat: write error: Input/output error
> conformance/interfaces/sem_unlink/sem_unlink_2-2: execution: FAILED
> 
> syslog{01 ...10} [6]
> -------------------
> cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
> syslog01    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf
> 
> cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
> syslog02    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf
> 
> ...
> cp: failed to close '/etc/syslog.conf.ltpback': Input/output error
> syslog10    1  TBROK  :  ltpapicmd.c:188: failed to backup /etc/syslog.conf
> 
> ref:
> PASS on 20190222:
> [1] https://lkft.validation.linaro.org/scheduler/job/890446#L1232
> [2] https://lkft.validation.linaro.org/scheduler/job/890454
> 
> FAILED on 20190823:
> [3] https://lkft.validation.linaro.org/scheduler/job/890404#L1245
> [4] https://lkft.validation.linaro.org/scheduler/job/886408#L2544
> [5] https://lkft.validation.linaro.org/scheduler/job/886409#L3088
> [6] https://lkft.validation.linaro.org/scheduler/job/890400#L1234
> 
>  - Naresh
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 14:38       ` Jan Stancek
@ 2019-08-26 15:58         ` Trond Myklebust
  2019-08-26 23:12           ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2019-08-26 15:58 UTC (permalink / raw)
  To: naresh.kamboju, jstancek
  Cc: the_hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey.kodanev

On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote:
> ----- Original Message -----
> > Hi Jan and Cyril,
> > 
> > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com>
> > wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > Hi!
> > > > > Do you see this LTP prot_hsymlinks failure on linux next
> > > > > 20190823 on
> > > > > x86_64 and i386 devices?
> > > > > 
> > > > > test output log,
> > > > > useradd: failure while writing changes to /etc/passwd
> > > > > useradd: /home/hsym was created, but could not be removed
> > > > 
> > > > This looks like an unrelated problem, failure to write to
> > > > /etc/passwd
> > > > probably means that filesystem is full or some problem happend
> > > > and how
> > > > is remounted RO.
> > > 
> > > In Naresh' example, root is on NFS:
> > >   root=/dev/nfs rw
> > >  
> > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > nfsrootfs-tyuevoxm,tcp,hard,intr
> > 
> > Right !
> > root is mounted on NFS.
> > 
> > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > nfsrootfs-tyuevoxm
> > > on / type nfs
> > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,
> > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,moun
> > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> > > devtmpfs on /dev type devtmpfs
> > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
> > > 

The only thing I can think of that might cause an EIO on NFSv2 would be
this patch 
http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0
assuming that a bind-related error is leaking through.

I'd suggest something like the following to fix it up:

8<---------------------------------------
From 1e9336ac5363914dfcc1f49bf091409edbf36f8d Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon, 26 Aug 2019 11:44:04 -0400
Subject: [PATCH] fixup! SUNRPC: Don't handle errors if the bind/connect
 succeeded

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 net/sunrpc/clnt.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f13ec73c8299..a07b516e503a 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1980,9 +1980,11 @@ call_bind_status(struct rpc_task *task)
 
 	dprint_status(task);
 	trace_rpc_bind_status(task);
-	if (task->tk_status >= 0 || xprt_bound(xprt)) {
-		task->tk_action = call_connect;
-		return;
+	if (task->tk_status >= 0)
+		goto out_next;
+	if (xprt_bound(xprt)) {
+		task->tk_status = 0;
+		goto out_next;
 	}
 
 	switch (task->tk_status) {
@@ -2045,6 +2047,9 @@ call_bind_status(struct rpc_task *task)
 
 	rpc_call_rpcerror(task, status);
 	return;
+out_next:
+	task->tk_action = call_connect;
+	return;
 retry_timeout:
 	task->tk_status = 0;
 	task->tk_action = call_bind;
@@ -2107,8 +2112,10 @@ call_connect_status(struct rpc_task *task)
 		clnt->cl_stats->netreconn++;
 		goto out_next;
 	}
-	if (xprt_connected(xprt))
+	if (xprt_connected(xprt)) {
+		task->tk_status = 0;
 		goto out_next;
+	}
 
 	task->tk_status = 0;
 	switch (status) {
-- 
2.21.0



-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 15:58         ` Trond Myklebust
@ 2019-08-26 23:12           ` Jan Stancek
  2019-08-27  0:59             ` Trond Myklebust
  2019-08-27  6:34             ` Naresh Kamboju
  0 siblings, 2 replies; 12+ messages in thread
From: Jan Stancek @ 2019-08-26 23:12 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: naresh kamboju, the hoang0709, linux-next, ltp, linux-kernel,
	chrubis, alexey kodanev


----- Original Message -----
> On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote:
> > ----- Original Message -----
> > > Hi Jan and Cyril,
> > > 
> > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com>
> > > wrote:
> > > > 
> > > > 
> > > > ----- Original Message -----
> > > > > Hi!
> > > > > > Do you see this LTP prot_hsymlinks failure on linux next
> > > > > > 20190823 on
> > > > > > x86_64 and i386 devices?
> > > > > > 
> > > > > > test output log,
> > > > > > useradd: failure while writing changes to /etc/passwd
> > > > > > useradd: /home/hsym was created, but could not be removed
> > > > > 
> > > > > This looks like an unrelated problem, failure to write to
> > > > > /etc/passwd
> > > > > probably means that filesystem is full or some problem happend
> > > > > and how
> > > > > is remounted RO.
> > > > 
> > > > In Naresh' example, root is on NFS:
> > > >   root=/dev/nfs rw
> > > >  
> > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > > nfsrootfs-tyuevoxm,tcp,hard,intr
> > > 
> > > Right !
> > > root is mounted on NFS.
> > > 
> > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > > nfsrootfs-tyuevoxm
> > > > on / type nfs
> > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nolock,
> > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,moun
> > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> > > > devtmpfs on /dev type devtmpfs
> > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
> > > > 
> 
> The only thing I can think of that might cause an EIO on NFSv2 would be
> this patch
> http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0
> assuming that a bind-related error is leaking through.
> 
> I'd suggest something like the following to fix it up:

No change with that patch,
but following one fixes it for me:

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 20b3717cd7ca..56cefa0ab804 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct nfs_pgio_header *hdr,
        }
 
        hdr->res.fattr   = &hdr->fattr;
-       hdr->res.count   = 0;
+       hdr->res.count   = count;
        hdr->res.eof     = 0;
        hdr->res.verf    = &hdr->verf;
        nfs_fattr_init(&hdr->fattr);

which is functionally revert of "NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup".

This hunk caught my eye, could res.eof == 0 explain those I/O errors?
                /* Emulate the eof flag, which isn't normally needed in NFSv2                                                                                 
                 * as it is guaranteed to always return the file attributes                                                                                   
                 */                                                                                                                                           
                if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size)                                                                                
                        hdr->res.eof = 1; 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 23:12           ` Jan Stancek
@ 2019-08-27  0:59             ` Trond Myklebust
  2019-08-27 10:25               ` Jan Stancek
  2019-08-27  6:34             ` Naresh Kamboju
  1 sibling, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2019-08-27  0:59 UTC (permalink / raw)
  To: jstancek
  Cc: naresh.kamboju, the_hoang0709, linux-next, ltp, linux-kernel,
	chrubis, alexey.kodanev

On Mon, 2019-08-26 at 19:12 -0400, Jan Stancek wrote:
> ----- Original Message -----
> > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote:
> > > ----- Original Message -----
> > > > Hi Jan and Cyril,
> > > > 
> > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com>
> > > > wrote:
> > > > > 
> > > > > ----- Original Message -----
> > > > > > Hi!
> > > > > > > Do you see this LTP prot_hsymlinks failure on linux next
> > > > > > > 20190823 on
> > > > > > > x86_64 and i386 devices?
> > > > > > > 
> > > > > > > test output log,
> > > > > > > useradd: failure while writing changes to /etc/passwd
> > > > > > > useradd: /home/hsym was created, but could not be removed
> > > > > > 
> > > > > > This looks like an unrelated problem, failure to write to
> > > > > > /etc/passwd
> > > > > > probably means that filesystem is full or some problem
> > > > > > happend
> > > > > > and how
> > > > > > is remounted RO.
> > > > > 
> > > > > In Naresh' example, root is on NFS:
> > > > >   root=/dev/nfs rw
> > > > >  
> > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extr
> > > > > act-
> > > > > nfsrootfs-tyuevoxm,tcp,hard,intr
> > > > 
> > > > Right !
> > > > root is mounted on NFS.
> > > > 
> > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > > > nfsrootfs-tyuevoxm
> > > > > on / type nfs
> > > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nol
> > > > > ock,
> > > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,
> > > > > moun
> > > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> > > > > devtmpfs on /dev type devtmpfs
> > > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
> > > > > 
> > 
> > The only thing I can think of that might cause an EIO on NFSv2
> > would be
> > this patch
> > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0
> > assuming that a bind-related error is leaking through.
> > 
> > I'd suggest something like the following to fix it up:
> 
> No change with that patch,
> but following one fixes it for me:
> 
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index 20b3717cd7ca..56cefa0ab804 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct
> nfs_pgio_header *hdr,
>         }
>  
>         hdr->res.fattr   = &hdr->fattr;
> -       hdr->res.count   = 0;
> +       hdr->res.count   = count;
>         hdr->res.eof     = 0;
>         hdr->res.verf    = &hdr->verf;
>         nfs_fattr_init(&hdr->fattr);
> 
> which is functionally revert of "NFS: Fix initialisation of I/O
> result struct in nfs_pgio_rpcsetup".
> 
> This hunk caught my eye, could res.eof == 0 explain those I/O errors?

Interesting hypothesis. It could if res.count ends up being 0. So does
the following also fix the problem?
8<----------------------------------------
From b5bc0812350e94f8c9331174d22f24692411aef9 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon, 26 Aug 2019 20:41:16 -0400
Subject: [PATCH] NFSv2: Fix eof handling

If we received a reply from the server with a zero length read and
no error, then that implies we are at eof.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/proc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index 5552fa8b6e12..5919878549d2 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -594,7 +594,8 @@ static int nfs_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
 		/* Emulate the eof flag, which isn't normally needed in NFSv2
 		 * as it is guaranteed to always return the file attributes
 		 */
-		if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size)
+		if (hdr->res.count == 0 && hdr->args.count > 0 ||
+		    hdr->args.offset + hdr->res.count >= hdr->res.fattr->size)
 			hdr->res.eof = 1;
 	}
 	return 0;
-- 
2.21.0

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-26 23:12           ` Jan Stancek
  2019-08-27  0:59             ` Trond Myklebust
@ 2019-08-27  6:34             ` Naresh Kamboju
  1 sibling, 0 replies; 12+ messages in thread
From: Naresh Kamboju @ 2019-08-27  6:34 UTC (permalink / raw)
  To: Jan Stancek
  Cc: Trond Myklebust, the hoang0709, Linux-Next Mailing List, ltp,
	open list, chrubis, alexey kodanev

On Tue, 27 Aug 2019 at 04:42, Jan Stancek <jstancek@redhat.com> wrote:
>
>
> ----- Original Message -----
> > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote:
>
> No change with that patch,

Same for me.

> but following one fixes it for me:

Works for me.
Thanks for the fix patch.

>
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index 20b3717cd7ca..56cefa0ab804 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct nfs_pgio_header *hdr,
>         }
>
>         hdr->res.fattr   = &hdr->fattr;
> -       hdr->res.count   = 0;
> +       hdr->res.count   = count;
>         hdr->res.eof     = 0;
>         hdr->res.verf    = &hdr->verf;
>         nfs_fattr_init(&hdr->fattr);
>
> which is functionally revert of "NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup".
>
> This hunk caught my eye, could res.eof == 0 explain those I/O errors?
>                 /* Emulate the eof flag, which isn't normally needed in NFSv2
>                  * as it is guaranteed to always return the file attributes
>                  */
>                 if (hdr->args.offset + hdr->res.count >= hdr->res.fattr->size)
>                         hdr->res.eof = 1;


- Naresh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-27  0:59             ` Trond Myklebust
@ 2019-08-27 10:25               ` Jan Stancek
  2019-08-27 12:58                 ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Stancek @ 2019-08-27 10:25 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: naresh kamboju, the hoang0709, linux-next, ltp, linux-kernel,
	chrubis, alexey kodanev


----- Original Message -----
> On Mon, 2019-08-26 at 19:12 -0400, Jan Stancek wrote:
> > ----- Original Message -----
> > > On Mon, 2019-08-26 at 10:38 -0400, Jan Stancek wrote:
> > > > ----- Original Message -----
> > > > > Hi Jan and Cyril,
> > > > > 
> > > > > On Mon, 26 Aug 2019 at 16:35, Jan Stancek <jstancek@redhat.com>
> > > > > wrote:
> > > > > > 
> > > > > > ----- Original Message -----
> > > > > > > Hi!
> > > > > > > > Do you see this LTP prot_hsymlinks failure on linux next
> > > > > > > > 20190823 on
> > > > > > > > x86_64 and i386 devices?
> > > > > > > > 
> > > > > > > > test output log,
> > > > > > > > useradd: failure while writing changes to /etc/passwd
> > > > > > > > useradd: /home/hsym was created, but could not be removed
> > > > > > > 
> > > > > > > This looks like an unrelated problem, failure to write to
> > > > > > > /etc/passwd
> > > > > > > probably means that filesystem is full or some problem
> > > > > > > happend
> > > > > > > and how
> > > > > > > is remounted RO.
> > > > > > 
> > > > > > In Naresh' example, root is on NFS:
> > > > > >   root=/dev/nfs rw
> > > > > >  
> > > > > > nfsroot=10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extr
> > > > > > act-
> > > > > > nfsrootfs-tyuevoxm,tcp,hard,intr
> > > > > 
> > > > > Right !
> > > > > root is mounted on NFS.
> > > > > 
> > > > > > 10.66.16.123:/var/lib/lava/dispatcher/tmp/886412/extract-
> > > > > > nfsrootfs-tyuevoxm
> > > > > > on / type nfs
> > > > > > (rw,relatime,vers=2,rsize=4096,wsize=4096,namlen=255,hard,nol
> > > > > > ock,
> > > > > > proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.16.123,
> > > > > > moun
> > > > > > tvers=1,mountproto=tcp,local_lock=all,addr=10.66.16.123)
> > > > > > devtmpfs on /dev type devtmpfs
> > > > > > (rw,relatime,size=3977640k,nr_inodes=994410,mode=755)
> > > > > > 
> > > 
> > > The only thing I can think of that might cause an EIO on NFSv2
> > > would be
> > > this patch
> > > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=627d48e597ec5993c4abb3b81dc75e554a07c7c0
> > > assuming that a bind-related error is leaking through.
> > > 
> > > I'd suggest something like the following to fix it up:
> > 
> > No change with that patch,
> > but following one fixes it for me:
> > 
> > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> > index 20b3717cd7ca..56cefa0ab804 100644
> > --- a/fs/nfs/pagelist.c
> > +++ b/fs/nfs/pagelist.c
> > @@ -590,7 +590,7 @@ static void nfs_pgio_rpcsetup(struct
> > nfs_pgio_header *hdr,
> >         }
> >  
> >         hdr->res.fattr   = &hdr->fattr;
> > -       hdr->res.count   = 0;
> > +       hdr->res.count   = count;
> >         hdr->res.eof     = 0;
> >         hdr->res.verf    = &hdr->verf;
> >         nfs_fattr_init(&hdr->fattr);
> > 
> > which is functionally revert of "NFS: Fix initialisation of I/O
> > result struct in nfs_pgio_rpcsetup".
> > 
> > This hunk caught my eye, could res.eof == 0 explain those I/O errors?
> 
> Interesting hypothesis. It could if res.count ends up being 0. So does
> the following also fix the problem?

It didn't fix it.

That theory is probably not correct for this case, since EIO I see appears
to originate from write and nfs_writeback_result(). This function also
produces message we saw in logs from Naresh.

I can't find where/how is resp->count updated on WRITE reply in NFSv2.
Issue also goes away with patch below, though I can't speak about its correctness:

NFS version     Type    Test    Return code
nfsvers=2       tcp     -b:base         0
nfsvers=2       tcp     -g:general      0
nfsvers=2       tcp     -s:special      0
nfsvers=2       tcp     -l:lock         0
Total time: 141

diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index cbc17a203248..4913c6da270b 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst *req, struct xdr_stream *xdr,
                                 void *data)
 {
        struct nfs_pgio_res *result = data;
+       struct rpc_task *rq_task  = req->rq_task;
+
+       if (rq_task) {
+               struct nfs_pgio_args *args = rq_task->tk_msg.rpc_argp;
+
+               if (args) {
+                       result->count = args->count;
+               }
+       }
 
        /* All NFSv2 writes are "file sync" writes */
        result->verf->committed = NFS_FILE_SYNC;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-27 10:25               ` Jan Stancek
@ 2019-08-27 12:58                 ` Trond Myklebust
  2019-08-27 13:20                   ` Jan Stancek
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2019-08-27 12:58 UTC (permalink / raw)
  To: jstancek
  Cc: naresh.kamboju, the_hoang0709, linux-next, ltp, linux-kernel,
	chrubis, alexey.kodanev

On Tue, 2019-08-27 at 06:25 -0400, Jan Stancek wrote:
> That theory is probably not correct for this case, since EIO I see
> appears
> to originate from write and nfs_writeback_result(). This function
> also
> produces message we saw in logs from Naresh.
> 
> I can't find where/how is resp->count updated on WRITE reply in
> NFSv2.
> Issue also goes away with patch below, though I can't speak about its
> correctness:
> 
> NFS version     Type    Test    Return code
> nfsvers=2       tcp     -b:base         0
> nfsvers=2       tcp     -g:general      0
> nfsvers=2       tcp     -s:special      0
> nfsvers=2       tcp     -l:lock         0
> Total time: 141
> 
> diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
> index cbc17a203248..4913c6da270b 100644
> --- a/fs/nfs/nfs2xdr.c
> +++ b/fs/nfs/nfs2xdr.c
> @@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst
> *req, struct xdr_stream *xdr,
>                                  void *data)
>  {
>         struct nfs_pgio_res *result = data;
> +       struct rpc_task *rq_task  = req->rq_task;
> +
> +       if (rq_task) {
> +               struct nfs_pgio_args *args = rq_task-
> >tk_msg.rpc_argp;
> +
> +               if (args) {
> +                       result->count = args->count;
> +               }
> +       }
>  
>         /* All NFSv2 writes are "file sync" writes */
>         result->verf->committed = NFS_FILE_SYNC;

Thanks! I've moved the above to nfs_write_done() so that we do it only
on success (see 
http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3ba5688da709dd0f7d917029c206bc1848a6ae74
)
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym
  2019-08-27 12:58                 ` Trond Myklebust
@ 2019-08-27 13:20                   ` Jan Stancek
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Stancek @ 2019-08-27 13:20 UTC (permalink / raw)
  To: Trond Myklebust, naresh kamboju
  Cc: the hoang0709, linux-next, ltp, linux-kernel, chrubis, alexey kodanev



----- Original Message -----
> On Tue, 2019-08-27 at 06:25 -0400, Jan Stancek wrote:
> > That theory is probably not correct for this case, since EIO I see
> > appears
> > to originate from write and nfs_writeback_result(). This function
> > also
> > produces message we saw in logs from Naresh.
> > 
> > I can't find where/how is resp->count updated on WRITE reply in
> > NFSv2.
> > Issue also goes away with patch below, though I can't speak about its
> > correctness:
> > 
> > NFS version     Type    Test    Return code
> > nfsvers=2       tcp     -b:base         0
> > nfsvers=2       tcp     -g:general      0
> > nfsvers=2       tcp     -s:special      0
> > nfsvers=2       tcp     -l:lock         0
> > Total time: 141
> > 
> > diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
> > index cbc17a203248..4913c6da270b 100644
> > --- a/fs/nfs/nfs2xdr.c
> > +++ b/fs/nfs/nfs2xdr.c
> > @@ -897,6 +897,16 @@ static int nfs2_xdr_dec_writeres(struct rpc_rqst
> > *req, struct xdr_stream *xdr,
> >                                  void *data)
> >  {
> >         struct nfs_pgio_res *result = data;
> > +       struct rpc_task *rq_task  = req->rq_task;
> > +
> > +       if (rq_task) {
> > +               struct nfs_pgio_args *args = rq_task-
> > >tk_msg.rpc_argp;
> > +
> > +               if (args) {
> > +                       result->count = args->count;
> > +               }
> > +       }
> >  
> >         /* All NFSv2 writes are "file sync" writes */
> >         result->verf->committed = NFS_FILE_SYNC;
> 
> Thanks! I've moved the above to nfs_write_done() so that we do it only
> on success (see
> http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=3ba5688da709dd0f7d917029c206bc1848a6ae74
> )

Thanks, retested with 3ba5688da, all PASS:

NFS version     Type    Test    Return code
nfsvers=2       tcp     -b:base         0
nfsvers=2       tcp     -g:general      0
nfsvers=2       tcp     -s:special      0
nfsvers=2       tcp     -l:lock         0

NFS version     Type    Test    Return code
nfsvers=3       tcp     -b:base         0
nfsvers=3       tcp     -g:general      0
nfsvers=3       tcp     -s:special      0
nfsvers=3       tcp     -l:lock         0
nfsvers=3       tcp6    -b:base         0
nfsvers=3       tcp6    -g:general      0
nfsvers=3       tcp6    -s:special      0
nfsvers=3       tcp6    -l:lock         0

NFS version     Type    Test    Return code
nfsvers=4       tcp     -b:base         0
nfsvers=4       tcp     -g:general      0
nfsvers=4       tcp     -s:special      0
nfsvers=4       tcp     -l:lock         0
nfsvers=4       tcp6    -b:base         0
nfsvers=4       tcp6    -g:general      0
nfsvers=4       tcp6    -s:special      0
nfsvers=4       tcp6    -l:lock         0

Feel free to add also:

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Jan Stancek <jstancek@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-08-27 13:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-26  9:47 Linux-next-20190823: x86_64/i386: prot_hsymlinks.c:325: Failed to run cmd: useradd hsym Naresh Kamboju
2019-08-26 10:41 ` Cyril Hrubis
2019-08-26 11:05   ` Jan Stancek
2019-08-26 13:50     ` Naresh Kamboju
2019-08-26 14:38       ` Jan Stancek
2019-08-26 15:58         ` Trond Myklebust
2019-08-26 23:12           ` Jan Stancek
2019-08-27  0:59             ` Trond Myklebust
2019-08-27 10:25               ` Jan Stancek
2019-08-27 12:58                 ` Trond Myklebust
2019-08-27 13:20                   ` Jan Stancek
2019-08-27  6:34             ` Naresh Kamboju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).