Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "aglo@umich.edu" <aglo@umich.edu>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"neilb@suse.com" <neilb@suse.com>
Subject: Re: multipath patches
Date: Fri, 12 Jul 2019 17:18:17 +0000
Message-ID: <b220fa1ef8b73c99aacb28285af9025d5c7a55fd.camel@hammerspace.com> (raw)
In-Reply-To: <CAN-5tyHkdmTJZAYwAcfUy4hYOz=KHgdBeTrYMNYiWQaKp7UrJA@mail.gmail.com>

On Fri, 2019-07-12 at 12:39 -0400, Olga Kornievskaia wrote:
> On Thu, Jul 11, 2019 at 5:13 PM Trond Myklebust <
> trondmy@hammerspace.com> wrote:
> > On Thu, 2019-07-11 at 16:33 -0400, Olga Kornievskaia wrote:
> > > On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote:
> > > > > Hi Trond,
> > > > > 
> > > > > I see that you have nconnect patches in your testing branch
> > > > > (as
> > > > > well
> > > > > as your linux-next and I assume they are the same).  There is
> > > > > something wrong with that version. A mount hangs the machine.
> > > > > 
> > > > > [  132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for
> > > > > 23s!
> > > > > [mount.nfs:2624]
> > > > > 
> > > > > I don't have such problems with the patch series that Neil
> > > > > has
> > > > > posted.
> > > > > 
> > > > > Thank you.
> > > > 
> > > > How are the patchsets different? As far as I know, all I did
> > > > was
> > > > apply
> > > > the 3 patches that Neil added to my existing branch.
> > > 
> > > I'm not sure. I had a problem with your "multipath" branch before
> > > and
> > > I recall what I did is went back and redownloaded your posted
> > > patches.
> > > That was when I was testing performance. So if you haven't
> > > touched
> > > that branch and just used it I think it's the same problem.
> > > 
> > > In the current testing branch I don't see several patches that
> > > Neil
> > > has added (posted) to the mailing list. So I'm not sure what you
> > > mean
> > > you added 3 of his patches on top of yours. At most I can say
> > > maybe
> > > you added 2 of his (one that allows for v2 and v3 and another
> > > that
> > > does state operations on a single connection. There are no
> > > patches
> > > for
> > > sunrpc stats that were posted).
> > > 
> > > What I know is that if I revert your branch to
> > > bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply
> > > Neils patches. All is fine. I really don't want to debug a non-
> > > working
> > > version when there is one that works.
> > 
> > Sure, but that is not really an option given the rules for how
> > trees in
> > linux-next are supposed to work. They are considered to be more or
> > less
> > stable.
> > 
> > Anyhow, I think I've found the bug. Neil had silently fixed it in
> > one
> > of my patches, so I've added an incremental patch that does more or
> > less what he did.
> 
> I just pulled and I still have a problem with the nconnect mount.
> Machine still hangs.
> 
> Stack trace isn't in NFS but I'm betting it's somehow related
> 
> [  235.756747] general protection fault: 0000 [#1] SMP PTI
> [  235.765187] CPU: 0 PID: 2780 Comm: pool Tainted: G        W
> 5.2.0-rc7+ #29
> [  235.768555] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
> [  235.774368] RIP: 0010:kmem_cache_alloc_node_trace+0x10b/0x1e0
> [  235.777576] Code: 4d 89 e1 41 f6 44 24 0b 04 0f 84 5f ff ff ff 4c
> 89 e7 e8 08 b6 01 00 49 89 c1 e9 4f ff ff ff 41 8b 41 20 49 8b 39 48
> 8d 4a 01 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84
> 36
> ff ff
> [  235.786811] RSP: 0018:ffffbc7c4200fe58 EFLAGS: 00010246
> [  235.789778] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000002b7c
> [  235.793204] RDX: 0000000000002b7b RSI: 0000000000000dc0 RDI:
> 000000000002d96
> [  235.796182] RBP: 0000000000000dc0 R08: ffff9c7bfa82d960 R09:
> ffff9c7bcfc06d00
> [  235.799135] R10: ffff9c7bfddf0240 R11: 0000000000000001 R12:
> ffff9c7bcfc06d00
> [  235.802094] R13: 0000000000000000 R14: f000ff53f000ff53 R15:
> ffffffffbe2d4d71
> [  235.805072] FS:  00007fd7f1d48700(0000) GS:ffff9c7bfa800000(0000)
> knlGS:0000000000000000
> [  235.808430] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  235.810762] CR2: 00007fd7f0eb65a4 CR3: 0000000012046005 CR4:
> 00000000001606f0
> [  235.813662] Call Trace:
> [  235.814694]  alloc_rt_sched_group+0xf1/0x250
> [  235.816439]  sched_create_group+0x59/0x70
> [  235.818094]  sched_autogroup_create_attach+0x3a/0x160
> [  235.820148]  ksys_setsid+0xeb/0x100
> [  235.821645]  __ia32_sys_setsid+0xa/0x10
> [  235.823216]  do_syscall_64+0x55/0x1a0
> [  235.824710]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

Ah.. Missing xprt_get(). Fixed in the 'testing' branch now. I'll send
out a patch for review.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-11 19:06 Olga Kornievskaia
2019-07-11 19:29 ` Trond Myklebust
2019-07-11 20:33   ` Olga Kornievskaia
2019-07-11 21:13     ` Trond Myklebust
2019-07-12 16:39       ` Olga Kornievskaia
2019-07-12 17:18         ` Trond Myklebust [this message]
2019-07-12 18:02           ` Olga Kornievskaia
2019-07-12 19:09             ` Trond Myklebust

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b220fa1ef8b73c99aacb28285af9025d5c7a55fd.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org linux-nfs@archiver.kernel.org
	public-inbox-index linux-nfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox