* multipath patches @ 2019-07-11 19:06 Olga Kornievskaia 2019-07-11 19:29 ` Trond Myklebust 0 siblings, 1 reply; 8+ messages in thread From: Olga Kornievskaia @ 2019-07-11 19:06 UTC (permalink / raw) To: trond.myklebust, NeilBrown; +Cc: linux-nfs Hi Trond, I see that you have nconnect patches in your testing branch (as well as your linux-next and I assume they are the same). There is something wrong with that version. A mount hangs the machine. [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mount.nfs:2624] I don't have such problems with the patch series that Neil has posted. Thank you. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-11 19:06 multipath patches Olga Kornievskaia @ 2019-07-11 19:29 ` Trond Myklebust 2019-07-11 20:33 ` Olga Kornievskaia 0 siblings, 1 reply; 8+ messages in thread From: Trond Myklebust @ 2019-07-11 19:29 UTC (permalink / raw) To: aglo, neilb; +Cc: linux-nfs On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > Hi Trond, > > I see that you have nconnect patches in your testing branch (as well > as your linux-next and I assume they are the same). There is > something wrong with that version. A mount hangs the machine. > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > [mount.nfs:2624] > > I don't have such problems with the patch series that Neil has > posted. > > Thank you. How are the patchsets different? As far as I know, all I did was apply the 3 patches that Neil added to my existing branch. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-11 19:29 ` Trond Myklebust @ 2019-07-11 20:33 ` Olga Kornievskaia 2019-07-11 21:13 ` Trond Myklebust 0 siblings, 1 reply; 8+ messages in thread From: Olga Kornievskaia @ 2019-07-11 20:33 UTC (permalink / raw) To: Trond Myklebust; +Cc: neilb, linux-nfs On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > > Hi Trond, > > > > I see that you have nconnect patches in your testing branch (as well > > as your linux-next and I assume they are the same). There is > > something wrong with that version. A mount hangs the machine. > > > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > > [mount.nfs:2624] > > > > I don't have such problems with the patch series that Neil has > > posted. > > > > Thank you. > > How are the patchsets different? As far as I know, all I did was apply > the 3 patches that Neil added to my existing branch. I'm not sure. I had a problem with your "multipath" branch before and I recall what I did is went back and redownloaded your posted patches. That was when I was testing performance. So if you haven't touched that branch and just used it I think it's the same problem. In the current testing branch I don't see several patches that Neil has added (posted) to the mailing list. So I'm not sure what you mean you added 3 of his patches on top of yours. At most I can say maybe you added 2 of his (one that allows for v2 and v3 and another that does state operations on a single connection. There are no patches for sunrpc stats that were posted). What I know is that if I revert your branch to bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply Neils patches. All is fine. I really don't want to debug a non-working version when there is one that works. > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-11 20:33 ` Olga Kornievskaia @ 2019-07-11 21:13 ` Trond Myklebust 2019-07-12 16:39 ` Olga Kornievskaia 0 siblings, 1 reply; 8+ messages in thread From: Trond Myklebust @ 2019-07-11 21:13 UTC (permalink / raw) To: aglo; +Cc: linux-nfs, neilb On Thu, 2019-07-11 at 16:33 -0400, Olga Kornievskaia wrote: > On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > > > Hi Trond, > > > > > > I see that you have nconnect patches in your testing branch (as > > > well > > > as your linux-next and I assume they are the same). There is > > > something wrong with that version. A mount hangs the machine. > > > > > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > > > [mount.nfs:2624] > > > > > > I don't have such problems with the patch series that Neil has > > > posted. > > > > > > Thank you. > > > > How are the patchsets different? As far as I know, all I did was > > apply > > the 3 patches that Neil added to my existing branch. > > I'm not sure. I had a problem with your "multipath" branch before and > I recall what I did is went back and redownloaded your posted > patches. > That was when I was testing performance. So if you haven't touched > that branch and just used it I think it's the same problem. > > In the current testing branch I don't see several patches that Neil > has added (posted) to the mailing list. So I'm not sure what you mean > you added 3 of his patches on top of yours. At most I can say maybe > you added 2 of his (one that allows for v2 and v3 and another that > does state operations on a single connection. There are no patches > for > sunrpc stats that were posted). > > What I know is that if I revert your branch to > bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply > Neils patches. All is fine. I really don't want to debug a non- > working > version when there is one that works. Sure, but that is not really an option given the rules for how trees in linux-next are supposed to work. They are considered to be more or less stable. Anyhow, I think I've found the bug. Neil had silently fixed it in one of my patches, so I've added an incremental patch that does more or less what he did. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-11 21:13 ` Trond Myklebust @ 2019-07-12 16:39 ` Olga Kornievskaia 2019-07-12 17:18 ` Trond Myklebust 0 siblings, 1 reply; 8+ messages in thread From: Olga Kornievskaia @ 2019-07-12 16:39 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs, neilb On Thu, Jul 11, 2019 at 5:13 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Thu, 2019-07-11 at 16:33 -0400, Olga Kornievskaia wrote: > > On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > > > > Hi Trond, > > > > > > > > I see that you have nconnect patches in your testing branch (as > > > > well > > > > as your linux-next and I assume they are the same). There is > > > > something wrong with that version. A mount hangs the machine. > > > > > > > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > > > > [mount.nfs:2624] > > > > > > > > I don't have such problems with the patch series that Neil has > > > > posted. > > > > > > > > Thank you. > > > > > > How are the patchsets different? As far as I know, all I did was > > > apply > > > the 3 patches that Neil added to my existing branch. > > > > I'm not sure. I had a problem with your "multipath" branch before and > > I recall what I did is went back and redownloaded your posted > > patches. > > That was when I was testing performance. So if you haven't touched > > that branch and just used it I think it's the same problem. > > > > In the current testing branch I don't see several patches that Neil > > has added (posted) to the mailing list. So I'm not sure what you mean > > you added 3 of his patches on top of yours. At most I can say maybe > > you added 2 of his (one that allows for v2 and v3 and another that > > does state operations on a single connection. There are no patches > > for > > sunrpc stats that were posted). > > > > What I know is that if I revert your branch to > > bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply > > Neils patches. All is fine. I really don't want to debug a non- > > working > > version when there is one that works. > > Sure, but that is not really an option given the rules for how trees in > linux-next are supposed to work. They are considered to be more or less > stable. > > Anyhow, I think I've found the bug. Neil had silently fixed it in one > of my patches, so I've added an incremental patch that does more or > less what he did. I just pulled and I still have a problem with the nconnect mount. Machine still hangs. Stack trace isn't in NFS but I'm betting it's somehow related [ 235.756747] general protection fault: 0000 [#1] SMP PTI [ 235.765187] CPU: 0 PID: 2780 Comm: pool Tainted: G W 5.2.0-rc7+ #29 [ 235.768555] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 235.774368] RIP: 0010:kmem_cache_alloc_node_trace+0x10b/0x1e0 [ 235.777576] Code: 4d 89 e1 41 f6 44 24 0b 04 0f 84 5f ff ff ff 4c 89 e7 e8 08 b6 01 00 49 89 c1 e9 4f ff ff ff 41 8b 41 20 49 8b 39 48 8d 4a 01 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 36 ff ff [ 235.786811] RSP: 0018:ffffbc7c4200fe58 EFLAGS: 00010246 [ 235.789778] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002b7c [ 235.793204] RDX: 0000000000002b7b RSI: 0000000000000dc0 RDI: 000000000002d96 [ 235.796182] RBP: 0000000000000dc0 R08: ffff9c7bfa82d960 R09: ffff9c7bcfc06d00 [ 235.799135] R10: ffff9c7bfddf0240 R11: 0000000000000001 R12: ffff9c7bcfc06d00 [ 235.802094] R13: 0000000000000000 R14: f000ff53f000ff53 R15: ffffffffbe2d4d71 [ 235.805072] FS: 00007fd7f1d48700(0000) GS:ffff9c7bfa800000(0000) knlGS:0000000000000000 [ 235.808430] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 235.810762] CR2: 00007fd7f0eb65a4 CR3: 0000000012046005 CR4: 00000000001606f0 [ 235.813662] Call Trace: [ 235.814694] alloc_rt_sched_group+0xf1/0x250 [ 235.816439] sched_create_group+0x59/0x70 [ 235.818094] sched_autogroup_create_attach+0x3a/0x160 [ 235.820148] ksys_setsid+0xeb/0x100 [ 235.821645] __ia32_sys_setsid+0xa/0x10 [ 235.823216] do_syscall_64+0x55/0x1a0 [ 235.824710] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-12 16:39 ` Olga Kornievskaia @ 2019-07-12 17:18 ` Trond Myklebust 2019-07-12 18:02 ` Olga Kornievskaia 0 siblings, 1 reply; 8+ messages in thread From: Trond Myklebust @ 2019-07-12 17:18 UTC (permalink / raw) To: aglo; +Cc: linux-nfs, neilb On Fri, 2019-07-12 at 12:39 -0400, Olga Kornievskaia wrote: > On Thu, Jul 11, 2019 at 5:13 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > On Thu, 2019-07-11 at 16:33 -0400, Olga Kornievskaia wrote: > > > On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > > > > > Hi Trond, > > > > > > > > > > I see that you have nconnect patches in your testing branch > > > > > (as > > > > > well > > > > > as your linux-next and I assume they are the same). There is > > > > > something wrong with that version. A mount hangs the machine. > > > > > > > > > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for > > > > > 23s! > > > > > [mount.nfs:2624] > > > > > > > > > > I don't have such problems with the patch series that Neil > > > > > has > > > > > posted. > > > > > > > > > > Thank you. > > > > > > > > How are the patchsets different? As far as I know, all I did > > > > was > > > > apply > > > > the 3 patches that Neil added to my existing branch. > > > > > > I'm not sure. I had a problem with your "multipath" branch before > > > and > > > I recall what I did is went back and redownloaded your posted > > > patches. > > > That was when I was testing performance. So if you haven't > > > touched > > > that branch and just used it I think it's the same problem. > > > > > > In the current testing branch I don't see several patches that > > > Neil > > > has added (posted) to the mailing list. So I'm not sure what you > > > mean > > > you added 3 of his patches on top of yours. At most I can say > > > maybe > > > you added 2 of his (one that allows for v2 and v3 and another > > > that > > > does state operations on a single connection. There are no > > > patches > > > for > > > sunrpc stats that were posted). > > > > > > What I know is that if I revert your branch to > > > bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply > > > Neils patches. All is fine. I really don't want to debug a non- > > > working > > > version when there is one that works. > > > > Sure, but that is not really an option given the rules for how > > trees in > > linux-next are supposed to work. They are considered to be more or > > less > > stable. > > > > Anyhow, I think I've found the bug. Neil had silently fixed it in > > one > > of my patches, so I've added an incremental patch that does more or > > less what he did. > > I just pulled and I still have a problem with the nconnect mount. > Machine still hangs. > > Stack trace isn't in NFS but I'm betting it's somehow related > > [ 235.756747] general protection fault: 0000 [#1] SMP PTI > [ 235.765187] CPU: 0 PID: 2780 Comm: pool Tainted: G W > 5.2.0-rc7+ #29 > [ 235.768555] Hardware name: VMware, Inc. VMware Virtual > Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 > [ 235.774368] RIP: 0010:kmem_cache_alloc_node_trace+0x10b/0x1e0 > [ 235.777576] Code: 4d 89 e1 41 f6 44 24 0b 04 0f 84 5f ff ff ff 4c > 89 e7 e8 08 b6 01 00 49 89 c1 e9 4f ff ff ff 41 8b 41 20 49 8b 39 48 > 8d 4a 01 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 > 36 > ff ff > [ 235.786811] RSP: 0018:ffffbc7c4200fe58 EFLAGS: 00010246 > [ 235.789778] RAX: 0000000000000000 RBX: 0000000000000000 RCX: > 0000000000002b7c > [ 235.793204] RDX: 0000000000002b7b RSI: 0000000000000dc0 RDI: > 000000000002d96 > [ 235.796182] RBP: 0000000000000dc0 R08: ffff9c7bfa82d960 R09: > ffff9c7bcfc06d00 > [ 235.799135] R10: ffff9c7bfddf0240 R11: 0000000000000001 R12: > ffff9c7bcfc06d00 > [ 235.802094] R13: 0000000000000000 R14: f000ff53f000ff53 R15: > ffffffffbe2d4d71 > [ 235.805072] FS: 00007fd7f1d48700(0000) GS:ffff9c7bfa800000(0000) > knlGS:0000000000000000 > [ 235.808430] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 235.810762] CR2: 00007fd7f0eb65a4 CR3: 0000000012046005 CR4: > 00000000001606f0 > [ 235.813662] Call Trace: > [ 235.814694] alloc_rt_sched_group+0xf1/0x250 > [ 235.816439] sched_create_group+0x59/0x70 > [ 235.818094] sched_autogroup_create_attach+0x3a/0x160 > [ 235.820148] ksys_setsid+0xeb/0x100 > [ 235.821645] __ia32_sys_setsid+0xa/0x10 > [ 235.823216] do_syscall_64+0x55/0x1a0 > [ 235.824710] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Ah.. Missing xprt_get(). Fixed in the 'testing' branch now. I'll send out a patch for review. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-12 17:18 ` Trond Myklebust @ 2019-07-12 18:02 ` Olga Kornievskaia 2019-07-12 19:09 ` Trond Myklebust 0 siblings, 1 reply; 8+ messages in thread From: Olga Kornievskaia @ 2019-07-12 18:02 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs, neilb On Fri, Jul 12, 2019 at 1:18 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Fri, 2019-07-12 at 12:39 -0400, Olga Kornievskaia wrote: > > On Thu, Jul 11, 2019 at 5:13 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > On Thu, 2019-07-11 at 16:33 -0400, Olga Kornievskaia wrote: > > > > On Thu, Jul 11, 2019 at 3:29 PM Trond Myklebust < > > > > trondmy@hammerspace.com> wrote: > > > > > On Thu, 2019-07-11 at 15:06 -0400, Olga Kornievskaia wrote: > > > > > > Hi Trond, > > > > > > > > > > > > I see that you have nconnect patches in your testing branch > > > > > > (as > > > > > > well > > > > > > as your linux-next and I assume they are the same). There is > > > > > > something wrong with that version. A mount hangs the machine. > > > > > > > > > > > > [ 132.143379] watchdog: BUG: soft lockup - CPU#0 stuck for > > > > > > 23s! > > > > > > [mount.nfs:2624] > > > > > > > > > > > > I don't have such problems with the patch series that Neil > > > > > > has > > > > > > posted. > > > > > > > > > > > > Thank you. > > > > > > > > > > How are the patchsets different? As far as I know, all I did > > > > > was > > > > > apply > > > > > the 3 patches that Neil added to my existing branch. > > > > > > > > I'm not sure. I had a problem with your "multipath" branch before > > > > and > > > > I recall what I did is went back and redownloaded your posted > > > > patches. > > > > That was when I was testing performance. So if you haven't > > > > touched > > > > that branch and just used it I think it's the same problem. > > > > > > > > In the current testing branch I don't see several patches that > > > > Neil > > > > has added (posted) to the mailing list. So I'm not sure what you > > > > mean > > > > you added 3 of his patches on top of yours. At most I can say > > > > maybe > > > > you added 2 of his (one that allows for v2 and v3 and another > > > > that > > > > does state operations on a single connection. There are no > > > > patches > > > > for > > > > sunrpc stats that were posted). > > > > > > > > What I know is that if I revert your branch to > > > > bf11fbdb20b385157b046ea7781f04d0c62554a3 before patches and apply > > > > Neils patches. All is fine. I really don't want to debug a non- > > > > working > > > > version when there is one that works. > > > > > > Sure, but that is not really an option given the rules for how > > > trees in > > > linux-next are supposed to work. They are considered to be more or > > > less > > > stable. > > > > > > Anyhow, I think I've found the bug. Neil had silently fixed it in > > > one > > > of my patches, so I've added an incremental patch that does more or > > > less what he did. > > > > I just pulled and I still have a problem with the nconnect mount. > > Machine still hangs. > > > > Stack trace isn't in NFS but I'm betting it's somehow related > > > > [ 235.756747] general protection fault: 0000 [#1] SMP PTI > > [ 235.765187] CPU: 0 PID: 2780 Comm: pool Tainted: G W > > 5.2.0-rc7+ #29 > > [ 235.768555] Hardware name: VMware, Inc. VMware Virtual > > Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 > > [ 235.774368] RIP: 0010:kmem_cache_alloc_node_trace+0x10b/0x1e0 > > [ 235.777576] Code: 4d 89 e1 41 f6 44 24 0b 04 0f 84 5f ff ff ff 4c > > 89 e7 e8 08 b6 01 00 49 89 c1 e9 4f ff ff ff 41 8b 41 20 49 8b 39 48 > > 8d 4a 01 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 > > 36 > > ff ff > > [ 235.786811] RSP: 0018:ffffbc7c4200fe58 EFLAGS: 00010246 > > [ 235.789778] RAX: 0000000000000000 RBX: 0000000000000000 RCX: > > 0000000000002b7c > > [ 235.793204] RDX: 0000000000002b7b RSI: 0000000000000dc0 RDI: > > 000000000002d96 > > [ 235.796182] RBP: 0000000000000dc0 R08: ffff9c7bfa82d960 R09: > > ffff9c7bcfc06d00 > > [ 235.799135] R10: ffff9c7bfddf0240 R11: 0000000000000001 R12: > > ffff9c7bcfc06d00 > > [ 235.802094] R13: 0000000000000000 R14: f000ff53f000ff53 R15: > > ffffffffbe2d4d71 > > [ 235.805072] FS: 00007fd7f1d48700(0000) GS:ffff9c7bfa800000(0000) > > knlGS:0000000000000000 > > [ 235.808430] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 235.810762] CR2: 00007fd7f0eb65a4 CR3: 0000000012046005 CR4: > > 00000000001606f0 > > [ 235.813662] Call Trace: > > [ 235.814694] alloc_rt_sched_group+0xf1/0x250 > > [ 235.816439] sched_create_group+0x59/0x70 > > [ 235.818094] sched_autogroup_create_attach+0x3a/0x160 > > [ 235.820148] ksys_setsid+0xeb/0x100 > > [ 235.821645] __ia32_sys_setsid+0xa/0x10 > > [ 235.823216] do_syscall_64+0x55/0x1a0 > > [ 235.824710] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > > > Ah.. Missing xprt_get(). Fixed in the 'testing' branch now. I'll send > out a patch for review. Hi Trond, With the latest patch in the testing branch, I can mount. > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: multipath patches 2019-07-12 18:02 ` Olga Kornievskaia @ 2019-07-12 19:09 ` Trond Myklebust 0 siblings, 0 replies; 8+ messages in thread From: Trond Myklebust @ 2019-07-12 19:09 UTC (permalink / raw) To: aglo; +Cc: linux-nfs, neilb On Fri, 2019-07-12 at 14:02 -0400, Olga Kornievskaia wrote: > > Hi Trond, > > With the latest patch in the testing branch, I can mount. Excellent! Thanks for your patience and for testing. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-07-12 19:09 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-11 19:06 multipath patches Olga Kornievskaia 2019-07-11 19:29 ` Trond Myklebust 2019-07-11 20:33 ` Olga Kornievskaia 2019-07-11 21:13 ` Trond Myklebust 2019-07-12 16:39 ` Olga Kornievskaia 2019-07-12 17:18 ` Trond Myklebust 2019-07-12 18:02 ` Olga Kornievskaia 2019-07-12 19:09 ` Trond Myklebust
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).