nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
@ 2018-07-04 12:45 Dorau, Lukasz
  2018-07-04 14:32 ` Dan Williams
  0 siblings, 1 reply; 6+ messages in thread
From: Dorau, Lukasz @ 2018-07-04 12:45 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Li, ZhijianX, Balcer, , Marcin

Hi,

This is a bug report:  injecting bad blocks in the nfit_test module does not work with the kernel v4.17.4 (the latest stable at this moment) and ndctl v61.1:

# ndctl inject-error --block=11 --count=12 namespace0.0
# ndctl list -M
{
"dev":"namespace0.0",
"mode":"devdax",
"map":"dev",
"size":30412800,
"uuid":"610d117f-eb1c-4560-9934-283e4ab44c9a",
"raw_uuid":"2dcfe1c1-38b2-4d38-81fe-9ea6b035ddad",
"chardev":"dax0.0"
}

More details are available at this issue:
https://github.com/pmem/issues/issues/897

Regards,
Lukasz

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
  2018-07-04 12:45 [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1 Dorau, Lukasz
@ 2018-07-04 14:32 ` Dan Williams
  2018-07-05  7:51   ` Dorau, Lukasz
  2018-07-05  8:29   ` Dorau, Lukasz
  0 siblings, 2 replies; 6+ messages in thread
From: Dan Williams @ 2018-07-04 14:32 UTC (permalink / raw)
  To: Dorau, Lukasz
  Cc: Li, ZhijianX, Slusarz, , Piotr  <piotr.balcer@intel.com>,
	linux-nvdimm@lists.01.org

On Wed, Jul 4, 2018 at 5:45 AM, Dorau, Lukasz <lukasz.dorau@intel.com> wrote:
> Hi,
>
>
>
> This is a bug report:  injecting bad blocks in the nfit_test module does not
> work with the kernel v4.17.4 (the latest stable at this moment) and ndctl
> v61.1:
>

Sorry, yes, this is a known issue. There is no fix on the near
horizon, this is a casualty of the ARS rework we did in the kernel
between v4.16 and v4.17. More details here:

https://github.com/pmem/issues/issues/897#issuecomment-402495154
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
  2018-07-04 14:32 ` Dan Williams
@ 2018-07-05  7:51   ` Dorau, Lukasz
  2018-07-05  8:29   ` Dorau, Lukasz
  1 sibling, 0 replies; 6+ messages in thread
From: Dorau, Lukasz @ 2018-07-05  7:51 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Li, ZhijianX, Slusarz, , Piotr  <piotr.balcer@intel.com>,
	linux-nvdimm@lists.01.org

On Wed, July 4, 2018 4:33 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Wed, Jul 4, 2018 at 5:45 AM, Dorau, Lukasz <lukasz.dorau@intel.com> wrote:
> > Hi,
> >
> >
> >
> > This is a bug report:  injecting bad blocks in the nfit_test module does not
> > work with the kernel v4.17.4 (the latest stable at this moment) and ndctl
> > v61.1:
> >
> 
> Sorry, yes, this is a known issue. There is no fix on the near
> horizon, this is a casualty of the ARS rework we did in the kernel
> between v4.16 and v4.17. More details here:
> 
> https://github.com/pmem/issues/issues/897#issuecomment-402495154
>

Hi Dan,

Thanks for the fast reply and the explanation.

I tested the workaround (ndctl start-scrub ; ndctl wait-scrub) but it does not work for me (with kernel v4.17.4 and ndctl v61.1):

# ndctl inject-error --block=16838 --count=512 namespace4.0
# ndctl start-scrub
# ndctl wait-scrub      #       <---       I waited 30 minutes for 'ndctl wait-scrub', but it did not finish...

--
Lukasz


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
  2018-07-04 14:32 ` Dan Williams
  2018-07-05  7:51   ` Dorau, Lukasz
@ 2018-07-05  8:29   ` Dorau, Lukasz
  2018-07-05 10:54     ` Dorau, Lukasz
  2018-07-05 16:57     ` Verma, Vishal L
  1 sibling, 2 replies; 6+ messages in thread
From: Dorau, Lukasz @ 2018-07-05  8:29 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Li, ZhijianX, Slusarz, , Piotr  <piotr.balcer@intel.com>,
	linux-nvdimm@lists.01.org

On Thu, July 5, 2018 9:51 AM, Dorau, Lukasz <lukasz.dorau@intel.com> wrote:
> 
> I tested the workaround (ndctl start-scrub ; ndctl wait-scrub) but it does not work
> for me (with kernel v4.17.4 and ndctl v61.1):
> 
> # ndctl inject-error --block=16838 --count=512 namespace4.0
> # ndctl start-scrub
> # ndctl wait-scrub      #       <---       I waited 30 minutes for 'ndctl wait-scrub', but it did
> not finish...
> 

It is polling forever in ndctl_bus_wait_for_scrub_completion() at libndctl.c:1239:

Program received signal SIGINT, Interrupt.
0x00007ffff72d2a40 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:84
84	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x00007ffff72d2a40 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ffff7bc5e96 in poll (__timeout=-1, __nfds=1, __fds=0x7fffffffde38) at /usr/include/bits/poll2.h:46
#2  ndctl_bus_wait_for_scrub_completion (bus=bus@entry=0x62c6d0) at libndctl.c:1239
#3  0x00000000004075c2 in scrub_action (action=ACTION_WAIT, bus=0x62c6d0) at bus.c:30
#4  bus_action (argc=<optimized out>, argv=<optimized out>, usage=usage@entry=0x4161c8 "ndctl wait-scrub [<bus-id> <bus-id2> ... <bus-idN>] [<options>]", action=action@entry=ACTION_WAIT, ctx=0x621340, 
    options=0x416260 <bus_options>) at bus.c:77
#5  0x0000000000407735 in cmd_wait_scrub (argc=<optimized out>, argv=<optimized out>, ctx=<optimized out>) at bus.c:119
#6  0x0000000000414590 in run_builtin (p=0x61faf0 <commands+304>, ctx=0x621340, argv=0x7fffffffe4f0, argc=1) at util/main.c:90
#7  main_handle_internal_command (argc=1, argv=0x7fffffffe4f0, ctx=0x621340, cmds=cmds@entry=0x61f9c0 <commands>, num_cmds=num_cmds@entry=23) at util/main.c:137
#8  0x00000000004071fd in main (argc=<optimized out>, argv=<optimized out>) at ndctl.c:125

--
Lukasz

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
  2018-07-05  8:29   ` Dorau, Lukasz
@ 2018-07-05 10:54     ` Dorau, Lukasz
  2018-07-05 16:57     ` Verma, Vishal L
  1 sibling, 0 replies; 6+ messages in thread
From: Dorau, Lukasz @ 2018-07-05 10:54 UTC (permalink / raw)
  To: Williams, Dan J; +Cc: Li, ZhijianX, linux-nvdimm, Balcer, , Marcin

On Thu, July 5, 2018 10:29 AM, Dorau, Lukasz <lukasz.dorau@intel.com> wrote:
> On Thu, July 5, 2018 9:51 AM, Dorau, Lukasz <lukasz.dorau@intel.com> wrote:
> >
> > I tested the workaround (ndctl start-scrub ; ndctl wait-scrub) but it does not work
> > for me (with kernel v4.17.4 and ndctl v61.1):
> >
> > # ndctl inject-error --block=16838 --count=512 namespace4.0
> > # ndctl start-scrub
> > # ndctl wait-scrub      #       <---       I waited 30 minutes for 'ndctl wait-scrub', but it
> > did not finish...
> >
> 
> It is polling forever in ndctl_bus_wait_for_scrub_completion() at libndctl.c:1239:
> 

When I replaced '--count=512' with '--count=1', it works well ...
 
--
Lukasz

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1
  2018-07-05  8:29   ` Dorau, Lukasz
  2018-07-05 10:54     ` Dorau, Lukasz
@ 2018-07-05 16:57     ` Verma, Vishal L
  1 sibling, 0 replies; 6+ messages in thread
From: Verma, Vishal L @ 2018-07-05 16:57 UTC (permalink / raw)
  To: Williams, Dan J, Dorau, Lukasz
  Cc: Li, ZhijianX, Slusarz, , Piotr  <piotr.balcer@intel.com>,
	linux-nvdimm@lists.01.org


On Thu, 2018-07-05 at 08:29 +0000, Dorau, Lukasz wrote:
> On Thu, July 5, 2018 9:51 AM, Dorau, Lukasz <lukasz.dorau@intel.com>
> wrote:
> > 
> > I tested the workaround (ndctl start-scrub ; ndctl wait-scrub) but
> > it does not work
> > for me (with kernel v4.17.4 and ndctl v61.1):
> > 
> > # ndctl inject-error --block=16838 --count=512 namespace4.0
> > # ndctl start-scrub
> > # ndctl wait-scrub      #       <---       I waited 30 minutes for
> > 'ndctl wait-scrub', but it did
> > not finish...
> > 
> 
> It is polling forever in ndctl_bus_wait_for_scrub_completion() at
> libndctl.c:1239:

I've seen this happen sometimes on my test setup, but not frequently
enough to be able to debug reliably..
Can you try inserting a sleep 5 after the start-scrub?

I'll also try injecting a large number of blocks and see if I can
reproduce it.

> 
> Program received signal SIGINT, Interrupt.
> 0x00007ffff72d2a40 in __poll_nocancel () at ../sysdeps/unix/syscall-
> template.S:84
> 84	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
> (gdb) bt
> #0  0x00007ffff72d2a40 in __poll_nocancel () at
> ../sysdeps/unix/syscall-template.S:84
> #1  0x00007ffff7bc5e96 in poll (__timeout=-1, __nfds=1,
> __fds=0x7fffffffde38) at /usr/include/bits/poll2.h:46
> #2  ndctl_bus_wait_for_scrub_completion (bus=bus@entry=0x62c6d0) at
> libndctl.c:1239
> #3  0x00000000004075c2 in scrub_action (action=ACTION_WAIT,
> bus=0x62c6d0) at bus.c:30
> #4  bus_action (argc=<optimized out>, argv=<optimized out>, usage=usa
> ge@entry=0x4161c8 "ndctl wait-scrub [<bus-id> <bus-id2> ... <bus-
> idN>] [<options>]", action=action@entry=ACTION_WAIT, ctx=0x621340, 
>     options=0x416260 <bus_options>) at bus.c:77
> #5  0x0000000000407735 in cmd_wait_scrub (argc=<optimized out>,
> argv=<optimized out>, ctx=<optimized out>) at bus.c:119
> #6  0x0000000000414590 in run_builtin (p=0x61faf0 <commands+304>,
> ctx=0x621340, argv=0x7fffffffe4f0, argc=1) at util/main.c:90
> #7  main_handle_internal_command (argc=1, argv=0x7fffffffe4f0,
> ctx=0x621340, cmds=cmds@entry=0x61f9c0 <commands>, num_cmds=num_cmds@
> entry=23) at util/main.c:137
> #8  0x00000000004071fd in main (argc=<optimized out>, argv=<optimized
> out>) at ndctl.c:125
> 
> --
> Lukasz
> 
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-05 16:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-04 12:45 [BUG] Injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1 Dorau, Lukasz
2018-07-04 14:32 ` Dan Williams
2018-07-05  7:51   ` Dorau, Lukasz
2018-07-05  8:29   ` Dorau, Lukasz
2018-07-05 10:54     ` Dorau, Lukasz
2018-07-05 16:57     ` Verma, Vishal L

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).