linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
@ 2019-06-09 18:14 Theodore Ts'o
  2019-06-18 23:09 ` Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Theodore Ts'o @ 2019-06-09 18:14 UTC (permalink / raw)
  To: Omar Sandoval, Andi Kleen; +Cc: linux-block

I recently noticed that block/002 from blktests started failing:

root@kvm-xfstests:~# cd blktests/
root@kvm-xfstests:~/blktests# ./check block/002
block/002 (remove a device while running blktrace)          
    runtime  ...
[   12.598314] run blktests block/002 at 2019-06-09 13:09:00
[   12.621298] scsi host0: scsi_debug: version 0188 [20190125]
[   12.621298]   dev_size_mb=8, opts=0x0, submit_queues=1, statistics=0
[   12.625578] scsi 0:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
[   12.627109] sd 0:0:0:0: Power-on or device reset occurred
[   12.630322] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   12.634693] sd 0:0:0:0: [sda] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
[   12.638881] sd 0:0:0:0: [sda] Write Protect is off
[   12.639464] sd 0:0:0:0: [sda] Mode Sense: 73 00 10 08
[   12.646951] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   12.658210] sd 0:0:0:0: [sda] Optimal transfer size 524288 bytes
[   12.722771] sd 0:0:0:0: [sda] Attached SCSI disk
block/002 (remove a device while running blktrace)           [failed]left
    runtime  ...  0.945s0: [sda] Synchronizing SCSI cache
    --- tests/block/002.out	2019-05-27 13:52:17.000000000 -0400
    +++ /root/blktests/results/nodev/block/002.out.bad	2019-06-09 13:09:01.034094065 -0400
    @@ -1,2 +1,3 @@
     Running block/002
    +debugfs directory leaked
     Test complete
root@kvm-xfstests:~/blktests# 

The git bisect log (see attached) pointed at this commit:

commit c2b3c170db610896e4e633cba2135045333811c2 (HEAD, refs/bisect/bad)
Author: Andi Kleen <ak@linux.intel.com>
Date:   Tue Mar 26 15:18:20 2019 -0700

    perf stat: Revert checks for duration_time
    
    This reverts e864c5ca145e ("perf stat: Hide internal duration_time
    counter") but doing it manually since the code has now moved to a
    different file.
    
    The next patch will properly implement duration_time as a full event, so
    no need to hide it anymore.
    
    Signed-off-by: Andi Kleen <ak@linux.intel.com>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Is this a known issue?

Thanks,

						- Ted

git bisect start
# good: [e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd] Linux 5.1
git bisect good e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd
# bad: [cd6c84d8f0cdc911df435bb075ba22ce3c605b07] Linux 5.2-rc2
git bisect bad cd6c84d8f0cdc911df435bb075ba22ce3c605b07
# bad: [f4d9a23d3dad0252f375901bf4ff6523a2c97241] sparc64: simplify reduce_memory() function
git bisect bad f4d9a23d3dad0252f375901bf4ff6523a2c97241
# bad: [67a242223958d628f0ba33283668e3ddd192d057] Merge tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block
git bisect bad 67a242223958d628f0ba33283668e3ddd192d057
# bad: [8ff468c29e9a9c3afe9152c10c7b141343270bf3] Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 8ff468c29e9a9c3afe9152c10c7b141343270bf3
# bad: [8f5e823f9131a430b12f73e9436d7486e20c16f5] Merge tag 'pm-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad 8f5e823f9131a430b12f73e9436d7486e20c16f5
# bad: [0bc40e549aeea2de20fc571749de9bbfc099fb34] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 0bc40e549aeea2de20fc571749de9bbfc099fb34
# good: [007dc78fea62610bf06829e38f1d8c69b6ea5af6] Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 007dc78fea62610bf06829e38f1d8c69b6ea5af6
# bad: [a0e928ed7c603a47dca8643e58db224a799ff2c5] Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad a0e928ed7c603a47dca8643e58db224a799ff2c5
# bad: [f447e4eb3ad1e60d173ca997fcb2ef2a66f12574] perf/x86/intel: Force resched when TFA sysctl is modified
git bisect bad f447e4eb3ad1e60d173ca997fcb2ef2a66f12574
# bad: [8313fe2d685da168b732421f85714cfd702d2141] perf vendor events intel: Update Broadwell events to v23
git bisect bad 8313fe2d685da168b732421f85714cfd702d2141
# bad: [70df6a7311186a7ab0b19f481dee4ca540a73837] tools lib traceevent: Add more debugging to see various internal ring buffer entries
git bisect bad 70df6a7311186a7ab0b19f481dee4ca540a73837
# bad: [c2b3c170db610896e4e633cba2135045333811c2] perf stat: Revert checks for duration_time
git bisect bad c2b3c170db610896e4e633cba2135045333811c2
# good: [59f3bd7802d3ff7e6ddcce600f361bed288a97dd] perf augmented_raw_syscalls: Use a PERCPU_ARRAY map to copy more string bytes
git bisect good 59f3bd7802d3ff7e6ddcce600f361bed288a97dd
# good: [514c54039da970f953164c1960d0284f87db969d] perf tools: Add header defining used namespace struct to event.h
git bisect good 514c54039da970f953164c1960d0284f87db969d
# good: [7fcfa9a2d9a7c1b428d61992c2deaa9e37a437b0] perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITES
git bisect good 7fcfa9a2d9a7c1b428d61992c2deaa9e37a437b0
# first bad commit: [c2b3c170db610896e4e633cba2135045333811c2] perf stat: Revert checks for duration_time

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
  2019-06-09 18:14 [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure Theodore Ts'o
@ 2019-06-18 23:09 ` Bart Van Assche
  2019-06-27 17:14   ` Omar Sandoval
  2019-06-27 20:24 ` Andi Kleen
  2019-08-16 17:24 ` Andi Kleen
  2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2019-06-18 23:09 UTC (permalink / raw)
  To: Theodore Ts'o, Omar Sandoval, Andi Kleen; +Cc: linux-block

On 6/9/19 11:14 AM, Theodore Ts'o wrote:
> I recently noticed that block/002 from blktests started failing:
> 
> root@kvm-xfstests:~# cd blktests/
> root@kvm-xfstests:~/blktests# ./check block/002
> block/002 (remove a device while running blktrace)
>      runtime  ...
> [   12.598314] run blktests block/002 at 2019-06-09 13:09:00
> [   12.621298] scsi host0: scsi_debug: version 0188 [20190125]
> [   12.621298]   dev_size_mb=8, opts=0x0, submit_queues=1, statistics=0
> [   12.625578] scsi 0:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> [   12.627109] sd 0:0:0:0: Power-on or device reset occurred
> [   12.630322] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [   12.634693] sd 0:0:0:0: [sda] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> [   12.638881] sd 0:0:0:0: [sda] Write Protect is off
> [   12.639464] sd 0:0:0:0: [sda] Mode Sense: 73 00 10 08
> [   12.646951] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   12.658210] sd 0:0:0:0: [sda] Optimal transfer size 524288 bytes
> [   12.722771] sd 0:0:0:0: [sda] Attached SCSI disk
> block/002 (remove a device while running blktrace)           [failed]left
>      runtime  ...  0.945s0: [sda] Synchronizing SCSI cache
>      --- tests/block/002.out	2019-05-27 13:52:17.000000000 -0400
>      +++ /root/blktests/results/nodev/block/002.out.bad	2019-06-09 13:09:01.034094065 -0400
>      @@ -1,2 +1,3 @@
>       Running block/002
>      +debugfs directory leaked
>       Test complete
> root@kvm-xfstests:~/blktests#
> 
> The git bisect log (see attached) pointed at this commit:
> 
> commit c2b3c170db610896e4e633cba2135045333811c2 (HEAD, refs/bisect/bad)
> Author: Andi Kleen <ak@linux.intel.com>
> Date:   Tue Mar 26 15:18:20 2019 -0700
> 
>      perf stat: Revert checks for duration_time
>      
>      This reverts e864c5ca145e ("perf stat: Hide internal duration_time
>      counter") but doing it manually since the code has now moved to a
>      different file.
>      
>      The next patch will properly implement duration_time as a full event, so
>      no need to hide it anymore.
>      
>      Signed-off-by: Andi Kleen <ak@linux.intel.com>
>      Acked-by: Jiri Olsa <jolsa@kernel.org>
>      Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
>      Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Is this a known issue?

Hi Ted,

Test block/002 removes a SCSI device by writing into the "delete" sysfs 
attribute. As one can see in __scsi_remove_device() that triggers a 
synchronous call of blk_cleanup_queue(). The "debugfs directory leaked" 
message is reported if the request queue debugfs directory is found 
after SCSI device deletion has finished. Request queue debugfs directory 
deletion happens upon the final put of the request queue (see also 
__blk_release_queue()). I don't think that there is any guarantee that 
the debugfs directory disappears immediately after SCSI device deletion 
has finished. In other words, I think that this is a bug in test 
block/002. Omar, are you the author of that test script?

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
  2019-06-18 23:09 ` Bart Van Assche
@ 2019-06-27 17:14   ` Omar Sandoval
  2019-06-28 16:20     ` Bart Van Assche
  0 siblings, 1 reply; 6+ messages in thread
From: Omar Sandoval @ 2019-06-27 17:14 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Theodore Ts'o, Omar Sandoval, Andi Kleen, linux-block

On Tue, Jun 18, 2019 at 04:09:26PM -0700, Bart Van Assche wrote:
> On 6/9/19 11:14 AM, Theodore Ts'o wrote:
> > I recently noticed that block/002 from blktests started failing:
> > 
> > root@kvm-xfstests:~# cd blktests/
> > root@kvm-xfstests:~/blktests# ./check block/002
> > block/002 (remove a device while running blktrace)
> >      runtime  ...
> > [   12.598314] run blktests block/002 at 2019-06-09 13:09:00
> > [   12.621298] scsi host0: scsi_debug: version 0188 [20190125]
> > [   12.621298]   dev_size_mb=8, opts=0x0, submit_queues=1, statistics=0
> > [   12.625578] scsi 0:0:0:0: Direct-Access     Linux    scsi_debug       0188 PQ: 0 ANSI: 7
> > [   12.627109] sd 0:0:0:0: Power-on or device reset occurred
> > [   12.630322] sd 0:0:0:0: Attached scsi generic sg0 type 0
> > [   12.634693] sd 0:0:0:0: [sda] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB)
> > [   12.638881] sd 0:0:0:0: [sda] Write Protect is off
> > [   12.639464] sd 0:0:0:0: [sda] Mode Sense: 73 00 10 08
> > [   12.646951] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
> > [   12.658210] sd 0:0:0:0: [sda] Optimal transfer size 524288 bytes
> > [   12.722771] sd 0:0:0:0: [sda] Attached SCSI disk
> > block/002 (remove a device while running blktrace)           [failed]left
> >      runtime  ...  0.945s0: [sda] Synchronizing SCSI cache
> >      --- tests/block/002.out	2019-05-27 13:52:17.000000000 -0400
> >      +++ /root/blktests/results/nodev/block/002.out.bad	2019-06-09 13:09:01.034094065 -0400
> >      @@ -1,2 +1,3 @@
> >       Running block/002
> >      +debugfs directory leaked
> >       Test complete
> > root@kvm-xfstests:~/blktests#
> > 
> > The git bisect log (see attached) pointed at this commit:
> > 
> > commit c2b3c170db610896e4e633cba2135045333811c2 (HEAD, refs/bisect/bad)
> > Author: Andi Kleen <ak@linux.intel.com>
> > Date:   Tue Mar 26 15:18:20 2019 -0700
> > 
> >      perf stat: Revert checks for duration_time
> >      This reverts e864c5ca145e ("perf stat: Hide internal duration_time
> >      counter") but doing it manually since the code has now moved to a
> >      different file.
> >      The next patch will properly implement duration_time as a full event, so
> >      no need to hide it anymore.
> >      Signed-off-by: Andi Kleen <ak@linux.intel.com>
> >      Acked-by: Jiri Olsa <jolsa@kernel.org>
> >      Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
> >      Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > 
> > Is this a known issue?
> 
> Hi Ted,
> 
> Test block/002 removes a SCSI device by writing into the "delete" sysfs
> attribute. As one can see in __scsi_remove_device() that triggers a
> synchronous call of blk_cleanup_queue(). The "debugfs directory leaked"
> message is reported if the request queue debugfs directory is found after
> SCSI device deletion has finished. Request queue debugfs directory deletion
> happens upon the final put of the request queue (see also
> __blk_release_queue()). I don't think that there is any guarantee that the
> debugfs directory disappears immediately after SCSI device deletion has
> finished. In other words, I think that this is a bug in test block/002.
> Omar, are you the author of that test script?

Hi, Bart, yes I wrote this test. I can reproduce the failure. I'll try
to find a reliable way to wait, otherwise I'll probably just toss a
sleep in here.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
  2019-06-09 18:14 [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure Theodore Ts'o
  2019-06-18 23:09 ` Bart Van Assche
@ 2019-06-27 20:24 ` Andi Kleen
  2019-08-16 17:24 ` Andi Kleen
  2 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Omar Sandoval, linux-block

> The git bisect log (see attached) pointed at this commit:
> 
> commit c2b3c170db610896e4e633cba2135045333811c2 (HEAD, refs/bisect/bad)
> Author: Andi Kleen <ak@linux.intel.com>
> Date:   Tue Mar 26 15:18:20 2019 -0700

>     perf stat: Revert checks for duration_time

You must have misbisected. The commit only changes the perf user tool,
not the kernel.

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
  2019-06-27 17:14   ` Omar Sandoval
@ 2019-06-28 16:20     ` Bart Van Assche
  0 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2019-06-28 16:20 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Theodore Ts'o, Omar Sandoval, Andi Kleen, linux-block

On 6/27/19 10:14 AM, Omar Sandoval wrote:
> I can reproduce the failure. I'll try to find a reliable way to wait,
> otherwise I'll probably just toss a sleep in here.

Thanks!

Bart.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure
  2019-06-09 18:14 [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure Theodore Ts'o
  2019-06-18 23:09 ` Bart Van Assche
  2019-06-27 20:24 ` Andi Kleen
@ 2019-08-16 17:24 ` Andi Kleen
  2 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2019-08-16 17:24 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Omar Sandoval, linux-block

> The git bisect log (see attached) pointed at this commit:

bisect must be wrong. The commit only changes the perf tool, so cannot
break anything in the kernel.

-Andi

> 
> commit c2b3c170db610896e4e633cba2135045333811c2 (HEAD, refs/bisect/bad)
> Author: Andi Kleen <ak@linux.intel.com>
> Date:   Tue Mar 26 15:18:20 2019 -0700
> 
>     perf stat: Revert checks for duration_time
>     
>     This reverts e864c5ca145e ("perf stat: Hide internal duration_time
>     counter") but doing it manually since the code has now moved to a
>     different file.
>     
>     The next patch will properly implement duration_time as a full event, so
>     no need to hide it anymore.
>     
>     Signed-off-by: Andi Kleen <ak@linux.intel.com>
>     Acked-by: Jiri Olsa <jolsa@kernel.org>
>     Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
>     Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Is this a known issue?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-16 17:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-09 18:14 [REGRESSION] commit c2b3c170db610 causes blktests block/002 failure Theodore Ts'o
2019-06-18 23:09 ` Bart Van Assche
2019-06-27 17:14   ` Omar Sandoval
2019-06-28 16:20     ` Bart Van Assche
2019-06-27 20:24 ` Andi Kleen
2019-08-16 17:24 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).