From: Ming Lei <ming.lei@redhat.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Ming Lei <tom.leiming@gmail.com>,
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
linux-block <linux-block@vger.kernel.org>
Subject: Re: kprobe: __blkdev_put probe is missed
Date: Sat, 20 Jun 2020 07:28:20 +0800 [thread overview]
Message-ID: <20200619232820.GE353853@T590> (raw)
In-Reply-To: <20200620003509.9521053fbd384f4f5d23408f@kernel.org>
Hi Masami,
On Sat, Jun 20, 2020 at 12:35:09AM +0900, Masami Hiramatsu wrote:
> Hi Ming,
>
> On Fri, 19 Jun 2020 21:32:40 +0800
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > On Fri, Jun 19, 2020 at 08:19:54AM -0400, Steven Rostedt wrote:
> > > On Fri, 19 Jun 2020 15:28:59 +0800
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > > >
> > > > > OK, then let's make events (for sure)
> > > > >
> > > > > root@devnote2:/sys/kernel/debug/tracing# echo p __blkdev_put >> kprobe_events
> > > > > root@devnote2:/sys/kernel/debug/tracing# echo r __blkdev_put >> kprobe_events
> > > > > root@devnote2:/sys/kernel/debug/tracing# echo p blkdev_put >> kprobe_events
> > >
> > > Hi Ming,
> > >
> > > Do you have the kprobe_events file?
> > >
> > > > > root@devnote2:/sys/kernel/debug/tracing# echo 1 > events/kprobes/enable
> > > >
> > > > I can't find 'events/kprobes' in my VM with upstream kernel, also not found
> > > > the dir under fedora31(5.5.15-200) & rhel8(v4.18 based).
> > >
> > > The events/kprobes directly will be created when you create a
> > > kprobe_event. It wont exist until then.
> >
> > Hi Steven and Masami,
> >
> > Got it, thanks for your help, now I can run the test, follows the steps
> > and results, and there is still one __blkdev_put probed.
>
> Hmm, strange...
>
> > And it is observed in my VM reliably with 5.7+ or Fedora kernel reliably,
> > kernel config is attached.
>
> Thanks for sharing it.
>
> >
> > [root@ktest-01 tracing]# uname -a
> > Linux ktest-01 5.7.0+ #1900 SMP Fri Jun 19 16:26:47 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
> > [root@ktest-01 tracing]#
> > [root@ktest-01 tracing]# cat kprobe_events
> > [root@ktest-01 tracing]#
> > [root@ktest-01 tracing]# echo p blkdev_put >> kprobe_events
> > [root@ktest-01 tracing]# echo p __blkdev_put >> kprobe_events
> > [root@ktest-01 tracing]# echo r __blkdev_put >> kprobe_events
> > [root@ktest-01 tracing]#
> > [root@ktest-01 tracing]# echo 1 > events/kprobes/enable
> > [root@ktest-01 tracing]# blockdev --getbsz /dev/sda1
> > 4096
> > [root@ktest-01 tracing]# echo 0 > events/kprobes/enable
> > [root@ktest-01 tracing]# cat trace
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 3/3 #P:8
> > #
> > # _-----=> irqs-off
> > # / _----=> need-resched
> > # | / _---=> hardirq/softirq
> > # || / _--=> preempt-depth
> > # ||| / delay
> > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > # | | | |||| | |
> > blockdev-970 [005] .... 17603.447236: p_blkdev_put_0: (blkdev_put+0x0/0xb4)
> > blockdev-970 [005] .... 17603.447244: p___blkdev_put_0: (__blkdev_put+0x0/0x19d)
> > blockdev-970 [005] d... 17603.447251: r___blkdev_put_0: (blkdev_close+0x22/0x25 <- __blkdev_put)
>
> This shows __blkdev_put() is a tail-call. It is possible that the
> internal (nested) __blkdev_put() call becomes a goto inside the
> function by the gcc optimization.
>
> Ah, after all it is as expected. With your kconfig, the kernel is
> very agressively optimized.
>
> $ objdump -dS vmlinux | less
> ...
> ffffffff81256dc3 <__blkdev_put>:
> {
> ffffffff81256dc3: e8 98 85 df ff callq ffffffff8104f360 <__fentry__>
> ffffffff81256dc8: 41 57 push %r15
> ffffffff81256dca: 41 56 push %r14
> ffffffff81256dcc: 41 55 push %r13
> ...
> ffffffff81256f05: 75 02 jne ffffffff81256f09 <__blkdev_put+0x146>
> struct block_device *victim = NULL;
> ffffffff81256f07: 31 db xor %ebx,%ebx
> bdev->bd_contains = NULL;
> ffffffff81256f09: 48 c7 45 60 00 00 00 movq $0x0,0x60(%rbp)
> ffffffff81256f10: 00
> put_disk_and_module(disk);
> ffffffff81256f11: 4c 89 f7 mov %r14,%rdi
> ffffffff81256f14: e8 c6 3d 11 00 callq ffffffff8136acdf <put_disk_and_module>
> mutex_unlock(&bdev->bd_mutex);
> ffffffff81256f19: 4c 89 ff mov %r15,%rdi
> __blkdev_put(victim, mode, 1);
> ffffffff81256f1c: 41 bc 01 00 00 00 mov $0x1,%r12d
> mutex_unlock(&bdev->bd_mutex);
> ffffffff81256f22: e8 8d d7 48 00 callq ffffffff816e46b4 <mutex_unlock>
> bdput(bdev);
> ffffffff81256f27: 48 89 ef mov %rbp,%rdi
> ffffffff81256f2a: e8 f0 e9 ff ff callq ffffffff8125591f <bdput>
> if (victim)
> ffffffff81256f2f: 48 85 db test %rbx,%rbx
> ffffffff81256f32: 74 08 je ffffffff81256f3c <__blkdev_put+0x179>
> ffffffff81256f34: 48 89 dd mov %rbx,%rbp
> ffffffff81256f37: e9 b4 fe ff ff jmpq ffffffff81256df0 <__blkdev_put+0x2d> <<-----THIS!!
> }
> ffffffff81256f3c: 48 8b 44 24 28 mov 0x28(%rsp),%rax
> ffffffff81256f41: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
> ffffffff81256f48: 00 00
> ffffffff81256f4a: 74 05 je ffffffff81256f51 <__blkdev_put+0x18e>
> ffffffff81256f4c: e8 5a 4e 48 00 callq ffffffff816dbdab <__stack_chk_fail>
> ffffffff81256f51: 48 83 c4 30 add $0x30,%rsp
> ffffffff81256f55: 5b pop %rbx
> ffffffff81256f56: 5d pop %rbp
> ffffffff81256f57: 41 5c pop %r12
> ffffffff81256f59: 41 5d pop %r13
> ffffffff81256f5b: 41 5e pop %r14
> ffffffff81256f5d: 41 5f pop %r15
> ffffffff81256f5f: c3 retq
>
>
> As you can see, the nested __blkdev_put() is coverted to a loop.
> If you put kprobe on __blkdev_put+0x2d, you'll see the event twice.
Thanks for your investigation.
Some trace tools can just trace on function entry, such as bcc, and some
user script always trace on function entry.
I guess the issue should belong to kprobe implementation:
1) __blkdev_put() is capable of being kprobed, so from user view, the
probe on entry of __blkdev_put() should be triggered
2) from implementation view, I understand exception should be trapped
on the entry of __blkdev_put(), looks it isn't done.
Correct me if the above is wrong, and is it possible to fix it in kprobe?
Thanks,
Ming
next prev parent reply other threads:[~2020-06-19 23:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-17 10:30 krobe: __blkdev_put probe is missed Ming Lei
2020-06-18 12:54 ` kprobe: " Ming Lei
2020-06-18 13:56 ` Masami Hiramatsu
2020-06-18 23:19 ` Ming Lei
2020-06-19 5:12 ` Masami Hiramatsu
2020-06-19 7:28 ` Ming Lei
2020-06-19 12:19 ` Steven Rostedt
2020-06-19 13:32 ` Ming Lei
2020-06-19 15:35 ` Masami Hiramatsu
2020-06-19 23:28 ` Ming Lei [this message]
2020-06-20 0:59 ` Steven Rostedt
2020-06-20 1:37 ` Masami Hiramatsu
2020-06-22 0:27 ` Ming Lei
2020-06-22 1:34 ` Masami Hiramatsu
2020-06-22 13:01 ` Steven Rostedt
2020-06-22 23:47 ` Masami Hiramatsu
2020-06-23 0:38 ` Masami Hiramatsu
2020-06-23 5:28 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200619232820.GE353853@T590 \
--to=ming.lei@redhat.com \
--cc=anil.s.keshavamurthy@intel.com \
--cc=davem@davemloft.net \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=naveen.n.rao@linux.ibm.com \
--cc=rostedt@goodmis.org \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).