From: Masami Hiramatsu <mhiramat@kernel.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Ming Lei <tom.leiming@gmail.com>,
"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
linux-block <linux-block@vger.kernel.org>
Subject: Re: kprobe: __blkdev_put probe is missed
Date: Sat, 20 Jun 2020 10:37:47 +0900 [thread overview]
Message-ID: <20200620103747.fb83f804083ef9956740acee@kernel.org> (raw)
In-Reply-To: <20200619232820.GE353853@T590>
Hi Ming,
On Sat, 20 Jun 2020 07:28:20 +0800
Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Ah, after all it is as expected. With your kconfig, the kernel is
> > very agressively optimized.
> >
> > $ objdump -dS vmlinux | less
> > ...
> > ffffffff81256dc3 <__blkdev_put>:
> > {
> > ffffffff81256dc3: e8 98 85 df ff callq ffffffff8104f360 <__fentry__>
> > ffffffff81256dc8: 41 57 push %r15
> > ffffffff81256dca: 41 56 push %r14
> > ffffffff81256dcc: 41 55 push %r13
> > ...
> > ffffffff81256f05: 75 02 jne ffffffff81256f09 <__blkdev_put+0x146>
> > struct block_device *victim = NULL;
> > ffffffff81256f07: 31 db xor %ebx,%ebx
> > bdev->bd_contains = NULL;
> > ffffffff81256f09: 48 c7 45 60 00 00 00 movq $0x0,0x60(%rbp)
> > ffffffff81256f10: 00
> > put_disk_and_module(disk);
> > ffffffff81256f11: 4c 89 f7 mov %r14,%rdi
> > ffffffff81256f14: e8 c6 3d 11 00 callq ffffffff8136acdf <put_disk_and_module>
> > mutex_unlock(&bdev->bd_mutex);
> > ffffffff81256f19: 4c 89 ff mov %r15,%rdi
> > __blkdev_put(victim, mode, 1);
> > ffffffff81256f1c: 41 bc 01 00 00 00 mov $0x1,%r12d
> > mutex_unlock(&bdev->bd_mutex);
> > ffffffff81256f22: e8 8d d7 48 00 callq ffffffff816e46b4 <mutex_unlock>
> > bdput(bdev);
> > ffffffff81256f27: 48 89 ef mov %rbp,%rdi
> > ffffffff81256f2a: e8 f0 e9 ff ff callq ffffffff8125591f <bdput>
> > if (victim)
> > ffffffff81256f2f: 48 85 db test %rbx,%rbx
> > ffffffff81256f32: 74 08 je ffffffff81256f3c <__blkdev_put+0x179>
> > ffffffff81256f34: 48 89 dd mov %rbx,%rbp
> > ffffffff81256f37: e9 b4 fe ff ff jmpq ffffffff81256df0 <__blkdev_put+0x2d> <<-----THIS!!
> > }
> > ffffffff81256f3c: 48 8b 44 24 28 mov 0x28(%rsp),%rax
> > ffffffff81256f41: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
> > ffffffff81256f48: 00 00
> > ffffffff81256f4a: 74 05 je ffffffff81256f51 <__blkdev_put+0x18e>
> > ffffffff81256f4c: e8 5a 4e 48 00 callq ffffffff816dbdab <__stack_chk_fail>
> > ffffffff81256f51: 48 83 c4 30 add $0x30,%rsp
> > ffffffff81256f55: 5b pop %rbx
> > ffffffff81256f56: 5d pop %rbp
> > ffffffff81256f57: 41 5c pop %r12
> > ffffffff81256f59: 41 5d pop %r13
> > ffffffff81256f5b: 41 5e pop %r14
> > ffffffff81256f5d: 41 5f pop %r15
> > ffffffff81256f5f: c3 retq
> >
> >
> > As you can see, the nested __blkdev_put() is coverted to a loop.
> > If you put kprobe on __blkdev_put+0x2d, you'll see the event twice.
>
> Thanks for your investigation.
>
> Some trace tools can just trace on function entry, such as bcc, and some
> user script always trace on function entry.
>
> I guess the issue should belong to kprobe implementation:
>
> 1) __blkdev_put() is capable of being kprobed, so from user view, the
> probe on entry of __blkdev_put() should be triggered
Yes, it is correctly triggered.
>
> 2) from implementation view, I understand exception should be trapped
> on the entry of __blkdev_put(), looks it isn't done.
No, it is correctly trapped the function entry address. The problem is
that the gcc optimized the nested function call into jump to the
beginning of function body (skip prologue).
Usually, a function is compiled as below
func() (1) the entry address (func:)
{ (2) the function prologue (setup stackframe)
int a (3) the beginning of function body
...
func() (4) the nested function call
And in this case, the gcc optimized (4) into jump to (3) instead of
actual function call instruction. Thus, for the nested case (1) and
(2) are skipped.
IOW, the code flow becomes
(1)->(2)->(3)->(4)->(3)
instead of
(1)->(2)->(3)->(4)->(1)->(2)->(3)
In this case, if we put a probe on (1) or (2), those are disappeared
in the nested call. Thus if you put a probe on (3) ('perf probe __blkdev_put:2')
you'll see the event twice.
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
next prev parent reply other threads:[~2020-06-20 1:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-17 10:30 krobe: __blkdev_put probe is missed Ming Lei
2020-06-18 12:54 ` kprobe: " Ming Lei
2020-06-18 13:56 ` Masami Hiramatsu
2020-06-18 23:19 ` Ming Lei
2020-06-19 5:12 ` Masami Hiramatsu
2020-06-19 7:28 ` Ming Lei
2020-06-19 12:19 ` Steven Rostedt
2020-06-19 13:32 ` Ming Lei
2020-06-19 15:35 ` Masami Hiramatsu
2020-06-19 23:28 ` Ming Lei
2020-06-20 0:59 ` Steven Rostedt
2020-06-20 1:37 ` Masami Hiramatsu [this message]
2020-06-22 0:27 ` Ming Lei
2020-06-22 1:34 ` Masami Hiramatsu
2020-06-22 13:01 ` Steven Rostedt
2020-06-22 23:47 ` Masami Hiramatsu
2020-06-23 0:38 ` Masami Hiramatsu
2020-06-23 5:28 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200620103747.fb83f804083ef9956740acee@kernel.org \
--to=mhiramat@kernel.org \
--cc=anil.s.keshavamurthy@intel.com \
--cc=davem@davemloft.net \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=naveen.n.rao@linux.ibm.com \
--cc=rostedt@goodmis.org \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).