linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
To: Song Liu <liu.song.a23@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
	srikar@linux.vnet.ibm.com, rostedt@goodmis.org,
	mhiramat@kernel.org, Peter Zijlstra <peterz@infradead.org>,
	mingo@redhat.com, acme@kernel.org,
	alexander.shishkin@linux.intel.com, jolsa@redhat.com,
	namhyung@kernel.org, open list <linux-kernel@vger.kernel.org>,
	corbet@lwn.net, linux-doc@vger.kernel.org,
	ananth@linux.vnet.ibm.com, alexis.berlemont@gmail.com,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, linux-mips@linux-mips.org,
	linux@armlinux.org.uk, ralf@linux-mips.org, paul.burton@mips.com,
	Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Subject: Re: [PATCH v5 06/10] Uprobes: Support SDT markers having reference count (semaphore)
Date: Fri, 13 Jul 2018 13:25:16 +0530	[thread overview]
Message-ID: <9b062a19-d9a7-b360-26b1-d28b8dfc35a3@linux.ibm.com> (raw)
In-Reply-To: <CAPhsuW6kCtn-tWSj5eKf+kGt8ZEjVwJKLxj9C6zdaqMZByRytg@mail.gmail.com>

Hi Song,

On 07/13/2018 01:23 AM, Song Liu wrote:
> I guess I got to the party late. I found this thread after I started developing
> the same feature...
> 
> On Thu, Jul 12, 2018 at 7:58 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> On 07/11, Ravi Bangoria wrote:
>>>
>>>> However, I still think it would be better to avoid uprobe exporting and modifying
>>>> set_swbp/set_orig_insn. May be we can simply kill both set_swbp() and set_orig_insn(),
>>>> I'll re-check...
>>>
>>> Good that you bring this up. Actually, we can implement same logic
>>> without exporting uprobe. We can do "uprobe = container_of(arch_uprobe)"
>>> in uprobe_write_opcode(). No need to export struct uprobe outside,
>>> no need to change set_swbp() / set_orig_insn() syntax. Just that we
>>> need to pass arch_uprobe object to uprobe_write_opcode().
>>
>> Yes, but you still need to modify set_swbp/set_orig_insn to pass the new
>> arg to uprobe_write_opcode(). OK, this is fine.
>>
>>
>>> But, I wanted to discuss about making ref_ctr_offset a uprobe property
>>> or a consumer property, before posting v6:
>>>
>>> If we make it a consumer property, the design becomes flexible for
>>> user. User will have an option to either depend on kernel to handle
>>> reference counter or he can create normal uprobe and manipulate
>>> reference counter on his own. This will not require any changes to
>>> existing tools. With this approach we need to increment / decrement
>>> reference counter for each consumer. But, because of the fact that our
>>> install_breakpoint() / remove_breakpoint() are not balanced, we have
>>> to keep track of which reference counter have been updated in which
>>> mm, for which uprobe and for which consumer. I.e. Maintain a list of
>>> {uprobe, consumer, mm}.
> 
> Is it possible to maintain balanced refcount by modifying callers of
> install_breakpoint() and remove_breakpoint()? I am actually working
> toward this direction. And I found some imbalance between
>      register_for_each_vma(uprobe, uc)
> and
>      register_for_each_vma(uprobe, NULL)
> 
> From reading the thread, I think there are other sources of imbalance.
> But I think it is still possible to fix it? Please let me know if this is not
> realistic...


I don't think so. It all depends on memory layout of the process, the
execution sequence of tracer vs target, how binary is loaded or how mmap()s
are called. To achieve a balance you need to change current uprobe
implementation. (I haven't explored to change current implementation because
I personally think there is no need to). Let me show you a simple example on
my Ubuntu 18.04 (powerpc vm) with upstream kernel:

-------------
  $ cat loop.c
    #include <stdio.h>
    #include <unistd.h>
    
    void foo(int i)
    {
            printf("Hi: %d\n", i);
            sleep(1);
    }
    
    void main()
    {
            int i;
            for (i = 0; i < 100; i++)
                    foo(i);
    }
  
  $ sudo ./perf probe -x ~/loop foo
  $ sudo ./perf probe install_breakpoint uprobe mm vaddr
  $ sudo ./perf probe remove_breakpoint uprobe mm vaddr
  
  term1~$ ./loop
  
  term2~$ sudo ./perf record -a -e probe:* -o perf.data.kprobe
  
  term3~$ sudo ./perf record -a -e probe_loop:foo
          ^C
  
  term2~$ ...
          ^C[ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.217 MB perf.data.probe (10 samples) ]
  
  term2~$ sudo ./perf script -i perf.data.kprobe
    probe:install_breakpoint: (c00000000032e4e8) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108e0844
    probe:install_breakpoint: (c00000000032e4e8) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108d0844
    probe:install_breakpoint: (c00000000032e4e8) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5055500 vaddr=0x7fffa2620844
    probe:install_breakpoint: (c00000000032e4e8) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5055500 vaddr=0x7fffa2620844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108e0844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108d0844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108d0844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108e0844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108e0844
     probe:remove_breakpoint: (c00000000032e938) uprobe=0xc0000000a40d0e00 mm=0xc0000000b5072900 vaddr=0x1108d0844
-------------

Here install_breakpoint() for our target (mm: 0xc0000000b5072900) was
called 2 times where as remove_breakpoint() was called 6 times.

Because, there is an imbalance, and if you make reference counter a
consumer property, you have two options. Either you have to fix
current uprobe infrastructure to solve this imbalance. Or maintain
a list of already updated counter as I've explained(in reply to Oleg).

Now,

  uprobe_register()
    register_for_each_vma()
      install_breakpoint()

gets called for each consumer, but

  uprobe_mmap()
    install_breakpoint()

gets called only once. Now, if you make ref_ctr_offset a consumer
property, you have to increment reference counter for each consumer
in case of uprobe_mmap(). Also, you have to make sure you update
reference counter only once for each consumer because install/
remove_breakpoint() are not balanced. Now, what if reference
counter increment fails for any one consumer? You have to rollback
already updated ones, which brings more complication.

Now, other complication is, generally vma holding reference counter
won't be present when install_breakpoint() gets called from
uprobe_mmap(). I've introduced delayed_uprobes for this. This is
anyway needed with any approach.

The only advantage I was seeing by making reference counter a
consumer property was a user flexibility to update reference counter
on his own. But I've already proposed a solution for that.

So, I personally don't suggest to make ref_ctr_offset a consumer
property because I, again personally, don't think it's a consumer
property.

Please feel free to say if this all looks crap to you :)


> 
> 
> About race conditions, I think both install_breakpoint() and remove_breakpoint()
> are called with
> 
>    down_write(&mm->mmap_sem);
> 


No. Not about updating in one mm. I meant, races in maintaining
all complication I've mentioned above.


> 
> As long as we do the same when modifying the reference counter,
> it should be fine, right?
> 
> Wait... sometimes we only down_read(). Is this by design?


Didn't get this.

Thanks,
Ravi


  reply	other threads:[~2018-07-13  7:55 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28  5:21 [PATCH v5 00/10] Uprobes: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 01/10] Uprobes: Move uprobe structure to uprobe.h Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 02/10] Uprobes: Simplify uprobe_register() body Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 03/10] Uprobe: Change set_swbp definition Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 04/10] Uprobe: Change set_orig_insn definition Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 05/10] Uprobe: Change uprobe_write_opcode definition Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 06/10] Uprobes: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-06-28 19:51   ` Oleg Nesterov
2018-06-29  3:23     ` Ravi Bangoria
2018-07-01 21:09   ` Oleg Nesterov
2018-07-02  5:16     ` Ravi Bangoria
2018-07-02 18:01       ` Oleg Nesterov
2018-07-03  5:30         ` Ravi Bangoria
2018-07-03  6:16           ` Srikar Dronamraju
2018-07-03  7:43             ` Ravi Bangoria
2018-07-04  9:16               ` Srikar Dronamraju
2018-07-04  9:24                 ` Ravi Bangoria
2018-07-03  8:11           ` Ravi Bangoria
2018-07-03 16:36           ` Oleg Nesterov
2018-07-03 17:25             ` Oleg Nesterov
2018-07-04  4:53               ` Ravi Bangoria
2018-07-10 15:25                 ` Oleg Nesterov
2018-07-11  8:44                   ` Ravi Bangoria
2018-07-11  9:52                     ` Ravi Bangoria
2018-07-12 14:58                     ` Oleg Nesterov
2018-07-12 19:53                       ` Song Liu
2018-07-13  7:55                         ` Ravi Bangoria [this message]
2018-07-13 23:50                           ` Song Liu
2018-07-16  8:20                             ` Ravi Bangoria
2018-07-16  8:51                             ` Ravi Bangoria
2018-07-13  5:39                       ` Ravi Bangoria
2018-07-04  4:49             ` Ravi Bangoria
2018-07-03 17:12           ` Oleg Nesterov
2018-07-03 18:23             ` Oleg Nesterov
2018-07-04  5:25               ` Ravi Bangoria
2018-07-02 16:01   ` Srikar Dronamraju
2018-07-02 18:05     ` Oleg Nesterov
2018-07-03  6:29     ` Ravi Bangoria
2018-07-03 19:26       ` Oleg Nesterov
2018-07-04  5:26         ` Ravi Bangoria
2018-07-04  6:07     ` Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 07/10] trace_uprobe/sdt: Prevent multiple reference counter for same uprobe Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 08/10] Uprobes/sdt: " Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 09/10] Uprobes/sdt: Document about reference counter Ravi Bangoria
2018-07-02 14:54   ` Srikar Dronamraju
2018-07-03  7:50     ` Ravi Bangoria
2018-06-28  5:22 ` [PATCH v5 10/10] perf probe: Support SDT markers having reference counter (semaphore) Ravi Bangoria
2018-07-02 14:45   ` Srikar Dronamraju
2018-07-02 14:57   ` Srikar Dronamraju
2018-07-03  8:00     ` Ravi Bangoria

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b062a19-d9a7-b360-26b1-d28b8dfc35a3@linux.ibm.com \
    --to=ravi.bangoria@linux.ibm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexis.berlemont@gmail.com \
    --cc=ananth@linux.vnet.ibm.com \
    --cc=corbet@lwn.net \
    --cc=jolsa@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux@armlinux.org.uk \
    --cc=liu.song.a23@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=oleg@redhat.com \
    --cc=paul.burton@mips.com \
    --cc=peterz@infradead.org \
    --cc=ralf@linux-mips.org \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).