bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
	Joe Lawrence <joe.lawrence@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>, <bpf@vger.kernel.org>,
	<linux-trace-kernel@vger.kernel.org>,
	<live-patching@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	<linux-modules@vger.kernel.org>
Subject: Re: [PATCH 2/3] bpf: Optimize get_modules_for_addrs()
Date: Mon, 9 Jan 2023 16:51:37 +0800	[thread overview]
Message-ID: <652e0eea-1ab2-a4fd-151a-e634bcb4e1da@huawei.com> (raw)
In-Reply-To: <Y7ftxIiV35Wd75lZ@krava>



On 2023/1/6 17:45, Jiri Olsa wrote:
> On Thu, Jan 05, 2023 at 10:31:12PM +0100, Jiri Olsa wrote:
>> On Wed, Jan 04, 2023 at 05:25:08PM +0100, Petr Mladek wrote:
>>> On Fri 2022-12-30 19:27:28, Zhen Lei wrote:
>>>> Function __module_address() can quickly return the pointer of the module
>>>> to which an address belongs. We do not need to traverse the symbols of all
>>>> modules to check whether each address in addrs[] is the start address of
>>>> the corresponding symbol, because register_fprobe_ips() will do this check
>>>> later.
>>
>> hum, for some reason I can see only replies to this patch and
>> not the actual patch.. I'll dig it out of the lore I guess
>>
>>>>
>>>> Assuming that there are m modules, each module has n symbols on average,
>>>> and the number of addresses 'addrs_cnt' is abbreviated as K. Then the time
>>>> complexity of the original method is O(K * log(K)) + O(m * n * log(K)),
>>>> and the time complexity of current method is O(K * (log(m) + M)), M <= m.
>>>> (m * n * log(K)) / (K * m) ==> n / log2(K). Even if n is 10 and K is 128,
>>>> the ratio is still greater than 1. Therefore, the new method will
>>>> generally have better performance.
>>
>> could you try to benchmark that? I tried something similar but was not
>> able to get better performance
> 
> hm looks like I tried the smilar thing (below) like you did,

Yes. I just found out you're working on this improvement, too.

> but wasn't able to get better performace

Your implementation below is already the limit that can be optimized.
If the performance is not improved, it indicates that this place is
not the bottleneck.

> 
> I guess your goal is to get rid of the module arg in
> module_kallsyms_on_each_symbol callback that we use?

It's not a bad thing to keep argument 'mod' for function
module_kallsyms_on_each_symbol(), but for kallsyms_on_each_symbol(),
it's completely redundant. Now these two functions often use the
same hook function. So I carefully analyzed get_modules_for_addrs(),
which is the only place that involves the use of parameter 'mod'.
Looks like there's a possibility of eliminating parameter 'mod'.

> I'm ok with the change if the performace is not worse

OK, thanks.

> 
> jirka
> 
> 
> ---
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 5b9008bc597b..3280c22009f1 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2692,23 +2692,16 @@ struct module_addr_args {
>  	int mods_cap;
>  };
>  
> -static int module_callback(void *data, const char *name,
> -			   struct module *mod, unsigned long addr)
> +static int add_module(struct module_addr_args *args, struct module *mod)
>  {
> -	struct module_addr_args *args = data;
>  	struct module **mods;
>  
> -	/* We iterate all modules symbols and for each we:
> -	 * - search for it in provided addresses array
> -	 * - if found we check if we already have the module pointer stored
> -	 *   (we iterate modules sequentially, so we can check just the last
> -	 *   module pointer)
> +	/* We iterate sorted addresses and for each within module we:
> +	 * - check if we already have the module pointer stored for it
> +	 *   (we iterate sorted addresses sequentially, so we can check
> +	 *   just the last module pointer)
>  	 * - take module reference and store it
>  	 */
> -	if (!bsearch(&addr, args->addrs, args->addrs_cnt, sizeof(addr),
> -		       bpf_kprobe_multi_addrs_cmp))
> -		return 0;
> -
>  	if (args->mods && args->mods[args->mods_cnt - 1] == mod)
>  		return 0;

There'll be problems Petr mentioned.

https://lkml.org/lkml/2023/1/5/191

>  
> @@ -2734,10 +2727,24 @@ static int get_modules_for_addrs(struct module ***mods, unsigned long *addrs, u3
>  		.addrs     = addrs,
>  		.addrs_cnt = addrs_cnt,
>  	};
> -	int err;
> +	u32 i, err = 0;
> +
> +	for (i = 0; !err && i < addrs_cnt; i++) {
> +		struct module *mod;
> +		bool found = false;
> +
> +		preempt_disable();
> +		mod = __module_text_address(addrs[i]);
> +		found = mod && try_module_get(mod);
> +		preempt_enable();
> +
> +		if (found) {
> +			err = add_module(&args, mod);
> +			module_put(mod);
> +		}
> +	}
>  
>  	/* We return either err < 0 in case of error, ... */
> -	err = module_kallsyms_on_each_symbol(module_callback, &args);
>  	if (err) {
>  		kprobe_multi_put_modules(args.mods, args.mods_cnt);
>  		kfree(args.mods);
> @@ -2862,7 +2869,8 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>  	} else {
>  		/*
>  		 * We need to sort addrs array even if there are no cookies
> -		 * provided, to allow bsearch in get_modules_for_addrs.
> +		 * provided, to allow sequential address walk in
> +		 * get_modules_for_addrs.
>  		 */
>  		sort(addrs, cnt, sizeof(*addrs),
>  		       bpf_kprobe_multi_addrs_cmp, NULL);
> .
> 

-- 
Regards,
  Zhen Lei

  reply	other threads:[~2023-01-09  9:01 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30 11:27 [PATCH 0/3] kallsyms: Optimize the search for module symbols by livepatch and bpf Zhen Lei
2022-12-30 11:27 ` [PATCH 1/3] livepatch: Improve the search performance of module_kallsyms_on_each_symbol() Zhen Lei
2023-01-04 15:36   ` Petr Mladek
2022-12-30 11:27 ` [PATCH 2/3] bpf: Optimize get_modules_for_addrs() Zhen Lei
2023-01-04 16:25   ` Petr Mladek
2023-01-04 17:07     ` Song Liu
2023-01-05  7:31       ` Leizhen (ThunderTown)
2023-01-05  9:05       ` Petr Mladek
2023-01-09  4:02         ` Leizhen (ThunderTown)
2023-01-05  7:48     ` Leizhen (ThunderTown)
2023-01-05  9:32     ` Petr Mladek
2023-01-09  4:10       ` Leizhen (ThunderTown)
2023-01-05 21:31     ` Jiri Olsa
2023-01-06  9:45       ` Jiri Olsa
2023-01-09  8:51         ` Leizhen (ThunderTown) [this message]
2023-01-09 13:48           ` Jiri Olsa
2023-01-09 15:11             ` Leizhen (ThunderTown)
2023-01-11  8:41               ` Leizhen (ThunderTown)
2023-01-11  9:53                 ` Jiri Olsa
2023-01-09  7:03       ` Leizhen (ThunderTown)
2022-12-30 11:27 ` [PATCH 3/3] kallsyms: Delete an unused parameter related to {module_}kallsyms_on_each_symbol() Zhen Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=652e0eea-1ab2-a4fd-151a-e634bcb4e1da@huawei.com \
    --to=thunder.leizhen@huawei.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=jpoimboe@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=mbenes@suse.cz \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=olsajiri@gmail.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).