All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: linux-kernel@vger.kernel.org,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Alexey Klimov <aklimov@redhat.com>,
	linux-alpha@vger.kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andy Gross <agross@kernel.org>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>, Andi Kleen <ak@linux.intel.com>,
	Tejun Heo <tj@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Anup Patel <anup.patel@wdc.com>,
	linux-ia64@vger.kernel.org, Andy Shevchenko <andy@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Christoph Hellwig <hch@lst.de>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Borislav Petkov <bp@alien8.de>, Arnd Bergmann <arnd@arndb.de>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	David Laight <David.Laight@aculab.com>,
	Sunil Goutham <sgoutham@marvell.com>,
	David Airlie <airlied@linux.ie>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	bcm-kernel-feedback-list@broadcom.com,
	Christoph Lameter <cl@linux.com>,
	linux-crypto@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>,
	linux-mm@kvack.org, Guo Ren <guoren@kernel.org>,
	linux-snps-arc@lists.infradead.org,
	Geetha sowjanya <gakula@marvell.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Dennis Zhou <dennis@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Heiko Carstens <hca@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Roy Pledge <Roy.Pledge@nxp.com>,
	Saeed Mahameed <saeedm@nvidia.com>, Jens Axboe <axboe@fb.com>,
	Jason Wessel <jason.wessel@windriver.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Ingo Molnar <mingo@redhat.com>, Stephen Boyd <sboyd@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	Tariq Toukan <tariqt@nvidia.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Jonathan Cameron <jic23@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Jiri Olsa <jolsa@redhat.com>, Vineet Gupta <vgupta@kernel.org>,
	Solomon Peachy <pizza@shaftnet.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Will Deacon <will@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>,
	kvm@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	Subbaraya Sundeep <sbhatta@marvell.com>,
	linux-csky@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
	linux-mips@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	linux-perf-users@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-s390@vger.kernel.org, Mark Gross <markgross@kernel.org>,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Wed, 1 Dec 2021 16:31:40 -0800	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) == false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: linux-kernel@vger.kernel.org,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Alexey Klimov <aklimov@redhat.com>,
	linux-alpha@vger.kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andy Gross <agross@kernel.org>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>, Andi Kleen <ak@linux.intel.com>,
	Tejun Heo <tj@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Anup Patel <anup.patel@wdc.com>,
	linux-ia64@vger.kernel.org, Andy Shevchenko <andy@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Christoph Hellwig <hch@lst.de>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Borislav Petkov <bp@alien8.de>, Arnd Bergmann <arnd@arndb.de>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	David Laight <David.Laight@aculab.com>,
	Sunil Goutham <sgoutham@marvell.com>,
	David Airlie <airlied@linux.ie>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	bcm-kernel-feedback-list@broadcom.com,
	Christoph Lameter <cl@linux.com>,
	linux-crypto@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>,
	linux-mm@kvack.org, Guo Ren <guoren@kernel.org>,
	linux-snps-arc@lists.infradead.org,
	Geetha sowjanya <gakula@marvell.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Dennis Zhou <dennis@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Heiko Carstens <hca@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Roy Pledge <Roy.Pledge@nxp.com>,
	Saeed Mahameed <saeedm@nvidia.com>, Jens Axboe <axboe@fb.com>,
	Jason Wessel <jason.wessel@windriver.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Ingo Molnar <mingo@redhat.com>, Stephen Boyd <sboyd@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	Tariq Toukan <tariqt@nvidia.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Jonathan Cameron <jic23@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Jiri Olsa <jolsa@redhat.com>, Vineet Gupta <vgupta@kernel.org>,
	Solomon Peachy <pizza@shaftnet.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Will Deacon <will@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>,
	kvm@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	Subbaraya Sundeep <sbhatta@marvell.com>,
	linux-csky@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
	linux-mips@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	linux-perf-users@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-s390@vger.kernel.org, Mark Gross <markgross@kernel.org>,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Wed, 1 Dec 2021 16:31:40 -0800	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) == false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: linux-kernel@vger.kernel.org,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Alexey Klimov <aklimov@redhat.com>,
	linux-alpha@vger.kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andy Gross <agross@kernel.org>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>, Andi Kleen <ak@linux.intel.com>,
	Tejun Heo <tj@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Anup Patel <anup.patel@wdc.com>,
	linux-ia64@vger.kernel.org, Andy Shevchenko <andy@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Christoph Hellwig <hch@lst.de>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Borislav Petkov <bp@alien8.de>, Arnd Bergmann <arnd@arndb.de>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	David Laight <David.Laight@aculab.com>,
	Sunil Goutham <sgoutham@marvell.com>,
	David Airlie <airlied@linux.ie>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	bcm-kernel-feedback-list@broadcom.com,
	Christoph Lameter <cl@linux.com>,
	linux-crypto@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>,
	linux-mm@kvack.org, Guo Ren <guoren@kernel.org>,
	linux-snps-arc@lists.infradead.org,
	Geetha sowjanya <gakula@marvell.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Dennis Zhou <dennis@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Heiko Carstens <hca@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Roy Pledge <Roy.Pledge@nxp.com>,
	Saeed Mahameed <saeedm@nvidia.com>, Jens Axboe <axboe@fb.com>,
	Jason Wessel <jason.wessel@windriver.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Ingo Molnar <mingo@redhat.com>, Stephen Boyd <sboyd@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	Tariq Toukan <tariqt@nvidia.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Jonathan Cameron <jic23@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Jiri Olsa <jolsa@redhat.com>, Vineet Gupta <vgupta@kernel.org>,
	Solomon Peachy <pizza@shaftnet.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Will Deacon <will@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>,
	kvm@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	Subbaraya Sundeep <sbhatta@marvell.com>,
	linux-csky@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
	linux-mips@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	linux-perf-users@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-s390@vger.kernel.org, Mark Gross <markgross@kernel.org>,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Wed, 1 Dec 2021 16:31:40 -0800	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) == false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Juri Lelli <juri.lelli@redhat.com>, Andrew Lunn <andrew@lunn.ch>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Guo Ren <guoren@kernel.org>, Christoph Lameter <cl@linux.com>,
	Christoph Hellwig <hch@lst.de>, Andi Kleen <ak@linux.intel.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ingo Molnar <mingo@redhat.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Mel Gorman <mgorman@suse.de>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Petr Mladek <pmladek@suse.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jens Axboe <axboe@fb.com>, Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	linux-crypto@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Anup Patel <anup.patel@wdc.com>,
	linux-ia64@vger.kernel.org, David Airlie <airlied@linux.ie>,
	Roy Pledge <Roy.Pledge@nxp.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Solomon Peachy <pizza@shaftnet.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>,
	Dennis Zhou <dennis@kernel.org>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	linux-alpha@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	Stephen Boyd <sboyd@kernel.org>, Tariq Toukan <tariqt@nvidia.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Jonathan Cameron <jic23@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Subbaraya Sundeep <sbhatta@marvell.com>,
	Will Deacon <will@kernel.org>, Sagi Grimberg <sagi@grimberg.me>,
	linux-csky@vger.kernel.org,
	bcm-kernel-feedback-list@broadcom.com,
	linux-arm-kernel@lists.infradead.org,
	linux-snps-arc@lists.infradead.org,
	Kees Cook <keescook@chromium.org>, Arnd Bergmann <arnd@arndb.de>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Vineet Gupta <vgupta@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mark Gross <markgross@kernel.org>, Borislav Petkov <bp@alien8.de>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	David Laight <David.Laight@aculab.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Geetha sowjanya <gakula@marvell.com>,
	Ian Rogers <irogers@google.com>,
	kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	linux-mm@kvack.org, linux-riscv@lists.infradead.org,
	Lee Jones <lee.jones@linaro.org>,
	Ard Biesheuvel <ardb@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>,
	Russell King <linux@armlinux.org.uk>,
	Andy Gross <agross@kernel.org>, Jakub Kicinski <kuba@kernel.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Sunil Goutham <sgoutham@marvell.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	linux-s390@vger.kernel.org, Alexey Klimov <aklimov@redhat.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Hans de Goede <hdegoede@redhat.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Marcin Wojtas <mw@semihalf.com>, Vlastimil Babka <vbabka@suse.cz>,
	linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Daniel Vetter <daniel@ffwll.ch>,
	Jason Wessel <jason.wessel@windriver.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Andy Shevchenko <andy@infradead.org>
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Wed, 1 Dec 2021 16:31:40 -0800	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) == false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: linux-kernel@vger.kernel.org,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Alexey Klimov <aklimov@redhat.com>,
	linux-alpha@vger.kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andy Gross <agross@kernel.org>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>, Andi Kleen <ak@linux.intel.com>,
	Tejun Heo <tj@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Anup Patel <anup.patel@wdc.com>,
	linux-ia64@vger.kernel.org, Andy Shevchenko <andy@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	Mel Gorman <mgorman@suse.de>, Christoph Hellwig <hch@lst.de>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Borislav Petkov <bp@alien8.de>, Arnd Bergmann <arnd@arndb.de>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	David Laight <David.Laight@aculab.com>,
	Sunil Goutham <sgoutham@marvell.com>,
	David Airlie <airlied@linux.ie>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Daniel Vetter <daniel@ffwll.ch>,
	bcm-kernel-feedback-list@broadcom.com,
	Christoph Lameter <cl@linux.com>,
	linux-crypto@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>,
	linux-mm@kvack.org, Guo Ren <guoren@kernel.org>,
	linux-snps-arc@lists.infradead.org,
	Geetha sowjanya <gakula@marvell.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Dennis Zhou <dennis@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Heiko Carstens <hca@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Roy Pledge <Roy.Pledge@nxp.com>,
	Saeed Mahameed <saeedm@nvidia.com>, Jens Axboe <axboe@fb.com>,
	Jason Wessel <jason.wessel@windriver.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Ingo Molnar <mingo@redhat.com>, Stephen Boyd <sboyd@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	Tariq Toukan <tariqt@nvidia.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Jonathan Cameron <jic23@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Jiri Olsa <jolsa@redhat.com>, Vineet Gupta <vgupta@kernel.org>,
	Solomon Peachy <pizza@shaftnet.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Lee Jones <lee.jones@linaro.org>, Will Deacon <will@kernel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>,
	kvm@vger.kernel.org, Kees Cook <keescook@chromium.org>,
	linux-arm-kernel@lists.infradead.org,
	Subbaraya Sundeep <sbhatta@marvell.com>,
	linux-csky@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
	linux-mips@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	linux-perf-users@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-s390@vger.kernel.org, Mark Gross <markgross@kernel.org>,
	linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Thu, 02 Dec 2021 00:31:40 +0000	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) = false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) = 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: linux-kernel@vger.kernel.org,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Alexey Klimov <aklimov@redhat.com>,
	linux-alpha@vger.kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Andy Gross <agross@kernel.org>,
	Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>,
	Petr Mladek <pmladek@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrew Lunn <andrew@lunn.ch>, Andi Kleen <ak@linux.intel.com>,
	Tejun Heo <tj@kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Anup Patel <anup.patel@wdc.c>
Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Date: Wed, 1 Dec 2021 16:31:40 -0800	[thread overview]
Message-ID: <20211202003140.GA430494@lapt> (raw)
In-Reply-To: <3CD9ECD8-901E-497B-9AE1-0DDB02346892@rere.qmqm.pl>

On Mon, Nov 29, 2021 at 04:34:07PM +0000, Michał Mirosław wrote:
> Dnia 29 listopada 2021 06:38:39 UTC, Yury Norov <yury.norov@gmail.com> napisał/a:
> >On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-test@rere.qmqm.pl wrote:
> >> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote:
> >> > In many cases people use bitmap_weight()-based functions like this:
> >> > 
> >> > 	if (num_present_cpus() > 1)
> >> > 		do_something();
> >> > 
> >> > This may take considerable amount of time on many-cpus machines because
> >> > num_present_cpus() will traverse every word of underlying cpumask
> >> > unconditionally.
> >> > 
> >> > We can significantly improve on it for many real cases if stop traversing
> >> > the mask as soon as we count present cpus to any number greater than 1:
> >> > 
> >> > 	if (num_present_cpus_gt(1))
> >> > 		do_something();
> >> > 
> >> > To implement this idea, the series adds bitmap_weight_{eq,gt,le}
> >> > functions together with corresponding wrappers in cpumask and nodemask.
> >> 
> >> Having slept on it I have more structured thoughts:
> >> 
> >> First, I like substituting bitmap_empty/full where possible - I think
> >> the change stands on its own, so could be split and sent as is.
> >
> >Ok, I can do it.
> >
> >> I don't like the proposed API very much. One problem is that it hides
> >> the comparison operator and makes call sites less readable:
> >> 
> >> 	bitmap_weight(...) > N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_gt(..., N)
> >> 
> >> and:
> >> 	bitmap_weight(...) <= N
> >> 
> >> becomes:
> >> 
> >> 	bitmap_weight_lt(..., N+1)
> >> or:
> >> 	!bitmap_weight_gt(..., N)
> >> 
> >> I'd rather see something resembling memcmp() API that's known enough
> >> to be easier to grasp. For above examples:
> >> 
> >> 	bitmap_weight_cmp(..., N) > 0
> >> 	bitmap_weight_cmp(..., N) <= 0
> >> 	...
> >
> >bitmap_weight_cmp() cannot be efficient. Consider this example:
> >
> >bitmap_weight_lt(1000 0000 0000 0000, 1) == false
> >                 ^
> >                 stop here
> >
> >bitmap_weight_cmp(1000 0000 0000 0000, 1) == 0
> >                                 ^
> >                                 stop here
> >
> >I agree that '_gt' is less verbose than '>', but the advantage of 
> >'_gt' over '>' is proportional to length of bitmap, and it means
> >that this API should exist.
> 
> Thank you for the example. Indeed, for less-than to be efficient here you would need to replace
>  bitmap_weight_cmp(..., N) < 0
> with
>  bitmap_weight_cmp(..., N-1) <= 0

Indeed, thanks for pointing to it.
 
> It would still be more readable, I think.

To be honest, I'm not sure that
        bitmap_weight_cmp(..., N-1) <= 0
would be an obvious replacement for the original
        bitmap_weight(...) < N
comparing to 
        bitmap_weight_lt(..., N)

I think the best thing I can do is to add bitmap_weight_cmp() as
you suggested, and turn lt and others to be wrappers on it. This
will let people choose a better function in each case.

I also think that for v2 it would be better to drop the conversion
for short bitmaps, except for switching to bitmap_empty(), because
in that case readability wins over performance; if no objections. 

Thanks,
Yury

  reply	other threads:[~2021-12-02  0:31 UTC|newest]

Thread overview: 183+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-28  3:56 [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Yury Norov
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:56 ` [PATCH 1/9] lib/bitmap: add bitmap_weight_{eq,gt,le} Yury Norov
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56 ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty, full} with bitmap_weight_eq() Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty,full} " Yury Norov
2021-11-28  4:37   ` Michał Mirosław
2021-11-28  4:37     ` Michał Mirosław
2021-11-28  4:37     ` Michał Mirosław
2021-11-28  4:37     ` Michał Mirosław
2021-11-28  4:37     ` Michał Mirosław
2021-11-28  6:27     ` Yury Norov
2021-11-28  6:27       ` Yury Norov
2021-11-28  6:27       ` Yury Norov
2021-11-28  6:27       ` Yury Norov
2021-11-28  6:27       ` Yury Norov
2021-11-28 18:10   ` Michał Mirosław
2021-11-28 18:10     ` Michał Mirosław
2021-11-28 18:10     ` Michał Mirosław
2021-11-28 18:10     ` Michał Mirosław
2021-11-28 18:10     ` Michał Mirosław
2021-11-28 18:10     ` Michał Mirosław
2021-12-14 19:43     ` Yury Norov
2021-12-14 19:43       ` Yury Norov
2021-12-14 19:43       ` Yury Norov
2021-12-14 19:43       ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty, full} " Yury Norov
2021-12-14 19:43       ` Yury Norov
2021-12-14 19:43       ` Yury Norov
2021-12-15  8:40       ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty,full} " David Laight
2021-12-15  8:40         ` David Laight
2021-12-15  8:40         ` David Laight
2021-12-15  8:40         ` David Laight
2021-12-15  8:40         ` David Laight
2021-12-15  8:40         ` David Laight
2021-12-15 17:45         ` Yury Norov
2021-12-15 17:45           ` Yury Norov
2021-12-15 17:45           ` Yury Norov
2021-12-15 17:45           ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty, full} " Yury Norov
2021-12-15 17:45           ` Yury Norov
2021-12-15 17:45           ` Yury Norov
2021-11-28  3:56 ` [PATCH 2/9] lib/bitmap: implement bitmap_{empty,full} " Yury Norov
2021-11-28  3:56 ` [PATCH 3/9] all: replace bitmap_weigth() with bitmap_{empty, full, eq, gt, le} Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` [PATCH 3/9] all: replace bitmap_weigth() with bitmap_{empty,full,eq,gt,le} Yury Norov
2021-11-28  3:56   ` [PATCH 3/9] all: replace bitmap_weigth() with bitmap_{empty, full, eq, gt, le} Yury Norov
2021-11-28  4:47   ` [PATCH 3/9] all: replace bitmap_weigth() with bitmap_{empty,full,eq,gt,le} Michał Mirosław
2021-11-28  4:47     ` Michał Mirosław
2021-11-28  4:47     ` Michał Mirosław
2021-11-28  4:47     ` Michał Mirosław
2021-11-28  4:47     ` Michał Mirosław
2021-11-28  4:47     ` Michał Mirosław
2021-11-28  8:01   ` Greg Kroah-Hartman
2021-11-28  8:01     ` Greg Kroah-Hartman
2021-11-28  8:01     ` Greg Kroah-Hartman
2021-11-28  8:01     ` Greg Kroah-Hartman
2021-11-28  8:01     ` Greg Kroah-Hartman
2021-11-28  8:01     ` Greg Kroah-Hartman
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:56 ` [PATCH 4/9] tools: sync bitmap_weight() usage with the kernel Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56   ` Yury Norov
2021-11-28  3:56 ` Yury Norov
2021-11-28  3:57 ` [PATCH 5/9] lib/cpumask: add cpumask_weight_{eq,gt,le} Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57 ` Yury Norov
2021-11-28  3:57 ` [PATCH 6/9] lib/nodemask: add nodemask_weight_{eq,gt,le} Yury Norov
2021-11-28  3:57 ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57 ` [PATCH 7/9] lib/cpumask: add num_{possible, present, active}_cpus_{eq, gt, le} Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le} Yury Norov
2021-11-28  3:57   ` [PATCH 7/9] lib/cpumask: add num_{possible, present, active}_cpus_{eq, gt, le} Yury Norov
2021-11-28  4:56   ` [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le} Michał Mirosław
2021-11-28  4:56     ` Michał Mirosław
2021-11-28  4:56     ` Michał Mirosław
2021-11-28  4:56     ` Michał Mirosław
2021-11-28  4:56     ` Michał Mirosław
2021-11-28  4:56     ` Michał Mirosław
2021-11-28  5:09     ` Michał Mirosław
2021-11-28  5:09       ` Michał Mirosław
2021-11-28  5:09       ` Michał Mirosław
2021-11-28  5:09       ` Michał Mirosław
2021-11-28  5:09       ` Michał Mirosław
2021-11-28  5:09       ` Michał Mirosław
2021-11-28  6:34     ` Yury Norov
2021-11-28  6:34       ` Yury Norov
2021-11-28  6:34       ` Yury Norov
2021-11-28  6:34       ` Yury Norov
2021-11-28  6:34       ` Yury Norov
2021-11-28  6:34       ` Yury Norov
2021-11-28 17:07   ` Joe Perches
2021-11-28 17:07   ` Joe Perches
2021-11-28 17:07     ` Joe Perches
2021-11-28 17:07     ` Joe Perches
2021-11-28 17:07     ` Joe Perches
2021-11-28 17:43     ` Yury Norov
2021-11-28 17:43       ` Yury Norov
2021-11-28 17:43       ` Yury Norov
2021-11-28 17:43       ` Yury Norov
2021-11-28 17:43       ` Yury Norov
2021-11-28 17:43       ` Yury Norov
2021-11-28 17:54       ` Dennis Zhou
2021-11-28 17:54         ` Dennis Zhou
2021-11-28 17:54         ` Dennis Zhou
2021-11-28 17:54         ` Dennis Zhou
2021-11-28 17:54         ` Dennis Zhou
2021-11-28 18:47         ` Yury Norov
2021-11-28 18:47           ` Yury Norov
2021-11-28 18:47           ` Yury Norov
2021-11-28 18:47           ` Yury Norov
2021-11-28 18:47           ` Yury Norov
2021-11-28 18:47           ` Yury Norov
2021-11-28 17:56       ` Emil Renner Berthing
2021-11-28 17:56         ` Emil Renner Berthing
2021-11-28 17:56         ` Emil Renner Berthing
2021-11-28 17:56         ` [PATCH 7/9] lib/cpumask: add num_{possible, present, active}_cpus_{eq, gt, le} Emil Renner Berthing
2021-11-28 17:56         ` Emil Renner Berthing
2021-11-28 17:56         ` Emil Renner Berthing
2021-11-28 17:57       ` [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le} Joe Perches
2021-11-28 17:57         ` Joe Perches
2021-11-28 17:57         ` Joe Perches
2021-11-28 17:57         ` Joe Perches
2021-11-28 17:57         ` Joe Perches
2021-11-28 17:57         ` Joe Perches
2021-11-28  3:57 ` Yury Norov
2021-11-28  3:57 ` [PATCH 8/9] lib/nodemask: add num_node_state_eq() Yury Norov
2021-11-28  3:57 ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57 ` [PATCH 9/9] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57   ` Yury Norov
2021-11-28  3:57 ` Yury Norov
2021-11-28 11:08 ` [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Nicholas Piggin
2021-11-28 11:08   ` Nicholas Piggin
2021-11-28 11:08   ` Nicholas Piggin
2021-11-28 23:36   ` Yury Norov
2021-11-28 23:36     ` Yury Norov
2021-11-28 23:36     ` Yury Norov
2021-11-28 23:36     ` Yury Norov
2021-11-28 23:36     ` Yury Norov
2021-11-28 23:36     ` Yury Norov
2021-11-28 11:08 ` Nicholas Piggin
2021-11-28 18:03 ` mirq-test
2021-11-28 18:05   ` Michał Mirosław
2021-11-28 18:05   ` Michał Mirosław
2021-11-28 18:05   ` Michał Mirosław
2021-11-28 18:03   ` mirq-test
2021-11-28 18:03   ` mirq-test
2021-11-28 18:03   ` mirq-test
2021-11-28 18:03   ` mirq-test
2021-11-29  6:38   ` Yury Norov
2021-11-29  6:38     ` Yury Norov
2021-11-29  6:38     ` Yury Norov
2021-11-29  6:38     ` Yury Norov
2021-11-29  6:38     ` Yury Norov
2021-11-29  6:38     ` Yury Norov
2021-11-29 16:34     ` Michał Mirosław
2021-11-29 16:34       ` Michał Mirosław
2021-11-29 16:34       ` Michał Mirosław
2021-11-29 16:34       ` Michał Mirosław
2021-11-29 16:34       ` Michał Mirosław
2021-12-02  0:31       ` Yury Norov [this message]
2021-12-02  0:31         ` Yury Norov
2021-12-02  0:31         ` Yury Norov
2021-12-02  0:31         ` Yury Norov
2021-12-02  0:31         ` Yury Norov
2021-12-02  0:31         ` Yury Norov
2021-11-28  3:56 Yury Norov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211202003140.GA430494@lapt \
    --to=yury.norov@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=Roy.Pledge@nxp.com \
    --cc=acme@kernel.org \
    --cc=agross@kernel.org \
    --cc=airlied@linux.ie \
    --cc=ak@linux.intel.com \
    --cc=aklimov@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amitkarwar@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=andy@infradead.org \
    --cc=anup.patel@wdc.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=axboe@fb.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=daniel@ffwll.ch \
    --cc=dave.hansen@linux.intel.com \
    --cc=dennis@kernel.org \
    --cc=dinguyen@kernel.org \
    --cc=gakula@marvell.com \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=guoren@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=hdegoede@redhat.com \
    --cc=irogers@google.com \
    --cc=jason.wessel@windriver.com \
    --cc=jejb@linux.ibm.com \
    --cc=jic23@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=krzysztof.kozlowski@canonical.com \
    --cc=kuba@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=kvm@vger.kernel.org \
    --cc=lee.jones@linaro.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@rasmusvillemoes.dk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=markgross@kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=maz@kernel.org \
    --cc=mazziesaccount@gmail.com \
    --cc=mchehab@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mike.marciniszyn@cornelisnetworks.com \
    --cc=mingo@redhat.com \
    --cc=mirq-linux@rere.qmqm.pl \
    --cc=mpe@ellerman.id.au \
    --cc=mw@semihalf.com \
    --cc=npiggin@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pizza@shaftnet.org \
    --cc=pmladek@suse.com \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=saeedm@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=sbhatta@marvell.com \
    --cc=sboyd@kernel.org \
    --cc=senozhatsky@chromium.org \
    --cc=sfr@canb.auug.org.au \
    --cc=sgoutham@marvell.com \
    --cc=sudeep.holla@arm.com \
    --cc=tariqt@nvidia.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=ulf.hansson@linaro.org \
    --cc=vbabka@suse.cz \
    --cc=vgupta@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=vivien.didelot@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.