All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sander Vanheule <sander@svanheule.net>
To: Peter Zijlstra <peterz@infradead.org>,
	Yury Norov <yury.norov@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Valentin Schneider <vschneid@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Marco Elver <elver@google.com>,
	Barry Song <song.bao.hua@hisilicon.com>
Cc: linux-kernel@vger.kernel.org,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Sander Vanheule <sander@svanheule.net>
Subject: [PATCH v1 0/2] cpumask: Fix invalid uniprocessor assumptions
Date: Thu,  2 Jun 2022 23:04:18 +0200	[thread overview]
Message-ID: <cover.1654201862.git.sander@svanheule.net> (raw)

On uniprocessor builds, it is currently assumed that any cpumask will
contain the single CPU: cpu0. This assumption is used to provide
optimised implementations.

The current assumption also appears to be wrong, by ignoring the fact
that users can provide empty cpumask-s. This can result in bugs as
explained in [1].

This series seeks to fix this assumption, allowing for masks that
contain at most cpu0, i.e. are "1" or "0".

[1] https://lore.kernel.org/all/20220530082552.46113-1-sander@svanheule.net/

Best,
Sander

---
To test these changes, I used the following code:

static void cpumask_test(const struct cpumask *mask)
{
        int cpu;

        if (cpumask_empty(mask))
                pr_info("testing empty mask\n");
        else
                pr_info("testing non-empty mask\n");

        pr_info("first cpu: %d\n", cpumask_first(mask));
        pr_info("first zero: %d\n", cpumask_first_zero(mask));
        pr_info("first and: %d\n", cpumask_first_and(mask, cpu_possible_mask));
        pr_info("last cpu: %d\n", cpumask_last(mask));
        pr_info("next cpu at -1: %d\n", cpumask_next(-1, mask));
        pr_info("next cpu at 0: %d\n", cpumask_next(0, mask));
        pr_info("next zero at -1: %d\n", cpumask_next_zero(-1, mask));
        pr_info("next zero at 0: %d\n", cpumask_next_zero(0, mask));
        pr_info("next and at -1: %d\n", cpumask_next_and(-1, mask, cpu_possible_mask));
        pr_info("next and at 0: %d\n", cpumask_next_and(0, mask, cpu_possible_mask));
        pr_info("next wrap at -1,false: %d\n", cpumask_next_wrap(-1, mask, 0, false));
        pr_info("next wrap at 0,false: %d\n", cpumask_next_wrap(-1, mask, 0, false));
        pr_info("next wrap at -1,true: %d\n", cpumask_next_wrap(-1, mask, 0, true));
        pr_info("next wrap at 0,true: %d\n", cpumask_next_wrap(-1, mask, 0, true));

        for_each_cpu(cpu, mask)
                pr_info("for each: %d\n", cpu);

        for_each_cpu_not(cpu, mask)
                pr_info("for each not: %d\n", cpu);

        for_each_cpu_wrap(cpu, mask, 0)
                pr_info("for each wrap: %d\n", cpu);

        for_each_cpu_and(cpu, mask, cpu_possible_mask)
                pr_info("for each and: %d\n", cpu);
}

static void run_tests()
{
        cpumask_clear(&empty_mask);
        cpumask_test(&empty_mask);
        cpumask_test(cpu_possible_mask);
}

On an unpatched build, with NR_CPUS=1:
    [...] testing empty mask
    [...] first cpu: 0
    [...] first zero: 0
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each not: 0
    [...] for each wrap: 0
    [...] for each and: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 0
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each not: 0
    [...] for each wrap: 0
    [...] for each and: 0

On a patched build, with NR_CPUS=1:
    [...] testing empty mask
    [...] first cpu: 1
    [...] first zero: 0
    [...] first and: 1
    [...] last cpu: 1
    [...] next cpu at -1: 1
    [...] next cpu at 0: 1
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 1
    [...] next and at 0: 1
    [...] next wrap at -1,false: 1
    [...] next wrap at 0,false: 1
    [...] next wrap at -1,true: 1
    [...] next wrap at 0,true: 1
    [...] for each not: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 1
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 1
    [...] next zero at -1: 1
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 1
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 0
    [...] next wrap at 0,true: 0
    [...] for each: 0
    [...] for each wrap: 0

For reference, the generic implementation with NR_CPUS=2, CONFIG_SMP=y
    [...] testing empty mask
    [...] first cpu: 2
    [...] first zero: 0
    [...] first and: 2
    [...] last cpu: 2
    [...] next cpu at -1: 2
    [...] next cpu at 0: 2
    [...] next zero at -1: 0
    [...] next zero at 0: 1
    [...] next and at -1: 2
    [...] next and at 0: 2
    [...] next wrap at -1,false: 2
    [...] next wrap at 0,false: 2
    [...] next wrap at -1,true: 2
    [...] next wrap at 0,true: 2
    [...] for each not: 0
    [...] testing non-empty mask
    [...] first cpu: 0
    [...] first zero: 1
    [...] first and: 0
    [...] last cpu: 0
    [...] next cpu at -1: 0
    [...] next cpu at 0: 2
    [...] next zero at -1: 1
    [...] next zero at 0: 1
    [...] next and at -1: 0
    [...] next and at 0: 2
    [...] next wrap at -1,false: 0
    [...] next wrap at 0,false: 0
    [...] next wrap at -1,true: 2
    [...] next wrap at 0,true: 2
    [...] for each: 0
    [...] for each wrap: 0
    [...] for each and: 0

Sander Vanheule (2):
  cpumask: Fix invalid uniprocessor mask assumption
  cpumask: Add UP optimised for_each_*_cpu loops

 include/linux/cpumask.h | 42 +++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

-- 
2.36.1


             reply	other threads:[~2022-06-02 21:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 21:04 Sander Vanheule [this message]
2022-06-02 21:04 ` [PATCH v1 1/2] cpumask: Fix invalid uniprocessor mask assumption Sander Vanheule
2022-06-02 22:48   ` Yury Norov
2022-06-03 15:13     ` Sander Vanheule
2022-06-02 21:04 ` [PATCH v1 2/2] cpumask: Add UP optimised for_each_*_cpu loops Sander Vanheule

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1654201862.git.sander@svanheule.net \
    --to=sander@svanheule.net \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=elver@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=tglx@linutronix.de \
    --cc=vschneid@redhat.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.