All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Yury Norov <yury.norov@gmail.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Kees Cook <keescook@chromium.org>,
	Eric Biggers <ebiggers@google.com>,
	Christian Brauner <brauner@kernel.org>,
	serge@hallyn.com, paul@paul-moore.com,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org
Subject: Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible
Date: Sat, 4 Mar 2023 13:01:15 -0800	[thread overview]
Message-ID: <CAHk-=wh2U3a7AdvekB3uyAmH+NNk-CxN-NxGzQ=GZwjaEcM-tg@mail.gmail.com> (raw)
In-Reply-To: <ZAOvUuxJP7tAKc1e@yury-laptop>

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

On Sat, Mar 4, 2023 at 12:51 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> > That particular code sequence is arguably broken to begin with.
> > setall() should really only be used as a mask, most definitely not as
> > some kind of "all possible cpus".
>
> Sorry, don't understand this.

See the example patch I sent out.

Literally just make the rule be "we play games with cpumasks in that
they have two different 'sizes', so just make sure the bits in the
bigger and faster size are always clear".

That simple rule just means that we can then use that bigger constant
size in all cases where "upper bits zero" just don't matter.

Which is basically all of them.

Your for_each_cpu_not() example is actually a great example: it should
damn well not exist at all. I hadn't even noticed how broken it was.
Exactly like the other broken case (that I *did* notice -
cpumask_complement), it has no actual valid users. It _literally_ only
exists as a pointless test-case.

So this is *literally* what I'm talking about: you are making up silly
cases that then act as "arguments" for making all the _real_ cases
slower.

Stop it.

Silly useless cases are just that - silly and useless. They should not
be arguments for the real cases then being optimized and simplified.

Updated patch to remove 'for_each_cpu_not()' attached.

It's still completely untested. Treat this very much as a "Let's make
the common cases faster, at least for !MAXSMP".

                   Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 7705 bytes --]

 .clang-format           |  1 -
 arch/ia64/kernel/acpi.c |  4 +--
 include/linux/cpumask.h | 68 ++++++++++++++++++++++++++-----------------------
 lib/cpumask_kunit.c     | 12 ---------
 4 files changed, 37 insertions(+), 48 deletions(-)

diff --git a/.clang-format b/.clang-format
index 2c61b4553374..d988e9fa9b26 100644
--- a/.clang-format
+++ b/.clang-format
@@ -226,7 +226,6 @@ ForEachMacros:
   - 'for_each_console_srcu'
   - 'for_each_cpu'
   - 'for_each_cpu_and'
-  - 'for_each_cpu_not'
   - 'for_each_cpu_wrap'
   - 'for_each_dapm_widgets'
   - 'for_each_dedup_cand'
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 96d13cb7c19f..15f6cfddcc08 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -783,11 +783,9 @@ __init void prefill_possible_map(void)
 
 static int _acpi_map_lsapic(acpi_handle handle, int physid, int *pcpu)
 {
-	cpumask_t tmp_map;
 	int cpu;
 
-	cpumask_complement(&tmp_map, cpu_present_mask);
-	cpu = cpumask_first(&tmp_map);
+	cpu = cpumask_first_zero(cpu_present_mask);
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;
 
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 10c92bd9b807..c8bb032afa7d 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -50,8 +50,30 @@ static inline void set_nr_cpu_ids(unsigned int nr)
 #endif
 }
 
-/* Deprecated. Always use nr_cpu_ids. */
-#define nr_cpumask_bits	nr_cpu_ids
+/*
+ * The difference between nr_cpumask_bits and nr_cpu_ids is that
+ * 'nr_cpu_ids' is the actual number of CPU ids in the system, while
+ * nr_cpumask_bits is a "reasonable upper value" that is often more
+ * efficient because it can be a fixed constant.
+ *
+ * So when clearing or traversing a cpumask, use 'nr_cpumask_bits',
+ * but when checking exact limits (and when _setting_ bits), use the
+ * tighter exact limit of 'nr_cpu_ids'.
+ *
+ * NOTE! The code depends on any exyta bits in nr_cpumask_bits a always
+ * being (a) allocated and (b) zero, so that the only effect of using
+ * 'nr_cpumask_bits' is that we might return a higher maximum CPU value
+ * (which is why we have that pattern of
+ *
+ *   Returns >= nr_cpu_ids if no cpus set.
+ *
+ * for many of the functions - they can return that higher value).
+ */
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ #define nr_cpumask_bits ((unsigned int)NR_CPUS)
+#else
+ #define nr_cpumask_bits	nr_cpu_ids
+#endif
 
 /*
  * The following particular system cpumasks and operations manage
@@ -114,7 +136,7 @@ static __always_inline void cpu_max_bits_warn(unsigned int cpu, unsigned int bit
 /* verify cpu argument to cpumask_* operators */
 static __always_inline unsigned int cpumask_check(unsigned int cpu)
 {
-	cpu_max_bits_warn(cpu, nr_cpumask_bits);
+	cpu_max_bits_warn(cpu, nr_cpu_ids);
 	return cpu;
 }
 
@@ -248,16 +270,6 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 #define for_each_cpu(cpu, mask)				\
 	for_each_set_bit(cpu, cpumask_bits(mask), nr_cpumask_bits)
 
-/**
- * for_each_cpu_not - iterate over every cpu in a complemented mask
- * @cpu: the (optionally unsigned) integer iterator
- * @mask: the cpumask pointer
- *
- * After the loop, cpu is >= nr_cpu_ids.
- */
-#define for_each_cpu_not(cpu, mask)				\
-	for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits)
-
 #if NR_CPUS == 1
 static inline
 unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
@@ -495,10 +507,14 @@ static __always_inline bool cpumask_test_and_clear_cpu(int cpu, struct cpumask *
 /**
  * cpumask_setall - set all cpus (< nr_cpu_ids) in a cpumask
  * @dstp: the cpumask pointer
+ *
+ * Note: since we set bits, we should use the tighter 'bitmap_set()' with
+ * the eact number of bits, not 'bitmap_fill()' that will fill past the
+ * end.
  */
 static inline void cpumask_setall(struct cpumask *dstp)
 {
-	bitmap_fill(cpumask_bits(dstp), nr_cpumask_bits);
+	bitmap_set(cpumask_bits(dstp), 0, nr_cpu_ids);
 }
 
 /**
@@ -569,18 +585,6 @@ static inline bool cpumask_andnot(struct cpumask *dstp,
 					  cpumask_bits(src2p), nr_cpumask_bits);
 }
 
-/**
- * cpumask_complement - *dstp = ~*srcp
- * @dstp: the cpumask result
- * @srcp: the input to invert
- */
-static inline void cpumask_complement(struct cpumask *dstp,
-				      const struct cpumask *srcp)
-{
-	bitmap_complement(cpumask_bits(dstp), cpumask_bits(srcp),
-					      nr_cpumask_bits);
-}
-
 /**
  * cpumask_equal - *src1p == *src2p
  * @src1p: the first input
@@ -648,7 +652,7 @@ static inline bool cpumask_empty(const struct cpumask *srcp)
  */
 static inline bool cpumask_full(const struct cpumask *srcp)
 {
-	return bitmap_full(cpumask_bits(srcp), nr_cpumask_bits);
+	return bitmap_full(cpumask_bits(srcp), nr_cpu_ids);
 }
 
 /**
@@ -694,7 +698,7 @@ static inline void cpumask_shift_left(struct cpumask *dstp,
 				      const struct cpumask *srcp, int n)
 {
 	bitmap_shift_left(cpumask_bits(dstp), cpumask_bits(srcp), n,
-					      nr_cpumask_bits);
+					      nr_cpu_ids);
 }
 
 /**
@@ -742,7 +746,7 @@ static inline void cpumask_copy(struct cpumask *dstp,
 static inline int cpumask_parse_user(const char __user *buf, int len,
 				     struct cpumask *dstp)
 {
-	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits);
+	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpu_ids);
 }
 
 /**
@@ -757,7 +761,7 @@ static inline int cpumask_parselist_user(const char __user *buf, int len,
 				     struct cpumask *dstp)
 {
 	return bitmap_parselist_user(buf, len, cpumask_bits(dstp),
-				     nr_cpumask_bits);
+				     nr_cpu_ids);
 }
 
 /**
@@ -769,7 +773,7 @@ static inline int cpumask_parselist_user(const char __user *buf, int len,
  */
 static inline int cpumask_parse(const char *buf, struct cpumask *dstp)
 {
-	return bitmap_parse(buf, UINT_MAX, cpumask_bits(dstp), nr_cpumask_bits);
+	return bitmap_parse(buf, UINT_MAX, cpumask_bits(dstp), nr_cpu_ids);
 }
 
 /**
@@ -781,7 +785,7 @@ static inline int cpumask_parse(const char *buf, struct cpumask *dstp)
  */
 static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
 {
-	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpumask_bits);
+	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpu_ids);
 }
 
 /**
diff --git a/lib/cpumask_kunit.c b/lib/cpumask_kunit.c
index d1fc6ece21f3..ab798365b7dc 100644
--- a/lib/cpumask_kunit.c
+++ b/lib/cpumask_kunit.c
@@ -23,16 +23,6 @@
 		KUNIT_EXPECT_EQ_MSG((test), mask_weight, iter, MASK_MSG(mask));	\
 	} while (0)
 
-#define EXPECT_FOR_EACH_CPU_NOT_EQ(test, mask)					\
-	do {									\
-		const cpumask_t *m = (mask);					\
-		int mask_weight = cpumask_weight(m);				\
-		int cpu, iter = 0;						\
-		for_each_cpu_not(cpu, m)					\
-			iter++;							\
-		KUNIT_EXPECT_EQ_MSG((test), nr_cpu_ids - mask_weight, iter, MASK_MSG(mask));	\
-	} while (0)
-
 #define EXPECT_FOR_EACH_CPU_OP_EQ(test, op, mask1, mask2)			\
 	do {									\
 		const cpumask_t *m1 = (mask1);					\
@@ -113,14 +103,12 @@ static void test_cpumask_next(struct kunit *test)
 static void test_cpumask_iterators(struct kunit *test)
 {
 	EXPECT_FOR_EACH_CPU_EQ(test, &mask_empty);
-	EXPECT_FOR_EACH_CPU_NOT_EQ(test, &mask_empty);
 	EXPECT_FOR_EACH_CPU_WRAP_EQ(test, &mask_empty);
 	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, &mask_empty, &mask_empty);
 	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, cpu_possible_mask, &mask_empty);
 	EXPECT_FOR_EACH_CPU_OP_EQ(test, andnot, &mask_empty, &mask_empty);
 
 	EXPECT_FOR_EACH_CPU_EQ(test, cpu_possible_mask);
-	EXPECT_FOR_EACH_CPU_NOT_EQ(test, cpu_possible_mask);
 	EXPECT_FOR_EACH_CPU_WRAP_EQ(test, cpu_possible_mask);
 	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, cpu_possible_mask, cpu_possible_mask);
 	EXPECT_FOR_EACH_CPU_OP_EQ(test, andnot, cpu_possible_mask, &mask_empty);

  reply	other threads:[~2023-03-04 21:01 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25 15:55 [PATCH v3 1/2] capability: add cap_isidentical Mateusz Guzik
2023-01-25 15:55 ` [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible Mateusz Guzik
2023-02-28  0:44   ` Linus Torvalds
2023-03-02  8:30     ` Christian Brauner
2023-03-02 17:51       ` Linus Torvalds
2023-03-02 18:14         ` Mateusz Guzik
2023-03-02 18:18           ` Al Viro
2023-03-02 18:22             ` Mateusz Guzik
2023-03-02 18:43               ` Al Viro
2023-03-02 18:51                 ` Mateusz Guzik
2023-03-02 19:02                 ` Al Viro
2023-03-02 19:18                   ` Al Viro
2023-03-02 19:03               ` Linus Torvalds
2023-03-02 19:10                 ` Linus Torvalds
2023-03-02 19:19                   ` Al Viro
2023-03-02 19:54                     ` Kees Cook
2023-03-02 20:11                       ` Al Viro
2023-03-03 15:30                         ` Alexander Potapenko
2023-03-03 17:39                           ` Mateusz Guzik
2023-03-03 17:54                             ` Linus Torvalds
2023-03-03 19:37                               ` Mateusz Guzik
2023-03-03 19:38                                 ` Mateusz Guzik
2023-03-03 20:08                                 ` Linus Torvalds
2023-03-03 20:39                                   ` Mateusz Guzik
2023-03-03 20:58                                     ` Linus Torvalds
2023-03-03 21:09                                       ` Mateusz Guzik
2023-03-04 19:01                                       ` Mateusz Guzik
2023-03-04 20:31                                         ` Mateusz Guzik
2023-03-04 20:48                                           ` Linus Torvalds
2023-03-05 17:23                                             ` David Laight
2023-03-04  1:29                                     ` Linus Torvalds
2023-03-04  3:25                                       ` Yury Norov
2023-03-04  3:42                                         ` Linus Torvalds
2023-03-04  5:51                                           ` Yury Norov
2023-03-04 16:41                                             ` David Vernet
2023-03-04 19:02                                             ` Linus Torvalds
2023-03-04 19:19                                             ` Linus Torvalds
2023-03-04 20:34                                               ` Linus Torvalds
2023-03-04 20:51                                               ` Yury Norov
2023-03-04 21:01                                                 ` Linus Torvalds [this message]
2023-03-04 21:03                                                   ` Linus Torvalds
2023-03-04 21:10                                                   ` Linus Torvalds
2023-03-04 23:08                                                     ` Linus Torvalds
2023-03-04 23:52                                                       ` Linus Torvalds
2023-03-05  9:26                                                       ` Sedat Dilek
2023-03-05 18:17                                                         ` Linus Torvalds
2023-03-05 18:43                                                           ` Linus Torvalds
2023-03-06  5:43                                                       ` Yury Norov
2023-03-04 20:18                                     ` Al Viro
2023-03-04 20:42                                       ` Mateusz Guzik
2023-03-02 19:38                   ` Kees Cook
2023-03-02 19:48                     ` Eric Biggers
2023-03-02 18:41             ` Al Viro
2023-03-03 14:49         ` Christian Brauner
2023-03-02 18:11       ` Al Viro
2023-03-03 14:27         ` Christian Brauner
2023-02-28  1:14 ` [PATCH v3 1/2] capability: add cap_isidentical Linus Torvalds
2023-02-28  2:46   ` Casey Schaufler
2023-02-28 14:47     ` Mateusz Guzik
2023-02-28 19:39       ` Linus Torvalds
2023-02-28 19:51         ` Linus Torvalds
2023-02-28 20:48         ` Linus Torvalds
2023-02-28 21:21           ` Mateusz Guzik
2023-02-28 21:29             ` Linus Torvalds
2023-03-01 18:13               ` Linus Torvalds
2023-02-28 17:32     ` Serge E. Hallyn
2023-02-28 17:52       ` Casey Schaufler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wh2U3a7AdvekB3uyAmH+NNk-CxN-NxGzQ=GZwjaEcM-tg@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=ebiggers@google.com \
    --cc=glider@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.