All of lore.kernel.org
 help / color / mirror / Atom feed
From: lkml@sdf.org
To: andriy.shevchenko@linux.intel.com, lkml@sdf.org
Cc: akpm@linux-foundation.org, daniel.wagner@siemens.com,
	dchinner@redhat.com, don.mullis@gmail.com, geert@linux-m68k.org,
	linux-kernel@vger.kernel.org, linux@rasmusvillemoes.dk,
	st5pub@yandex.ru
Subject: Re: [PATCH 1/5] lib/sort: Make swap functions more generic
Date: Sat, 9 Mar 2019 15:53:41 GMT	[thread overview]
Message-ID: <201903091553.x29FrfMR018600@sdf.org> (raw)
In-Reply-To: <20190309140653.GO9224@smile.fi.intel.com>

Thnk you for the feedback!

Andy Shevchenko wrote:
> On Thu, Feb 21, 2019 at 06:30:28AM +0000, George Spelvin wrote:
>> +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> 
> Why #ifdef is better than if (IS_ENABLED()) ?

Because CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is bool and not
tristate.  IS_ENABLED tests for 'y' or 'm' but we don't need it
for something that's only on or off.

Looking through the kernel, I see both, but #ifdef or #if defined()
are definitely in the majority:

lib/lzo/lzo1x_compress.c:#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(LZO_USE_CTZ64)
lib/siphash.c:#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
lib/string.c:#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
lib/strncpy_from_user.c:#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
lib/zlib_inflate/inffast.c:#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS

I see a few IS_ENABLED uses in include/crypto/ and kernel/bpf/.

It makes no real difference; #ifdef is simpler to me.

>> +	(void)base;
> 
> I understand the possible weirdness of the case, but 0 pointer is also good, no?

I don't quite understand the question.  A NULL pointer is aligned
as far as alignment_ok is concerned.  In other words, word accesses to
*NULL will (not) work just as well as byte accesses.

None of this matters because the callers never pass in NULL pointers.

>> +#else
>> +	lsbits |= (unsigned int)(size_t)base;
> 
> Perhaps you meant simple (uintptr_t) here and maybe above as well.

No, I meant to truncate it to the same type as "align".  We only
care about the low few bits; I could have used (unsigned char) just
as well.

My reason or choosing unsigned int rather than unsigned char was
that both x86 and ARM are a tiny bit more efficient at 32-bit
operations (x86 avoids a REX prefix; ARM gates off the high half
of the ALU to save power), but only x86 does byte operations
natively.

Still, compilers know how to promote to word operations, so
maybe using unsigned char would make it clearer to the
reader that "yes, I meant to do that!".

I've changed it to:

static bool __attribute_const__
is_aligned(const void *base, size_t size, unsigned char align)
{
	unsigned char lsbits = (unsigned char)size;
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
	(void)base;
#else
	lsbits |= (unsigned char)(uintptr_t)base;
#endif
	return (lsbits & (align - 1)) == 0;
}

Although I'm thinking of:

static bool __attribute_const__
is_aligned(const void *base, size_t size, unsigned char align)
{
	unsigned char lsbits = (unsigned char)size;

	(void)base;
#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
	lsbits |= (unsigned char)(uintptr_t)base;
#endif
	return (lsbits & (align - 1)) == 0;
}

Any preference?

>> +#endif
>> +	lsbits &= align - 1;
>> +	return lsbits == 0;
> 
> and this can be one return operation.

It was just to avoid some annoying parentheses because C's precendence
rules and GCC's warnings are fighting me here.  If others prefer
"return (lsbits & (align - 1)) == 0;" I'm happy to change.

>>  static void u32_swap(void *a, void *b, int size)
> 
> For such primitives that operates on top of an arrays we usually append 's' to
> the name. Currently the name is misleading.
> 
> Perhaps u32s_swap().

I don't worry much about the naming of static helper functions.
If they were exported, it would be a whole lot more important!

I find "u32s" confusing; I keep reading the "s" as "signed" rather
than a plural.

How about one of:
swap_bytes / swap_ints / swap_longs
swap_1 / swap_4 / swap_8

> Shouldn't simple memcpy cover these case for both 32- and 64-bit architectures?

This isn't a memcpy, it's a memory *swap*.  To do it with memcpy
requires:
	memcpy(temp_buffer, a, size);
	memcpy(a, b, size);
	memcpy(b, temp_buffer, size);

This is 1.5x as much memory access, and you have to find a large
enough temp_buffer.  (I didn't think a variable-length array on
the stack would make people happy.)

Also, although it is a predictable branch, memcpy() has to check the
alignment of its inputs each call.  The reason for these helpers is
to factor that out.

  parent reply	other threads:[~2019-03-09 15:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-09  2:17 [PATCH 0/5] lib/sort & lib/list_sort: faster and smaller George Spelvin
2019-02-21  6:30 ` [PATCH 1/5] lib/sort: Make swap functions more generic George Spelvin
     [not found]   ` <20190309140653.GO9224@smile.fi.intel.com>
2019-03-09 15:53     ` lkml [this message]
2019-03-09 20:19       ` Andrey Abramov
2019-03-14  9:29       ` Andy Shevchenko
2019-03-14 10:09         ` George Spelvin
2019-03-14 10:41           ` Geert Uytterhoeven
2019-03-14 11:53             ` George Spelvin
2019-03-14 12:18               ` Andy Shevchenko
2019-03-14 19:59                 ` Andrey Abramov
2019-03-15  3:35                   ` George Spelvin
2019-03-15  8:27                     ` Geert Uytterhoeven
2019-03-14 10:11         ` George Spelvin
2019-03-09 21:02     ` George Spelvin
2019-03-13 21:23   ` Rasmus Villemoes
2019-03-13 22:02     ` Geert Uytterhoeven
2019-03-13 23:15     ` George Spelvin
2019-02-21  8:21 ` [PATCH 2/5] lib/sort: Use more efficient bottom-up heapsort variant George Spelvin
2019-03-13 22:29   ` Rasmus Villemoes
2019-03-14  0:03     ` George Spelvin
2019-03-14  0:15       ` Rasmus Villemoes
2019-02-21  8:21 ` [PATCH 3/5] lib/sort: Avoid indirect calls to built-in swap George Spelvin
2019-03-05  3:06 ` [PATCH 4/5] lib/list_sort: Simplify and remove MAX_LIST_LENGTH_BITS George Spelvin
2019-03-10 21:54   ` Rasmus Villemoes
2019-03-10 22:29     ` George Spelvin
2019-03-14  9:10   ` Andy Shevchenko
2019-03-14  9:41     ` George Spelvin
2019-03-15  4:33     ` George Spelvin
2019-03-15  8:20       ` Geert Uytterhoeven
2019-03-15 10:23         ` George Spelvin
2019-03-15 12:57           ` Geert Uytterhoeven
2019-03-15 16:59             ` George Spelvin
2019-03-15 17:47               ` Geert Uytterhoeven
2019-03-15 18:53                 ` Andrey Abramov
2019-03-15 19:06                   ` Andy Shevchenko
2019-03-15 19:23                     ` Andrey Abramov
2019-03-15 19:56                       ` Andy Shevchenko
2019-03-16  3:49                         ` George Spelvin
2019-03-05  5:58 ` [PATCH 5/5] lib/list_sort: Optimize number of calls to comparison function George Spelvin
2019-03-13 23:28   ` Rasmus Villemoes
2019-03-14  1:58     ` George Spelvin
2019-06-21 23:12       ` Rasmus Villemoes
2019-12-08  8:01         ` George Spelvin
2019-03-15 19:54 ` [PATCH 0/5] lib/sort & lib/list_sort: faster and smaller Andrey Abramov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201903091553.x29FrfMR018600@sdf.org \
    --to=lkml@sdf.org \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=daniel.wagner@siemens.com \
    --cc=dchinner@redhat.com \
    --cc=don.mullis@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=st5pub@yandex.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.