All of lore.kernel.org
 help / color / mirror / Atom feed
From: Karl Nasrallah <knnspeed@aol.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Rich Felker <dalias@libc.org>,
	Linux-SH <linux-sh@vger.kernel.org>,
	Linux-Renesas <linux-renesas-soc@vger.kernel.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Karl Nasrallah <knnspeed@aol.com>
Subject: Re: [PATCH v2 0/3] sh: fixup strncpy()
Date: Sat, 21 Dec 2019 02:35:31 +0000	[thread overview]
Message-ID: <20191221023531.GA31@knnspeed.com> (raw)
In-Reply-To: <s59o8w3qa0l.wl-ysato@users.sourceforge.jp>

On Sat, Dec 21, 2019 at 12:42:02AM +0900, Yoshinori Sato wrote:
> On Thu, 19 Dec 2019 10:20:46 +0900,
> Kuninori Morimoto wrote:
> > 
> > 
> > Hi Sato-san, Rich
> > 
> > These are strncpy() fixup patches, but using different solutions.
> > Karl's patches are updating current strncpy(), but using 2 patterns.
> > Kuninori's patch is using generic strncpy().
> > 
> > We don't know which is the best, but we can follow
> > SH maintainer's selection.
> > 
> > Karl Nasrallah (2):
> >       sh: fixup strncpy() warning and add missing padding
> >       sh: fixup strncpy() warning and add missing padding
> > 
> > Kuninori Morimoto (4):
> >       sh: use generic strncpy()
> > 
> > Thank you for your help !!
> > Best regards
> > ---
> > Kuninori Morimoto
> 
> I think the generic version is better, but I want to decide after comparing what code is generated.
> If it is not very inefficient, I would like to make it a generic version.
> 
> --
> Yoshinori Sato

Hello,

I did some testing using the following pure C version:

static inline char *strncpy(char *__dest, const char *__src, size_t __n)
{
	register char * __xdest = __dest;
	const char * __dest_end = __dest + __n - 1;

	if (__n = 0)
		return __xdest;

	while(__dest != __dest_end)
	{
		if(!(*__src))
		{
			*__dest++ = 0;
		}
		else
		{
			*__dest++ = *__src++;
		}
	}

	return __xdest;
}

This takes 8 instructions for the while loop using Os and O2 on SH4 
under GCC 9.2.0. I challenged myself to beat GCC and was only able to
do it in 8 in-loop instructions at best. On SH2A it should be possible
to hit closer to 6, and I think it may be possible to hit 7 on SH4, but
these are the kind of numbers we are looking at.

For reference, this is what I came up with, using only instructions
common to all SH cores:

static inline char *strncpy(char *__dest, const char *__src, size_t __n)
{
	register char * __xdest = __dest;
	const char * __dest_end = __dest + __n - 1;

	if (__n = 0)
		return __xdest;

	__asm__ __volatile__(
		"strncpy_start:\n\t"
			"mov.b @%[src]+,r0\n\t"
		"strncpy_loop:\n\t"
			"cmp/eq %[dest],%[dest_end]\n\t"
			"bt.s strncpy_end\n\t"
			" mov.b r0,@%[dest]\n\t"
			"cmp/eq #0,r0\n\t"
			"bt.s strncpy_loop\n\t"
			" add #1,%[dest]\n\t"
			"bf.s strncpy_loop\n\t"
			" mov.b @%[src]+,r0\n\t"
		"strncpy_end:\n\t"
		: [dest] "+r" (__dest), [src] "+r" (__src)
		: [dest_end] "r" (__dest_end)
		: "r0","t","memory");

	return retval;
}
(In maintaining the spirit of the original work, consider these pieces of
code public domain.)

I did also discover that the m68k and xtensa architectures have similar
assembly strncpy() implementations that do not add padding.

See arch/m68k/include/asm/string.h and arch/xtensa/include/asm/string.h.

Another oddity is that it does not appear that all online documentation
notes that strncpy() is supposed to add padding if the size parameter
exceeds the string size. The official man page of strncpy(3) states it
should, but some other sources make no mention of such behavior.

Hope this helps,
Karl Nasrallah

WARNING: multiple messages have this Message-ID (diff)
From: Karl Nasrallah <knnspeed@aol.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Rich Felker <dalias@libc.org>,
	Linux-SH <linux-sh@vger.kernel.org>,
	Linux-Renesas <linux-renesas-soc@vger.kernel.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Karl Nasrallah <knnspeed@aol.com>
Subject: Re: [PATCH v2 0/3] sh: fixup strncpy()
Date: Fri, 20 Dec 2019 21:35:31 -0500	[thread overview]
Message-ID: <20191221023531.GA31@knnspeed.com> (raw)
In-Reply-To: <s59o8w3qa0l.wl-ysato@users.sourceforge.jp>

On Sat, Dec 21, 2019 at 12:42:02AM +0900, Yoshinori Sato wrote:
> On Thu, 19 Dec 2019 10:20:46 +0900,
> Kuninori Morimoto wrote:
> > 
> > 
> > Hi Sato-san, Rich
> > 
> > These are strncpy() fixup patches, but using different solutions.
> > Karl's patches are updating current strncpy(), but using 2 patterns.
> > Kuninori's patch is using generic strncpy().
> > 
> > We don't know which is the best, but we can follow
> > SH maintainer's selection.
> > 
> > Karl Nasrallah (2):
> >       sh: fixup strncpy() warning and add missing padding
> >       sh: fixup strncpy() warning and add missing padding
> > 
> > Kuninori Morimoto (4):
> >       sh: use generic strncpy()
> > 
> > Thank you for your help !!
> > Best regards
> > ---
> > Kuninori Morimoto
> 
> I think the generic version is better, but I want to decide after comparing what code is generated.
> If it is not very inefficient, I would like to make it a generic version.
> 
> --
> Yoshinori Sato

Hello,

I did some testing using the following pure C version:

static inline char *strncpy(char *__dest, const char *__src, size_t __n)
{
	register char * __xdest = __dest;
	const char * __dest_end = __dest + __n - 1;

	if (__n == 0)
		return __xdest;

	while(__dest != __dest_end)
	{
		if(!(*__src))
		{
			*__dest++ = 0;
		}
		else
		{
			*__dest++ = *__src++;
		}
	}

	return __xdest;
}

This takes 8 instructions for the while loop using Os and O2 on SH4 
under GCC 9.2.0. I challenged myself to beat GCC and was only able to
do it in 8 in-loop instructions at best. On SH2A it should be possible
to hit closer to 6, and I think it may be possible to hit 7 on SH4, but
these are the kind of numbers we are looking at.

For reference, this is what I came up with, using only instructions
common to all SH cores:

static inline char *strncpy(char *__dest, const char *__src, size_t __n)
{
	register char * __xdest = __dest;
	const char * __dest_end = __dest + __n - 1;

	if (__n == 0)
		return __xdest;

	__asm__ __volatile__(
		"strncpy_start:\n\t"
			"mov.b @%[src]+,r0\n\t"
		"strncpy_loop:\n\t"
			"cmp/eq %[dest],%[dest_end]\n\t"
			"bt.s strncpy_end\n\t"
			" mov.b r0,@%[dest]\n\t"
			"cmp/eq #0,r0\n\t"
			"bt.s strncpy_loop\n\t"
			" add #1,%[dest]\n\t"
			"bf.s strncpy_loop\n\t"
			" mov.b @%[src]+,r0\n\t"
		"strncpy_end:\n\t"
		: [dest] "+r" (__dest), [src] "+r" (__src)
		: [dest_end] "r" (__dest_end)
		: "r0","t","memory");

	return retval;
}
(In maintaining the spirit of the original work, consider these pieces of
code public domain.)

I did also discover that the m68k and xtensa architectures have similar
assembly strncpy() implementations that do not add padding.

See arch/m68k/include/asm/string.h and arch/xtensa/include/asm/string.h.

Another oddity is that it does not appear that all online documentation
notes that strncpy() is supposed to add padding if the size parameter
exceeds the string size. The official man page of strncpy(3) states it
should, but some other sources make no mention of such behavior.

Hope this helps,
Karl Nasrallah

  reply	other threads:[~2019-12-21  2:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-19  1:20 [PATCH v2 0/3] sh: fixup strncpy() Kuninori Morimoto
2019-12-19  1:20 ` Kuninori Morimoto
2019-12-19  1:23 ` [PATCH v2 1/3] sh: fixup strncpy() warning and add missing padding Kuninori Morimoto
2019-12-19  1:23   ` Kuninori Morimoto
2019-12-19  1:24 ` [PATCH v2 2/3] " Kuninori Morimoto
2019-12-19  1:24   ` Kuninori Morimoto
2019-12-19  1:24 ` [PATCH 3/3] sh: use generic strncpy() Kuninori Morimoto
2019-12-19  1:24   ` Kuninori Morimoto
2019-12-20 15:42 ` [PATCH v2 0/3] sh: fixup strncpy() Yoshinori Sato
2019-12-20 15:42   ` Yoshinori Sato
2019-12-21  2:35   ` Karl Nasrallah [this message]
2019-12-21  2:35     ` Karl Nasrallah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191221023531.GA31@knnspeed.com \
    --to=knnspeed@aol.com \
    --cc=dalias@libc.org \
    --cc=geert+renesas@glider.be \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.