linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <willy@w.ods.org>
To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Cc: "'Timothy Miller'" <miller@techsource.com>,
	linux-kernel mailing list <linux-kernel@vger.kernel.org>
Subject: Re: generic strncpy - off-by-one error
Date: Thu, 14 Aug 2003 21:45:00 +0200	[thread overview]
Message-ID: <20030814194500.GA21146@alpha.home.local> (raw)
In-Reply-To: <D069C7355C6E314B85CF36761C40F9A42E20AB@mailse02.se.axis.com>

On Thu, Aug 14, 2003 at 11:34:50AM +0200, Peter Kjellerstedt wrote:
 
> char *strncpy(char *dest, const char *src, size_t count)
> {
> 	char *tmp = dest;
> 
> 	while (count) {
> 		if (*src == '\0')
> 			break;
> 
> 		*tmp++ = *src++;
> 		count--;
> 	}
> 
> 	while (count) {
> 		*tmp++ = '\0';
> 		count--;
> 	}
> 
> 	return dest;
> }
> 
> Moving the check for *src == '\0' into the first loop seems
> to let the compiler reuse the object code a little better
> (may depend on the optimizer). Also note that your version
> of the second loop needs an explicit  comparison with -1,
> whereas mine uses an implicit comparison with 0.

Well, if you're at this level of comparison, then I may object that
'count' is evaluated twice when jumping from one loop to the other one.

*Algorithmically* speaking, the most optimal code should be :

char *strncpy(char *dest, const char *src, size_t count)
{
        char *tmp = dest;
        if (unlikely(!count))
                goto end;
loop1:
        if ((*tmp = *src++) == '\0')
		goto next;
        tmp++;
        if (likely(--count))
		goto loop1;
	else
		goto end;
loop2:
        *tmp = '\0';
next:
        tmp++;
        if (likely(--count))
		goto loop2;
end:
        return dest;
}

(sorry for those who hate gotos). Look at it : no test is performed twice, no
byte is read nor written twice in any case. The assembly code produced is
optimal on x86 (well, in fact we could delete exactly one conditionnal jump
because the compiler doesn't know how to reuse them for several tests). 16
inlinable instructions (= excluding function in/out) which can be executed 2 at
a time if your CPU has 2 pipelines. about 3 cycles/copied byte, 2 cycles/filled
byte.

Unfortunately, it fools branch predictors and prefetch mechanisms found in
today's processors, so it results in slower code than yours (at least on
athlon and P3). Perhaps it would be fast on older processors, I don't know.

All that demonstrates that whatever your efforts in helping the optimizer, the
only meaningful result is the benchmark. Number of bytes and speculations on
the reusability of information between lines of codes are not what makes our
programs fast anymore :-(

> Testing on the CRIS architecture, your version is 24 instructions,
> whereas mine is 18. For comparison, Eric's one is 12 and the
> currently used implementation is 26 (when corrected for the
> off-by-one error by comparing with > 1 rather than != 0 in the
> second loop).

Just out of curiosity, can the CRIS architecture execute multiple instructions
per cycle ? have you tried to determine the dependencies between the
instructions ? Did you time the different functions ? (yours is rather fast on
x86 BTW).

To conclude, I would say that if we want to implement really fast low-level
primitives such as str*, mem*, ... there's nothing better than assembly
associated with benchmarks on different CPU architectures and models.

But I don't know if strncpy() is used enough to justify that...

Regards,
Willy

  reply	other threads:[~2003-08-14 19:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-14  9:34 generic strncpy - off-by-one error Peter Kjellerstedt
2003-08-14 19:45 ` Willy Tarreau [this message]
2003-08-14 20:24 ` Timothy Miller
  -- strict thread matches above, loose matches on Subject: below --
2003-08-20  7:43 Peter Kjellerstedt
2003-08-16 21:10 Peter Kjellerstedt
2003-08-18 18:41 ` Timothy Miller
2003-08-16 20:08 Peter Kjellerstedt
2003-08-16  9:19 Peter Kjellerstedt
2003-08-16 10:04 ` Daniel Forrest
2003-08-18 16:40   ` Timothy Miller
2003-08-16  8:15 Peter Kjellerstedt
2003-08-16  8:41 ` Daniel Forrest
2003-08-18 16:17   ` Timothy Miller
2003-08-18 16:06 ` Timothy Miller
2003-08-15  9:54 Peter Kjellerstedt
2003-08-15 17:52 ` Timothy Miller
2003-08-15  9:53 Peter Kjellerstedt
2003-08-15 17:47 ` Timothy Miller
2003-08-13  3:09 Anthony Truong
2003-08-13  2:18 Albert Cahalan
2003-08-13  2:47 ` Erik Andersen
2003-08-13  3:38   ` Albert Cahalan
2003-08-13  3:56     ` Nick Piggin
2003-08-13  5:18     ` Willy Tarreau
2003-08-13 19:03   ` Timothy Miller
2003-08-12 14:07 Yoshinori Sato
2003-08-12 14:39 ` Willy Tarreau
2003-08-12 14:50 ` Yoshinori Sato
2003-08-12 15:03   ` Valdis.Kletnieks
2003-08-12 15:54     ` William Gallafent
2003-08-12 16:19       ` Valdis.Kletnieks
2003-08-12 16:09     ` Andreas Schwab
2003-08-12  1:56 Anthony Truong
2003-08-12 17:14 ` Alan Cox
2003-08-12 21:53   ` Bernd Petrovitsch
2003-08-12  1:28 Anthony Truong
2003-08-12 16:24 ` Bernd Petrovitsch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030814194500.GA21146@alpha.home.local \
    --to=willy@w.ods.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miller@techsource.com \
    --cc=peter.kjellerstedt@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).