linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Sinz <msinz@wgate.com>
To: Daniel Egger <degger@fhm.edu>
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	torvalds@transmeta.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] streq()
Date: Wed, 25 Sep 2002 08:19:16 -0400	[thread overview]
Message-ID: <3D91A9C4.2030409@wgate.com> (raw)
In-Reply-To: 1032953252.18004.18.camel@sonja.de.interearth.com

Daniel Egger wrote:
> Am Die, 2002-09-24 um 06.49 schrieb Rusty Russell:
> 
> 
>>Embarrassing, huh?  But I just found a bug in my code cause by
>>"if (strcmp(a,b))" instead of "if (!strcmp(a,b))".
> 
> 
>>diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.38/include/linux/string.h working-2.5.38-streq/include/linux/string.h
>>--- linux-2.5.38/include/linux/string.h	2002-06-06 21:38:40.000000000 +1000
>>+++ working-2.5.38-streq/include/linux/string.h	2002-09-24 14:43:30.000000000 +1000
>>@@ -15,7 +15,7 @@ extern "C" {
>> extern char * strpbrk(const char *,const char *);
>> extern char * strsep(char **,const char *);
>> extern __kernel_size_t strspn(const char *,const char *);
>>-
>>+#define streq(a,b) (strcmp((a),(b)) == 0)
> 
> 
> Considering most compares will only care for equality/non-equality and
> not about the type of unequality a strcmp usually returns, wouldn't it
> be more wise and faster to use an approach like memcpy for comparison
> instead of that stupid compare each character approach?
> 
> Something along the lines of:
> Start comparying by granules with the biggest type the architecture has
> to offer which will fit into the length of the string and going down
> to the size of 1 char bailing out as soon as the granules don't match.

I see two problems with this - first is that strings are not and can not
be assumed to be on nice word boundaries.  While x86 and some (many)
CPUs can actually read words (longs/etc) at non-natural boundaries,
they suffer in performance.  Other architectures can not even do the
reads and will cause a trap/exception that *may* be handled in software.
(Or may not)

Second, unless you know for sure how long the string buffer is you can
not easily bound down the string in larger chunks.  (It may hurt
reading that byte just beyond the end of the string since it is on
another page, etc)

There is also the added overhead of noticing the '\0' byte at the
end of the string since it may be in the middle of your 32-bit
64-bit) data value.

Given all of this, it is rather unlikely that for strings it is worth
doing anything different than the (usually) highly tuned assembly of
the strcmp() code.

(memcpy() has, by its nature and API at least the size which can
significantly help, along with architecture knowledge that it can
then use to pick the right)

A side note:  While memcmp() could use the same logic of memcpy(),
the greater/less than result code specifically is documented at
byte level.  This was a real PITA for Alpha CPUs to make fast,
especially the earlier ones that did not even have natural byte
operators.  (Alpha was designed with UNICODE characters in mind
and performance as its top concern.)

-- 
Michael Sinz -- Director, Systems Engineering -- Worldgate Communications
A master's secrets are only as good as
	the master's ability to explain them to others.



      parent reply	other threads:[~2002-09-25 12:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-24  4:49 [PATCH] streq() Rusty Russell
2002-09-24  5:40 ` Ingo Molnar
2002-09-24  6:04   ` Rusty Russell
2002-09-24  6:24     ` David S. Miller
2002-09-24  7:28       ` Rusty Russell
2002-09-24  7:38         ` Ingo Molnar
2002-09-24  8:19         ` David S. Miller
2002-09-24 13:07   ` Denis Vlasenko
2002-09-24 17:21   ` H. Peter Anvin
2002-09-25 11:27 ` Daniel Egger
2002-09-25 11:45   ` Russell King
2002-09-25 12:38     ` Daniel Egger
2002-09-25 12:19   ` Michael Sinz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D91A9C4.2030409@wgate.com \
    --to=msinz@wgate.com \
    --cc=degger@fhm.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).