All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Pavel Machek <pavel@ucw.cz>, "Ma, Ling" <ling.ma@intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] [X86] performance improvement for memcpy_64.S by fast string.
Date: Fri, 13 Nov 2009 09:10:37 +0100	[thread overview]
Message-ID: <20091113081037.GB18054@elte.hu> (raw)
In-Reply-To: <4AFD1326.506@zytor.com>


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 11/12/2009 11:33 PM, Ingo Molnar wrote:
> > 
> > * Pavel Machek <pavel@ucw.cz> wrote:
> > 
> >>> Ling, if you are interested, could you send a user-space test-app to 
> >>> this thread that everyone could just compile and run on various older 
> >>> boxes, to gather a performance profile of hand-coded versus string ops 
> >>> performance?
> >>>
> >>> ( And i think we can make a judgement based on cache-hot performance
> >>>   alone - if then the strings ops will perform comparatively better in
> >>>   cache-cold scenarios, so the cache-hot numbers would be a conservative
> >>>   estimate. )
> >>
> >> Ugh, really? I'd expect cache-cold performance to be not helped at all 
> >> (memory bandwidth limit) and you'll get slow down from additional 
> >> i-cache misses...
> > 
> > That's my point - the new code is shorter, which will run comparatively 
> > faster in a cache-cold environment.
> > 
> 
> memcpy_c by itself is by far the shortest variant, of course.

yep. The argument i made was when a long function was compared to a 
short one. As you noted we dont actually enable the long function all 
that often - which inverts the same argument.

> The question is if it makes sense to use the long variants for short 
> (< 1024 bytes) copies.

I'd say not - the kernel executes in a icache-cold environment most of 
the time (as user-space is far more cache intense in the majority of 
workloads and kernel processing starts with a cold icache), so 
optimizing the kernel for code size is very important. (but numbers done 
on real workloads can convince me of the opposite.)

	Ingo

  reply	other threads:[~2009-11-13  8:10 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-06  9:41 [PATCH RFC] [X86] performance improvement for memcpy_64.S by fast string ling.ma
2009-11-06 16:51 ` Andi Kleen
2009-11-08 10:18   ` Ingo Molnar
2009-11-06 17:07 ` H. Peter Anvin
2009-11-06 19:26   ` H. Peter Anvin
2009-11-09  7:24     ` Ma, Ling
2009-11-09  7:36       ` H. Peter Anvin
2009-11-09  8:08         ` Ingo Molnar
2009-11-11  7:05           ` Ma, Ling
2009-11-11  7:18             ` Ingo Molnar
2009-11-11  7:57               ` Ma, Ling
2009-11-11 23:21                 ` H. Peter Anvin
2009-11-12  2:12                   ` Ma, Ling
2009-11-11 20:34             ` Cyrill Gorcunov
2009-11-11 22:39               ` H. Peter Anvin
2009-11-12  4:28                 ` Cyrill Gorcunov
2009-11-12  4:49                   ` Ma, Ling
2009-11-12  5:26                     ` H. Peter Anvin
2009-11-12  7:42                       ` Ma, Ling
2009-11-12  9:54                     ` Cyrill Gorcunov
2009-11-12 12:16           ` Pavel Machek
2009-11-13  7:33             ` Ingo Molnar
2009-11-13  8:04               ` H. Peter Anvin
2009-11-13  8:10                 ` Ingo Molnar [this message]
2009-11-09  9:26         ` Andi Kleen
2009-11-09 16:41           ` H. Peter Anvin
2009-11-09 18:54             ` Andi Kleen
2009-11-09 22:36               ` H. Peter Anvin
2009-11-12 12:16       ` Pavel Machek
2009-11-13  5:33         ` Ma, Ling
2009-11-13  6:04           ` H. Peter Anvin
2009-11-13  7:23             ` Ma, Ling
2009-11-13  7:30               ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091113081037.GB18054@elte.hu \
    --to=mingo@elte.hu \
    --cc=hpa@zytor.com \
    --cc=ling.ma@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.