linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Salman Qazi <sqazi@google.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	davem@davemloft.net, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: Performance regression in write() syscall
Date: Mon, 23 Feb 2009 21:43:23 -0800	[thread overview]
Message-ID: <4352991a0902232143o7c6c3c53h5f78ffc0758472e4@mail.gmail.com> (raw)
In-Reply-To: <200902241510.33911.nickpiggin@yahoo.com.au>

On Mon, Feb 23, 2009 at 8:10 PM, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> added some ccs
>
> On Tuesday 24 February 2009 13:03:04 Salman Qazi wrote:
> > While the introduction of __copy_from_user_nocache (see commit:
> > 0812a579c92fefa57506821fa08e90f47cb6dbdd) may have been an improvement
> > for sufficiently large writes, there is evidence to show that it is
> > deterimental for small writes.  Unixbench's fstime test gives the
> > following results for 256 byte writes with MAX_BLOCK of 2000:
> >
> >     2.6.29-rc6 ( 5 samples, each in KB/sec ):
> >     283750, 295200, 294500, 293000, 293300
> >
> >     2.6.29-rc6 + this patch (5 samples, each in KB/sec):
> >     313050, 3106750, 293350, 306300, 307900
> >
> >     2.6.18
> >     395700, 342000, 399100, 366050, 359850
>
> What does unixbench's fstime test do? If it is just writing to the
> pagecache, then this would be unexpected. If it is reading and writing,
> then perhaps this could be a problem, but how realistic is it for a
> performance critical application to read data out of the pagecache that
> it has recently written? Do you have something at google actually doing
> real work that speeds up with this patch?

It has 3 parts, each producing a number corresponding to write, read
and copy.  The first one only does writes and lseeks.  This produces
the numbers that I have provided.  We are actually not sure at this
point if this slows down one of our real application.  We noticed that
Unixbench fstime, which is part of our automated testing was slower,
and upon investigation this was one of the causes.  We will be
forthcoming with at least one other regressions in this exact usage of
write() system call shortly (I am gathering relevant numbers  for
upstream kernels now).  In both cases, the regression was noticed for
sub page writes.

>
>
> >     See w_test() in src/fstime.c in unixbench version 4.1.0.  Basically,
> > the above test consists of counting how much we can write in this manner:
> >
> >     alarm(10);
> >     while (!sigalarm) {
> >             for (f_blocks = 0; f_blocks < 2000; ++f_blocks) {
> >                    write(f, buf, 256);
> >             }
> >             lseek(f, 0L, 0);
> >     }
> >
> > I realised that there are other components to the write syscall regression
> > that are not addressed here.  I will send another email shortly stating the
> > source of another one.
> >
> > Signed-off-by: Salman Qazi <sqazi@google.com>
> > ---
> > diff --git a/arch/x86/include/asm/uaccess_64.h
> > b/arch/x86/include/asm/uaccess_64.h index 84210c4..efe7315 100644
> > --- a/arch/x86/include/asm/uaccess_64.h
> > +++ b/arch/x86/include/asm/uaccess_64.h
> > @@ -192,14 +192,20 @@ static inline int __copy_from_user_nocache(void *dst,
> > const void __user *src, unsigned size)
> >  {
> >       might_sleep();
> > -     return __copy_user_nocache(dst, src, size, 1);
> > +     if (likely(size >= PAGE_SIZE))
> > +             return __copy_user_nocache(dst, src, size, 1);
> > +     else
> > +             return __copy_from_user(dst, src, size);
> >  }
> >
> >  static inline int __copy_from_user_inatomic_nocache(void *dst,
> >                                                   const void __user *src,
> >                                                   unsigned size)
> >  {
> > -     return __copy_user_nocache(dst, src, size, 0);
> > +     if (likely(size >= PAGE_SIZE))
> > +             return __copy_user_nocache(dst, src, size, 0);
> > +     else
> > +             return __copy_from_user_inatomic(dst, src, size);
> >  }
> >
> >  unsigned long
>
>

  parent reply	other threads:[~2009-02-24  5:43 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-24  2:03 Performance regression in write() syscall Salman Qazi
2009-02-24  4:10 ` Nick Piggin
2009-02-24  4:28   ` Linus Torvalds
2009-02-24  9:02     ` Nick Piggin
2009-02-24 15:52       ` Linus Torvalds
2009-02-24 16:24         ` Andi Kleen
2009-02-24 16:51         ` Ingo Molnar
2009-02-25  3:23         ` Nick Piggin
2009-02-25  7:25           ` [patch] x86, mm: pass in 'total' to __copy_from_user_*nocache() Ingo Molnar
2009-02-25  8:09             ` Nick Piggin
2009-02-25  8:29               ` Ingo Molnar
2009-02-25  8:59                 ` Nick Piggin
2009-02-25 12:01                   ` Ingo Molnar
2009-02-25 16:04             ` Linus Torvalds
2009-02-25 16:29               ` Ingo Molnar
2009-02-27 12:05               ` Nick Piggin
2009-02-28  8:29                 ` Ingo Molnar
2009-02-28 11:49                   ` Nick Piggin
2009-02-28 12:58                     ` Ingo Molnar
2009-02-28 17:16                       ` Linus Torvalds
2009-02-28 17:24                         ` Arjan van de Ven
2009-02-28 17:42                           ` Linus Torvalds
2009-02-28 17:53                             ` Arjan van de Ven
2009-02-28 18:05                             ` Andi Kleen
2009-02-28 18:27                             ` Ingo Molnar
2009-02-28 18:39                               ` Arjan van de Ven
2009-03-02 10:39                                 ` [PATCH] x86, mm: dont use non-temporal stores in pagecache accesses Ingo Molnar
2009-02-28 18:52                               ` [patch] x86, mm: pass in 'total' to __copy_from_user_*nocache() Linus Torvalds
2009-03-01 14:19                                 ` Nick Piggin
2009-03-01  0:06                             ` David Miller
2009-03-01  0:40                               ` Andi Kleen
2009-03-01  0:28                                 ` H. Peter Anvin
2009-03-01  0:38                                   ` Arjan van de Ven
2009-03-01  1:48                                     ` Andi Kleen
2009-03-01  1:38                                       ` Arjan van de Ven
2009-03-01  1:40                                         ` H. Peter Anvin
2009-03-01 14:06                                           ` Nick Piggin
2009-03-02  4:46                                             ` H. Peter Anvin
2009-03-02  6:18                                               ` Nick Piggin
2009-03-02 21:16                                             ` Linus Torvalds
2009-03-02 21:25                                               ` Ingo Molnar
2009-03-03  4:30                                                 ` Nick Piggin
2009-03-03  4:20                                               ` Nick Piggin
2009-03-03  9:02                                                 ` Ingo Molnar
2009-03-04  3:37                                                   ` Nick Piggin
2009-03-01  2:07                                         ` Andi Kleen
2009-02-24  5:43   ` Salman Qazi [this message]
2009-02-24 10:09 ` Performance regression in write() syscall Andi Kleen
2009-02-24 16:13   ` Ingo Molnar
2009-02-24 16:51     ` Andi Kleen
     [not found] <c8UUh-6G-3@gated-at.bofh.it>
     [not found] ` <c92fh-3uD-15@gated-at.bofh.it>
2009-02-24 11:12   ` Bodo Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4352991a0902232143o7c6c3c53h5f78ffc0758472e4@mail.gmail.com \
    --to=sqazi@google.com \
    --cc=andi@firstfloor.org \
    --cc=davem@davemloft.net \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).