linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel McNeil <daniel@osdl.org>
To: Suparna Bhattacharya <suparna@in.ibm.com>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, "linux-aio@kvack.org" <linux-aio@kvack.org>
Subject: Re: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch
Date: 03 Dec 2003 15:14:06 -0800	[thread overview]
Message-ID: <1070493246.1969.15.camel@ibm-c.pdx.osdl.net> (raw)
In-Reply-To: <20031202152547.GA4249@in.ibm.com>

Suparna,

I did a quick test of your patch and my patch by running my aiocp
program to write zeros to a file and to read a file.  I used
a 50MB file on ext2 file system on a ramdisk.  The machine
is a 2-proc IBM box with:

model name      : Intel(R) Xeon(TM) CPU 1700MHz
stepping        : 10
cpu MHz         : 1686.033
cache size      : 256 KB

The write test was:

time aiocp -n 32 -b 1k -s 50m -z -f DIRECT file

The read test was

time aiocp -n 32 -b 1k -s 50 -w -f DIRECT file

I ran each test more than 10 times and here are the averages:

		my patch		your patch

aiocp write	real 0.7328		real 0.7756
		user 0.01425		user 0.01221
		sys  0.716		sys  0.76157

aiocp read	real 0.7250		real 0.7456
		user 0.0144		user 0.0130
		sys  0.07149		sys  0.7307

It looks like using the spin_lock instead of the atomic inc/dec
is very close performance wise.  The spin_lock averages a bit
faster.  This is not testing the fallback base, but both patches
would be very similar in performance for that case.

I don't have any non-intel hardware to test with.

Daniel



On Tue, 2003-12-02 at 07:25, Suparna Bhattacharya wrote:
> I suspect the degree to which the spin_lock_irq is costlier 
> than atomic_inc/dec would vary across architectures - cli/sti
> is probably more expensive on certain archs than others.
> 
> The patch I sent just kept things the way they were in terms of 
> locking costs, assuming that those choices were thought through
> at that time (should check with akpm). Yours changes it by 
> switching to spin_lock(unlock)_irq instead of atomic_dec in 
> the normal (common) path for finished_one_bio, for both sync 
> and async i/o. At the same time, for the sync i/o case, as 
> you observe it takes away one atomic_dec from dio_bio_end_io. 
> 
> Since these probably aren't really very hot paths ... possibly
> the difference doesn't matter that much. I do agree that your
> patch makes the locking easier to follow.
> 
> Regards
> Suparna
> 
> On Mon, Dec 01, 2003 at 05:35:18PM -0800, Daniel McNeil wrote:
> > Suparna,
> > 
> > Sorry I did not respond sooner, I was on vacation.
> > 
> > Your patch should also fix the problem.  I like mine with the
> > cleaner locking.
> > 
> > I am not sure your approach has less overhead.  At least
> > on x86, cli/sti are fairly inexpensive.  The locked xchange or locked
> > inc/dec is what is expensive (from what I understand).
> > 
> > So comparing:
> > 
> > my patch:				Your patch:
> > 
> > dio_bio_submit()		
> > 	spin_lock()			atomic_inc(bio_count);
> > 	bio_count++			atomic_inc(bios_in_flight);
> > 	bios_in_flight++
> > 	spin_unlock
> > 
> > My guess is that the spin_lock/spin_unlock is faster than 2 atomic_inc's
> > since it is only 1 locked operation (spin_lock) verses 2 (atomic_inc's)
> > 
> > finished_one_bio() (normal case)
> > 
> > My patch:
> > 	spin_lock()			atomic_dec_and_test(bio_count)
> > 	bio_count--
> > 	spin_unlock()	
> > 	
> > 1 locked instruction each, so very close -- atomic_dec_and_test() does
> > not disable interrupts, so it is probabably a little bit faster.
> > 
> > finished_one-bio (fallback case):
> > 
> > 	spin_lock()			spin_lock()
> > 	bio_count--;			dio->waiter = null
> > 	spin_unlock()			spin_unlock()
> > 
> > Both approaches are the same.
> > 	
> > dio_bio_complete()
> > 
> > 	spin_lock()			spin_lock()
> > 	bios_in_flight--		atomic_dec()
> > 	spin_unlock			spin_unlock()
> > 
> > My patch is faster since it removed 1 locked instruction.
> > 
> > Conclusion:
> > 
> > My guess would be that both approaches are close, but my patch
> > has less locked instructions but does disable interrupts more.
> > My preference is for the cleaner locking approach that is easier
> > to understand and modify in the future.
> > 
> > Daniel
> > 
> > On Tue, 2003-11-25 at 23:55, Suparna Bhattacharya wrote:
> > > On Tue, Nov 25, 2003 at 03:49:31PM -0800, Daniel McNeil wrote:
> > > > Suparna,
> > > > 
> > > > Yes your patch did help.  I originally had CONFIG_DEBUG_SLAB=y which
> > > > was helping me see problems because the the freed dio was getting
> > > > poisoned.  I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
> > > > very good at catching these.
> > > 
> > > Ah I see - perhaps that explains why neither Janet nor I could
> > > recreate the problem that you were hitting so easily. So we 
> > > should probably try running with CONFIG_DEBUG_SLAB and
> > > CONFIG_DEBUG_PAGEALLOC as well.
> > > 
> > > > 
> > > > I updated your AIO fallback patch plus your AIO race plus I fixed
> > > > the bio_count decrement fix.  This patch has all three fixes and
> > > > it is working for me.
> > > > 
> > > > I fixed the bio_count race, by changing bio_list_lock into bio_lock
> > > > and using that for all the bio fields.  I changed bio_count and
> > > > bios_in_flight from atomics into int.  They are now proctected by
> > > > the bio_lock.  I fixed the race, by in finished_one_bio() by
> > > > leaving the bio_count at 1 until after the dio_complete()
> > > > and then do the bio_count decrement and wakeup holding the bio_lock.
> > > > 
> > > > Take a look, give it a try, and let me know what you think.
> > > 
> > > I had been trying a slightly different kind of fix -- appended is
> > > the updated version of the patch I last posted. It uses the bio_list_lock
> > > to protect the dio->waiter field, which finished_one_bio sets back
> > > to NULL after it has issued the wakeup; and the code that waits for
> > > i/o to drain out checks the dio->waiter field instead of bio_count.
> > > This might not seem very obvious given the nomenclature of the 
> > > bio_list_lock, so I was holding back wondering if it could be 
> > > improved. 
> > > 
> > > Your approach looks clearer in that sense -- its pretty unambiguous
> > > about what lock protects what fields. The only thing that bothers me (and
> > > this is what I was trying to avoid in my patch) is the increased
> > > use of spin_lock_irq 's (overhead of turning interrupts off and on)
> > > instead of simple atomic inc/dec in most places.
> > > 
> > > Thoughts ?
> > > 
> > > Regards
> > > Suparna
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > the body to majordomo@kvack.org.  For more info on Linux AIO,
> > see: http://www.kvack.org/aio/
> > Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>


  reply	other threads:[~2003-12-03 23:15 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-13  7:30 2.6.0-test9-mm3 Andrew Morton
2003-11-13 20:03 ` [PATCH] linux-2.6.0-test9-mm3_verbose-timesource-acpi-pm_A0 john stultz
2003-11-13 22:03 ` 2.6.0-test9-mm3 - AIO test results Daniel McNeil
2003-11-17  5:25   ` Suparna Bhattacharya
2003-11-18  1:15     ` Daniel McNeil
2003-11-18  1:37       ` Daniel McNeil
2003-11-18 11:55         ` Suparna Bhattacharya
2003-11-18 23:47           ` Daniel McNeil
2003-11-24  9:42             ` Suparna Bhattacharya
2003-11-25 23:49               ` [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch Daniel McNeil
2003-11-26  7:55                 ` Suparna Bhattacharya
2003-12-02  1:35                   ` Daniel McNeil
2003-12-02 15:25                     ` Suparna Bhattacharya
2003-12-03 23:14                       ` Daniel McNeil [this message]
2003-12-04  4:40                         ` Suparna Bhattacharya
2003-11-13 22:04 ` 2.6.0-test9-mm3 (compile stats) John Cherry
2003-11-14  5:07 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 20:57   ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:57     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 21:37       ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:47       ` 2.6.0-test9-mm3 Linus Torvalds
2003-11-15  0:55         ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-15 19:34           ` [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops Zwane Mwaikambo
2003-11-15 19:52             ` Zwane Mwaikambo
2003-11-17 21:46             ` Zwane Mwaikambo
2003-11-17 22:42               ` Linus Torvalds
2003-11-17 23:01                 ` Zwane Mwaikambo
2003-11-17 23:14                   ` Zwane Mwaikambo
2003-11-18  7:21                     ` Zwane Mwaikambo
2003-11-18 15:47                       ` Linus Torvalds
2003-11-18 16:16                         ` Zwane Mwaikambo
2003-11-18 16:37                           ` Linus Torvalds
2003-11-18 17:08                             ` Zwane Mwaikambo
2003-11-18 17:38                               ` Martin J. Bligh
2003-11-18 17:22                                 ` Zwane Mwaikambo
2003-11-19 20:32                             ` Matt Mackall
2003-11-19 23:09                               ` Matt Mackall
2003-11-20  7:14                                 ` Zwane Mwaikambo
2003-11-20  7:44                                 ` Matt Mackall
2003-11-20  7:53                                   ` Andrew Morton
2003-11-20  8:13                                   ` Matt Mackall
2003-11-14 19:08 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 18:59   ` 2.6.0-test9-mm3 Andrew Morton
2003-11-14 19:32     ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 20:27       ` 2.6.0-test9-mm3 John Stoffel
2003-11-15  1:01         ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 19:10   ` 2.6.0-test9-mm3 Badari Pulavarty
2003-11-14 20:29     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-17 20:58       ` 2.6.0-test9-mm3 bill davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1070493246.1969.15.camel@ibm-c.pdx.osdl.net \
    --to=daniel@osdl.org \
    --cc=akpm@osdl.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=suparna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).