linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Daniel McNeil <daniel@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, "linux-aio@kvack.org" <linux-aio@kvack.org>
Subject: Re: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch
Date: Thu, 4 Dec 2003 10:10:29 +0530	[thread overview]
Message-ID: <20031204044029.GA3353@in.ibm.com> (raw)
In-Reply-To: <1070493246.1969.15.camel@ibm-c.pdx.osdl.net>

Interesting :) 
Could it be that extra atomic_dec in dio_bio_end_io in the current code 
that makes the difference ? Or are spin_lock_irq and atomic_inc/dec
really that close ? Would be good to know from other archs just to keep
in mind in the future.

As for the fallback case, performance doesn't even matter - its not a
typical situation. So we don't need to bother with that.

Unless someone thinks otherwise, we should go with your fix.

Regards
Suparna

On Wed, Dec 03, 2003 at 03:14:06PM -0800, Daniel McNeil wrote:
> Suparna,
> 
> I did a quick test of your patch and my patch by running my aiocp
> program to write zeros to a file and to read a file.  I used
> a 50MB file on ext2 file system on a ramdisk.  The machine
> is a 2-proc IBM box with:
> 
> model name      : Intel(R) Xeon(TM) CPU 1700MHz
> stepping        : 10
> cpu MHz         : 1686.033
> cache size      : 256 KB
> 
> The write test was:
> 
> time aiocp -n 32 -b 1k -s 50m -z -f DIRECT file
> 
> The read test was
> 
> time aiocp -n 32 -b 1k -s 50 -w -f DIRECT file
> 
> I ran each test more than 10 times and here are the averages:
> 
> 		my patch		your patch
> 
> aiocp write	real 0.7328		real 0.7756
> 		user 0.01425		user 0.01221
> 		sys  0.716		sys  0.76157
> 
> aiocp read	real 0.7250		real 0.7456
> 		user 0.0144		user 0.0130
> 		sys  0.07149		sys  0.7307
> 
> It looks like using the spin_lock instead of the atomic inc/dec
> is very close performance wise.  The spin_lock averages a bit
> faster.  This is not testing the fallback base, but both patches
> would be very similar in performance for that case.
> 
> I don't have any non-intel hardware to test with.
> 
> Daniel
> 
> 
> 
> On Tue, 2003-12-02 at 07:25, Suparna Bhattacharya wrote:
> > I suspect the degree to which the spin_lock_irq is costlier 
> > than atomic_inc/dec would vary across architectures - cli/sti
> > is probably more expensive on certain archs than others.
> > 
> > The patch I sent just kept things the way they were in terms of 
> > locking costs, assuming that those choices were thought through
> > at that time (should check with akpm). Yours changes it by 
> > switching to spin_lock(unlock)_irq instead of atomic_dec in 
> > the normal (common) path for finished_one_bio, for both sync 
> > and async i/o. At the same time, for the sync i/o case, as 
> > you observe it takes away one atomic_dec from dio_bio_end_io. 
> > 
> > Since these probably aren't really very hot paths ... possibly
> > the difference doesn't matter that much. I do agree that your
> > patch makes the locking easier to follow.
> > 
> > Regards
> > Suparna
> > 
> > On Mon, Dec 01, 2003 at 05:35:18PM -0800, Daniel McNeil wrote:
> > > Suparna,
> > > 
> > > Sorry I did not respond sooner, I was on vacation.
> > > 
> > > Your patch should also fix the problem.  I like mine with the
> > > cleaner locking.
> > > 
> > > I am not sure your approach has less overhead.  At least
> > > on x86, cli/sti are fairly inexpensive.  The locked xchange or locked
> > > inc/dec is what is expensive (from what I understand).
> > > 
> > > So comparing:
> > > 
> > > my patch:				Your patch:
> > > 
> > > dio_bio_submit()		
> > > 	spin_lock()			atomic_inc(bio_count);
> > > 	bio_count++			atomic_inc(bios_in_flight);
> > > 	bios_in_flight++
> > > 	spin_unlock
> > > 
> > > My guess is that the spin_lock/spin_unlock is faster than 2 atomic_inc's
> > > since it is only 1 locked operation (spin_lock) verses 2 (atomic_inc's)
> > > 
> > > finished_one_bio() (normal case)
> > > 
> > > My patch:
> > > 	spin_lock()			atomic_dec_and_test(bio_count)
> > > 	bio_count--
> > > 	spin_unlock()	
> > > 	
> > > 1 locked instruction each, so very close -- atomic_dec_and_test() does
> > > not disable interrupts, so it is probabably a little bit faster.
> > > 
> > > finished_one-bio (fallback case):
> > > 
> > > 	spin_lock()			spin_lock()
> > > 	bio_count--;			dio->waiter = null
> > > 	spin_unlock()			spin_unlock()
> > > 
> > > Both approaches are the same.
> > > 	
> > > dio_bio_complete()
> > > 
> > > 	spin_lock()			spin_lock()
> > > 	bios_in_flight--		atomic_dec()
> > > 	spin_unlock			spin_unlock()
> > > 
> > > My patch is faster since it removed 1 locked instruction.
> > > 
> > > Conclusion:
> > > 
> > > My guess would be that both approaches are close, but my patch
> > > has less locked instructions but does disable interrupts more.
> > > My preference is for the cleaner locking approach that is easier
> > > to understand and modify in the future.
> > > 
> > > Daniel
> > > 
> > > On Tue, 2003-11-25 at 23:55, Suparna Bhattacharya wrote:
> > > > On Tue, Nov 25, 2003 at 03:49:31PM -0800, Daniel McNeil wrote:
> > > > > Suparna,
> > > > > 
> > > > > Yes your patch did help.  I originally had CONFIG_DEBUG_SLAB=y which
> > > > > was helping me see problems because the the freed dio was getting
> > > > > poisoned.  I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
> > > > > very good at catching these.
> > > > 
> > > > Ah I see - perhaps that explains why neither Janet nor I could
> > > > recreate the problem that you were hitting so easily. So we 
> > > > should probably try running with CONFIG_DEBUG_SLAB and
> > > > CONFIG_DEBUG_PAGEALLOC as well.
> > > > 
> > > > > 
> > > > > I updated your AIO fallback patch plus your AIO race plus I fixed
> > > > > the bio_count decrement fix.  This patch has all three fixes and
> > > > > it is working for me.
> > > > > 
> > > > > I fixed the bio_count race, by changing bio_list_lock into bio_lock
> > > > > and using that for all the bio fields.  I changed bio_count and
> > > > > bios_in_flight from atomics into int.  They are now proctected by
> > > > > the bio_lock.  I fixed the race, by in finished_one_bio() by
> > > > > leaving the bio_count at 1 until after the dio_complete()
> > > > > and then do the bio_count decrement and wakeup holding the bio_lock.
> > > > > 
> > > > > Take a look, give it a try, and let me know what you think.
> > > > 
> > > > I had been trying a slightly different kind of fix -- appended is
> > > > the updated version of the patch I last posted. It uses the bio_list_lock
> > > > to protect the dio->waiter field, which finished_one_bio sets back
> > > > to NULL after it has issued the wakeup; and the code that waits for
> > > > i/o to drain out checks the dio->waiter field instead of bio_count.
> > > > This might not seem very obvious given the nomenclature of the 
> > > > bio_list_lock, so I was holding back wondering if it could be 
> > > > improved. 
> > > > 
> > > > Your approach looks clearer in that sense -- its pretty unambiguous
> > > > about what lock protects what fields. The only thing that bothers me (and
> > > > this is what I was trying to avoid in my patch) is the increased
> > > > use of spin_lock_irq 's (overhead of turning interrupts off and on)
> > > > instead of simple atomic inc/dec in most places.
> > > > 
> > > > Thoughts ?
> > > > 
> > > > Regards
> > > > Suparna
> > > 
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > > the body to majordomo@kvack.org.  For more info on Linux AIO,
> > > see: http://www.kvack.org/aio/
> > > Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
> 

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India


  reply	other threads:[~2003-12-04  4:35 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-13  7:30 2.6.0-test9-mm3 Andrew Morton
2003-11-13 20:03 ` [PATCH] linux-2.6.0-test9-mm3_verbose-timesource-acpi-pm_A0 john stultz
2003-11-13 22:03 ` 2.6.0-test9-mm3 - AIO test results Daniel McNeil
2003-11-17  5:25   ` Suparna Bhattacharya
2003-11-18  1:15     ` Daniel McNeil
2003-11-18  1:37       ` Daniel McNeil
2003-11-18 11:55         ` Suparna Bhattacharya
2003-11-18 23:47           ` Daniel McNeil
2003-11-24  9:42             ` Suparna Bhattacharya
2003-11-25 23:49               ` [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch Daniel McNeil
2003-11-26  7:55                 ` Suparna Bhattacharya
2003-12-02  1:35                   ` Daniel McNeil
2003-12-02 15:25                     ` Suparna Bhattacharya
2003-12-03 23:14                       ` Daniel McNeil
2003-12-04  4:40                         ` Suparna Bhattacharya [this message]
2003-11-13 22:04 ` 2.6.0-test9-mm3 (compile stats) John Cherry
2003-11-14  5:07 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 20:57   ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:57     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 21:37       ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:47       ` 2.6.0-test9-mm3 Linus Torvalds
2003-11-15  0:55         ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-15 19:34           ` [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops Zwane Mwaikambo
2003-11-15 19:52             ` Zwane Mwaikambo
2003-11-17 21:46             ` Zwane Mwaikambo
2003-11-17 22:42               ` Linus Torvalds
2003-11-17 23:01                 ` Zwane Mwaikambo
2003-11-17 23:14                   ` Zwane Mwaikambo
2003-11-18  7:21                     ` Zwane Mwaikambo
2003-11-18 15:47                       ` Linus Torvalds
2003-11-18 16:16                         ` Zwane Mwaikambo
2003-11-18 16:37                           ` Linus Torvalds
2003-11-18 17:08                             ` Zwane Mwaikambo
2003-11-18 17:38                               ` Martin J. Bligh
2003-11-18 17:22                                 ` Zwane Mwaikambo
2003-11-19 20:32                             ` Matt Mackall
2003-11-19 23:09                               ` Matt Mackall
2003-11-20  7:14                                 ` Zwane Mwaikambo
2003-11-20  7:44                                 ` Matt Mackall
2003-11-20  7:53                                   ` Andrew Morton
2003-11-20  8:13                                   ` Matt Mackall
2003-11-14 19:08 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 18:59   ` 2.6.0-test9-mm3 Andrew Morton
2003-11-14 19:32     ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 20:27       ` 2.6.0-test9-mm3 John Stoffel
2003-11-15  1:01         ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 19:10   ` 2.6.0-test9-mm3 Badari Pulavarty
2003-11-14 20:29     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-17 20:58       ` 2.6.0-test9-mm3 bill davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031204044029.GA3353@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=daniel@osdl.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).