All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin LaHaise <bcrl@kvack.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-aio@kvack.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 07/13] aio: enabled thread based async fsync
Date: Fri, 22 Jan 2016 23:50:24 -0500	[thread overview]
Message-ID: <20160123045024.GA32488@kvack.org> (raw)
In-Reply-To: <20160123042449.GE6033@dastard>

On Sat, Jan 23, 2016 at 03:24:49PM +1100, Dave Chinner wrote:
> On Wed, Jan 20, 2016 at 04:56:30PM -0500, Benjamin LaHaise wrote:
> > On Thu, Jan 21, 2016 at 08:45:46AM +1100, Dave Chinner wrote:
> > > Filesystems *must take locks* in the IO path. We have to serialise
> > > against truncate and other operations at some point in the IO path
> > > (e.g. block mapping vs concurrent allocation and/or removal), and
> > > that can only be done sanely with sleeping locks.  There is no way
> > > of knowing in advance if we are going to block, and so either we
> > > always use threads for IO submission or we accept that occasionally
> > > the AIO submission will block.
> > 
> > I never said we don't take locks.  Still, we can be more intelligent 
> > about when and where we do so.  With the nonblocking pread() and pwrite() 
> > changes being proposed elsewhere, we can do the part of the I/O that 
> > doesn't block in the submitter, which is a huge win when possible.
> > 
> > As it stands today, *every* buffered write takes i_mutex immediately 
> > on entering ->write().  That one issue alone accounts for a nearly 10x 
> > performance difference between an O_SYNC write and an O_DIRECT write, 
> 
> Yes, that locking is for correct behaviour, not for performance
> reasons.  The i_mutex is providing the required semantics for POSIX
> write(2) functionality - writes must serialise against other reads
> and writes so that they are completed atomically w.r.t. other IO.
> i.e. writes to the same offset must not interleave, not should reads
> be able to see partial data from a write in progress.

No, the locks are not *required* for POSIX semantics, they are a legacy
of how Linux filesystem code has been implemented and how we ensure the
necessary internal consistency needed inside our filesystems is
provided.  There are other ways to achieve the required semantics that
do not involve a single giant lock for the entire file/inode.  And no, I
am not saying that doing this is simple or easy to do.

		-ben

> Direct IO does not conform to POSIX concurrency standards, so we
> don't have to serialise concurrent IO against each other.
> 
> > and using O_SYNC writes is a legitimate use-case for users who want 
> > caching of data by the kernel (duplicating that functionality is a huge 
> > amount of work for an application, plus if you want the cache to be 
> > persistent between runs of an app, you have to get the kernel to do it).
> 
> Yes, but you take what you get given. Buffered IO sucks in many ways;
> this is just one of them.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

-- 
"Thought is the essence of where you are now."

WARNING: multiple messages have this Message-ID (diff)
From: Benjamin LaHaise <bcrl@kvack.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-aio@kvack.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 07/13] aio: enabled thread based async fsync
Date: Fri, 22 Jan 2016 23:50:24 -0500	[thread overview]
Message-ID: <20160123045024.GA32488@kvack.org> (raw)
In-Reply-To: <20160123042449.GE6033@dastard>

On Sat, Jan 23, 2016 at 03:24:49PM +1100, Dave Chinner wrote:
> On Wed, Jan 20, 2016 at 04:56:30PM -0500, Benjamin LaHaise wrote:
> > On Thu, Jan 21, 2016 at 08:45:46AM +1100, Dave Chinner wrote:
> > > Filesystems *must take locks* in the IO path. We have to serialise
> > > against truncate and other operations at some point in the IO path
> > > (e.g. block mapping vs concurrent allocation and/or removal), and
> > > that can only be done sanely with sleeping locks.  There is no way
> > > of knowing in advance if we are going to block, and so either we
> > > always use threads for IO submission or we accept that occasionally
> > > the AIO submission will block.
> > 
> > I never said we don't take locks.  Still, we can be more intelligent 
> > about when and where we do so.  With the nonblocking pread() and pwrite() 
> > changes being proposed elsewhere, we can do the part of the I/O that 
> > doesn't block in the submitter, which is a huge win when possible.
> > 
> > As it stands today, *every* buffered write takes i_mutex immediately 
> > on entering ->write().  That one issue alone accounts for a nearly 10x 
> > performance difference between an O_SYNC write and an O_DIRECT write, 
> 
> Yes, that locking is for correct behaviour, not for performance
> reasons.  The i_mutex is providing the required semantics for POSIX
> write(2) functionality - writes must serialise against other reads
> and writes so that they are completed atomically w.r.t. other IO.
> i.e. writes to the same offset must not interleave, not should reads
> be able to see partial data from a write in progress.

No, the locks are not *required* for POSIX semantics, they are a legacy
of how Linux filesystem code has been implemented and how we ensure the
necessary internal consistency needed inside our filesystems is
provided.  There are other ways to achieve the required semantics that
do not involve a single giant lock for the entire file/inode.  And no, I
am not saying that doing this is simple or easy to do.

		-ben

> Direct IO does not conform to POSIX concurrency standards, so we
> don't have to serialise concurrent IO against each other.
> 
> > and using O_SYNC writes is a legitimate use-case for users who want 
> > caching of data by the kernel (duplicating that functionality is a huge 
> > amount of work for an application, plus if you want the cache to be 
> > persistent between runs of an app, you have to get the kernel to do it).
> 
> Yes, but you take what you get given. Buffered IO sucks in many ways;
> this is just one of them.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

-- 
"Thought is the essence of where you are now."

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Benjamin LaHaise <bcrl@kvack.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-aio@kvack.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 07/13] aio: enabled thread based async fsync
Date: Fri, 22 Jan 2016 23:50:24 -0500	[thread overview]
Message-ID: <20160123045024.GA32488@kvack.org> (raw)
In-Reply-To: <20160123042449.GE6033@dastard>

On Sat, Jan 23, 2016 at 03:24:49PM +1100, Dave Chinner wrote:
> On Wed, Jan 20, 2016 at 04:56:30PM -0500, Benjamin LaHaise wrote:
> > On Thu, Jan 21, 2016 at 08:45:46AM +1100, Dave Chinner wrote:
> > > Filesystems *must take locks* in the IO path. We have to serialise
> > > against truncate and other operations at some point in the IO path
> > > (e.g. block mapping vs concurrent allocation and/or removal), and
> > > that can only be done sanely with sleeping locks.  There is no way
> > > of knowing in advance if we are going to block, and so either we
> > > always use threads for IO submission or we accept that occasionally
> > > the AIO submission will block.
> > 
> > I never said we don't take locks.  Still, we can be more intelligent 
> > about when and where we do so.  With the nonblocking pread() and pwrite() 
> > changes being proposed elsewhere, we can do the part of the I/O that 
> > doesn't block in the submitter, which is a huge win when possible.
> > 
> > As it stands today, *every* buffered write takes i_mutex immediately 
> > on entering ->write().  That one issue alone accounts for a nearly 10x 
> > performance difference between an O_SYNC write and an O_DIRECT write, 
> 
> Yes, that locking is for correct behaviour, not for performance
> reasons.  The i_mutex is providing the required semantics for POSIX
> write(2) functionality - writes must serialise against other reads
> and writes so that they are completed atomically w.r.t. other IO.
> i.e. writes to the same offset must not interleave, not should reads
> be able to see partial data from a write in progress.

No, the locks are not *required* for POSIX semantics, they are a legacy
of how Linux filesystem code has been implemented and how we ensure the
necessary internal consistency needed inside our filesystems is
provided.  There are other ways to achieve the required semantics that
do not involve a single giant lock for the entire file/inode.  And no, I
am not saying that doing this is simple or easy to do.

		-ben

> Direct IO does not conform to POSIX concurrency standards, so we
> don't have to serialise concurrent IO against each other.
> 
> > and using O_SYNC writes is a legitimate use-case for users who want 
> > caching of data by the kernel (duplicating that functionality is a huge 
> > amount of work for an application, plus if you want the cache to be 
> > persistent between runs of an app, you have to get the kernel to do it).
> 
> Yes, but you take what you get given. Buffered IO sucks in many ways;
> this is just one of them.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

-- 
"Thought is the essence of where you are now."

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

  reply	other threads:[~2016-01-23  4:50 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 22:06 [PATCH 00/13] aio: thread (work queue) based aio and new aio functionality Benjamin LaHaise
2016-01-11 22:06 ` Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 01/13] signals: distinguish signals sent due to i/o via io_send_sig() Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 02/13] aio: add aio_get_mm() helper Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:06 ` [PATCH 03/13] aio: for async operations, make the iter argument persistent Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:06   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 04/13] signals: add and use aio_get_task() to direct signals sent via io_send_sig() Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 05/13] fs: make do_loop_readv_writev() non-static Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 06/13] aio: add queue_work() based threaded aio support Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 07/13] aio: enabled thread based async fsync Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-12  1:11   ` Dave Chinner
2016-01-12  1:11     ` Dave Chinner
2016-01-12  1:20     ` Linus Torvalds
2016-01-12  1:20       ` Linus Torvalds
2016-01-12  2:25       ` Dave Chinner
2016-01-12  2:25         ` Dave Chinner
2016-01-12  2:25         ` Dave Chinner
2016-01-12  2:38         ` Linus Torvalds
2016-01-12  2:38           ` Linus Torvalds
2016-01-12  3:37           ` Dave Chinner
2016-01-12  3:37             ` Dave Chinner
2016-01-12  4:03             ` Linus Torvalds
2016-01-12  4:03               ` Linus Torvalds
2016-01-12  4:48               ` Linus Torvalds
2016-01-12  4:48                 ` Linus Torvalds
2016-01-12 22:50                 ` Benjamin LaHaise
2016-01-12 22:50                   ` Benjamin LaHaise
2016-01-12 22:50                   ` Benjamin LaHaise
2016-01-15 20:21                 ` Benjamin LaHaise
2016-01-15 20:21                   ` Benjamin LaHaise
2016-01-15 20:21                   ` Benjamin LaHaise
2016-01-20  3:59                   ` Linus Torvalds
2016-01-20  3:59                     ` Linus Torvalds
2016-01-20  3:59                     ` Linus Torvalds
2016-01-20  5:02                     ` Theodore Ts'o
2016-01-20  5:02                       ` Theodore Ts'o
2016-01-20  5:02                       ` Theodore Ts'o
2016-01-20 19:59                     ` Dave Chinner
2016-01-20 19:59                       ` Dave Chinner
2016-01-20 19:59                       ` Dave Chinner
2016-01-20 20:29                       ` Linus Torvalds
2016-01-20 20:29                         ` Linus Torvalds
2016-01-20 20:44                         ` Benjamin LaHaise
2016-01-20 20:44                           ` Benjamin LaHaise
2016-01-20 20:44                           ` Benjamin LaHaise
2016-01-20 21:45                           ` Dave Chinner
2016-01-20 21:45                             ` Dave Chinner
2016-01-20 21:56                             ` Benjamin LaHaise
2016-01-20 21:56                               ` Benjamin LaHaise
2016-01-20 21:56                               ` Benjamin LaHaise
2016-01-23  4:24                               ` Dave Chinner
2016-01-23  4:24                                 ` Dave Chinner
2016-01-23  4:50                                 ` Benjamin LaHaise [this message]
2016-01-23  4:50                                   ` Benjamin LaHaise
2016-01-23  4:50                                   ` Benjamin LaHaise
2016-01-23 22:22                                   ` Dave Chinner
2016-01-23 22:22                                     ` Dave Chinner
2016-01-23 22:22                                     ` Dave Chinner
2016-01-20 23:07                             ` Linus Torvalds
2016-01-23  4:39                               ` Dave Chinner
2016-01-23  4:39                                 ` Dave Chinner
2016-01-23  4:39                                 ` Dave Chinner
2016-03-14 17:17                                 ` aio openat " Benjamin LaHaise
2016-03-14 17:17                                   ` Benjamin LaHaise
2016-03-20  1:20                                   ` Linus Torvalds
2016-03-20  1:20                                     ` Linus Torvalds
2016-03-20  1:26                                     ` Al Viro
2016-03-20  1:26                                       ` Al Viro
2016-03-20  1:26                                       ` Al Viro
2016-03-20  1:45                                       ` Linus Torvalds
2016-03-20  1:45                                         ` Linus Torvalds
2016-03-20  1:45                                         ` Linus Torvalds
2016-03-20  1:55                                         ` Al Viro
2016-03-20  1:55                                           ` Al Viro
2016-03-20  2:03                                           ` Linus Torvalds
2016-03-20  2:03                                             ` Linus Torvalds
2016-03-20  2:03                                             ` Linus Torvalds
2016-01-20 21:57                         ` Dave Chinner
2016-01-20 21:57                           ` Dave Chinner
2016-01-20 21:57                           ` Dave Chinner
2016-01-22 15:41                     ` Andres Freund
2016-01-22 15:41                       ` Andres Freund
2016-01-12 22:59               ` Andy Lutomirski
2016-01-12 22:59                 ` Andy Lutomirski
2016-01-12 22:59                 ` Andy Lutomirski
2016-01-14  9:19       ` Paolo Bonzini
2016-01-14  9:19         ` Paolo Bonzini
2016-01-14  9:19         ` Paolo Bonzini
2016-01-12  1:30     ` Benjamin LaHaise
2016-01-12  1:30       ` Benjamin LaHaise
2016-01-12  1:30       ` Benjamin LaHaise
2016-01-22 15:31     ` Andres Freund
2016-01-22 15:31       ` Andres Freund
2016-01-22 15:31       ` Andres Freund
2016-01-11 22:07 ` [PATCH 08/13] aio: add support for aio poll via aio thread helper Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 09/13] aio: add support for async openat() Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-12  0:22   ` Linus Torvalds
2016-01-12  0:22     ` Linus Torvalds
2016-01-12  1:17     ` Benjamin LaHaise
2016-01-12  1:17       ` Benjamin LaHaise
2016-01-12  1:17       ` Benjamin LaHaise
2016-01-12  1:45     ` Chris Mason
2016-01-12  1:45       ` Chris Mason
2016-01-12  1:45       ` Chris Mason
2016-01-12  9:53     ` Ingo Molnar
2016-01-12  9:53       ` Ingo Molnar
2016-01-12  9:53       ` Ingo Molnar
2016-01-11 22:07 ` [PATCH 10/13] aio: add async unlinkat functionality Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 11/13] mm: enable __do_page_cache_readahead() to include present pages Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:07 ` [PATCH 12/13] aio: add support for aio readahead Benjamin LaHaise
2016-01-11 22:07   ` Benjamin LaHaise
2016-01-11 22:08 ` [PATCH 13/13] aio: add support for aio renameat operation Benjamin LaHaise
2016-01-11 22:08   ` Benjamin LaHaise
2016-01-11 22:08   ` Benjamin LaHaise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160123045024.GA32488@kvack.org \
    --to=bcrl@kvack.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.