linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	Joseph Qi <joseph.qi@linux.alibaba.com>,
	"Theodore Y. Ts'o" <tytso@mit.edu>,
	Joseph Qi <jiangqi903@gmail.com>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>,
	Liu Bo <bo.liu@linux.alibaba.com>
Subject: Re: [RFC] performance regression with "ext4: Allow parallel DIO reads"
Date: Mon, 26 Aug 2019 13:10:17 -0600	[thread overview]
Message-ID: <94515D9C-045C-46EA-9F3C-E13CB2DAA1F9@dilger.ca> (raw)
In-Reply-To: <20190826083958.GA10614@quack2.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 4722 bytes --]

On Aug 26, 2019, at 2:39 AM, Jan Kara <jack@suse.cz> wrote:
> 
> On Sat 24-08-19 12:18:40, Dave Chinner wrote:
>> On Fri, Aug 23, 2019 at 09:08:53PM +0800, Joseph Qi wrote:
>>> 
>>> 
>>> On 19/8/23 18:16, Dave Chinner wrote:
>>>> On Fri, Aug 23, 2019 at 03:57:02PM +0800, Joseph Qi wrote:
>>>>> Hi Dave,
>>>>> 
>>>>> On 19/8/22 13:40, Dave Chinner wrote:
>>>>>> On Wed, Aug 21, 2019 at 09:04:57AM +0800, Joseph Qi wrote:
>>>>>>> Hi Ted,
>>>>>>> 
>>>>>>> On 19/8/21 00:08, Theodore Y. Ts'o wrote:
>>>>>>>> On Tue, Aug 20, 2019 at 11:00:39AM +0800, Joseph Qi wrote:
>>>>>>>>> 
>>>>>>>>> I've tested parallel dio reads with dioread_nolock, it
>>>>>>>>> doesn't have significant performance improvement and still
>>>>>>>>> poor compared with reverting parallel dio reads. IMO, this
>>>>>>>>> is because with parallel dio reads, it take inode shared
>>>>>>>>> lock at the very beginning in ext4_direct_IO_read().
>>>>>>>> 
>>>>>>>> Why is that a problem?  It's a shared lock, so parallel
>>>>>>>> threads should be able to issue reads without getting
>>>>>>>> serialized?
>>>>>>>> 
>>>>>>> The above just tells the result that even mounting with
>>>>>>> dioread_nolock, parallel dio reads still has poor performance
>>>>>>> than before (w/o parallel dio reads).
>>>>>>> 
>>>>>>>> Are you using sufficiently fast storage devices that you're
>>>>>>>> worried about cache line bouncing of the shared lock?  Or do
>>>>>>>> you have some other concern, such as some other thread
>>>>>>>> taking an exclusive lock?
>>>>>>>> 
>>>>>>> The test case is random read/write described in my first
>>>>>>> mail. And
>>>>>> 
>>>>>> Regardless of dioread_nolock, ext4_direct_IO_read() is taking
>>>>>> inode_lock_shared() across the direct IO call.  And writes in
>>>>>> ext4 _always_ take the inode_lock() in ext4_file_write_iter(),
>>>>>> even though it gets dropped quite early when overwrite &&
>>>>>> dioread_nolock is set.  But just taking the lock exclusively
>>>>>> in write fro a short while is enough to kill all shared
>>>>>> locking concurrency...
>>>>>> 
>>>>>>> from my preliminary investigation, shared lock consumes more
>>>>>>> in such scenario.
>>>>>> 
>>>>>> If the write lock is also shared, then there should not be a
>>>>>> scalability issue. The shared dio locking is only half-done in
>>>>>> ext4, so perhaps comparing your workload against XFS would be
>>>>>> an informative exercise...
>>>>> 
>>>>> I've done the same test workload on xfs, it behaves the same as
>>>>> ext4 after reverting parallel dio reads and mounting with
>>>>> dioread_lock.
>>>> 
>>>> Ok, so the problem is not shared locking scalability ('cause
>>>> that's what XFS does and it scaled fine), the problem is almost
>>>> certainly that ext4 is using exclusive locking during
>>>> writes...
>>>> 
>>> 
>>> Agree. Maybe I've misled you in my previous mails.I meant shared
>>> lock makes worse in case of mixed random read/write, since we
>>> would always take inode lock during write.  And it also conflicts
>>> with dioread_nolock. It won't take any inode lock before with
>>> dioread_nolock during read, but now it always takes a shared
>>> lock.
>> 
>> No, you didn't mislead me. IIUC, the shared locking was added to the
>> direct IO read path so that it can't run concurrently with
>> operations like hole punch that free the blocks the dio read might
>> currently be operating on (use after free).
>> 
>> i.e. the shared locking fixes an actual bug, but the performance
>> regression is a result of only partially converting the direct IO
>> path to use shared locking. Only half the job was done from a
>> performance perspective. Seems to me that the two options here to
>> fix the performance regression are to either finish the shared
>> locking conversion, or remove the shared locking on read and re-open
>> a potential data exposure issue...
> 
> We actually had a separate locking mechanism in ext4 code to avoid stale
> data exposure during hole punch when unlocked DIO reads were running. But
> it was kind of ugly and making things complex. I agree we need to move ext4
> DIO path conversion further to avoid taking exclusive lock when we won't
> actually need it.

It seems to me that the right solution for the short term is to revert
the patch in question, since that appears to be incomplete, and reverting
it will restore the performance.  I haven't seen any comments posted with
a counter-example that the original patch actually improved performance,
or that reverting it will cause some other performance regression.

We can then leave implementing a more complete solution to a later kernel.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

  reply	other threads:[~2019-08-26 19:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-19  9:22 [RFC] performance regression with "ext4: Allow parallel DIO reads" Joseph Qi
2019-07-23 11:17 ` Joseph Qi
2019-07-25 21:20   ` Andreas Dilger
2019-07-26  1:12     ` Joseph Qi
2019-07-27  1:57       ` Andreas Dilger
2019-07-27  2:16         ` Joseph Qi
2019-07-28 22:51       ` Dave Chinner
2019-07-30  1:34         ` Joseph Qi
2019-08-15 15:13           ` Jan Kara
2019-08-16 13:23             ` Joseph Qi
2019-08-16 14:57               ` Jan Kara
2019-08-20  3:00                 ` Joseph Qi
2019-08-20 16:08                   ` Theodore Y. Ts'o
2019-08-21  1:04                     ` Joseph Qi
2019-08-21  3:34                       ` Theodore Y. Ts'o
2019-08-22  6:45                         ` Joseph Qi
2019-08-22  5:40                       ` Dave Chinner
2019-08-23  7:57                         ` Joseph Qi
2019-08-23  8:07                           ` Joseph Qi
2019-08-23 10:16                           ` Dave Chinner
2019-08-23 13:08                             ` Joseph Qi
2019-08-24  2:18                               ` Dave Chinner
2019-08-26  8:39                                 ` Jan Kara
2019-08-26 19:10                                   ` Andreas Dilger [this message]
2019-08-27  1:00                                     ` Joseph Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94515D9C-045C-46EA-9F3C-E13CB2DAA1F9@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=bo.liu@linux.alibaba.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jiangqi903@gmail.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).