linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan.kim@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] mm: disallow direct reclaim page writeback
Date: Wed, 14 Apr 2010 16:54:17 +0900	[thread overview]
Message-ID: <t2r28c262361004140054t807b7edbzc69e7830f6978735@mail.gmail.com> (raw)
In-Reply-To: <20100414044458.GF2493@dastard>

On Wed, Apr 14, 2010 at 1:44 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Wed, Apr 14, 2010 at 09:24:33AM +0900, Minchan Kim wrote:
>> Hi, Dave.
>>
>> On Tue, Apr 13, 2010 at 9:17 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > From: Dave Chinner <dchinner@redhat.com>
>> >
>> > When we enter direct reclaim we may have used an arbitrary amount of stack
>> > space, and hence enterring the filesystem to do writeback can then lead to
>> > stack overruns. This problem was recently encountered x86_64 systems with
>> > 8k stacks running XFS with simple storage configurations.
>> >
>> > Writeback from direct reclaim also adversely affects background writeback. The
>> > background flusher threads should already be taking care of cleaning dirty
>> > pages, and direct reclaim will kick them if they aren't already doing work. If
>> > direct reclaim is also calling ->writepage, it will cause the IO patterns from
>> > the background flusher threads to be upset by LRU-order writeback from
>> > pageout() which can be effectively random IO. Having competing sources of IO
>> > trying to clean pages on the same backing device reduces throughput by
>> > increasing the amount of seeks that the backing device has to do to write back
>> > the pages.
>> >
>> > Hence for direct reclaim we should not allow ->writepages to be entered at all.
>> > Set up the relevant scan_control structures to enforce this, and prevent
>> > sc->may_writepage from being set in other places in the direct reclaim path in
>> > response to other events.
>>
>> I think your solution is rather aggressive change as Mel and Kosaki
>> already pointed out.
>
> It may be agressive, but writeback from direct reclaim is, IMO, one
> of the worst aspects of the current VM design because of it's
> adverse effect on the IO subsystem.

Tend to agree. But De we need it by last resort if flusher thread
can't catch up
write stream?
Or In my opinion, Could I/O layer have better throttle logic than now?

>
> I'd prefer to remove it completely that continue to try and patch
> around it, especially given that everyone seems to agree that it
> does have an adverse affect on IO...

Of course, If everybody agree, we can do it.
For it, we need many benchmark result which is very hard.
Maybe I will help it in embedded system.

>
>> Do flush thread aware LRU of dirty pages in system level recency not
>> dirty pages recency?
>
> It writes back in the order inodes were dirtied. i.e. the LRU is a
> coarser measure, but it it still definitely there. It also takes
> into account fairness of IO between dirty inodes, so no one dirty
> inode prevents IO beining issued on a other dirty inodes on the
> LRU...

Thanks.
It seems to be lost recency.
I am not sure how much it affects system performance.

>
>> Of course flush thread can clean dirty pages faster than direct reclaimer.
>> But if it don't aware LRUness, hot page thrashing can be happened by
>> corner case.
>> It could lost write merge.
>>
>> And non-rotation storage might be not big of seek cost.
>
> Non-rotational storage still goes faster when it is fed large, well
> formed IOs.

Agreed. I missed. Nand device is stronger than HDD about random read.
But ramdom write is very weak in performance and wear-leveling.

>
>> I think we have to consider that case if we decide to change direct reclaim I/O.
>>
>> How do we separate the problem?
>>
>> 1. stack hogging problem.
>> 2. direct reclaim random write.
>
> AFAICT, the only way to _reliably_ avoid the stack usage problem is
> to avoid writeback in direct reclaim. That has the side effect of
> fixing #2 as well, so do they really need separating?

If we can do it, it's good.
but 2. problem is not easy to fix, I think.
Compared to 2, 1 is rather easy.
So I thought we can solve 1 firstly and then focusing 2.
If your suggestion is right, then we can apply your idea.
Then we don't need to revert the patch of 1 since small stack usage is
always good
if we don't lost big performance.

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>



-- 
Kind regards,
Minchan Kim

  reply	other threads:[~2010-04-14  7:54 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-13  0:17 [PATCH] mm: disallow direct reclaim page writeback Dave Chinner
2010-04-13  8:31 ` KOSAKI Motohiro
2010-04-13 10:29   ` Dave Chinner
2010-04-13 11:39     ` KOSAKI Motohiro
2010-04-13 14:36       ` Dave Chinner
2010-04-14  3:12         ` Dave Chinner
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-15  1:56             ` Dave Chinner
2010-04-14  6:52         ` KOSAKI Motohiro
2010-04-14  7:36           ` Dave Chinner
2010-04-13  9:58 ` Mel Gorman
2010-04-13 11:19   ` Dave Chinner
2010-04-13 19:34     ` Mel Gorman
2010-04-13 20:20       ` Chris Mason
2010-04-14  1:40         ` Dave Chinner
2010-04-14  4:59           ` KAMEZAWA Hiroyuki
2010-04-14  5:41             ` Dave Chinner
2010-04-14  5:54               ` KOSAKI Motohiro
2010-04-14  6:13                 ` Minchan Kim
2010-04-14  7:19                   ` Minchan Kim
2010-04-14  9:42                     ` KAMEZAWA Hiroyuki
2010-04-14 10:01                       ` Minchan Kim
2010-04-14 10:07                         ` Mel Gorman
2010-04-14 10:16                           ` Minchan Kim
2010-04-14  7:06                 ` Dave Chinner
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-14  7:28             ` Dave Chinner
2010-04-14  8:51               ` Mel Gorman
2010-04-15  1:34                 ` Dave Chinner
2010-04-15  4:09                   ` KOSAKI Motohiro
2010-04-15  4:11                     ` [PATCH 1/4] vmscan: delegate pageout io to flusher thread if current is kswapd KOSAKI Motohiro
2010-04-15  8:05                       ` Suleiman Souhlal
2010-04-15  8:17                         ` KOSAKI Motohiro
2010-04-15  8:26                           ` KOSAKI Motohiro
2010-04-15 10:30                             ` Johannes Weiner
2010-04-15 17:24                               ` Suleiman Souhlal
2010-04-20  2:56                               ` Ying Han
2010-04-15  9:32                         ` Dave Chinner
2010-04-15  9:41                           ` KOSAKI Motohiro
2010-04-15 17:27                           ` Suleiman Souhlal
2010-04-15 23:33                             ` Dave Chinner
2010-04-15 23:41                               ` Suleiman Souhlal
2010-04-16  9:50                               ` Alan Cox
2010-04-17  3:06                                 ` Dave Chinner
2010-04-15  8:18                       ` KOSAKI Motohiro
2010-04-15 10:31                       ` Mel Gorman
2010-04-15 11:26                         ` KOSAKI Motohiro
2010-04-15  4:13                     ` [PATCH 2/4] vmscan: kill prev_priority completely KOSAKI Motohiro
2010-04-15  4:14                     ` [PATCH 3/4] vmscan: move priority variable into scan_control KOSAKI Motohiro
2010-04-15  4:15                     ` [PATCH 4/4] vmscan: delegate page cleaning io to flusher thread if VM pressure is low KOSAKI Motohiro
2010-04-15  4:35                     ` [PATCH] mm: disallow direct reclaim page writeback KOSAKI Motohiro
2010-04-15  6:32                       ` Dave Chinner
2010-04-15  6:44                         ` KOSAKI Motohiro
2010-04-15  6:58                           ` Dave Chinner
2010-04-15  6:20                     ` Dave Chinner
2010-04-15  6:35                       ` KOSAKI Motohiro
2010-04-15  8:54                         ` Dave Chinner
2010-04-15 10:21                           ` KOSAKI Motohiro
2010-04-15 10:23                             ` [PATCH 1/4] vmscan: simplify shrink_inactive_list() KOSAKI Motohiro
2010-04-15 13:15                               ` Mel Gorman
2010-04-15 15:01                                 ` Andi Kleen
2010-04-15 15:44                                   ` Mel Gorman
2010-04-15 16:54                                     ` Andi Kleen
2010-04-15 23:40                                       ` Dave Chinner
2010-04-16  7:13                                         ` Andi Kleen
2010-04-16 14:57                                         ` Mel Gorman
2010-04-17  2:37                                           ` Dave Chinner
2010-04-16 14:55                                       ` Mel Gorman
2010-04-15 18:22                                 ` Valdis.Kletnieks
2010-04-16  9:39                                   ` Mel Gorman
2010-04-15 10:24                             ` [PATCH 2/4] [cleanup] mm: introduce free_pages_prepare KOSAKI Motohiro
2010-04-15 13:33                               ` Mel Gorman
2010-04-15 10:24                             ` [PATCH 3/4] mm: introduce free_pages_bulk KOSAKI Motohiro
2010-04-15 13:46                               ` Mel Gorman
2010-04-15 10:26                             ` [PATCH 4/4] vmscan: replace the pagevec in shrink_inactive_list() with list KOSAKI Motohiro
2010-04-15 10:28                   ` [PATCH] mm: disallow direct reclaim page writeback Mel Gorman
2010-04-15 13:42                     ` Chris Mason
2010-04-15 17:50                       ` tytso
2010-04-16 15:05                       ` Mel Gorman
2010-04-19 15:15                         ` Mel Gorman
2010-04-16  4:14                     ` Dave Chinner
2010-04-16 15:14                       ` Mel Gorman
2010-04-18  0:32                         ` Andrew Morton
2010-04-18 19:05                           ` Christoph Hellwig
2010-04-18 16:31                             ` Andrew Morton
2010-04-18 19:35                               ` Christoph Hellwig
2010-04-18 19:11                             ` Sorin Faibish
2010-04-18 19:10                           ` Sorin Faibish
2010-04-18 21:30                             ` James Bottomley
2010-04-18 23:34                               ` Sorin Faibish
2010-04-19  3:08                               ` tytso
2010-04-19  0:35                           ` Dave Chinner
2010-04-19  0:49                             ` Arjan van de Ven
2010-04-19  1:08                               ` Dave Chinner
2010-04-19  4:32                                 ` Arjan van de Ven
2010-04-19 15:20                         ` Mel Gorman
2010-04-23  1:06                           ` Dave Chinner
2010-04-23 10:50                             ` Mel Gorman
2010-04-15 14:57                   ` Andi Kleen
2010-04-15  2:37                 ` Johannes Weiner
2010-04-15  2:43                   ` KOSAKI Motohiro
2010-04-16 23:56                     ` Johannes Weiner
2010-04-14  6:52         ` KOSAKI Motohiro
2010-04-14 10:06         ` Andi Kleen
2010-04-14 11:20           ` Chris Mason
2010-04-14 12:15             ` Andi Kleen
2010-04-14 12:32               ` Alan Cox
2010-04-14 12:34                 ` Andi Kleen
2010-04-14 13:23             ` Mel Gorman
2010-04-14 14:07               ` Chris Mason
2010-04-14  0:24 ` Minchan Kim
2010-04-14  4:44   ` Dave Chinner
2010-04-14  7:54     ` Minchan Kim [this message]
2010-04-16  1:13 ` KAMEZAWA Hiroyuki
2010-04-16  4:18   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=t2r28c262361004140054t807b7edbzc69e7830f6978735@mail.gmail.com \
    --to=minchan.kim@gmail.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).