From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751549AbdINIPy (ORCPT <rfc822;w@1wt.eu>);
        Thu, 14 Sep 2017 04:15:54 -0400
Received: from LGEAMRELO13.lge.com ([156.147.23.53]:38114 "EHLO
        lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751385AbdINIPx (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 14 Sep 2017 04:15:53 -0400
X-Original-SENDERIP: 156.147.1.126
X-Original-MAILFROM: minchan@kernel.org
X-Original-SENDERIP: 10.177.220.163
X-Original-MAILFROM: minchan@kernel.org
Date: Thu, 14 Sep 2017 17:15:47 +0900
From: Minchan Kim <minchan@kernel.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, Johannes Weiner <hannes@cmpxchg.org>,
        Rik van Riel <riel@redhat.com>, Shaohua Li <shli@kernel.org>,
        Hugh Dickins <hughd@google.com>, Fengguang Wu <fengguang.wu@intel.com>,
        Tim Chen <tim.c.chen@intel.com>, Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead
Message-ID: <20170914081547.GC5533@bbox>
References: <20170807054038.1843-1-ying.huang@intel.com>
 <20170807054038.1843-4-ying.huang@intel.com>
 <20170913014019.GB29422@bbox>
 <20170913140229.8a6cad6f017fa3ea8b53cefc@linux-foundation.org>
 <87lglim77z.fsf@yhuang-dev.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87lglim77z.fsf@yhuang-dev.intel.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Sep 14, 2017 at 08:53:04AM +0800, Huang, Ying wrote:
> Hi, Andrew,
> 
> Andrew Morton <akpm@linux-foundation.org> writes:
> 
> > On Wed, 13 Sep 2017 10:40:19 +0900 Minchan Kim <minchan@kernel.org> wrote:
> >
> >> Every zram users like low-end android device has used 0 page-cluster
> >> to disable swap readahead because it has no seek cost and works as
> >> synchronous IO operation so if we do readahead multiple pages,
> >> swap falut latency would be (4K * readahead window size). IOW,
> >> readahead is meaningful only if it doesn't bother faulted page's
> >> latency.
> >> 
> >> However, this patch introduces additional knob /sys/kernel/mm/swap/
> >> vma_ra_max_order as well as page-cluster. It means existing users
> >> has used disabled swap readahead doesn't work until they should be
> >> aware of new knob and modification of their script/code to disable
> >> vma_ra_max_order as well as page-cluster.
> >> 
> >> I say it's a *regression* and wanted to fix it but Huang's opinion
> >> is that it's not a functional regression so userspace should be fixed
> >> by themselves.
> >> Please look into detail of discussion in
> >> http://lkml.kernel.org/r/%3C1505183833-4739-4-git-send-email-minchan@kernel.org%3E
> >
> > hm, tricky problem.  I do agree that linking the physical and virtual
> > readahead schemes in the proposed fashion is unfortunate.  I also agree
> > that breaking existing setups (a bit) is also unfortunate.
> >
> > Would it help if, when page-cluster is written to zero, we do
> >
> > printk_once("physical readahead disabled, virtual readahead still
> > enabled.  Disable virtual readhead via
> > /sys/kernel/mm/swap/vma_ra_max_order").
> >
> > Or something like that.  It's pretty lame, but it should help alert the
> > zram-readahead-disabling people to the issue?
> 
> This sounds good for me.
> 
> Hi, Minchan, what do you think about this?  I think for low-end android
> device, the end-user may have no opportunity to upgrade to the latest
> kernel, the device vendor should care about this.  For desktop users,
> the warning proposed by Andrew may help to remind them for the new knob.

Yes, it would be option. At least, we should alert to the user to make
a chance to fix. However, can't we make vma-based readahead new config
option? Please look at the detail in my reply of andrew.

With that, there is no regression with current users and as a bonus,
user can measure both algorithm with their real workload with both
algorithm rather than artificial benchmark. I think recency vs spartial
locality would have each pros and cons so that kind soft landing would
be safer option rather than sudden replacing.
After a while, we can set new algorithm as default.