linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@linux.alibaba.com>
To: Hugh Dickins <hughd@google.com>
Cc: hannes@cmpxchg.org, Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mgorman@techsingularity.net, tj@kernel.org,
	khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com,
	yang.shi@linux.alibaba.com, willy@infradead.org,
	shakeelb@google.com
Subject: Re: [PATCH v7 00/10] per lruvec lru_lock for memcg
Date: Mon, 13 Jan 2020 20:45:25 +0800	[thread overview]
Message-ID: <24d671ac-36ef-8883-ad94-1bd497d46783@linux.alibaba.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2001130032170.1103@eggly.anvils>



在 2020/1/13 下午4:48, Hugh Dickins 写道:
> On Fri, 10 Jan 2020, Alex Shi wrote:
>> 在 2020/1/2 下午6:21, Alex Shi 写道:
>>> 在 2020/1/1 上午7:05, Andrew Morton 写道:
>>>> On Wed, 25 Dec 2019 17:04:16 +0800 Alex Shi <alex.shi@linux.alibaba.com> wrote:
>>>>
>>>>> This patchset move lru_lock into lruvec, give a lru_lock for each of
>>>>> lruvec, thus bring a lru_lock for each of memcg per node.
>>>>
>>>> I see that there has been plenty of feedback on previous versions, but
>>>> no acked/reviewed tags as yet.
>>>>
>>>> I think I'll take a pass for now, see what the audience feedback looks
>>>> like ;)
>>>>
>>>
>>
>> Hi Johannes,
>>
>> Any comments of this version? :)
> 
> I (Hugh) tried to test it on v5.5-rc5, but did not get very far at all -
> perhaps because my particular interest tends towards tmpfs and swap,
> and swap always made trouble for lruvec lock - one of the reasons why
> our patches were more complicated than you thought necessary.
> 
> Booted a smallish kernel in mem=700M with 1.5G of swap, with intention
> of running small kernel builds in tmpfs and in ext4-on-loop-on-tmpfs
> (losetup was the last command started but I doubt it played much part):
> 
> mount -t tmpfs -o size=470M tmpfs /tst
> cp /dev/zero /tst
> losetup /dev/loop0 /tst/zero

Hi Hugh,

Many thanks for the testing!

I am trying to reproduce your testing, do above 3 steps, then build kernel with 'make -j 8' on my qemu. but cannot reproduce the problem with this v7 version or with v8 version, https://github.com/alexshi/linux/tree/lru-next, which fixed the bug KK mentioned, like the following. 
my qemu vmm like this:

[root@debug010000002015 ~]# mount -t tmpfs -o size=470M tmpfs /tst
[root@debug010000002015 ~]# cp /dev/zero /tst
cp: error writing ‘/tst/zero’: No space left on device
cp: failed to extend ‘/tst/zero’: No space left on device
[root@debug010000002015 ~]# losetup /dev/loop0 /tst/zero
[root@debug010000002015 ~]# cat /proc/cmdline
earlyprintk=ttyS0 root=/dev/sda1 console=ttyS0 debug crashkernel=128M printk.devkmsg=on

my kernel configed with MEMCG/MEMCG_SWAP with xfs rootimage, and compiling kernel under ext4. Could you like to share your kernel config and detailed reproduce steps with me? And would you like to try my new version from above github link in your convenient?

Thanks a lot!
Alex 

 static void commit_charge(struct page *page, struct mem_cgroup *memcg,
                          bool lrucare)
 {
-       int isolated;
+       struct lruvec *lruvec = NULL;

        VM_BUG_ON_PAGE(page->mem_cgroup, page);

@@ -2612,8 +2617,16 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
         * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page
         * may already be on some other mem_cgroup's LRU.  Take care of it.
         */
-       if (lrucare)
-               lock_page_lru(page, &isolated);
+       if (lrucare) {
+               lruvec = lock_page_lruvec_irq(page);
+               if (likely(PageLRU(page))) {
+                       ClearPageLRU(page);
+                       del_page_from_lru_list(page, lruvec, page_lru(page));
+               } else {
+                       unlock_page_lruvec_irq(lruvec);
+                       lruvec = NULL;
+               }
+       }

        /*
         * Nobody should be changing or seriously looking at
@@ -2631,8 +2644,15 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
         */
        page->mem_cgroup = memcg;

-       if (lrucare)
-               unlock_page_lru(page, isolated);
+       if (lrucare && lruvec) {
+               unlock_page_lruvec_irq(lruvec);
+               lruvec = lock_page_lruvec_irq(page);
+
+               VM_BUG_ON_PAGE(PageLRU(page), page);
+               SetPageLRU(page);
+               add_page_to_lru_list(page, lruvec, page_lru(page));
+               unlock_page_lruvec_irq(lruvec);
+       }
 }
> 
> and kernel crashed on the
> 
> VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != page->mem_cgroup, page);
> kernel BUG at mm/memcontrol.c:1268!
> lock_page_lruvec_irqsave
> relock_page_lruvec_irqsave
> pagevec_lru_move_fn
> __pagevec_lru_add
> lru_add_drain_cpu
> lru_add_drain
> swap_cluster_readahead
> shmem_swapin
> shmem_swapin_page
> shmem_getpage_gfp
> shmem_getpage
> shmem_write_begin
> generic_perform_write
> __generic_file_write_iter
> generic_file_write_iter
> new_sync_write
> __vfs_write
> vfs_write
> ksys_write
> __x86_sys_write
> do_syscall_64
> 
> Hugh
> 

  reply	other threads:[~2020-01-13 12:46 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25  9:04 [PATCH v7 00/10] per lruvec lru_lock for memcg Alex Shi
2019-12-25  9:04 ` [PATCH v7 01/10] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2020-01-10  8:39   ` Konstantin Khlebnikov
2020-01-13  7:21     ` Alex Shi
2019-12-25  9:04 ` [PATCH v7 02/10] mm/memcg: fold lru_lock in lock_page_lru Alex Shi
2020-01-10  8:49   ` Konstantin Khlebnikov
2020-01-13  9:45     ` Alex Shi
2020-01-13  9:55       ` Konstantin Khlebnikov
2020-01-13 12:47         ` Alex Shi
2020-01-13 16:34           ` Matthew Wilcox
2020-01-14  9:20             ` Alex Shi
2019-12-25  9:04 ` [PATCH v7 03/10] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2020-01-13 15:41   ` Daniel Jordan
2020-01-14  6:33     ` Alex Shi
2019-12-25  9:04 ` [PATCH v7 04/10] mm/lru: introduce the relock_page_lruvec function Alex Shi
2019-12-25  9:04 ` [PATCH v7 05/10] mm/mlock: optimize munlock_pagevec by relocking Alex Shi
2019-12-25  9:04 ` [PATCH v7 06/10] mm/swap: only change the lru_lock iff page's lruvec is different Alex Shi
2019-12-25  9:04 ` [PATCH v7 07/10] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-12-25  9:04 ` [PATCH v7 08/10] mm/lru: revise the comments of lru_lock Alex Shi
2019-12-25  9:04 ` [PATCH v7 09/10] mm/lru: add debug checking for page memcg moving Alex Shi
2019-12-25  9:04 ` [PATCH v7 10/10] mm/memcg: add debug checking in lock_page_memcg Alex Shi
2019-12-31 23:05 ` [PATCH v7 00/10] per lruvec lru_lock for memcg Andrew Morton
2020-01-02 10:21   ` Alex Shi
2020-01-10  2:01     ` Alex Shi
2020-01-13  8:48       ` Hugh Dickins
2020-01-13 12:45         ` Alex Shi [this message]
2020-01-13 20:20           ` Hugh Dickins
2020-01-14  9:14             ` Alex Shi
2020-01-14  9:29               ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24d671ac-36ef-8883-ad94-1bd497d46783@linux.alibaba.com \
    --to=alex.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).