From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 242D48E0002 for ; Wed, 16 Jan 2019 16:08:48 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id v64so6560133qka.5 for ; Wed, 16 Jan 2019 13:08:48 -0800 (PST) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id d10sor86390475qvn.64.2019.01.16.13.08.47 for (Google Transport Security); Wed, 16 Jan 2019 13:08:47 -0800 (PST) MIME-Version: 1.0 References: <15614FDC-198E-449B-BFAF-B00D6EF61155@bytedance.com> <97A4C2CA-97BA-46DB-964A-E44410BB1730@bytedance.com> <9B56B884-8FDD-4BB5-A6CA-AD7F84397039@bytedance.com> <20190116070614.GG24149@dhcp22.suse.cz> In-Reply-To: <20190116070614.GG24149@dhcp22.suse.cz> From: Yang Shi Date: Wed, 16 Jan 2019 13:08:35 -0800 Message-ID: Subject: Re: memory cgroup pagecache and inode problem Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Fam Zheng , cgroups@vger.kernel.org, Linux MM , tj@kernel.org, Johannes Weiner , lizefan@huawei.com, Vladimir Davydov , duanxiongchun@bytedance.com, =?UTF-8?B?5byg5rC46IKD?= , liuxiaozhou@bytedance.com On Tue, Jan 15, 2019 at 11:06 PM Michal Hocko wrote: > > On Wed 16-01-19 11:52:08, Fam Zheng wrote: > [...] > > > This is what force_empty is supposed to do. But, as your test shows > > > some page cache may still remain after force_empty, then cause offlin= e > > > memcgs accumulated. I haven't figured out what happened. You may tr= y > > > what Michal suggested. > > > > None of the existing patches helped so far, but we suspect that the > > pages cannot be locked at the force_empty moment. We have being > > working on a =E2=80=9Cretry=E2=80=9D patch which does solve the problem= . We=E2=80=99ll > > do more tracing (to have a better understanding of the issue) and post > > the findings and/or the patch later. Thanks. > > Just for the record. There was a patch to remove > MEM_CGROUP_RECLAIM_RETRIES restriction in the path. I cannot find the > link right now but that is something we certainly can do. The context is > interruptible by signal and it from my experience any retry count can Do you mean this one https://lore.kernel.org/patchwork/patch/865835/ ? I think removing retries is feasible as long as exit is handled correctly. Yang > lead to unexpected failures. But I guess you really want to check > vmscan tracepoints to see why you cannot reclaim pages on memcg LRUs > first. > -- > Michal Hocko > SUSE Labs