From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755958AbcDDOqi (ORCPT ); Mon, 4 Apr 2016 10:46:38 -0400 Received: from mx2.suse.de ([195.135.220.15]:55220 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755760AbcDDOqh (ORCPT ); Mon, 4 Apr 2016 10:46:37 -0400 Subject: Re: [PATCH v3 01/16] mm: use put_page to free page instead of putback_lru_page To: Naoya Horiguchi , Minchan Kim References: <1459321935-3655-1-git-send-email-minchan@kernel.org> <1459321935-3655-2-git-send-email-minchan@kernel.org> <56FE706D.7080507@suse.cz> <20160404013917.GC6543@bbox> <20160404044458.GA20250@hori1.linux.bs1.fc.nec.co.jp> Cc: Andrew Morton , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "jlayton@poochiereds.net" , "bfields@fieldses.org" , Joonsoo Kim , "koct9i@gmail.com" , "aquini@redhat.com" , "virtualization@lists.linux-foundation.org" , Mel Gorman , Hugh Dickins , Sergey Senozhatsky , Rik van Riel , "rknize@motorola.com" , Gioh Kim , Sangseok Lee , Chan Gyun Jeong , Al Viro , YiPing Xu From: Vlastimil Babka Message-ID: <57027E47.7070909@suse.cz> Date: Mon, 4 Apr 2016 16:46:31 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 MIME-Version: 1.0 In-Reply-To: <20160404044458.GA20250@hori1.linux.bs1.fc.nec.co.jp> Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/04/2016 06:45 AM, Naoya Horiguchi wrote: > On Mon, Apr 04, 2016 at 10:39:17AM +0900, Minchan Kim wrote: >> Thanks for catching it, Vlastimil. >> It was my mistake. But in this chance, I looked over hwpoison code and >> I saw other places which increases num_poisoned_pages are successful >> migration, already freed page and successful invalidated page. >> IOW, they are already successful isolated page so I guess it should >> increase the count when only successful migration is done? > > Yes, that's right. When exiting with migration's failure, we shouldn't call > test_set_page_hwpoison or num_poisoned_pages_inc, so current code checking > (rc != -EAGAIN) is simply incorrect. Your change fixes the bug in memory > error handling. Great! Ah, I see, soft onlining works differently than I thought. >> And when I read memory_failure, it bails out without killing if it >> encounters HWPoisoned page so I think it's not for catching and >> kill the poor proces. >> >>> >>> Also (but not your fault) the put_page() preceding >>> test_set_page_hwpoison(page)) IMHO deserves a comment saying which >>> pin we are releasing and which one we still have (hopefully? if I >>> read description of da1b13ccfbebe right) otherwise it looks like >>> doing something with a page that we just potentially freed. >> >> Yes, while I read the code, I had same question. I think the releasing >> refcount is for get_any_page. > > As the other callers of page migration do, soft_offline_page expects the > migration source page to be freed at this put_page() (no pin remains.) > The refcount released here is from isolate_lru_page() in __soft_offline_page(). > (the pin by get_any_page is released by put_hwpoison_page just after it.) > > .. yes, doing something just after freeing page looks weird, but that's > how PageHWPoison flag works. IOW, many other page flags are maintained > only during one "allocate-free" life span, but PageHWPoison still does > its job beyond it. But what prevents the page from being allocated again between put_page() and test_set_page_hwpoison()? In that case we would be marking page poisoned while still in use, which is the same as marking it while still in use after a failed migration? (Also, which part prevents pages with PageHWPoison to be allocated again, anyway? I can't find it and test_set_page_hwpoison() doesn't remove from buddy freelists). Thanks. > As for commenting, this put_page() is called in any MIGRATEPAGE_SUCCESS > case (regardless of callers), so what we can say here is "we free the > source page here, bypassing LRU list" or something? > > Thanks, > Naoya Horiguchi >