From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751255AbcFCBBB (ORCPT ); Thu, 2 Jun 2016 21:01:01 -0400 Received: from mail-pa0-f67.google.com ([209.85.220.67]:33013 "EHLO mail-pa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750905AbcFCBA6 (ORCPT ); Thu, 2 Jun 2016 21:00:58 -0400 Date: Fri, 3 Jun 2016 10:00:36 +0900 From: Sergey Senozhatsky To: Ebru Akagunduz Cc: Vlastimil Babka , sergey.senozhatsky.work@gmail.com, Andrew Morton , Michal Hocko , "Kirill A. Shutemov" , Stephen Rothwell , Andrea Arcangeli , Rik van Riel , linux-mm@kvack.org, linux-next@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup Message-ID: <20160603010036.GA464@swordfish> References: <20160601131122.7dbb0a65@canb.auug.org.au> <20160602014835.GA635@swordfish> <0c47a3a0-5530-b257-1c1f-28ed44ba97e6@suse.cz> <20160602185856.GA3854@debian> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160602185856.GA3854@debian> User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (06/02/16 21:58), Ebru Akagunduz wrote: [..] > > I think it's this patch: > > > > http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-make-swapin-readahead-under-down_read-of-mmap_sem.patch > > > > Some parts of the code in collapse_huge_page() that were under > > down_write(mmap_sem) are under down_read() after the patch. But > > there's "goto out" which continues via "goto out_up_write" which > > does up_write(mmap_sem) so there's an imbalance. One path seems to > > go via both up_read() and up_write(). I can imagine this can cause a > > stuck down_write() among other things? > Recently, I realized the same imbalance, it is an obvious > inconsistency. I don't know, this issue can be related with > mine. I'll send a fix patch. a good find by Vlastimil. Ebru, can you also re-visit __collapse_huge_page_swapin()? it's called from collapse_huge_page() under the down_read(&mm->mmap_sem), is there any reason to do the nested down_read(&mm->mmap_sem)? collapse_huge_page() ... down_read(&mm->mmap_sem); result = hugepage_vma_revalidate(mm, vma, address); if (result) goto out; pmd = mm_find_pmd(mm, address); if (!pmd) { result = SCAN_PMD_NULL; goto out; } if (allocstall == curr_allocstall && swap != 0) { if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) { { : if (ret & VM_FAULT_RETRY) { : down_read(&mm->mmap_sem); : ^^^^^^^^^ : if (hugepage_vma_revalidate(mm, vma, address)) : return false; : } } up_read(&mm->mmap_sem); goto out; } } up_read(&mm->mmap_sem); so if __collapse_huge_page_swapin() retruns true we have: - down_read() twice, up_read() once? the locking rules here are a bit confusing. (I didn't have my morning coffee yet). -ss