From: Mel Gorman <mgorman@techsingularity.net> To: Andrew Morton <akpm@linux-foundation.org> Cc: Jerome Marchand <jmarchan@redhat.com>, Trond Myklebust <trond.myklebust@primarydata.com>, Anna Schumaker <anna.schumaker@netapp.com>, Hugh Dickins <hughd@google.com>, Linux NFS Mailing List <linux-nfs@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linux-MM <linux-mm@kvack.org> Subject: [PATCH] mm: swap: Use swap_lock to prevent parallel swapon activations instead of i_mutex Date: Tue, 29 Sep 2015 15:23:47 +0100 [thread overview] Message-ID: <20150929142347.GK3068@techsingularity.net> (raw) Jerome Marchand reported a lockdep warning as follows [ 6819.501009] ================================= [ 6819.501009] [ INFO: inconsistent lock state ] [ 6819.501009] 4.2.0-rc1-shmacct-babka-v2-next-20150709+ #255 Not tainted [ 6819.501009] --------------------------------- [ 6819.501009] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 6819.501009] kswapd0/38 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 6819.501009] (&sb->s_type->i_mutex_key#17){+.+.?.}, at: [<ffffffffa03772a5>] nfs_file_direct_write+0x85/0x3f0 [nfs] [ 6819.501009] {RECLAIM_FS-ON-W} state was registered at: [ 6819.501009] [<ffffffff81107f51>] mark_held_locks+0x71/0x90 [ 6819.501009] [<ffffffff8110b775>] lockdep_trace_alloc+0x75/0xe0 [ 6819.501009] [<ffffffff81245529>] kmem_cache_alloc_node_trace+0x39/0x440 [ 6819.501009] [<ffffffff81225b8f>] __get_vm_area_node+0x7f/0x160 [ 6819.501009] [<ffffffff81226eb2>] __vmalloc_node_range+0x72/0x2c0 [ 6819.501009] [<ffffffff81227424>] vzalloc+0x54/0x60 [ 6819.501009] [<ffffffff8122c7c8>] SyS_swapon+0x628/0xfc0 [ 6819.501009] [<ffffffff81867772>] entry_SYSCALL_64_fastpath+0x12/0x76 It's due to NFS acquiring i_mutex since a9ab5e840669 ("nfs: page cache invalidation for dio") to invalidate page cache before direct I/O. Filesystems may safely acquire i_mutex during direct writes but NFS is unique in its treatment of swap files. Ordinarily swap files are supported by the core VM looking up the physical block for a given offset in advance. There is no physical block for NFS and the direct write paths are used after calling mapping->swap_activate. The lockdep warning is triggered by swapon(), which is not in reclaim context, acquiring the i_mutex to ensure a swapfile is not activated twice. swapon does not need the i_mutex for this purpose. There is a requirement that fallocate not be used on swapfiles but this is protected by the inode flag S_SWAPFILE and nothing to do with i_mutex. In fact, the current protection does nothing for block devices. This patch expands the role of swap_lock to protect against parallel activations of block devices and swapfiles and removes the use of i_mutex. This both improves the protection for swapon and avoids the lockdep warning. Reported-and-tested-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- mm/swapfile.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 58877312cf6b..e55a69fd24e4 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1970,9 +1970,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) set_blocksize(bdev, old_block_size); blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); } else { - mutex_lock(&inode->i_mutex); + spin_lock(&swap_lock); inode->i_flags &= ~S_SWAPFILE; - mutex_unlock(&inode->i_mutex); + spin_unlock(&swap_lock); } filp_close(swap_file, NULL); @@ -2197,7 +2197,6 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) p->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { p->bdev = inode->i_sb->s_bdev; - mutex_lock(&inode->i_mutex); if (IS_SWAPFILE(inode)) return -EBUSY; } else @@ -2426,12 +2425,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) goto bad_swap; } + /* prevent parallel swapons */ + spin_lock(&swap_lock); p->swap_file = swap_file; mapping = swap_file->f_mapping; inode = mapping->host; /* If S_ISREG(inode->i_mode) will do mutex_lock(&inode->i_mutex); */ error = claim_swapfile(p, inode); + spin_unlock(&swap_lock); if (unlikely(error)) goto bad_swap; @@ -2574,10 +2576,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) vfree(swap_map); vfree(cluster_info); if (swap_file) { - if (inode && S_ISREG(inode->i_mode)) { - mutex_unlock(&inode->i_mutex); + if (inode && S_ISREG(inode->i_mode)) inode = NULL; - } filp_close(swap_file, NULL); } out: @@ -2587,8 +2587,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } if (name) putname(name); - if (inode && S_ISREG(inode->i_mode)) - mutex_unlock(&inode->i_mutex); return error; }
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net> To: Andrew Morton <akpm@linux-foundation.org> Cc: Jerome Marchand <jmarchan@redhat.com>, Trond Myklebust <trond.myklebust@primarydata.com>, Anna Schumaker <anna.schumaker@netapp.com>, Hugh Dickins <hughd@google.com>, Linux NFS Mailing List <linux-nfs@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linux-MM <linux-mm@kvack.org> Subject: [PATCH] mm: swap: Use swap_lock to prevent parallel swapon activations instead of i_mutex Date: Tue, 29 Sep 2015 15:23:47 +0100 [thread overview] Message-ID: <20150929142347.GK3068@techsingularity.net> (raw) Jerome Marchand reported a lockdep warning as follows [ 6819.501009] ================================= [ 6819.501009] [ INFO: inconsistent lock state ] [ 6819.501009] 4.2.0-rc1-shmacct-babka-v2-next-20150709+ #255 Not tainted [ 6819.501009] --------------------------------- [ 6819.501009] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 6819.501009] kswapd0/38 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 6819.501009] (&sb->s_type->i_mutex_key#17){+.+.?.}, at: [<ffffffffa03772a5>] nfs_file_direct_write+0x85/0x3f0 [nfs] [ 6819.501009] {RECLAIM_FS-ON-W} state was registered at: [ 6819.501009] [<ffffffff81107f51>] mark_held_locks+0x71/0x90 [ 6819.501009] [<ffffffff8110b775>] lockdep_trace_alloc+0x75/0xe0 [ 6819.501009] [<ffffffff81245529>] kmem_cache_alloc_node_trace+0x39/0x440 [ 6819.501009] [<ffffffff81225b8f>] __get_vm_area_node+0x7f/0x160 [ 6819.501009] [<ffffffff81226eb2>] __vmalloc_node_range+0x72/0x2c0 [ 6819.501009] [<ffffffff81227424>] vzalloc+0x54/0x60 [ 6819.501009] [<ffffffff8122c7c8>] SyS_swapon+0x628/0xfc0 [ 6819.501009] [<ffffffff81867772>] entry_SYSCALL_64_fastpath+0x12/0x76 It's due to NFS acquiring i_mutex since a9ab5e840669 ("nfs: page cache invalidation for dio") to invalidate page cache before direct I/O. Filesystems may safely acquire i_mutex during direct writes but NFS is unique in its treatment of swap files. Ordinarily swap files are supported by the core VM looking up the physical block for a given offset in advance. There is no physical block for NFS and the direct write paths are used after calling mapping->swap_activate. The lockdep warning is triggered by swapon(), which is not in reclaim context, acquiring the i_mutex to ensure a swapfile is not activated twice. swapon does not need the i_mutex for this purpose. There is a requirement that fallocate not be used on swapfiles but this is protected by the inode flag S_SWAPFILE and nothing to do with i_mutex. In fact, the current protection does nothing for block devices. This patch expands the role of swap_lock to protect against parallel activations of block devices and swapfiles and removes the use of i_mutex. This both improves the protection for swapon and avoids the lockdep warning. Reported-and-tested-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- mm/swapfile.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 58877312cf6b..e55a69fd24e4 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1970,9 +1970,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) set_blocksize(bdev, old_block_size); blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); } else { - mutex_lock(&inode->i_mutex); + spin_lock(&swap_lock); inode->i_flags &= ~S_SWAPFILE; - mutex_unlock(&inode->i_mutex); + spin_unlock(&swap_lock); } filp_close(swap_file, NULL); @@ -2197,7 +2197,6 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) p->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { p->bdev = inode->i_sb->s_bdev; - mutex_lock(&inode->i_mutex); if (IS_SWAPFILE(inode)) return -EBUSY; } else @@ -2426,12 +2425,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) goto bad_swap; } + /* prevent parallel swapons */ + spin_lock(&swap_lock); p->swap_file = swap_file; mapping = swap_file->f_mapping; inode = mapping->host; /* If S_ISREG(inode->i_mode) will do mutex_lock(&inode->i_mutex); */ error = claim_swapfile(p, inode); + spin_unlock(&swap_lock); if (unlikely(error)) goto bad_swap; @@ -2574,10 +2576,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) vfree(swap_map); vfree(cluster_info); if (swap_file) { - if (inode && S_ISREG(inode->i_mode)) { - mutex_unlock(&inode->i_mutex); + if (inode && S_ISREG(inode->i_mode)) inode = NULL; - } filp_close(swap_file, NULL); } out: @@ -2587,8 +2587,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } if (name) putname(name); - if (inode && S_ISREG(inode->i_mode)) - mutex_unlock(&inode->i_mutex); return error; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2015-09-29 14:24 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-09-29 14:23 Mel Gorman [this message] 2015-09-29 14:23 ` [PATCH] mm: swap: Use swap_lock to prevent parallel swapon activations instead of i_mutex Mel Gorman 2015-09-30 21:15 ` Andrew Morton 2015-09-30 21:15 ` Andrew Morton
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150929142347.GK3068@techsingularity.net \ --to=mgorman@techsingularity.net \ --cc=akpm@linux-foundation.org \ --cc=anna.schumaker@netapp.com \ --cc=hughd@google.com \ --cc=jmarchan@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nfs@vger.kernel.org \ --cc=trond.myklebust@primarydata.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.