From: Nick Piggin <npiggin@suse.de> To: Andrew Morton <akpm@linux-foundation.org>, linux-fsdevel@vger.kernel.org, Linux Memory Management List <linux-mm@kvack.org>, Chris Mason <chris.mason@oracle.com> Subject: [patch 2/2] fs: fix page_mkwrite error cases in core code and btrfs Date: Wed, 11 Mar 2009 04:55:03 +0100 [thread overview] Message-ID: <20090311035503.GI16561@wotan.suse.de> (raw) In-Reply-To: <20090311035318.GH16561@wotan.suse.de> page_mkwrite is called with neither the page lock nor the ptl held. This means a page can be concurrently truncated or invalidated out from underneath it. Callers are supposed to prevent truncate races themselves, however previously the only thing they can do in case they hit one is to raise a SIGBUS. A sigbus is wrong for the case that the page has been invalidated or truncated within i_size (eg. hole punched). Callers may also have to perform memory allocations in this path, where again, SIGBUS would be wrong. The previous patch made it possible to properly specify errors. Convert the generic buffer.c code and btrfs to return sane error values (in the case of page removed from pagecache, VM_FAULT_NOPAGE will cause the fault handler to exit without doing anything, and the fault will be retried properly). This fixes core code, and converts btrfs as a template/example. All other filesystems defining their own page_mkwrite should be fixed in a similar manner. Acked-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Nick Piggin <npiggin@suse.de> --- fs/btrfs/inode.c | 11 +++++++---- fs/buffer.c | 12 ++++++++---- 2 files changed, 15 insertions(+), 8 deletions(-) Index: linux-2.6/fs/btrfs/inode.c =================================================================== --- linux-2.6.orig/fs/btrfs/inode.c +++ linux-2.6/fs/btrfs/inode.c @@ -4307,10 +4307,15 @@ int btrfs_page_mkwrite(struct vm_area_st u64 page_end; ret = btrfs_check_data_free_space(root, inode, PAGE_CACHE_SIZE); - if (ret) + if (ret) { + if (ret == -ENOMEM) + ret = VM_FAULT_OOM; + else /* -ENOSPC, -EIO, etc */ + ret = VM_FAULT_SIGBUS; goto out; + } - ret = -EINVAL; + ret = VM_FAULT_NOPAGE; /* make the VM retry the fault */ again: lock_page(page); size = i_size_read(inode); @@ -4363,8 +4368,6 @@ again: out_unlock: unlock_page(page); out: - if (ret) - ret = VM_FAULT_SIGBUS; return ret; } Index: linux-2.6/fs/buffer.c =================================================================== --- linux-2.6.orig/fs/buffer.c +++ linux-2.6/fs/buffer.c @@ -2473,7 +2473,7 @@ block_page_mkwrite(struct vm_area_struct struct inode *inode = vma->vm_file->f_path.dentry->d_inode; unsigned long end; loff_t size; - int ret = -EINVAL; + int ret = VM_FAULT_NOPAGE; /* make the VM retry the fault */ lock_page(page); size = i_size_read(inode); @@ -2493,10 +2493,14 @@ block_page_mkwrite(struct vm_area_struct if (!ret) ret = block_commit_write(page, 0, end); -out_unlock: - if (ret) - ret = VM_FAULT_SIGBUS; + if (unlikely(ret)) { + if (ret == -ENOMEM) + ret = VM_FAULT_OOM; + else /* -ENOSPC, -EIO, etc */ + ret = VM_FAULT_SIGBUS; + } +out_unlock: unlock_page(page); return ret; }
WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <npiggin@suse.de> To: Andrew Morton <akpm@linux-foundation.org>, linux-fsdevel@vger.kernel.org, Linux Memory Management List <linux-mm@kvack.org>, Chris Mason <chris.mason@oracle.com> Subject: [patch 2/2] fs: fix page_mkwrite error cases in core code and btrfs Date: Wed, 11 Mar 2009 04:55:03 +0100 [thread overview] Message-ID: <20090311035503.GI16561@wotan.suse.de> (raw) In-Reply-To: <20090311035318.GH16561@wotan.suse.de> page_mkwrite is called with neither the page lock nor the ptl held. This means a page can be concurrently truncated or invalidated out from underneath it. Callers are supposed to prevent truncate races themselves, however previously the only thing they can do in case they hit one is to raise a SIGBUS. A sigbus is wrong for the case that the page has been invalidated or truncated within i_size (eg. hole punched). Callers may also have to perform memory allocations in this path, where again, SIGBUS would be wrong. The previous patch made it possible to properly specify errors. Convert the generic buffer.c code and btrfs to return sane error values (in the case of page removed from pagecache, VM_FAULT_NOPAGE will cause the fault handler to exit without doing anything, and the fault will be retried properly). This fixes core code, and converts btrfs as a template/example. All other filesystems defining their own page_mkwrite should be fixed in a similar manner. Acked-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Nick Piggin <npiggin@suse.de> --- fs/btrfs/inode.c | 11 +++++++---- fs/buffer.c | 12 ++++++++---- 2 files changed, 15 insertions(+), 8 deletions(-) Index: linux-2.6/fs/btrfs/inode.c =================================================================== --- linux-2.6.orig/fs/btrfs/inode.c +++ linux-2.6/fs/btrfs/inode.c @@ -4307,10 +4307,15 @@ int btrfs_page_mkwrite(struct vm_area_st u64 page_end; ret = btrfs_check_data_free_space(root, inode, PAGE_CACHE_SIZE); - if (ret) + if (ret) { + if (ret == -ENOMEM) + ret = VM_FAULT_OOM; + else /* -ENOSPC, -EIO, etc */ + ret = VM_FAULT_SIGBUS; goto out; + } - ret = -EINVAL; + ret = VM_FAULT_NOPAGE; /* make the VM retry the fault */ again: lock_page(page); size = i_size_read(inode); @@ -4363,8 +4368,6 @@ again: out_unlock: unlock_page(page); out: - if (ret) - ret = VM_FAULT_SIGBUS; return ret; } Index: linux-2.6/fs/buffer.c =================================================================== --- linux-2.6.orig/fs/buffer.c +++ linux-2.6/fs/buffer.c @@ -2473,7 +2473,7 @@ block_page_mkwrite(struct vm_area_struct struct inode *inode = vma->vm_file->f_path.dentry->d_inode; unsigned long end; loff_t size; - int ret = -EINVAL; + int ret = VM_FAULT_NOPAGE; /* make the VM retry the fault */ lock_page(page); size = i_size_read(inode); @@ -2493,10 +2493,14 @@ block_page_mkwrite(struct vm_area_struct if (!ret) ret = block_commit_write(page, 0, end); -out_unlock: - if (ret) - ret = VM_FAULT_SIGBUS; + if (unlikely(ret)) { + if (ret == -ENOMEM) + ret = VM_FAULT_OOM; + else /* -ENOSPC, -EIO, etc */ + ret = VM_FAULT_SIGBUS; + } +out_unlock: unlock_page(page); return ret; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-11 3:55 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2009-03-11 3:53 [patch 1/2] mm: page_mkwrite change prototype to match fault Nick Piggin 2009-03-11 3:55 ` Nick Piggin [this message] 2009-03-11 3:55 ` [patch 2/2] fs: fix page_mkwrite error cases in core code and btrfs Nick Piggin 2009-03-12 22:08 ` Trond Myklebust 2009-03-12 22:08 ` Trond Myklebust 2009-03-12 23:03 ` Sage Weil 2009-03-12 23:03 ` Sage Weil 2009-03-13 2:20 ` Nick Piggin 2009-03-13 2:20 ` Nick Piggin 2009-03-13 3:21 ` Sage Weil
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20090311035503.GI16561@wotan.suse.de \ --to=npiggin@suse.de \ --cc=akpm@linux-foundation.org \ --cc=chris.mason@oracle.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-mm@kvack.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.