* [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix @ 2012-08-22 1:33 Hugh Dickins 2012-08-22 9:12 ` Richard W.M. Jones 2012-08-22 11:52 ` Richard W.M. Jones 0 siblings, 2 replies; 10+ messages in thread From: Hugh Dickins @ 2012-08-22 1:33 UTC (permalink / raw) To: Jeff Moyer Cc: Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Richard W.M. Jones, Josh Boyer, linux-kernel Jeff, Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), already gone into 3.* stable, is not good. Could you and your testers please give this alternative a try - I think it should work, and have started it on a few days' memory load on 3.5, but not tried your case. But, see final paragraph of comment below, I'm worried whether all __getblk() users are actually ready for your new NULL case. Thanks, Hugh [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") is not good: a successful call to grow_buffers() cannot guarantee that the page won't be reclaimed before the immediate next call to __find_get_block(), which is why there was always a loop there. Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: inode #19278: block 664: comm cc1: unable to read itable block" on console, which pointed to this commit. I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs sometimes fails from a missing header file, under memory pressure on ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 itself, despite that commit being in there: bisection pointed to an irrelevant pinctrl merge, but hard to tell when failure takes between 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). (I've since found such __ext4_get_inode_loc errors in /var/log/messages from previous weeks: why the message never appeared on console until yesterday morning is a mystery for another day.) Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus a checkpatch nitfix). Avoid the infinite loop beyond end of device by instead checking buffer_mapped(bh) in grow_dev_page() (I presume that's more efficient than a repeated call to blkdev_max_block()), and returning -ENXIO to __getblk_slow() in that case. And remove akpm's ten-year-old "__getblk() cannot fail ... weird" comment, but that is worrying: are all users of __getblk() really now prepared for a NULL bh beyond end of device, or will some oops?? Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5 --- fs/buffer.c | 43 +++++++++++++++++++++---------------------- 1 file changed, 21 insertions(+), 22 deletions(-) --- 3.6-rc2/fs/buffer.c 2012-08-04 09:19:20.644022328 -0700 +++ linux/fs/buffer.c 2012-08-21 08:49:32.333908590 -0700 @@ -950,6 +950,7 @@ grow_dev_page(struct block_device *bdev, struct inode *inode = bdev->bd_inode; struct page *page; struct buffer_head *bh; + int err = 0; page = find_or_create_page(inode->i_mapping, index, (mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE); @@ -962,7 +963,11 @@ grow_dev_page(struct block_device *bdev, bh = page_buffers(page); if (bh->b_size == size) { init_page_buffers(page, bdev, block, size); - return page; + if (buffer_mapped(bh)) + return page; + + err = -ENXIO; + goto failed; } if (!try_to_free_buffers(page)) goto failed; @@ -984,12 +989,14 @@ grow_dev_page(struct block_device *bdev, link_dev_buffers(page, bh); init_page_buffers(page, bdev, block, size); spin_unlock(&inode->i_mapping->private_lock); - return page; + if (buffer_mapped(bh)) + return page; + err = -ENXIO; failed: unlock_page(page); page_cache_release(page); - return NULL; + return ERR_PTR(err); } /* @@ -1026,8 +1033,8 @@ grow_buffers(struct block_device *bdev, block = index << sizebits; /* Create a page with the proper size buffers.. */ page = grow_dev_page(bdev, block, index, size); - if (!page) - return 0; + if (IS_ERR_OR_NULL(page)) + return PTR_RET(page); unlock_page(page); page_cache_release(page); return 1; @@ -1036,9 +1043,6 @@ grow_buffers(struct block_device *bdev, static struct buffer_head * __getblk_slow(struct block_device *bdev, sector_t block, int size) { - int ret; - struct buffer_head *bh; - /* Size must be multiple of hard sectorsize */ if (unlikely(size & (bdev_logical_block_size(bdev)-1) || (size < 512 || size > PAGE_SIZE))) { @@ -1051,21 +1055,20 @@ __getblk_slow(struct block_device *bdev, return NULL; } -retry: - bh = __find_get_block(bdev, block, size); - if (bh) - return bh; + for (;;) { + struct buffer_head *bh; + int ret; - ret = grow_buffers(bdev, block, size); - if (ret == 0) { - free_more_memory(); - goto retry; - } else if (ret > 0) { bh = __find_get_block(bdev, block, size); if (bh) return bh; + + ret = grow_buffers(bdev, block, size); + if (ret < 0) + return NULL; + if (ret == 0) + free_more_memory(); } - return NULL; } /* @@ -1321,10 +1324,6 @@ EXPORT_SYMBOL(__find_get_block); * which corresponds to the passed block_device, block and size. The * returned buffer has its reference count incremented. * - * __getblk() cannot fail - it just keeps trying. If you pass it an - * illegal block number, __getblk() will happily return a buffer_head - * which represents the non-existent block. Very weird. - * * __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers() * attempt is failing. FIXME, perhaps? */ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins @ 2012-08-22 9:12 ` Richard W.M. Jones 2012-08-22 11:52 ` Richard W.M. Jones 1 sibling, 0 replies; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-22 9:12 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > Jeff, > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > already gone into 3.* stable, is not good. Could you and your testers > please give this alternative a try - I think it should work, and have > started it on a few days' memory load on 3.5, but not tried your case. I'll try your patch later today, but the test is very simple: - Create a 1024 byte empty (all zeroes) partition and then try to mount it. Failure was that the mount command would go into an infinite loop using 100% CPU. Success is that it gives an error. More about it here including a very simple libguestfs script to reproduce it: https://bugzilla.redhat.com/show_bug.cgi?id=835019#c0 and a virtual disk image that also demonstrates the problem without needing anything except a VM running the test kernel: https://bugzilla.redhat.com/show_bug.cgi?id=835019#c1 Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones New in Fedora 11: Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 70 libraries supprt'd http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins 2012-08-22 9:12 ` Richard W.M. Jones @ 2012-08-22 11:52 ` Richard W.M. Jones 2012-08-23 4:56 ` Hugh Dickins 1 sibling, 1 reply; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-22 11:52 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > Jeff, > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > already gone into 3.* stable, is not good. Could you and your testers > please give this alternative a try - I think it should work, and have > started it on a few days' memory load on 3.5, but not tried your case. I have tested your patch and it does NOT fix the problem in http://bugzilla.redhat.com/835019 Kernel 3.6.0 + your patch => mount goes into a loop when mounting a small empty partition. Please do NOT apply this as it will cause a regression! Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-22 11:52 ` Richard W.M. Jones @ 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Hugh Dickins @ 2012-08-23 4:56 UTC (permalink / raw) To: Richard W.M. Jones Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Wed, 22 Aug 2012, Richard W.M. Jones wrote: > On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > > Jeff, > > > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > > already gone into 3.* stable, is not good. Could you and your testers > > please give this alternative a try - I think it should work, and have > > started it on a few days' memory load on 3.5, but not tried your case. > > I have tested your patch and it does NOT fix the problem in > http://bugzilla.redhat.com/835019 > Kernel 3.6.0 + your patch => mount goes into a loop when mounting > a small empty partition. Please do NOT apply this as it will > cause a regression! That was all very helpful information that you provided, thank you. Sorry, I had missed how "block" is massaged as it's passed down a level. The patch below fixes it for me, though it does have to change more than I'd been hoping in such a fix. Perhaps I am just being silly to resist repeating the call to blkdev_max_block(). Does this patch work for you? Thanks, Hugh [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") is not good: a successful call to grow_buffers() cannot guarantee that the page won't be reclaimed before the immediate next call to __find_get_block(), which is why there was always a loop there. Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: inode #19278: block 664: comm cc1: unable to read itable block" on console, which pointed to this commit. I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs sometimes fails from a missing header file, under memory pressure on ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 itself, despite that commit being in there: bisection pointed to an irrelevant pinctrl merge, but hard to tell when failure takes between 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). (I've since found such __ext4_get_inode_loc errors in /var/log/messages from previous weeks: why the message never appeared on console until yesterday morning is a mystery for another day.) Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus a checkpatch nitfix). Simplify the interface between grow_buffers() and grow_dev_page(), and avoid the infinite loop beyond end of device by instead checking init_page_buffers()'s end_block there (I presume that's more efficient than a repeated call to blkdev_max_block()), returning -ENXIO to __getblk_slow() in that case. And remove akpm's ten-year-old "__getblk() cannot fail ... weird" comment, but that is worrying: are all users of __getblk() really now prepared for a NULL bh beyond end of device, or will some oops?? Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5 --- fs/buffer.c | 66 ++++++++++++++++++++++---------------------------- 1 file changed, 30 insertions(+), 36 deletions(-) --- 3.6-rc3/fs/buffer.c 2012-08-04 09:19:20.644022328 -0700 +++ linux/fs/buffer.c 2012-08-22 17:11:57.143827063 -0700 @@ -914,7 +914,7 @@ link_dev_buffers(struct page *page, stru /* * Initialise the state of a blockdev page's buffers. */ -static void +static sector_t init_page_buffers(struct page *page, struct block_device *bdev, sector_t block, int size) { @@ -936,33 +936,41 @@ init_page_buffers(struct page *page, str block++; bh = bh->b_this_page; } while (bh != head); + + /* + * Caller needs to validate requested block against end of device. + */ + return end_block; } /* * Create the page-cache page that contains the requested block. * - * This is user purely for blockdev mappings. + * This is used purely for blockdev mappings. */ -static struct page * +static int grow_dev_page(struct block_device *bdev, sector_t block, - pgoff_t index, int size) + pgoff_t index, int size, int sizebits) { struct inode *inode = bdev->bd_inode; struct page *page; struct buffer_head *bh; + sector_t end_block; + int ret = 0; /* Will call free_more_memory() */ page = find_or_create_page(inode->i_mapping, index, (mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE); if (!page) - return NULL; + return ret; BUG_ON(!PageLocked(page)); if (page_has_buffers(page)) { bh = page_buffers(page); if (bh->b_size == size) { - init_page_buffers(page, bdev, block, size); - return page; + end_block = init_page_buffers(page, bdev, + index << sizebits, size); + goto done; } if (!try_to_free_buffers(page)) goto failed; @@ -982,14 +990,14 @@ grow_dev_page(struct block_device *bdev, */ spin_lock(&inode->i_mapping->private_lock); link_dev_buffers(page, bh); - init_page_buffers(page, bdev, block, size); + end_block = init_page_buffers(page, bdev, index << sizebits, size); spin_unlock(&inode->i_mapping->private_lock); - return page; - +done: + ret = (block < end_block) ? 1 : -ENXIO; failed: unlock_page(page); page_cache_release(page); - return NULL; + return ret; } /* @@ -999,7 +1007,6 @@ failed: static int grow_buffers(struct block_device *bdev, sector_t block, int size) { - struct page *page; pgoff_t index; int sizebits; @@ -1023,22 +1030,14 @@ grow_buffers(struct block_device *bdev, bdevname(bdev, b)); return -EIO; } - block = index << sizebits; + /* Create a page with the proper size buffers.. */ - page = grow_dev_page(bdev, block, index, size); - if (!page) - return 0; - unlock_page(page); - page_cache_release(page); - return 1; + return grow_dev_page(bdev, block, index, size, sizebits); } static struct buffer_head * __getblk_slow(struct block_device *bdev, sector_t block, int size) { - int ret; - struct buffer_head *bh; - /* Size must be multiple of hard sectorsize */ if (unlikely(size & (bdev_logical_block_size(bdev)-1) || (size < 512 || size > PAGE_SIZE))) { @@ -1051,21 +1050,20 @@ __getblk_slow(struct block_device *bdev, return NULL; } -retry: - bh = __find_get_block(bdev, block, size); - if (bh) - return bh; + for (;;) { + struct buffer_head *bh; + int ret; - ret = grow_buffers(bdev, block, size); - if (ret == 0) { - free_more_memory(); - goto retry; - } else if (ret > 0) { bh = __find_get_block(bdev, block, size); if (bh) return bh; + + ret = grow_buffers(bdev, block, size); + if (ret < 0) + return NULL; + if (ret == 0) + free_more_memory(); } - return NULL; } /* @@ -1321,10 +1319,6 @@ EXPORT_SYMBOL(__find_get_block); * which corresponds to the passed block_device, block and size. The * returned buffer has its reference count incremented. * - * __getblk() cannot fail - it just keeps trying. If you pass it an - * illegal block number, __getblk() will happily return a buffer_head - * which represents the non-existent block. Very weird. - * * __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers() * attempt is failing. FIXME, perhaps? */ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins @ 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer 2012-08-28 11:14 ` Richard W.M. Jones 2 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2012-08-23 10:20 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Jeff Moyer, Andrew Morton, Linus Torvalds, Torsten Hilbrich, Josh Boyer, linux-kernel On 08/23/2012 06:56 AM, Hugh Dickins wrote: > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > is not good: a successful call to grow_buffers() cannot guarantee > that the page won't be reclaimed before the immediate next call to > __find_get_block(), which is why there was always a loop there. > > Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595: > inode #19278: block 664: comm cc1: unable to read itable block" on console, > which pointed to this commit. > > I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs > sometimes fails from a missing header file, under memory pressure on > ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7 > itself, despite that commit being in there: bisection pointed to an > irrelevant pinctrl merge, but hard to tell when failure takes between > 18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2). > > (I've since found such __ext4_get_inode_loc errors in /var/log/messages > from previous weeks: why the message never appeared on console until > yesterday morning is a mystery for another day.) > > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > a checkpatch nitfix). Simplify the interface between grow_buffers() > and grow_dev_page(), and avoid the infinite loop beyond end of device > by instead checking init_page_buffers()'s end_block there (I presume > that's more efficient than a repeated call to blkdev_max_block()), > returning -ENXIO to __getblk_slow() in that case. > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > comment, but that is worrying: are all users of __getblk() really > now prepared for a NULL bh beyond end of device, or will some oops?? Hugh, I tentatively applied this one, awaiting some test feedback before pushing it upstream this cycle. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe @ 2012-08-23 20:40 ` Jeff Moyer 2012-08-23 22:46 ` Hugh Dickins 2012-08-28 11:14 ` Richard W.M. Jones 2 siblings, 1 reply; 10+ messages in thread From: Jeff Moyer @ 2012-08-23 20:40 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel Hugh Dickins <hughd@google.com> writes: > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > is not good: a successful call to grow_buffers() cannot guarantee > that the page won't be reclaimed before the immediate next call to > __find_get_block(), which is why there was always a loop there. [snip] > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > a checkpatch nitfix). Simplify the interface between grow_buffers() > and grow_dev_page(), and avoid the infinite loop beyond end of device > by instead checking init_page_buffers()'s end_block there (I presume > that's more efficient than a repeated call to blkdev_max_block()), > returning -ENXIO to __getblk_slow() in that case. > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > comment, but that is worrying: are all users of __getblk() really > now prepared for a NULL bh beyond end of device, or will some oops?? Hi, Hugh, Thanks for digging into this. I had wondered whether that loop was required for other purposes, and you've verified that it was. I mostly like your fix. Returning end_block from init_page_buffers is a little strange, but I understand not wanting to redo the call to blkdev_max_block. I went ahead to re-tested with the reproducer that I had, and your patch works fine. Thanks again for tracking this down, and sorry I wasn't more diligent to begin with. Reviewed-and-Tested-by: Jeff Moyer <jmoyer@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 20:40 ` Jeff Moyer @ 2012-08-23 22:46 ` Hugh Dickins 2012-08-24 13:13 ` Jeff Moyer 0 siblings, 1 reply; 10+ messages in thread From: Hugh Dickins @ 2012-08-23 22:46 UTC (permalink / raw) To: Jeff Moyer Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Thu, 23 Aug 2012, Jeff Moyer wrote: > Hugh Dickins <hughd@google.com> writes: > > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > > > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") > > is not good: a successful call to grow_buffers() cannot guarantee > > that the page won't be reclaimed before the immediate next call to > > __find_get_block(), which is why there was always a loop there. > [snip] > > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus > > a checkpatch nitfix). Simplify the interface between grow_buffers() > > and grow_dev_page(), and avoid the infinite loop beyond end of device > > by instead checking init_page_buffers()'s end_block there (I presume > > that's more efficient than a repeated call to blkdev_max_block()), > > returning -ENXIO to __getblk_slow() in that case. > > > > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" > > comment, but that is worrying: are all users of __getblk() really > > now prepared for a NULL bh beyond end of device, or will some oops?? > > Hi, Hugh, > > Thanks for digging into this. I had wondered whether that loop was > required for other purposes, and you've verified that it was. I mostly > like your fix. Returning end_block from init_page_buffers is a little > strange, It is a little strange, I agree. I wrestled a lot with the way block and end_block were known at opposite ends of the callstack, and needed to be brought together for the check. If you think it would be better to move the blkdev_max_block() lookup up a level into grow_dev_page(), then pass end_block down to init_page_buffers(), that could work as well; though we'd then still need to report "failure" from init_page_buffers(). I was inclined to leave it precisely where you had sited it, with that comment on passing it back up; but could change it around if preferred. > but I understand not wanting to redo the call to blkdev_max_block. > > I went ahead to re-tested with the reproducer that I had, and your patch > works fine. Thanks again for tracking this down, and sorry I wasn't > more diligent to begin with. > > Reviewed-and-Tested-by: Jeff Moyer <jmoyer@redhat.com> Great, thanks a lot. It (but equally its incorrect earlier version) have been running fine at home under my swapping load. I'm confident that it fixes the issues I had been seeing, although, strictly speaking, it'll take another two or three days to reach bisection confidence level. Hugh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 22:46 ` Hugh Dickins @ 2012-08-24 13:13 ` Jeff Moyer 0 siblings, 0 replies; 10+ messages in thread From: Jeff Moyer @ 2012-08-24 13:13 UTC (permalink / raw) To: Hugh Dickins Cc: Richard W.M. Jones, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel Hugh Dickins <hughd@google.com> writes: > On Thu, 23 Aug 2012, Jeff Moyer wrote: >> Hugh Dickins <hughd@google.com> writes: >> >> > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix >> > >> > Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow") >> > is not good: a successful call to grow_buffers() cannot guarantee >> > that the page won't be reclaimed before the immediate next call to >> > __find_get_block(), which is why there was always a loop there. >> [snip] >> > Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus >> > a checkpatch nitfix). Simplify the interface between grow_buffers() >> > and grow_dev_page(), and avoid the infinite loop beyond end of device >> > by instead checking init_page_buffers()'s end_block there (I presume >> > that's more efficient than a repeated call to blkdev_max_block()), >> > returning -ENXIO to __getblk_slow() in that case. >> > >> > And remove akpm's ten-year-old "__getblk() cannot fail ... weird" >> > comment, but that is worrying: are all users of __getblk() really >> > now prepared for a NULL bh beyond end of device, or will some oops?? >> >> Hi, Hugh, >> >> Thanks for digging into this. I had wondered whether that loop was >> required for other purposes, and you've verified that it was. I mostly >> like your fix. Returning end_block from init_page_buffers is a little >> strange, > > It is a little strange, I agree. I wrestled a lot with the way block > and end_block were known at opposite ends of the callstack, and needed > to be brought together for the check. > > If you think it would be better to move the blkdev_max_block() > lookup up a level into grow_dev_page(), then pass end_block down to > init_page_buffers(), that could work as well; though we'd then still > need to report "failure" from init_page_buffers(). I have no strong opinion; I can live with it as-is. > Great, thanks a lot. It (but equally its incorrect earlier version) > have been running fine at home under my swapping load. I'm confident > that it fixes the issues I had been seeing, although, strictly speaking, > it'll take another two or three days to reach bisection confidence level. Well, I bought your analysis, so I'm at least ready to see this get pushed. ;-) Cheers, Jeff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer @ 2012-08-28 11:14 ` Richard W.M. Jones 2012-08-28 14:35 ` Hugh Dickins 2 siblings, 1 reply; 10+ messages in thread From: Richard W.M. Jones @ 2012-08-28 11:14 UTC (permalink / raw) To: Hugh Dickins Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Wed, Aug 22, 2012 at 09:56:12PM -0700, Hugh Dickins wrote: > On Wed, 22 Aug 2012, Richard W.M. Jones wrote: > > On Tue, Aug 21, 2012 at 06:33:45PM -0700, Hugh Dickins wrote: > > > Jeff, > > > > > > Your commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow"), > > > already gone into 3.* stable, is not good. Could you and your testers > > > please give this alternative a try - I think it should work, and have > > > started it on a few days' memory load on 3.5, but not tried your case. > > > > I have tested your patch and it does NOT fix the problem in > > http://bugzilla.redhat.com/835019 > > Kernel 3.6.0 + your patch => mount goes into a loop when mounting > > a small empty partition. Please do NOT apply this as it will > > cause a regression! > > That was all very helpful information that you provided, thank you. > Sorry, I had missed how "block" is massaged as it's passed down a level. > > The patch below fixes it for me, though it does have to change more than > I'd been hoping in such a fix. Perhaps I am just being silly to resist > repeating the call to blkdev_max_block(). Does this patch work for you? > > Thanks, > Hugh > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix I noticed this (second version) went upstream already. Nevertheless I tested it today and it doesn't cause a regression of RHBZ#835019. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix 2012-08-28 11:14 ` Richard W.M. Jones @ 2012-08-28 14:35 ` Hugh Dickins 0 siblings, 0 replies; 10+ messages in thread From: Hugh Dickins @ 2012-08-28 14:35 UTC (permalink / raw) To: Richard W.M. Jones Cc: Jeff Moyer, Andrew Morton, Linus Torvalds, Jens Axboe, Torsten Hilbrich, Josh Boyer, linux-kernel On Tue, 28 Aug 2012, Richard W.M. Jones wrote: > On Wed, Aug 22, 2012 at 09:56:12PM -0700, Hugh Dickins wrote: > > [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix > > I noticed this (second version) went upstream already. Nevertheless I > tested it today and it doesn't cause a regression of RHBZ#835019. Thank you for testing and confirming, Rich. Once Jeff had tested it, we felt safe to go ahead while you were off. Hugh ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-08-28 14:36 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-08-22 1:33 [PATCH] block: replace __getblk_slow misfix by grow_dev_page fix Hugh Dickins 2012-08-22 9:12 ` Richard W.M. Jones 2012-08-22 11:52 ` Richard W.M. Jones 2012-08-23 4:56 ` Hugh Dickins 2012-08-23 10:20 ` Jens Axboe 2012-08-23 20:40 ` Jeff Moyer 2012-08-23 22:46 ` Hugh Dickins 2012-08-24 13:13 ` Jeff Moyer 2012-08-28 11:14 ` Richard W.M. Jones 2012-08-28 14:35 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).