All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	syzbot <syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com>,
	linux-mm@kvack.org, mgorman@techsingularity.net,
	Michal Hocko <mhocko@kernel.org>,
	ak@linux.intel.com, jlayton@redhat.com,
	linux-kernel@vger.kernel.org, mawilcox@microsoft.com,
	syzkaller-bugs@googlegroups.com, tim.c.chen@linux.intel.com,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: INFO: task hung in generic_file_write_iter
Date: Wed, 2 Jan 2019 15:40:15 +0100	[thread overview]
Message-ID: <20190102144015.GA23089@quack2.suse.cz> (raw)
In-Reply-To: <e8a23623-feaf-7730-5492-b329cb0daa21@i-love.sakura.ne.jp>

On Fri 28-12-18 22:34:13, Tetsuo Handa wrote:
> On 2018/08/06 19:09, Jan Kara wrote:
> > On Tue 31-07-18 00:07:22, Tetsuo Handa wrote:
> >> On 2018/07/21 5:06, Andrew Morton wrote:
> >>> On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
> >>>
> >>>>>
> >>>>> This report is stalling after mount() completed and process used remap_file_pages().
> >>>>> I think that we might need to use debug printk(). But I don't know what to examine.
> >>>>>
> >>>>
> >>>> Andrew, can you pick up this debug printk() patch?
> >>>> I guess we can get the result within one week.
> >>>
> >>> Sure, let's toss it in -next for a while.
> >>>
> >>>> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001
> >>>> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> >>>> Date: Fri, 20 Jul 2018 19:29:06 +0900
> >>>> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem
> >>>>
> >>>> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain
> >>>> __getblk_gfp() line in the backtrace. Since there is a comment block that
> >>>> says that __getblk_gfp() will lock up the machine if try_to_free_buffers()
> >>>> attempt from grow_dev_page() is failing, let's start from checking whether
> >>>> syzbot is hitting that case. This change will be removed after the bug is
> >>>> fixed.
> >>>
> >>> I'm not sure that grow_dev_page() is hanging.  It has often been
> >>> suspected, but always is proven innocent.  Lets see.
> >>
> >> syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) .
> >> It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea?
> > 
> > Looks like some kind of a race where device block size gets changed while
> > getblk() runs (and creates buffers for underlying page). I don't have time
> > to nail it down at this moment can have a look into it later unless someone
> > beats me to it.
> 
> I feel that the frequency of hitting this problem was decreased
> by merging loop module's ioctl() serialization patches. But this
> problem is still there, and syzbot got a new line in
> https://syzkaller.appspot.com/text?tag=CrashLog&x=177f889f400000 .
> 
>   [  615.881781] __loop_clr_fd: partition scan of loop5 failed (rc=-22)
>   [  619.059920] syz-executor4(2193): getblk(): executed=cd bh_count=0 bh_state=29
>   [  622.069808] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0
>   [  625.080013] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0
>   [  628.089900] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0
> 
> I guess that loop module is somehow related to this problem.

I had a look into this and the only good explanation for this I have is
that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits).
If that would happen, we'd get exactly the behavior syzkaller observes
because grow_buffers() would populate different page than
__find_get_block() then looks up.

However I don't see how that's possible since the filesystem has the block
device open exclusively and blkdev_bszset() makes sure we also have
exclusive access to the block device before changing the block device size.
So changing block device block size after filesystem gets access to the
device should be impossible. 

Anyway, could you perhaps add to your debug patch a dump of 'size' passed
to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us
whether my theory is right or not. Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2019-01-02 14:40 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18  8:58 INFO: task hung in generic_file_write_iter syzbot
2018-07-18 10:28 ` Tetsuo Handa
2018-07-18 10:36   ` Dmitry Vyukov
2018-07-20 10:36   ` Tetsuo Handa
2018-07-20 10:36     ` Tetsuo Handa
2018-07-20 10:36     ` Tetsuo Handa
2018-07-20 20:06     ` Andrew Morton
2018-07-30 15:07       ` Tetsuo Handa
2018-08-06 10:09         ` Jan Kara
2018-08-06 11:56           ` Tetsuo Handa
2018-08-20 14:12             ` Tetsuo Handa
2018-12-28 13:34           ` Tetsuo Handa
2019-01-02 14:40             ` Jan Kara [this message]
2019-01-02 14:46               ` Dmitry Vyukov
2019-01-02 14:46                 ` Dmitry Vyukov
2019-01-02 16:07               ` Tetsuo Handa
2019-01-02 16:07                 ` Tetsuo Handa
2019-01-02 17:26                 ` Jan Kara
2019-01-03  0:46                   ` Tetsuo Handa
2019-01-03  0:46                     ` Tetsuo Handa
2019-01-08 10:04                   ` Tetsuo Handa
2019-01-08 11:24                     ` Jan Kara
2019-01-08 11:49                       ` Dmitry Vyukov
2019-01-08 11:49                         ` Dmitry Vyukov
2019-01-09 13:30                         ` Jan Kara
2019-01-14 15:11                           ` Dmitry Vyukov
2019-01-14 15:11                             ` Dmitry Vyukov
2019-01-14 15:13                             ` Dmitry Vyukov
2019-01-14 15:13                               ` Dmitry Vyukov
2019-01-15  9:29                               ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190102144015.GA23089@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.