All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Jan Kara <jack@suse.cz>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	syzbot <syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com>,
	Linux-MM <linux-mm@kvack.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@kernel.org>, Andi Kleen <ak@linux.intel.com>,
	jlayton@redhat.com, LKML <linux-kernel@vger.kernel.org>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	tim.c.chen@linux.intel.com,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: INFO: task hung in generic_file_write_iter
Date: Wed, 9 Jan 2019 14:30:06 +0100	[thread overview]
Message-ID: <20190109133006.GG15397@quack2.suse.cz> (raw)
In-Reply-To: <CACT4Y+bxUJ-6dLch+orY0AcjrvJhXq1=ELvHciX5M-gd5bdPpA@mail.gmail.com>

On Tue 08-01-19 12:49:08, Dmitry Vyukov wrote:
> On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote:
> > > On 2019/01/03 2:26, Jan Kara wrote:
> > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote:
> > > >> On 2019/01/02 23:40, Jan Kara wrote:
> > > >>> I had a look into this and the only good explanation for this I have is
> > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits).
> > > >>> If that would happen, we'd get exactly the behavior syzkaller observes
> > > >>> because grow_buffers() would populate different page than
> > > >>> __find_get_block() then looks up.
> > > >>>
> > > >>> However I don't see how that's possible since the filesystem has the block
> > > >>> device open exclusively and blkdev_bszset() makes sure we also have
> > > >>> exclusive access to the block device before changing the block device size.
> > > >>> So changing block device block size after filesystem gets access to the
> > > >>> device should be impossible.
> > > >>>
> > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed
> > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us
> > > >>> whether my theory is right or not. Thanks!
> > > >>>
> > >
> > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12.
> > >
> > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000
> > >
> > > [  385.723941][  T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12
> > > (...snipped...)
> > > [  568.159544][  T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12
> >
> > Right, so indeed the block size in the superblock and in the block device
> > gets out of sync which explains why we endlessly loop in the buffer cache
> > code. The superblock uses blocksize of 512 while the block device thinks
> > the set block size is 4096.
> >
> > And after staring into the code for some time, I finally have a trivial
> > reproducer:
> >
> > truncate -s 1G /tmp/image
> > losetup /dev/loop0 /tmp/image
> > mkfs.ext4 -b 1024 /dev/loop0
> > mount -t ext4 /dev/loop0 /mnt
> > losetup -c /dev/loop0
> > l /mnt
> > <hangs>
> >
> > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block
> > device block size to 4096 by calling bd_set_size(). I have to think how to
> > best fix this...
> >
> > Thanks for your help with debugging this!
> 
> Wow! I am very excited.
> We have 587 open "task hung" reports, I suspect this explains lots of them.
> What would be some pattern that we can use to best-effort distinguish
> most manifestations? Skimming through few reports I see "inode_lock",
> "get_super", "blkdev_put" as common indicators. Anything else?

Well, there will be always looping task with __getblk_gfp() on its stack
(which should be visible in the stacktrace generated by the stall
detector). Then there can be lots of other processes getting blocked due to
locks and other resources held by this task...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2019-01-09 13:30 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18  8:58 syzbot
2018-07-18 10:28 ` Tetsuo Handa
2018-07-18 10:36   ` Dmitry Vyukov
2018-07-20 10:36   ` Tetsuo Handa
2018-07-20 10:36     ` Tetsuo Handa
2018-07-20 10:36     ` Tetsuo Handa
2018-07-20 20:06     ` Andrew Morton
2018-07-30 15:07       ` Tetsuo Handa
2018-08-06 10:09         ` Jan Kara
2018-08-06 11:56           ` Tetsuo Handa
2018-08-20 14:12             ` Tetsuo Handa
2018-12-28 13:34           ` Tetsuo Handa
2019-01-02 14:40             ` Jan Kara
2019-01-02 14:46               ` Dmitry Vyukov
2019-01-02 14:46                 ` Dmitry Vyukov
2019-01-02 16:07               ` Tetsuo Handa
2019-01-02 16:07                 ` Tetsuo Handa
2019-01-02 17:26                 ` Jan Kara
2019-01-03  0:46                   ` Tetsuo Handa
2019-01-03  0:46                     ` Tetsuo Handa
2019-01-08 10:04                   ` Tetsuo Handa
2019-01-08 11:24                     ` Jan Kara
2019-01-08 11:49                       ` Dmitry Vyukov
2019-01-08 11:49                         ` Dmitry Vyukov
2019-01-09 13:30                         ` Jan Kara [this message]
2019-01-14 15:11                           ` Dmitry Vyukov
2019-01-14 15:11                             ` Dmitry Vyukov
2019-01-14 15:13                             ` Dmitry Vyukov
2019-01-14 15:13                               ` Dmitry Vyukov
2019-01-15  9:29                               ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190109133006.GG15397@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: INFO: task hung in generic_file_write_iter' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.