linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: lining <lining2020x@163.com>
To: josef@toxicpanda.com, axboe@kernel.dk,
	linux-block@vger.kernel.org, yunchuan.wen@kylin-cloud.com,
	ceph-users@ceph.io
Cc: donglifekernel@126.com
Subject: [bug report] NBD: rbd-nbd + ext4 stuck after nbd resized
Date: Mon, 19 Oct 2020 11:29:44 +0800	[thread overview]
Message-ID: <464B35DA-D889-41F8-9193-EBBC8C4F7E9D@163.com> (raw)

Hi kernel、ceph comunity:

We run into an issue that mainly related to the (kernel) nbd driver and (ceph) rbd-nbd. 
After some investigations, I found that the root cause of the problem seems to be related to the change in the block size of nbd.

I am not sure whether it is the nbd driver or rbd-nbd bug, however there is such a problem.


What happened:
It will always hang when accessing the mount point of nbd device with ext4 after nbd resized. 


Environment information:
- kernel:               v4.19.25 or master
- rbd-nbd(ceph):  v12.2.0 Luminous or master
- the fs of nbd:    ext4


Steps to reproduce:
1. rbd create --size 2G rbdpool/foo  # create a 2G size rbd image
2. rbd-nbd map rbdpool/foo            # map the rbd image as a local block device /dev/nbd0, block size is 512(the default block size is set in rbd-nbd code when nbd mapped).
3. mkfs.ext4 /dev/nbd0                     # mkfs.ext4 on nbd0, only nbd + ext4 can reproduce the problem
4. mount /dev/nbd0 /mnt                # mount nbd0 on /mnt
5. rbd resize --size 4G rbdpool/foo   # expand the nbd backend image from 2G to 4G size
6. ls /mnt                                         # `ls` stuck here forever

ln@ubuntu:linux>$ ps -ef |grep mnt
root        8670    7519 98 10:16 pts/5    00:28:46 ls --color=auto /mnt/
ln          9508    9293  0 10:45 pts/6    00:00:00 grep --color=auto mnt


ln@ubuntu:linux>$ sudo cat /proc/8670/stack
[<0>] io_schedule+0x1a/0x40
[<0>] __lock_page+0x105/0x150
[<0>] pagecache_get_page+0x199/0x2c0
[<0>] __getblk_gfp+0xef/0x290
[<0>] ext4_getblk+0x83/0x1a0
[<0>] ext4_bread+0x26/0xb0
[<0>] __ext4_read_dirblock+0x34/0x2c0
[<0>] htree_dirblock_to_tree+0x56/0x1c0
[<0>] ext4_htree_fill_tree+0xad/0x330
[<0>] ext4_readdir+0x6a3/0x980
[<0>] iterate_dir+0x9e/0x1a0
[<0>] ksys_getdents64+0xa0/0x130
[<0>] __x64_sys_getdents64+0x1e/0x30
[<0>] do_syscall_64+0x5e/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff


Some investigations on the kernel side:
By git bisect, I found the problem is related to this commit: https://github.com/torvalds/linux/commit/9a9c3c02eacecf4bfde74b08ed32749a4929a2cf .
The kernel with this commit (9a9c3c02) can reproduce the problem, revert the commit and the problem disappears.


Some Logical analysis about the nbd block size changing:
1. rbd-nbd map rbdpool/foo      
    => ioctl NBD_BLKSZSET 512 
      => nbd_size_set() 
        => nbd_size_update(nbd) 
          =>{
                  bdev = bdget_disk(nbd->disk, 0);
                  bd_set_size(bdev, 512)  
                  set_blocksize(bdev, 512)
               }

2. mkfs.ext4 /dev/nbd0

3. mount /dev/nbd0 /mnt    
    => vfs mount
      =>  ext4_mount() 
        => … 
          => sb_set_blocksize() 
            => set_blocksize(bdev, 4096)   <= mount ext4 will set the nbd blocksize to 4096

4. rbd resize –size 4G rbdpool/foo   
    => ioctl NBD_SET_SIZE 4G   <= rbd-nbd will update the latest total size of nbd device
      =>  nbd_size_set() 
        => nbd_size_update(nbd) 
          =>{
                  bdev = bdget_disk(nbd->disk, 0);
                  bd_set_size(bdev, 512)  
                  set_blocksize(bdev, 512)   <= the blocksize is set back to 512 [code line: set_blocksize(bdev, config->blksize);  ]. It seems to be the root cause.
               } 




             reply	other threads:[~2020-10-19  3:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19  3:29 lining [this message]
2020-10-21  1:08 [bug report] NBD: rbd-nbd + ext4 stuck after nbd resized lining
2020-10-27  1:18 ` Ming Lei
2020-10-27  2:35   ` lining

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=464B35DA-D889-41F8-9193-EBBC8C4F7E9D@163.com \
    --to=lining2020x@163.com \
    --cc=axboe@kernel.dk \
    --cc=ceph-users@ceph.io \
    --cc=donglifekernel@126.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=yunchuan.wen@kylin-cloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).