Linux-Raid Archives on lore.kernel.org
 help / color / Atom feed
From: Daniel Gnoutcheff <gnoutchd@softwarefreedom.org>
To: linux-raid@vger.kernel.org
Subject: "attempt to access beyond end of device" when reshaping raid10 from near=2 to offset=2
Date: Wed, 27 Jan 2021 17:15:56 -0500
Message-ID: <09505ed1-ad29-28f1-627e-8a6a0b8df3a4@softwarefreedom.org> (raw)

Greets,

Whilst experimenting with array reshaping, I've found that if I create a 
near=2 raid10 like so:

   for i in 0 1 2 3 ; do
     truncate --size=1G disk$i
     losetup /dev/loop$i disk$i
   done
   mdadm --create /dev/md0 --level=raid10 --raid-devices=4 --layout=n2 \
     --data-offset=10M /dev/loop{0,1,2,3}
   mdadm --wait /dev/md0  # wait for resync

and then try to reshape it to offset=2:

   mdadm --grow /dev/md0 --layout=o2

the reshape repeatedly fails and restarts with something like this in 
dmesg (repetitive messages snipped):

> Jan 27 15:47:07 snaptest kernel: md: reshape of RAID array md0
> Jan 27 15:47:07 snaptest kernel: md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> Jan 27 15:47:07 snaptest kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
> Jan 27 15:47:07 snaptest kernel: md: using 128k window, over a total of 2076672k.
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop2: rw=536870912, want=2097280, limit=2097152
> Jan 27 15:47:08 snaptest kernel: md/raid10:md0: Disk failure on loop2, disabling device.
>                                  md/raid10:md0: Operation continuing on 3 devices.
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop0: rw=536870912, want=2097280, limit=2097152
> Jan 27 15:47:08 snaptest kernel: md/raid10:md0: Disk failure on loop0, disabling device.
>                                  md/raid10:md0: Operation continuing on 2 devices.
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop1: rw=536870912, want=2097280, limit=2097152
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop1: rw=536870912, want=2097408, limit=2097152
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop1: rw=536870912, want=2097536, limit=2097152
> Jan 27 15:47:08 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:08 snaptest kernel: loop1: rw=536870912, want=2097664, limit=2097152
<snip>
> Jan 27 15:47:09 snaptest kernel: loop3: rw=536870912, want=2097408, limit=2097152
> Jan 27 15:47:09 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:09 snaptest kernel: loop3: rw=536870912, want=2097536, limit=2097152
> Jan 27 15:47:09 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:09 snaptest kernel: loop3: rw=536870912, want=2097664, limit=2097152
<snip>
> Jan 27 15:47:09 snaptest kernel: loop3: rw=536870912, want=2106368, limit=2097152
> Jan 27 15:47:09 snaptest kernel: md: md0: reshape interrupted.
> Jan 27 15:47:09 snaptest kernel: md: reshape of RAID array md0
> Jan 27 15:47:09 snaptest kernel: md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> Jan 27 15:47:09 snaptest kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
> Jan 27 15:47:09 snaptest kernel: md: using 128k window, over a total of 2076672k.
> Jan 27 15:47:09 snaptest kernel: attempt to access beyond end of device
> Jan 27 15:47:09 snaptest kernel: loop1: rw=536870912, want=2107520, limit=2097152
<snip>

and so on until the array is stopped.

This log came from a Debian stretch VM (mdadm v3.4 and kernel 4.9.246 as 
patched and built by Debian), but I see the same behavior in Debian 
buster (mdadm v4.1) with stock and backport kernels (4.19.160 and 
5.9.15, respectively).

I notice that if I stop, re-assemble, and add back the "failed" devices, 
eg.:

   mdadm --assemble /dev/md0 /dev/loop{0,1,2,3}
   mdadm /dev/md0 --add /dev/loop0
   mdadm /dev/md0 --add /dev/loop2

then it recovers and reshapes without complaint.

Have I encountered a bug?

Many thanks,
-- 
Daniel Gnoutcheff
Systems Administrator
Software Freedom Law Center

             reply index

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 22:15 Daniel Gnoutcheff [this message]
2021-01-28  0:59 ` antlists
2021-01-28 21:50   ` Daniel Gnoutcheff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09505ed1-ad29-28f1-627e-8a6a0b8df3a4@softwarefreedom.org \
    --to=gnoutchd@softwarefreedom.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Raid Archives on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-raid/0 linux-raid/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-raid linux-raid/ https://lore.kernel.org/linux-raid \
		linux-raid@vger.kernel.org
	public-inbox-index linux-raid

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-raid


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git