From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?Q?=C3=89tienne?= Buira Subject: Probable bug in md with rdev->new_data_offset Date: Mon, 28 Mar 2016 12:31:24 +0200 Message-ID: <20160328103123.GC8633@rcKGHUlyQfVFW> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids Hi all, Please apologise if i hit the wrong list. I searched a bit, but could not find bug report or commits that seemed related, please apologise if i'm wrong here. I was going to grow a raid6 array (that contained a spare), using this command: # mdadm --grow -n 7 /dev/mdx But when doing so, i got a PAX message saying that a size overflow was detected in super_1_sync on the decl new_offset. The array was then in unusable state (presumably because some locks were held). After printking the values for rdev->new_data_offset and rdev->data_offset in the if (rdev->new_data_offset != rdev->data_offset) { ... block of super_1_sync, i found that new_data_offset (252928 in my case) where smaller than data_offset (258048), thus, the substraction to compute sb->new_data_offset yielded an insanely high value. For all partitions this array is made of, mdadm -E /dev/sdxy reports a data offset of 258048 sectors (the value of rdev->data_offset). IMHO, it seems a good idea to put a BUG_ON or similar if rdev->new_data_offset is smaller than rdev->data offset at this place, but that would not address the real issue. I could solve my problem by setting mdadm's backup-file= option. Kernel version was Gentoo hardened v4.4.2. Full PAX size overflow detection line: size overflow detected in function super_1_sync drivers/md/md.c:1683 cicus.1522_314 min, count: 158, decl: new_offset; num: 0; context: mdp_superblock_1 Call stack (without addresses): dump_stack report_size_overflow super_1_sync ? sched_clock_cpu md_update_sb ? account_entity_dequeue ? dequeue_task_fair ? mutex_lock ? bitmap_daemon_work md_check_recovery raid5d ? try_to_del_timer_sync ? del_timer_sync md_thread ? wait_woken ? find_pers kthread ? kthread_create_on_node ret_from_fork ? kthread_create_on_node I am not familiar with kernel coding, so i won't create a patch, but i'm willing to give more information if needed to track this issue. Regards.