* [GIT PULL] md-next 20220921 @ 2022-09-21 21:33 Song Liu 2022-09-21 22:37 ` Logan Gunthorpe 0 siblings, 1 reply; 4+ messages in thread From: Song Liu @ 2022-09-21 21:33 UTC (permalink / raw) To: Jens Axboe, linux-raid Cc: Logan Gunthorpe, David Sloan, Yu Kuai, Mateusz Grzonka, Saurabh Sengar, XU pengfei, Guoqing Jiang, Zhou nan Hi Jens, Please consider pulling the following changes for md-next on top of your for-6.1/block branch (for-6.1/drivers branch doesn't exist yet). The major changes are: 1. Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. 2. Raid10 performance optimization, by Yu Kuai. 3. Generate CHANGE uevents for md device, by Mateusz Grzonka. Thanks, Song The following changes since commit 8c5035dfbb9475b67c82b3fdb7351236525bf52b: blk-wbt: call rq_qos_add() after wb_normal is initialized (2022-09-21 08:36:13 -0600) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git md-next for you to fetch changes up to 9859e343daaf8b08bbb4bed63a378a05535bcb47: md: Fix spelling mistake in comments of r5l_log (2022-09-21 14:22:17 -0700) ---------------------------------------------------------------- David Sloan (1): md/raid5: Remove unnecessary bio_put() in raid5_read_one_chunk() Guoqing Jiang (1): md/raid10: fix compile warning Logan Gunthorpe (7): md/raid5: Refactor raid5_get_active_stripe() md/raid5: Drop extern on function declarations in raid5.h md/raid5: Cleanup prototype of raid5_get_active_stripe() md/raid5: Don't read ->active_stripes if it's not needed md/raid5: Ensure stripe_fill happens on non-read IO with journal md: Remove extra mddev_get() in md_seq_start() md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d Mateusz Grzonka (1): md: generate CHANGE uevents for md device Saurabh Sengar (1): md: Replace snprintf with scnprintf Song Liu (1): Merge branch 'md-next-raid10-optimize' into md-next XU pengfei (1): md/raid5: Fix spelling mistakes in comments Yu Kuai (5): md/raid10: factor out code from wait_barrier() to stop_waiting_barrier() md/raid10: don't modify 'nr_waitng' in wait_barrier() for the case nowait md/raid10: prevent unnecessary calls to wake_up() in fast path md/raid10: fix improper BUG_ON() in raise_barrier() md/raid10: convert resync_lock to use seqlock Zhou nan (1): md: Fix spelling mistake in comments of r5l_log drivers/md/md.c | 32 ++++++++++++++++---------------- drivers/md/md.h | 2 +- drivers/md/raid0.c | 2 +- drivers/md/raid10.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------- drivers/md/raid10.h | 2 +- drivers/md/raid5-cache.c | 11 ++++++----- drivers/md/raid5.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------- drivers/md/raid5.h | 32 ++++++++++++++++++++------------ 8 files changed, 223 insertions(+), 160 deletions(-) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GIT PULL] md-next 20220921 2022-09-21 21:33 [GIT PULL] md-next 20220921 Song Liu @ 2022-09-21 22:37 ` Logan Gunthorpe 2022-09-21 23:44 ` Logan Gunthorpe 0 siblings, 1 reply; 4+ messages in thread From: Logan Gunthorpe @ 2022-09-21 22:37 UTC (permalink / raw) To: Song Liu, Jens Axboe, linux-raid Cc: David Sloan, Yu Kuai, Mateusz Grzonka, Saurabh Sengar, XU pengfei, Guoqing Jiang, Zhou nan On 2022-09-21 15:33, Song Liu wrote: > Hi Jens, > > Please consider pulling the following changes for md-next on top of your > for-6.1/block branch (for-6.1/drivers branch doesn't exist yet). > > The major changes are: > > 1. Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. > 2. Raid10 performance optimization, by Yu Kuai. > 3. Generate CHANGE uevents for md device, by Mateusz Grzonka. I may have hit a bug with my tests on the latest md-next branch. Still trying to hit it again. The last tests I ran for several days with some patches on the previous md-next branch, but I didn't have Mateusz's changes, and it also looks like the branch was rebased today so it could be caused by either of those things. I'll let you know when I know more. Logan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GIT PULL] md-next 20220921 2022-09-21 22:37 ` Logan Gunthorpe @ 2022-09-21 23:44 ` Logan Gunthorpe 2022-09-22 0:40 ` Song Liu 0 siblings, 1 reply; 4+ messages in thread From: Logan Gunthorpe @ 2022-09-21 23:44 UTC (permalink / raw) To: Song Liu, Jens Axboe, linux-raid Cc: David Sloan, Yu Kuai, Mateusz Grzonka, Saurabh Sengar, XU pengfei, Guoqing Jiang, Zhou nan On 2022-09-21 16:37, Logan Gunthorpe wrote: > > > On 2022-09-21 15:33, Song Liu wrote: >> Hi Jens, >> >> Please consider pulling the following changes for md-next on top of your >> for-6.1/block branch (for-6.1/drivers branch doesn't exist yet). >> >> The major changes are: >> >> 1. Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. >> 2. Raid10 performance optimization, by Yu Kuai. >> 3. Generate CHANGE uevents for md device, by Mateusz Grzonka. > > I may have hit a bug with my tests on the latest md-next branch. Still > trying to hit it again. The last tests I ran for several days with some > patches on the previous md-next branch, but I didn't have Mateusz's > changes, and it also looks like the branch was rebased today so it could > be caused by either of those things. I'll let you know when I know more. Yes, ok, I've found two separate issues and both are fixed by reverting 21023a82bff7 ("md: generate CHANGE uevents for md device") I suggest we drop that patch for this cycle so we can sort them out. The issues are: 1) The concrete issue comes when running mdadm test 01r1fail. I get the kernel bugs at the end of this email. It seems we cannot call kobject_uevent() in at least one of the contexts that md_new_event() is called in because it sleeps in a critical section. 2) With our custom test suite that creates and destroys arrays, adds and removes disks, and runs data through them repeatedly, I randomly start seeing these warnings: mdadm: Fail to create md0 when using /sys/module/md_mod/parameters/new_array, fallback to creation via node And then very occasionally get that warning paired with this error: mdadm: unexpected failure opening /dev/md0 Which stops the test because it fails to create an array. I also see a lot of the same bugs as below so it may be related. Logan -- BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 853, name: mdadm preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 1 lock held by mdadm/853: #0: ffffffff98c623c0 (rcu_read_lock){....}-{1:2}, at: md_ioctl+0x8f0/0x2670 CPU: 2 PID: 853 Comm: mdadm Not tainted 6.0.0-rc2-eid-vmlocalyes-dbg-00096-g9859e343daaf #2680 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5a/0x74 dump_stack+0x10/0x12 __might_resched.cold+0x146/0x17e __might_sleep+0x66/0xc0 kmem_cache_alloc_trace+0x2f8/0x400 kobject_uevent_env+0x121/0xa30 kobject_uevent+0xb/0x10 md_new_event+0x6b/0x80 md_error+0x168/0x1b0 md_ioctl+0x989/0x2670 blkdev_ioctl+0x24d/0x450 __x64_sys_ioctl+0xc0/0x100 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ============================= [ BUG: Invalid wait context ] 6.0.0-rc2-eid-vmlocalyes-dbg-00096-g9859e343daaf #2680 Tainted: G W ----------------------------- mdadm/853 is trying to lock: ffffffff990e4950 (uevent_sock_mutex){+.+.}-{3:3}, at: kobject_uevent_env+0x460/0xa30 other info that might help us debug this: context-{4:4} 1 lock held by mdadm/853: #0: ffffffff98c623c0 (rcu_read_lock){....}-{1:2}, at: md_ioctl+0x8f0/0x2670 stack backtrace: CPU: 2 PID: 853 Comm: mdadm Tainted: G W 6.0.0-rc2-eid-vmlocalyes-dbg-00096-g9859e343daaf #2680 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5a/0x74 dump_stack+0x10/0x12 __lock_acquire.cold+0x2f2/0x31a lock_acquire+0x183/0x440 __mutex_lock+0x125/0xe20 mutex_lock_nested+0x1b/0x20 kobject_uevent_env+0x460/0xa30 kobject_uevent+0xb/0x10 md_new_event+0x6b/0x80 md_error+0x168/0x1b0 md_ioctl+0x989/0x2670 blkdev_ioctl+0x24d/0x450 __x64_sys_ioctl+0xc0/0x100 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GIT PULL] md-next 20220921 2022-09-21 23:44 ` Logan Gunthorpe @ 2022-09-22 0:40 ` Song Liu 0 siblings, 0 replies; 4+ messages in thread From: Song Liu @ 2022-09-22 0:40 UTC (permalink / raw) To: Logan Gunthorpe Cc: Jens Axboe, linux-raid, David Sloan, Yu Kuai, Mateusz Grzonka, Saurabh Sengar, XU pengfei, Guoqing Jiang, Zhou nan Hi Logan, > On Sep 21, 2022, at 4:44 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > > On 2022-09-21 16:37, Logan Gunthorpe wrote: >> >> >> On 2022-09-21 15:33, Song Liu wrote: >>> Hi Jens, >>> >>> Please consider pulling the following changes for md-next on top of your >>> for-6.1/block branch (for-6.1/drivers branch doesn't exist yet). >>> >>> The major changes are: >>> >>> 1. Various raid5 fix and clean up, by Logan Gunthorpe and David Sloan. >>> 2. Raid10 performance optimization, by Yu Kuai. >>> 3. Generate CHANGE uevents for md device, by Mateusz Grzonka. >> >> I may have hit a bug with my tests on the latest md-next branch. Still >> trying to hit it again. The last tests I ran for several days with some >> patches on the previous md-next branch, but I didn't have Mateusz's >> changes, and it also looks like the branch was rebased today so it could >> be caused by either of those things. I'll let you know when I know more. > > Yes, ok, I've found two separate issues and both are fixed by reverting > > 21023a82bff7 ("md: generate CHANGE uevents for md device") > > I suggest we drop that patch for this cycle so we can sort them out. > > The issues are: > > 1) The concrete issue comes when running mdadm test 01r1fail. I get the > kernel bugs at the end of this email. It seems we cannot call > kobject_uevent() in at least one of the contexts that md_new_event() is > called in because it sleeps in a critical section. > > 2) With our custom test suite that creates and destroys arrays, adds and > removes disks, and runs data through them repeatedly, I randomly start > seeing these warnings: > > mdadm: Fail to create md0 when using > /sys/module/md_mod/parameters/new_array, fallback to creation via node > > And then very occasionally get that warning paired with this error: > > mdadm: unexpected failure opening /dev/md0 > > Which stops the test because it fails to create an array. I also see a > lot of the same bugs as below so it may be related. Thanks for testing and debugging these issues. I also see issue 1). Jens, please ignore this pull request. I will send v2 later. Song ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-09-22 0:40 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-21 21:33 [GIT PULL] md-next 20220921 Song Liu 2022-09-21 22:37 ` Logan Gunthorpe 2022-09-21 23:44 ` Logan Gunthorpe 2022-09-22 0:40 ` Song Liu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.