Re: BUG: list_add corruption while doing migrate_swap -> balance_push

* Re: BUG: list_add corruption while doing migrate_swap -> balance_push
       [not found] <6dab6e564e43c952f63f83ef868da6ed829fc1a8.camel@mediatek.com>
@ 2022-09-07 12:00 ` Hillf Danton
  0 siblings, 0 replies; only message in thread
From: Hillf Danton @ 2022-09-07 12:00 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: peterz, mgorman, Waiman Long, linux-kernel, linux-mm, jing-ting.wu

On 6 Sep 2022 20:54:58 +0800 Kuyo Chang <kuyo.chang@mediatek.com> wrote
> Hi,
> 
> [Syndrome]
> A list_add corruption error at kernel-5.15, the log shows.
> list_add corruption. prev->next should be next (ffffff81a6f08ba0), but
> was 0000000000000000. (prev=ffffff81a6f05930).
> 
> The call trace as below:
> ipanic_die
> notify_die
> die
> bug_handler
> brk_handler
> do_debug_exception
> el1_dbg
> el1h_64_sync_handler
> el1h_64_sync
> __list_add_valid
> cpu_stop_queue_work
> stop_one_cpu_nowait
> balance_push
> __schedule
> schedule
> do_sched_yield
> __arm64_sys_sched_yield
> invoke_syscall
> el0_svc_common
> do_el0_svc
> el0_svc
> el0t_64_sync_handler
> el0t_64_sync
> 
> [Analysis]
> By memory dump and analyzing the stopper->works list, the error code
> flow as following:
> 
> migrate_swap 
> ->stop_two_cpus
> 	->cpu_stop_queue_two_works
> 		->__cpu_stop_queue_work (add work->list to stopper-
> >works respectively)	
> 			->list_add_tail(&work->list, &stopper->works);
> 	->wake_up_q(&wakeq);	
> ->wait_for_completion(&done.completion);
> ->wait_for_common
> ->schedule_timeout
> ->schedule
> 
> At this point, the cpu hotplug trigged,
> It registers balance_callback by below flow:
> cpu_down(cpuid)
> ->_cpu_down
> ->cpuhp_set_state()
> ->set_cpu_dying(cpuid, true)
> ->sched_cpu_deactivate
> ->balance_push_set(cpuid, true)
> 	->rq->balance_callback = &balance_push_callback;
> 
> 
> Finally, 
> ->__schedule
> ->__balance_callbacks
> ->do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
> ->balance_push
> ->stop_one_cpu_nowait
> 	*work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg,
> .caller = _RET_IP_, };
> At this point the list_head *next, *prev is initial to NULL!!
> ->cpu_stop_queue_work
> ->__list_add_valid
> 
> Do you have any suggestion for this issue?

See if making balance_push() non re-entrable removes the chance for
double list add in your case.

Hillf

--- linux-5.15/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8815,6 +8815,7 @@ static int __balance_push_cpu_stop(void
 		cpu = select_fallback_rq(rq->cpu, p);
 		rq = __migrate_task(rq, &rf, p, cpu);
 	}
+	this_cpu_ptr(&push_work)->queued = 0;
 
 	rq_unlock(rq, &rf);
 	raw_spin_unlock_irq(&p->pi_lock);
@@ -8838,6 +8839,8 @@ static void balance_push(struct rq *rq)
 
 	lockdep_assert_rq_held(rq);
 
+	if (WARN_ON_ONCE(this_cpu_ptr(&push_work)->queued != 0))
+		return;
 	/*
 	 * Ensure the thing is persistent until balance_push_set(.on = false);
 	 */
@@ -8877,6 +8880,7 @@ static void balance_push(struct rq *rq)
 		return;
 	}
 
+	this_cpu_ptr(&push_work)->queued = 1;
 	get_task_struct(push_task);
 	/*
 	 * Temporarily drop rq->lock such that we can wake-up the stop task.
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -27,6 +27,7 @@ struct cpu_stop_work {
 	unsigned long		caller;
 	void			*arg;
 	struct cpu_stop_done	*done;
+	unsigned		queued;
 };
 
 int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg);


^ permalink raw reply	[flat|nested] only message in thread