All of lore.kernel.org
 help / color / mirror / Atom feed
* next/master boot bisection: next-20190430 on beagle-xm
@ 2019-04-30 20:51 kernelci.org bot
  2019-05-01 15:37 ` Sebastian Andrzej Siewior
  2019-05-01 21:48 ` Kevin Hilman
  0 siblings, 2 replies; 13+ messages in thread
From: kernelci.org bot @ 2019-04-30 20:51 UTC (permalink / raw)
  To: Tejun Heo, Sebastian Andrzej Siewior, Peter Zijlstra,
	tomeu.vizoso, guillaume.tucker, mgalka, Thomas Gleixner, broonie,
	matthew.hart, khilman, enric.balletbo, Ingo Molnar
  Cc: Peter Zijlstra, kernelci.org bot, Lai Jiangshan, Johannes Weiner,
	linux-kernel, Ingo Molnar

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* This automated bisection report was sent to you on the basis  *
* that you may be involved with the breaking commit it has      *
* found.  No manual investigation has been done to verify it,   *
* and the root cause of the problem may be somewhere else.      *
* Hope this helps!                                              *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

next/master boot bisection: next-20190430 on beagle-xm

Summary:
  Start:      f43b05fd4c17 Add linux-next specific files for 20190430
  Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
  Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
  HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
  Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock

Checks:
  revert:     PASS
  verify:     PASS

Parameters:
  Tree:       next
  URL:        git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
  Branch:     master
  Target:     beagle-xm
  CPU arch:   arm
  Lab:        lab-baylibre
  Compiler:   gcc-7
  Config:     multi_v7_defconfig+CONFIG_SMP=n
  Test suite: boot

Breaking commit found:

-------------------------------------------------------------------------------
commit 6d25be5782e482eb93e3de0c94d0a517879377d0
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Mar 13 17:55:48 2019 +0100

    sched/core, workqueues: Distangle worker accounting from rq lock
    
    The worker accounting for CPU bound workers is plugged into the core
    scheduler code and the wakeup code. This is not a hard requirement and
    can be avoided by keeping track of the state in the workqueue code
    itself.
    
    Keep track of the sleeping state in the worker itself and call the
    notifier before entering the core scheduler. There might be false
    positives when the task is woken between that call and actually
    scheduling, but that's not really different from scheduling and being
    woken immediately after switching away. When nr_running is updated when
    the task is retunrning from schedule() then it is later compared when it
    is done from ttwu().
    
    [ bigeasy: preempt_disable() around wq_worker_sleeping() by Daniel Bristot de Oliveira ]
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Tejun Heo <tj@kernel.org>
    Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/ad2b29b5715f970bffc1a7026cabd6ff0b24076a.1532952814.git.bristot@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4778c48a7fda..6184a0856aab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1685,10 +1685,6 @@ static inline void ttwu_activate(struct rq *rq, struct task_struct *p, int en_fl
 {
 	activate_task(rq, p, en_flags);
 	p->on_rq = TASK_ON_RQ_QUEUED;
-
-	/* If a worker is waking up, notify the workqueue: */
-	if (p->flags & PF_WQ_WORKER)
-		wq_worker_waking_up(p, cpu_of(rq));
 }
 
 /*
@@ -2106,56 +2102,6 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	return success;
 }
 
-/**
- * try_to_wake_up_local - try to wake up a local task with rq lock held
- * @p: the thread to be awakened
- * @rf: request-queue flags for pinning
- *
- * Put @p on the run-queue if it's not already there. The caller must
- * ensure that this_rq() is locked, @p is bound to this_rq() and not
- * the current task.
- */
-static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf)
-{
-	struct rq *rq = task_rq(p);
-
-	if (WARN_ON_ONCE(rq != this_rq()) ||
-	    WARN_ON_ONCE(p == current))
-		return;
-
-	lockdep_assert_held(&rq->lock);
-
-	if (!raw_spin_trylock(&p->pi_lock)) {
-		/*
-		 * This is OK, because current is on_cpu, which avoids it being
-		 * picked for load-balance and preemption/IRQs are still
-		 * disabled avoiding further scheduler activity on it and we've
-		 * not yet picked a replacement task.
-		 */
-		rq_unlock(rq, rf);
-		raw_spin_lock(&p->pi_lock);
-		rq_relock(rq, rf);
-	}
-
-	if (!(p->state & TASK_NORMAL))
-		goto out;
-
-	trace_sched_waking(p);
-
-	if (!task_on_rq_queued(p)) {
-		if (p->in_iowait) {
-			delayacct_blkio_end(p);
-			atomic_dec(&rq->nr_iowait);
-		}
-		ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_NOCLOCK);
-	}
-
-	ttwu_do_wakeup(rq, p, 0, rf);
-	ttwu_stat(p, smp_processor_id(), 0);
-out:
-	raw_spin_unlock(&p->pi_lock);
-}
-
 /**
  * wake_up_process - Wake up a specific process
  * @p: The process to be woken up.
@@ -3472,19 +3418,6 @@ static void __sched notrace __schedule(bool preempt)
 				atomic_inc(&rq->nr_iowait);
 				delayacct_blkio_start();
 			}
-
-			/*
-			 * If a worker went to sleep, notify and ask workqueue
-			 * whether it wants to wake up a task to maintain
-			 * concurrency.
-			 */
-			if (prev->flags & PF_WQ_WORKER) {
-				struct task_struct *to_wakeup;
-
-				to_wakeup = wq_worker_sleeping(prev);
-				if (to_wakeup)
-					try_to_wake_up_local(to_wakeup, &rf);
-			}
 		}
 		switch_count = &prev->nvcsw;
 	}
@@ -3544,6 +3477,20 @@ static inline void sched_submit_work(struct task_struct *tsk)
 {
 	if (!tsk->state || tsk_is_pi_blocked(tsk))
 		return;
+
+	/*
+	 * If a worker went to sleep, notify and ask workqueue whether
+	 * it wants to wake up a task to maintain concurrency.
+	 * As this function is called inside the schedule() context,
+	 * we disable preemption to avoid it calling schedule() again
+	 * in the possible wakeup of a kworker.
+	 */
+	if (tsk->flags & PF_WQ_WORKER) {
+		preempt_disable();
+		wq_worker_sleeping(tsk);
+		preempt_enable_no_resched();
+	}
+
 	/*
 	 * If we are going to sleep and we have plugged IO queued,
 	 * make sure to submit it to avoid deadlocks.
@@ -3552,6 +3499,12 @@ static inline void sched_submit_work(struct task_struct *tsk)
 		blk_schedule_flush_plug(tsk);
 }
 
+static void sched_update_worker(struct task_struct *tsk)
+{
+	if (tsk->flags & PF_WQ_WORKER)
+		wq_worker_running(tsk);
+}
+
 asmlinkage __visible void __sched schedule(void)
 {
 	struct task_struct *tsk = current;
@@ -3562,6 +3515,7 @@ asmlinkage __visible void __sched schedule(void)
 		__schedule(false);
 		sched_preempt_enable_no_resched();
 	} while (need_resched());
+	sched_update_worker(tsk);
 }
 EXPORT_SYMBOL(schedule);
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ddee541ea97a..56180c9286f5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -841,43 +841,32 @@ static void wake_up_worker(struct worker_pool *pool)
 }
 
 /**
- * wq_worker_waking_up - a worker is waking up
+ * wq_worker_running - a worker is running again
  * @task: task waking up
- * @cpu: CPU @task is waking up to
  *
- * This function is called during try_to_wake_up() when a worker is
- * being awoken.
- *
- * CONTEXT:
- * spin_lock_irq(rq->lock)
+ * This function is called when a worker returns from schedule()
  */
-void wq_worker_waking_up(struct task_struct *task, int cpu)
+void wq_worker_running(struct task_struct *task)
 {
 	struct worker *worker = kthread_data(task);
 
-	if (!(worker->flags & WORKER_NOT_RUNNING)) {
-		WARN_ON_ONCE(worker->pool->cpu != cpu);
+	if (!worker->sleeping)
+		return;
+	if (!(worker->flags & WORKER_NOT_RUNNING))
 		atomic_inc(&worker->pool->nr_running);
-	}
+	worker->sleeping = 0;
 }
 
 /**
  * wq_worker_sleeping - a worker is going to sleep
  * @task: task going to sleep
  *
- * This function is called during schedule() when a busy worker is
- * going to sleep.  Worker on the same cpu can be woken up by
- * returning pointer to its task.
- *
- * CONTEXT:
- * spin_lock_irq(rq->lock)
- *
- * Return:
- * Worker task on @cpu to wake up, %NULL if none.
+ * This function is called from schedule() when a busy worker is
+ * going to sleep.
  */
-struct task_struct *wq_worker_sleeping(struct task_struct *task)
+void wq_worker_sleeping(struct task_struct *task)
 {
-	struct worker *worker = kthread_data(task), *to_wakeup = NULL;
+	struct worker *next, *worker = kthread_data(task);
 	struct worker_pool *pool;
 
 	/*
@@ -886,13 +875,15 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task)
 	 * checking NOT_RUNNING.
 	 */
 	if (worker->flags & WORKER_NOT_RUNNING)
-		return NULL;
+		return;
 
 	pool = worker->pool;
 
-	/* this can only happen on the local cpu */
-	if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id()))
-		return NULL;
+	if (WARN_ON_ONCE(worker->sleeping))
+		return;
+
+	worker->sleeping = 1;
+	spin_lock_irq(&pool->lock);
 
 	/*
 	 * The counterpart of the following dec_and_test, implied mb,
@@ -906,9 +897,12 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task)
 	 * lock is safe.
 	 */
 	if (atomic_dec_and_test(&pool->nr_running) &&
-	    !list_empty(&pool->worklist))
-		to_wakeup = first_idle_worker(pool);
-	return to_wakeup ? to_wakeup->task : NULL;
+	    !list_empty(&pool->worklist)) {
+		next = first_idle_worker(pool);
+		if (next)
+			wake_up_process(next->task);
+	}
+	spin_unlock_irq(&pool->lock);
 }
 
 /**
@@ -4929,7 +4923,7 @@ static void rebind_workers(struct worker_pool *pool)
 		 *
 		 * WRITE_ONCE() is necessary because @worker->flags may be
 		 * tested without holding any lock in
-		 * wq_worker_waking_up().  Without it, NOT_RUNNING test may
+		 * wq_worker_running().  Without it, NOT_RUNNING test may
 		 * fail incorrectly leading to premature concurrency
 		 * management operations.
 		 */
diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
index cb68b03ca89a..498de0e909a4 100644
--- a/kernel/workqueue_internal.h
+++ b/kernel/workqueue_internal.h
@@ -44,6 +44,7 @@ struct worker {
 	unsigned long		last_active;	/* L: last active timestamp */
 	unsigned int		flags;		/* X: flags */
 	int			id;		/* I: worker id */
+	int			sleeping;	/* None */
 
 	/*
 	 * Opaque string set with work_set_desc().  Printed out with task
@@ -72,8 +73,8 @@ static inline struct worker *current_wq_worker(void)
  * Scheduler hooks for concurrency managed workqueue.  Only to be used from
  * sched/ and workqueue.c.
  */
-void wq_worker_waking_up(struct task_struct *task, int cpu);
-struct task_struct *wq_worker_sleeping(struct task_struct *task);
+void wq_worker_running(struct task_struct *task);
+void wq_worker_sleeping(struct task_struct *task);
 work_func_t wq_worker_last_func(struct task_struct *task);
 
 #endif /* _KERNEL_WORKQUEUE_INTERNAL_H */
-------------------------------------------------------------------------------


Git bisection log:

-------------------------------------------------------------------------------
git bisect start
# good: [80871482fd5cb1cb396ea232237a7d9c540854f9] x86: make ZERO_PAGE() at least parse its argument
git bisect good 80871482fd5cb1cb396ea232237a7d9c540854f9
# bad: [f43b05fd4c176d42c7b3f3b99643910486fc49c8] Add linux-next specific files for 20190430
git bisect bad f43b05fd4c176d42c7b3f3b99643910486fc49c8
# good: [5581b5dd6d5a20de0a40ac8975ca66fe15324293] Merge remote-tracking branch 'crypto/master'
git bisect good 5581b5dd6d5a20de0a40ac8975ca66fe15324293
# good: [3ed1aaa4720275e2c6f94e109805472d55969148] Merge remote-tracking branch 'spi/for-next'
git bisect good 3ed1aaa4720275e2c6f94e109805472d55969148
# bad: [0606c6c8fc2478eb7d09202444412d4f9b484076] Merge remote-tracking branch 'staging/staging-next'
git bisect bad 0606c6c8fc2478eb7d09202444412d4f9b484076
# bad: [f0ca99b2ef58eb1d0509b996c9f4b16cb37780d0] Merge remote-tracking branch 'usb-serial/usb-next'
git bisect bad f0ca99b2ef58eb1d0509b996c9f4b16cb37780d0
# bad: [ded23883f168101a4de43e87f9329ee7fcdd540f] Merge branch 'locking/core'
git bisect bad ded23883f168101a4de43e87f9329ee7fcdd540f
# bad: [0dc77d22166d637e69728dde0121764b35f6d18e] Merge branch 'perf/core'
git bisect bad 0dc77d22166d637e69728dde0121764b35f6d18e
# good: [7a525b0cc661abb2d9004f619406df0fbe480106] Merge branch 'x86/asm'
git bisect good 7a525b0cc661abb2d9004f619406df0fbe480106
# good: [477f00f9617009a9a3a9271885231573b728ca4f] perf/x86/intel/ds: Extract code of event update in short period
git bisect good 477f00f9617009a9a3a9271885231573b728ca4f
# bad: [146b2c0aea6a74c5c22b6f0bb68b17f7601c3fea] Merge branch 'sched/core'
git bisect bad 146b2c0aea6a74c5c22b6f0bb68b17f7601c3fea
# bad: [ad2e379def135ebc079f89a0e0b1d987d243f949] sched/debug: Fix spelling mistake "logaritmic" -> "logarithmic"
git bisect bad ad2e379def135ebc079f89a0e0b1d987d243f949
# bad: [6d25be5782e482eb93e3de0c94d0a517879377d0] sched/core, workqueues: Distangle worker accounting from rq lock
git bisect bad 6d25be5782e482eb93e3de0c94d0a517879377d0
# good: [7ba7319f9e3898101bff5d63cbae5a6cc174c8c9] sched/core: Annotate perf_domain pointer with __rcu
git bisect good 7ba7319f9e3898101bff5d63cbae5a6cc174c8c9
# good: [d8743230c9f4e92f370ecd2a90c680ddcede6ae5] sched/topology: Fix build_sched_groups() comment
git bisect good d8743230c9f4e92f370ecd2a90c680ddcede6ae5
# good: [e2abb398115e9c33f3d1e25bf6d1d08badc58b13] sched/fair: Remove unneeded prototype of capacity_of()
git bisect good e2abb398115e9c33f3d1e25bf6d1d08badc58b13
# first bad commit: [6d25be5782e482eb93e3de0c94d0a517879377d0] sched/core, workqueues: Distangle worker accounting from rq lock
-------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-04-30 20:51 next/master boot bisection: next-20190430 on beagle-xm kernelci.org bot
@ 2019-05-01 15:37 ` Sebastian Andrzej Siewior
  2019-05-01 16:29   ` Tony Lindgren
  2019-05-01 21:48 ` Kevin Hilman
  1 sibling, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-01 15:37 UTC (permalink / raw)
  To: kernelci.org bot
  Cc: Tejun Heo, Peter Zijlstra, tomeu.vizoso, guillaume.tucker,
	mgalka, Thomas Gleixner, broonie, matthew.hart, khilman,
	enric.balletbo, Ingo Molnar, Lai Jiangshan, Johannes Weiner,
	linux-kernel, Ingo Molnar, Tony Lindgren


On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote:
> next/master boot bisection: next-20190430 on beagle-xm
> 
> Summary:
>   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
>   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
>   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
>   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
>   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       next
>   URL:        git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>   Branch:     master
>   Target:     beagle-xm
>   CPU arch:   arm
>   Lab:        lab-baylibre
>   Compiler:   gcc-7
>   Config:     multi_v7_defconfig+CONFIG_SMP=n
>   Test suite: boot
> 
> Breaking commit found:
> 
> -------------------------------------------------------------------------------
> commit 6d25be5782e482eb93e3de0c94d0a517879377d0
> Author: Thomas Gleixner <tglx@linutronix.de>
> Date:   Wed Mar 13 17:55:48 2019 +0100
> 
>     sched/core, workqueues: Distangle worker accounting from rq lock

According to the bootlog it just stopped its output. This commit is in
next since a week or two so I don't understand why this pops up now.

I just revived my BBB and I can boot that commit in question. Currently
that as close as I get to a beagle-xm. 
Looking at
	https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/
it seems that the very same board managed to boot linux-next for
next-20190501.

Side note: I can't boot next-20190501 on my BBB, bisect points to commit
  1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision")

any idea?

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 15:37 ` Sebastian Andrzej Siewior
@ 2019-05-01 16:29   ` Tony Lindgren
  2019-05-01 16:44     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2019-05-01 16:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

Hi,

* Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 15:37]:
> 
> On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote:
> > next/master boot bisection: next-20190430 on beagle-xm
> > 
> > Summary:
> >   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
> >   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
> >   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
> >   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
> >   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock
> > 
> > Checks:
> >   revert:     PASS
> >   verify:     PASS
> > 
> > Parameters:
> >   Tree:       next
> >   URL:        git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> >   Branch:     master
> >   Target:     beagle-xm
> >   CPU arch:   arm
> >   Lab:        lab-baylibre
> >   Compiler:   gcc-7
> >   Config:     multi_v7_defconfig+CONFIG_SMP=n
> >   Test suite: boot
> > 
> > Breaking commit found:
> > 
> > -------------------------------------------------------------------------------
> > commit 6d25be5782e482eb93e3de0c94d0a517879377d0
> > Author: Thomas Gleixner <tglx@linutronix.de>
> > Date:   Wed Mar 13 17:55:48 2019 +0100
> > 
> >     sched/core, workqueues: Distangle worker accounting from rq lock
> 
> According to the bootlog it just stopped its output. This commit is in
> next since a week or two so I don't understand why this pops up now.

Adding Kevin to Cc, he just confirmed on #armlinux irc that he is able to
reproduce this with CONFIG_SMP=n and root=/dev/ram0. I could not reproduce
this issue so far on omap3 with NFSroot at least.

> I just revived my BBB and I can boot that commit in question. Currently
> that as close as I get to a beagle-xm. 
> Looking at
> 	https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/
> it seems that the very same board managed to boot linux-next for
> next-20190501.
> 
> Side note: I can't boot next-20190501 on my BBB, bisect points to commit
>   1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision")
> 
> any idea?

Oh interesting thanks for letting me know. Next boots fine for me here
with NFSroot on BBB.

Do you have some output on what happens so I can investigate?

Regards,

Tony


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 16:29   ` Tony Lindgren
@ 2019-05-01 16:44     ` Sebastian Andrzej Siewior
  2019-05-01 16:52       ` Tony Lindgren
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-01 16:44 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

On 2019-05-01 09:29:44 [-0700], Tony Lindgren wrote:
> Hi,
> 
> * Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 15:37]:
> > 
> > On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote:
> > > next/master boot bisection: next-20190430 on beagle-xm
> > > 
> > > Summary:
> > >   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
> > >   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
> > >   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
> > >   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
> > >   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock
> > > 
> > > Checks:
> > >   revert:     PASS
> > >   verify:     PASS
> > > 
> > > Parameters:
> > >   Tree:       next
> > >   URL:        git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > >   Branch:     master
> > >   Target:     beagle-xm
> > >   CPU arch:   arm
> > >   Lab:        lab-baylibre
> > >   Compiler:   gcc-7
> > >   Config:     multi_v7_defconfig+CONFIG_SMP=n
> > >   Test suite: boot
> > > 
> > > Breaking commit found:
> > > 
> > > -------------------------------------------------------------------------------
> > > commit 6d25be5782e482eb93e3de0c94d0a517879377d0
> > > Author: Thomas Gleixner <tglx@linutronix.de>
> > > Date:   Wed Mar 13 17:55:48 2019 +0100
> > > 
> > >     sched/core, workqueues: Distangle worker accounting from rq lock
> > 
> > According to the bootlog it just stopped its output. This commit is in
> > next since a week or two so I don't understand why this pops up now.
> 
> Adding Kevin to Cc, he just confirmed on #armlinux irc that he is able to
> reproduce this with CONFIG_SMP=n and root=/dev/ram0. I could not reproduce
> this issue so far on omap3 with NFSroot at least.

So that problem remains even that the job for today passed?
 
> > I just revived my BBB and I can boot that commit in question. Currently
> > that as close as I get to a beagle-xm. 
> > Looking at
> > 	https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/
> > it seems that the very same board managed to boot linux-next for
> > next-20190501.
> > 
> > Side note: I can't boot next-20190501 on my BBB, bisect points to commit
> >   1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision")
> > 
> > any idea?
> 
> Oh interesting thanks for letting me know. Next boots fine for me here
> with NFSroot on BBB.
> 
> Do you have some output on what happens so I can investigate?

Nope, the console remains dark.

> Regards,
> 
> Tony

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 16:44     ` Sebastian Andrzej Siewior
@ 2019-05-01 16:52       ` Tony Lindgren
  2019-05-01 17:01         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2019-05-01 16:52 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

* Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 16:45]:
> On 2019-05-01 09:29:44 [-0700], Tony Lindgren wrote:
> > Hi,
> > 
> > * Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 15:37]:
> > > 
> > > On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote:
> > > > next/master boot bisection: next-20190430 on beagle-xm
> > > > 
> > > > Summary:
> > > >   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
> > > >   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
> > > >   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
> > > >   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
> > > >   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock
> > > > 
> > > > Checks:
> > > >   revert:     PASS
> > > >   verify:     PASS
> > > > 
> > > > Parameters:
> > > >   Tree:       next
> > > >   URL:        git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > > >   Branch:     master
> > > >   Target:     beagle-xm
> > > >   CPU arch:   arm
> > > >   Lab:        lab-baylibre
> > > >   Compiler:   gcc-7
> > > >   Config:     multi_v7_defconfig+CONFIG_SMP=n
> > > >   Test suite: boot
> > > > 
> > > > Breaking commit found:
> > > > 
> > > > -------------------------------------------------------------------------------
> > > > commit 6d25be5782e482eb93e3de0c94d0a517879377d0
> > > > Author: Thomas Gleixner <tglx@linutronix.de>
> > > > Date:   Wed Mar 13 17:55:48 2019 +0100
> > > > 
> > > >     sched/core, workqueues: Distangle worker accounting from rq lock
> > > 
> > > According to the bootlog it just stopped its output. This commit is in
> > > next since a week or two so I don't understand why this pops up now.
> > 
> > Adding Kevin to Cc, he just confirmed on #armlinux irc that he is able to
> > reproduce this with CONFIG_SMP=n and root=/dev/ram0. I could not reproduce
> > this issue so far on omap3 with NFSroot at least.
> 
> So that problem remains even that the job for today passed?

So it seems, let's wait and see what Kevin comes up with.

> > > I just revived my BBB and I can boot that commit in question. Currently
> > > that as close as I get to a beagle-xm. 
> > > Looking at
> > > 	https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/
> > > it seems that the very same board managed to boot linux-next for
> > > next-20190501.
> > > 
> > > Side note: I can't boot next-20190501 on my BBB, bisect points to commit
> > >   1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision")
> > > 
> > > any idea?
> > 
> > Oh interesting thanks for letting me know. Next boots fine for me here
> > with NFSroot on BBB.
> > 
> > Do you have some output on what happens so I can investigate?
> 
> Nope, the console remains dark.

OK. Can you please email me your .config and the kernel cmdline you're
using? I'll try to reproduce that one here.

Regards,

Tony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 16:52       ` Tony Lindgren
@ 2019-05-01 17:01         ` Sebastian Andrzej Siewior
  2019-05-01 17:44           ` Tony Lindgren
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-01 17:01 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

On 2019-05-01 09:52:24 [-0700], Tony Lindgren wrote:
> > > Oh interesting thanks for letting me know. Next boots fine for me here
> > > with NFSroot on BBB.
> > > 
> > > Do you have some output on what happens so I can investigate?
> > 
> > Nope, the console remains dark.
> 
> OK. Can you please email me your .config and the kernel cmdline you're
> using? I'll try to reproduce that one here.

This is "multi_v7_defconfig+CONFIG_SMP=n" and my earlyprintk vanished.
So with this added:
|[    0.000000] Booting Linux on physical CPU 0x0
|[    0.000000] Linux version 5.1.0-rc7-next-20190501 (bigeasy@flow) (gcc version 8.3.0 (Debian 8.3.0-7)) #29 Wed May 1 18:55:24 CEST 2019
|[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
|[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
|[    0.000000] OF: fdt: Machine model: TI AM335x BeagleBone Black
|[    0.000000] printk: bootconsole [earlycon0] enabled
|[    0.000000] Memory policy: Data cache writeback
|[    0.000000] efi: Getting EFI parameters from FDT:
|[    0.000000] efi: UEFI not found.
|[    0.000000] cma: Reserved 64 MiB at 0x9b800000
|[    0.000000] CPU: All CPU(s) started in SVC mode.
|[    0.000000] AM335X ES2.0 (neon)
|[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 129540
|[    0.000000] Kernel command line: console=ttyO0,115200n8 root=/dev/mmcblk1p2 rootwait coherent_pool=1M net.ifnames=0 earlyprintk
|[    0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
|[    0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
|[    0.000000] Memory: 430304K/522240K available (11264K kernel code, 1659K rwdata, 4924K rodata, 2048K init, 383K bss, 26400K reserved, 65536K cma-reserved, 0K highmem)
|[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
|[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
|[    0.000000] IRQ: Found an INTC at 0x(ptrval) (revision 5.0) with 128 interrupts
|[    0.000000] random: get_random_bytes called from start_kernel+0x2ec/0x478 with crng_init=0
|[    0.000000] OMAP clockevent source: timer2 at 24000000 Hz
|[    0.000015] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
|[    0.008050] clocksource: timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
|[    0.017499] OMAP clocksource: timer1 at 24000000 Hz
|[    0.028449] timer_probe: no matching timers found
|[    0.033509] Console: colour dummy device 80x30
|[    0.038089] WARNING: Your 'console=ttyO0' has been replaced by 'ttyS0'
|[    0.044781] This ensures that you still see kernel messages. Please
|[    0.051197] update your kernel commandline.
|[    0.055513] Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736)
|[    0.100098] pid_max: default: 32768 minimum: 301
|[    0.104992] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
|[    0.111774] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
|[    0.119499] *** VALIDATE proc ***
|[    0.123004] *** VALIDATE cgroup1 ***
|[    0.126693] *** VALIDATE cgroup2 ***
|[    0.130358] CPU: Testing write buffer coherency: ok
|[    0.135408] CPU0: Spectre v2: using BPIALL workaround
|[    0.141140] Setting up static identity map for 0x80300000 - 0x803000ac
|[    0.153146] EFI services will not be available.
|[    0.159067] devtmpfs: initialized
|[    0.171139] VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 3
|[    0.179275] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
|[    0.189383] futex hash table entries: 256 (order: -1, 3072 bytes)
|[    0.199562] pinctrl core: initialized pinctrl subsystem
|[    0.207278] DMI not present or invalid.
|[    0.211649] NET: Registered protocol family 16
|[    0.219070] DMA: preallocated 1024 KiB pool for atomic coherent allocations
|[    0.244016] l3-aon-clkctrl:0000:0: failed to disable
|[    0.277247] cpuidle: using governor menu
|[    0.302433] No ATAGs?
|[    0.302443] hw-breakpoint: debug architecture 0x4 unsupported.
|[    0.311790] omap4_sram_init:Unable to allocate sram needed to handle errata I688
|[    0.319401] omap4_sram_init:Unable to get sram pool needed to handle errata I688
|[    0.332438] Serial: AMBA PL011 UART driver
|[    0.355726] edma 49000000.edma: TI EDMA DMA engine driver
|[    0.362309] AT91: Could not find identification node
|[    0.365980] vgaarb: loaded
|[    0.374698] SCSI subsystem initialized
|[    0.379005] usbcore: registered new interface driver usbfs
|[    0.384723] usbcore: registered new interface driver hub
|[    0.390205] usbcore: registered new device driver usb
|[    0.396826] pps_core: LinuxPPS API ver. 1 registered
|[    0.401912] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
|[    0.411320] PTP clock support registered
|[    0.415536] EDAC MC: Ver: 3.0.0
|[    0.421533] clocksource: Switched to clocksource timer1
|[    0.823224] NET: Registered protocol family 2
|[    0.828261] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes)
|[    0.836192] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
|[    0.843460] TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
|[    0.850087] TCP: Hash tables configured (established 4096 bind 4096)
|[    0.856698] UDP hash table entries: 256 (order: 0, 4096 bytes)
|[    0.862700] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
|[    0.869257] NET: Registered protocol family 1
|[    0.874257] RPC: Registered named UNIX socket transport module.
|[    0.880325] RPC: Registered udp transport module.
|[    0.885174] RPC: Registered tcp transport module.
|[    0.889989] RPC: Registered tcp NFSv4.1 backchannel transport module.
|[    0.897690] hw perfevents: enabled with armv7_cortex_a8 PMU driver, 5 counters available
|[    0.907566] Initialise system trusted keyrings
|[    0.912417] workingset: timestamp_bits=30 max_order=17 bucket_order=0
|[    0.924685] squashfs: version 4.0 (2009/01/31) Phillip Lougher
|[    0.931401] NFS: Registering the id_resolver key type
|[    0.936677] Key type id_resolver registered
|[    0.940961] Key type id_legacy registered
|[    0.945116] ntfs: driver 2.1.32 [Flags: R/O].
|[    0.950130] Key type asymmetric registered
|[    0.954380] Asymmetric key parser 'x509' registered
|[    0.959421] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
|[    0.967018] io scheduler mq-deadline registered
|[    0.971667] io scheduler kyber registered
|[    0.983052] OMAP GPIO hardware version 0.1
|[    1.010470] ti-sysc 48042000.target-module: sysc_flags 00000222 != 00000022
|[    1.018713] ti-sysc 48044000.target-module: sysc_flags 00000222 != 00000022
|[    1.026947] ti-sysc 48046000.target-module: sysc_flags 00000222 != 00000022
|[    1.035182] ti-sysc 48048000.target-module: sysc_flags 00000222 != 00000022
|[    1.043393] ti-sysc 4804a000.target-module: sysc_flags 00000222 != 00000022
|[    1.073661] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa1cc000
|[    1.081518] pgd = (ptrval)
|[    1.084285] [fa1cc000] *pgd=48011452(bad)
|[    1.088398] Internal error: : 1028 [#1] ARM
|[    1.092681] Modules linked in:
|[    1.095814] CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-rc7-next-20190501 #29
|[    1.103300] Hardware name: Generic AM33XX (Flattened Device Tree)
|[    1.109560] PC is at sysc_probe+0x958/0x10a4
|[    1.113932] LR is at sysc_probe+0x928/0x10a4
|[    1.118302] pc : [<c0644e38>]    lr : [<c0644e08>]    psr: 60000013
|[    1.124720] sp : db0b1db8  ip : 00000013  fp : c162ac60
|[    1.130069] r10: 00000000  r9 : 00000028  r8 : 00000001
|[    1.135418] r7 : 00000000  r6 : db191210  r5 : c1604048  r4 : db345940
|[    1.142103] r3 : fa1cc000  r2 : 00000000  r1 : 00000000  r0 : 00000000
|[    1.148793] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
|[    1.156103] Control: 10c5387d  Table: 80204019  DAC: 00000051
|[    1.161987] Process swapper (pid: 1, stack limit = 0x(ptrval))
|…
|[    1.331712] [<c0644e38>] (sysc_probe) from [<c08f027c>] (platform_drv_probe+0x48/0x98)
|[    1.339831] [<c08f027c>] (platform_drv_probe) from [<c08ee41c>] (really_probe+0xf0/0x2c8)
|[    1.348216] [<c08ee41c>] (really_probe) from [<c08ee754>] (driver_probe_device+0x60/0x16c)
|[    1.356688] [<c08ee754>] (driver_probe_device) from [<c08eea00>] (device_driver_attach+0x58/0x60)
|[    1.365782] [<c08eea00>] (device_driver_attach) from [<c08eea60>] (__driver_attach+0x58/0xcc)
|[    1.374521] [<c08eea60>] (__driver_attach) from [<c08ec8d8>] (bus_for_each_dev+0x74/0xb4)
|[    1.382903] [<c08ec8d8>] (bus_for_each_dev) from [<c08ed944>] (bus_add_driver+0x1b8/0x1d8)
|[    1.391374] [<c08ed944>] (bus_add_driver) from [<c08ef394>] (driver_register+0x74/0x108)
|[    1.399672] [<c08ef394>] (driver_register) from [<c0302d88>] (do_one_initcall+0x50/0x1a4)
|[    1.408064] [<c0302d88>] (do_one_initcall) from [<c1401064>] (kernel_init_freeable+0x1c4/0x25c)
|[    1.416989] [<c1401064>] (kernel_init_freeable) from [<c0dedba4>] (kernel_init+0x8/0x10c)
|[    1.425373] [<c0dedba4>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
|[    1.433127] Exception stack(0xdb0b1fb0 to 0xdb0b1ff8)
|[    1.438301] 1fa0:                                     00000000 00000000 00000000 00000000
|[    1.446683] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
|[    1.455063] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
|[    1.461845] Code: e3130004 1a000126 e5943014 e0833001 (e5930000) 
|[    1.468105] ---[ end trace 5481d6c45bd9fae0 ]---
|[    1.472934] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
|[    1.480784] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

with  arm-linux-gnueabihf-addr2line -i c0644e38 -e vmlinux
| arch/arm/include/asm/io.h:117
| drivers/bus/ti-sysc.c:117
| drivers/bus/ti-sysc.c:132
| drivers/bus/ti-sysc.c:1361
| drivers/bus/ti-sysc.c:2117

Does this help?

> Regards,
> 
> Tony

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 17:01         ` Sebastian Andrzej Siewior
@ 2019-05-01 17:44           ` Tony Lindgren
  2019-05-01 19:03             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2019-05-01 17:44 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

* Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 17:01]:
> On 2019-05-01 09:52:24 [-0700], Tony Lindgren wrote:
> > > > Oh interesting thanks for letting me know. Next boots fine for me here
> > > > with NFSroot on BBB.
> > > > 
> > > > Do you have some output on what happens so I can investigate?
> > > 
> > > Nope, the console remains dark.
> > 
> > OK. Can you please email me your .config and the kernel cmdline you're
> > using? I'll try to reproduce that one here.
> 
> This is "multi_v7_defconfig+CONFIG_SMP=n" and my earlyprintk vanished.
> So with this added:
> |[    0.000000] Booting Linux on physical CPU 0x0
> |[    0.000000] Linux version 5.1.0-rc7-next-20190501 (bigeasy@flow) (gcc version 8.3.0 (Debian 8.3.0-7)) #29 Wed May 1 18:55:24 CEST 2019
> |[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
> |[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> |[    0.000000] OF: fdt: Machine model: TI AM335x BeagleBone Black
> |[    0.000000] printk: bootconsole [earlycon0] enabled
> |[    0.000000] Memory policy: Data cache writeback
> |[    0.000000] efi: Getting EFI parameters from FDT:
> |[    0.000000] efi: UEFI not found.
> |[    0.000000] cma: Reserved 64 MiB at 0x9b800000
> |[    0.000000] CPU: All CPU(s) started in SVC mode.
> |[    0.000000] AM335X ES2.0 (neon)
> |[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 129540
> |[    0.000000] Kernel command line: console=ttyO0,115200n8 root=/dev/mmcblk1p2 rootwait coherent_pool=1M net.ifnames=0 earlyprintk

Hmm so I tried without "earlycon" in command line thinking it might be
happening with just "earlyprintk" but still no luck.

BTW, in general you might want to update your kernel command line
options to:

debug earlyprintk earlycon

As that way you get early output without CONFIG_DEBUG_LL=y
with earlycon that should be enabled by default already.

> |[    1.073661] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa1cc000
> |[    1.081518] pgd = (ptrval)
> |[    1.084285] [fa1cc000] *pgd=48011452(bad)
> |[    1.088398] Internal error: : 1028 [#1] ARM
> |[    1.092681] Modules linked in:
> |[    1.095814] CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-rc7-next-20190501 #29
> |[    1.103300] Hardware name: Generic AM33XX (Flattened Device Tree)
> |[    1.109560] PC is at sysc_probe+0x958/0x10a4
> |[    1.113932] LR is at sysc_probe+0x928/0x10a4
> |[    1.118302] pc : [<c0644e38>]    lr : [<c0644e08>]    psr: 60000013
> |[    1.124720] sp : db0b1db8  ip : 00000013  fp : c162ac60
> |[    1.130069] r10: 00000000  r9 : 00000028  r8 : 00000001
> |[    1.135418] r7 : 00000000  r6 : db191210  r5 : c1604048  r4 : db345940
> |[    1.142103] r3 : fa1cc000  r2 : 00000000  r1 : 00000000  r0 : 00000000
> |[    1.148793] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> |[    1.156103] Control: 10c5387d  Table: 80204019  DAC: 00000051
> |[    1.161987] Process swapper (pid: 1, stack limit = 0x(ptrval))
> |…
> |[    1.331712] [<c0644e38>] (sysc_probe) from [<c08f027c>] (platform_drv_probe+0x48/0x98)
> |[    1.339831] [<c08f027c>] (platform_drv_probe) from [<c08ee41c>] (really_probe+0xf0/0x2c8)
> |[    1.348216] [<c08ee41c>] (really_probe) from [<c08ee754>] (driver_probe_device+0x60/0x16c)
> |[    1.356688] [<c08ee754>] (driver_probe_device) from [<c08eea00>] (device_driver_attach+0x58/0x60)
> |[    1.365782] [<c08eea00>] (device_driver_attach) from [<c08eea60>] (__driver_attach+0x58/0xcc)
> |[    1.374521] [<c08eea60>] (__driver_attach) from [<c08ec8d8>] (bus_for_each_dev+0x74/0xb4)
> |[    1.382903] [<c08ec8d8>] (bus_for_each_dev) from [<c08ed944>] (bus_add_driver+0x1b8/0x1d8)
> |[    1.391374] [<c08ed944>] (bus_add_driver) from [<c08ef394>] (driver_register+0x74/0x108)
> |[    1.399672] [<c08ef394>] (driver_register) from [<c0302d88>] (do_one_initcall+0x50/0x1a4)
> |[    1.408064] [<c0302d88>] (do_one_initcall) from [<c1401064>] (kernel_init_freeable+0x1c4/0x25c)
> |[    1.416989] [<c1401064>] (kernel_init_freeable) from [<c0dedba4>] (kernel_init+0x8/0x10c)
> |[    1.425373] [<c0dedba4>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
> |[    1.433127] Exception stack(0xdb0b1fb0 to 0xdb0b1ff8)
> |[    1.438301] 1fa0:                                     00000000 00000000 00000000 00000000
> |[    1.446683] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> |[    1.455063] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> |[    1.461845] Code: e3130004 1a000126 e5943014 e0833001 (e5930000) 
> |[    1.468105] ---[ end trace 5481d6c45bd9fae0 ]---
> |[    1.472934] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> |[    1.480784] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
> 
> with  arm-linux-gnueabihf-addr2line -i c0644e38 -e vmlinux
> | arch/arm/include/asm/io.h:117
> | drivers/bus/ti-sysc.c:117
> | drivers/bus/ti-sysc.c:132
> | drivers/bus/ti-sysc.c:1361
> | drivers/bus/ti-sysc.c:2117
> 
> Does this help?

Yes getting closer thanks. Can you please boot one more time with
the following debug patch that sould confirm which target module
during probing triggers the abort?

Looking at the oops 0xfa1cc000, so 0x481cc000 I guess which is d_can0?

Regards,

Tony

8< ------------------
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -2069,6 +2069,8 @@ static int sysc_probe(struct platform_device *pdev)
 	struct sysc *ddata;
 	int error;
 
+	dev_info(&pdev->dev, "probing\n");
+
 	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
 	if (!ddata)
 		return -ENOMEM;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 17:44           ` Tony Lindgren
@ 2019-05-01 19:03             ` Sebastian Andrzej Siewior
  2019-05-01 20:21               ` Tony Lindgren
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-01 19:03 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

On 2019-05-01 10:44:31 [-0700], Tony Lindgren wrote:
> Hmm so I tried without "earlycon" in command line thinking it might be
> happening with just "earlyprintk" but still no luck.
> 
> BTW, in general you might want to update your kernel command line
> options to:
> 
> debug earlyprintk earlycon

debug. Let me look if I manage to hide that `debug' from systemd…

> As that way you get early output without CONFIG_DEBUG_LL=y
> with earlycon that should be enabled by default already.

yup, thx.

> Yes getting closer thanks. Can you please boot one more time with
> the following debug patch that sould confirm which target module
> during probing triggers the abort?

|[    1.091575] ti-sysc 48042000.target-module: probing
|[    1.096694] ti-sysc 48042000.target-module: sysc_flags 00000222 != 00000022
|[    1.104784] ti-sysc 48044000.target-module: probing
|[    1.109974] ti-sysc 48044000.target-module: sysc_flags 00000222 != 00000022
|[    1.118079] ti-sysc 48046000.target-module: probing
|[    1.123206] ti-sysc 48046000.target-module: sysc_flags 00000222 != 00000022
|[    1.131331] ti-sysc 48048000.target-module: probing
|[    1.136505] ti-sysc 48048000.target-module: sysc_flags 00000222 != 00000022
|[    1.144571] ti-sysc 4804a000.target-module: probing
|[    1.149755] ti-sysc 4804a000.target-module: sysc_flags 00000222 != 00000022
|[    1.157846] ti-sysc 4804c000.target-module: probing
|[    1.164537] ti-sysc 480602fc.target-module: probing
|[    1.170858] ti-sysc 48080000.target-module: probing
|[    1.176168] ti-sysc 480c8000.target-module: probing
|[    1.182309] ti-sysc 480ca000.target-module: probing
|[    1.188357] ti-sysc 4819c000.target-module: probing
|[    1.206713] ti-sysc 481a0000.target-module: probing
|[    1.212003] ti-sysc 481a6050.target-module: probing
|[    1.217306] ti-sysc 481a8050.target-module: probing
|[    1.222546] ti-sysc 481aa050.target-module: probing
|[    1.227806] ti-sysc 481ac000.target-module: probing
|[    1.234355] ti-sysc 481ae000.target-module: probing
|[    1.240877] ti-sysc 481cc000.target-module: probing
|[    1.245976] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa1cc000
|[    1.253847] pgd = (ptrval)
|[    1.256623] [fa1cc000] *pgd=48011452(bad)
|[    1.260749] Internal error: : 1028 [#1] ARM
|[    1.265044] Modules linked in:
|[    1.268187] CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-rc7-next-20190501-dirty #31
|[    1.276234] Hardware name: Generic AM33XX (Flattened Device Tree)
|[    1.282510] PC is at sysc_probe+0xb80/0xfc0
|[    1.286805] LR is at sysc_probe+0xb40/0xfc0
|[    1.291100] pc : [<c0644f90>]    lr : [<c0644f50>]    psr: 60000013
|[    1.297538] sp : db0b1dc0  ip : 00000013  fp : c1131648
|[    1.302904] r10: c116c9a4  r9 : c116c984  r8 : 00000028
|[    1.308270] r7 : 00000001  r6 : c0e373ac  r5 : 00000000  r4 : db345940
|[    1.314975] r3 : fa1cc000  r2 : db349700  r1 : 00000000  r0 : 00000000
|[    1.321684] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
|[    1.329017] Control: 10c5387d  Table: 80204019  DAC: 00000051
|[    1.334921] Process swapper (pid: 1, stack limit = 0x(ptrval))
…
|[    1.496788] [<c0644f90>] (sysc_probe) from [<c08f00c4>] (platform_drv_probe+0x48/0x98)
|[    1.504933] [<c08f00c4>] (platform_drv_probe) from [<c08ee264>] (really_probe+0xf0/0x2c8)
|[    1.513346] [<c08ee264>] (really_probe) from [<c08ee59c>] (driver_probe_device+0x60/0x16c)
|[    1.521843] [<c08ee59c>] (driver_probe_device) from [<c08ee848>] (device_driver_attach+0x58/0x60)
|[    1.530967] [<c08ee848>] (device_driver_attach) from [<c08ee8a8>] (__driver_attach+0x58/0xcc)
|[    1.539734] [<c08ee8a8>] (__driver_attach) from [<c08ec720>] (bus_for_each_dev+0x74/0xb4)
|[    1.548142] [<c08ec720>] (bus_for_each_dev) from [<c08ed78c>] (bus_add_driver+0x1b8/0x1d8)
|[    1.556640] [<c08ed78c>] (bus_add_driver) from [<c08ef1dc>] (driver_register+0x74/0x108)
|[    1.564963] [<c08ef1dc>] (driver_register) from [<c0302ce4>] (do_one_initcall+0x50/0x1a4)
|[    1.573384] [<c0302ce4>] (do_one_initcall) from [<c1401064>] (kernel_init_freeable+0x1c4/0x25c)
|[    1.582339] [<c1401064>] (kernel_init_freeable) from [<c0ded9e4>] (kernel_init+0x8/0x10c)
|[    1.590750] [<c0ded9e4>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
|[    1.598531] Exception stack(0xdb0b1fb0 to 0xdb0b1ff8)
|[    1.603722] 1fa0:                                     00000000 00000000 00000000 00000000
|[    1.612129] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
|[    1.620535] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
|[    1.627337] Code: ebfffa8a ea000002 e5943014 e0833001 (e5930000) 
|[    1.633616] ---[ end trace d02e59e9267a59cb ]---
|[    1.638458] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

> Looking at the oops 0xfa1cc000, so 0x481cc000 I guess which is d_can0?

That node around it I guess.

> Regards,
> 
> Tony

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 19:03             ` Sebastian Andrzej Siewior
@ 2019-05-01 20:21               ` Tony Lindgren
  2019-05-01 21:13                 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 13+ messages in thread
From: Tony Lindgren @ 2019-05-01 20:21 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

Hi,

* Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 19:03]:
> On 2019-05-01 10:44:31 [-0700], Tony Lindgren wrote:
> > Hmm so I tried without "earlycon" in command line thinking it might be
> > happening with just "earlyprintk" but still no luck.
> > 
> > BTW, in general you might want to update your kernel command line
> > options to:
> > 
> > debug earlyprintk earlycon
> 
> debug. Let me look if I manage to hide that `debug' from systemd…

Oh that.. I've been quite happy with openrc now for years :)

> > Looking at the oops 0xfa1cc000, so 0x481cc000 I guess which is d_can0?
> 
> That node around it I guess.

OK I found two issues. It seems that d_can also needs osc clock
on am335x. And there's no revision register for d_can.. We're now
reading the CTL register unnecessarily.

Below is what I hope fixes the boot issue for you, care to boot
test?

If this helps I'll send out proper patches for for both issues.

Regards,

Tony

8< ----------------------
diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi
--- a/arch/arm/boot/dts/am33xx-l4.dtsi
+++ b/arch/arm/boot/dts/am33xx-l4.dtsi
@@ -1762,8 +1762,9 @@
 			reg = <0xcc000 0x4>;
 			reg-names = "rev";
 			/* Domains (P, C): per_pwrdm, l4ls_clkdm */
-			clocks = <&l4ls_clkctrl AM3_L4LS_D_CAN0_CLKCTRL 0>;
-			clock-names = "fck";
+			clocks = <&l4ls_clkctrl AM3_L4LS_D_CAN0_CLKCTRL 0>,
+				 <&dcan0_fck>;
+			clock-names = "fck", "osc";
 			#address-cells = <1>;
 			#size-cells = <1>;
 			ranges = <0x0 0xcc000 0x2000>;
@@ -1785,8 +1786,9 @@
 			reg = <0xd0000 0x4>;
 			reg-names = "rev";
 			/* Domains (P, C): per_pwrdm, l4ls_clkdm */
-			clocks = <&l4ls_clkctrl AM3_L4LS_D_CAN1_CLKCTRL 0>;
-			clock-names = "fck";
+			clocks = <&l4ls_clkctrl AM3_L4LS_D_CAN1_CLKCTRL 0>,
+				 <&dcan1_fck>;
+			clock-names = "fck", "osc";
 			#address-cells = <1>;
 			#size-cells = <1>;
 			ranges = <0x0 0xd0000 0x2000>;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 20:21               ` Tony Lindgren
@ 2019-05-01 21:13                 ` Sebastian Andrzej Siewior
  2019-05-01 21:17                   ` Tony Lindgren
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-01 21:13 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

On 2019-05-01 13:21:49 [-0700], Tony Lindgren wrote:
> Hi,
Hi,

> OK I found two issues. It seems that d_can also needs osc clock
> on am335x. And there's no revision register for d_can.. We're now
> reading the CTL register unnecessarily.
> 
> Below is what I hope fixes the boot issue for you, care to boot
> test?

yup, that boots.
Thanks.

> If this helps I'll send out proper patches for for both issues.
> 
> Regards,
> 
> Tony

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 21:13                 ` Sebastian Andrzej Siewior
@ 2019-05-01 21:17                   ` Tony Lindgren
  0 siblings, 0 replies; 13+ messages in thread
From: Tony Lindgren @ 2019-05-01 21:17 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	khilman, enric.balletbo, Ingo Molnar, Lai Jiangshan,
	Johannes Weiner, linux-kernel, Ingo Molnar, Kevin Hilman,
	linux-omap

* Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 21:14]:
> On 2019-05-01 13:21:49 [-0700], Tony Lindgren wrote:
> > Hi,
> Hi,
> 
> > OK I found two issues. It seems that d_can also needs osc clock
> > on am335x. And there's no revision register for d_can.. We're now
> > reading the CTL register unnecessarily.
> > 
> > Below is what I hope fixes the boot issue for you, care to boot
> > test?
> 
> yup, that boots.

OK good to hear and thanks a lot for testing it. I'll post two
patches shortly.

Regards,

Tony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-04-30 20:51 next/master boot bisection: next-20190430 on beagle-xm kernelci.org bot
  2019-05-01 15:37 ` Sebastian Andrzej Siewior
@ 2019-05-01 21:48 ` Kevin Hilman
  2019-05-02  7:16   ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 13+ messages in thread
From: Kevin Hilman @ 2019-05-01 21:48 UTC (permalink / raw)
  To: kernelci.org bot, Tejun Heo, Sebastian Andrzej Siewior,
	Peter Zijlstra, tomeu.vizoso, guillaume.tucker, mgalka,
	Thomas Gleixner, broonie, matthew.hart, enric.balletbo,
	Ingo Molnar
  Cc: Peter Zijlstra, kernelci.org bot, Lai Jiangshan, Johannes Weiner,
	linux-kernel, Ingo Molnar

"kernelci.org bot" <bot@kernelci.org> writes:

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>
> next/master boot bisection: next-20190430 on beagle-xm
>
> Summary:
>   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
>   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
>   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
>   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
>   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock

I was able to reproduce this in next-20190430, but...

I'm not sure what fixed it, but this is passing again in today's
linux-next (next-20190501)

Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: next/master boot bisection: next-20190430 on beagle-xm
  2019-05-01 21:48 ` Kevin Hilman
@ 2019-05-02  7:16   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-02  7:16 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: kernelci.org bot, Tejun Heo, Peter Zijlstra, tomeu.vizoso,
	guillaume.tucker, mgalka, Thomas Gleixner, broonie, matthew.hart,
	enric.balletbo, Ingo Molnar, Lai Jiangshan, Johannes Weiner,
	linux-kernel, Ingo Molnar

On 2019-05-01 14:48:44 [-0700], Kevin Hilman wrote:
> "kernelci.org bot" <bot@kernelci.org> writes:
> 
> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> > * This automated bisection report was sent to you on the basis  *
> > * that you may be involved with the breaking commit it has      *
> > * found.  No manual investigation has been done to verify it,   *
> > * and the root cause of the problem may be somewhere else.      *
> > * Hope this helps!                                              *
> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> >
> > next/master boot bisection: next-20190430 on beagle-xm
> >
> > Summary:
> >   Start:      f43b05fd4c17 Add linux-next specific files for 20190430
> >   Details:    https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b
> >   Plain log:  https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt
> >   HTML log:   https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html
> >   Result:     6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock
> 
> I was able to reproduce this in next-20190430, but...
> 
> I'm not sure what fixed it, but this is passing again in today's
> linux-next (next-20190501)

Okay, thanks for the confirmation.

> Kevin

Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-02  7:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-30 20:51 next/master boot bisection: next-20190430 on beagle-xm kernelci.org bot
2019-05-01 15:37 ` Sebastian Andrzej Siewior
2019-05-01 16:29   ` Tony Lindgren
2019-05-01 16:44     ` Sebastian Andrzej Siewior
2019-05-01 16:52       ` Tony Lindgren
2019-05-01 17:01         ` Sebastian Andrzej Siewior
2019-05-01 17:44           ` Tony Lindgren
2019-05-01 19:03             ` Sebastian Andrzej Siewior
2019-05-01 20:21               ` Tony Lindgren
2019-05-01 21:13                 ` Sebastian Andrzej Siewior
2019-05-01 21:17                   ` Tony Lindgren
2019-05-01 21:48 ` Kevin Hilman
2019-05-02  7:16   ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.