All of lore.kernel.org
 help / color / mirror / Atom feed
* Bad rss-counter is back on 3.14-stable
@ 2014-06-04 18:27 ` Greg KH
  0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2014-06-04 18:27 UTC (permalink / raw)
  To: Dave Jones, linux-mm, Linux Kernel; +Cc: Brandon Philips

Hi all,

Dave, I saw you mention that you were seeing the "Bad rss-counter" line
on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
figured it out, or did it just "magically" go away?

I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
causing system crashes and problems:

[16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
[16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508

[20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
[20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518

[21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
[21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569

[21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
[21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946

Anyone have any ideas of a 3.15-rc patch I should be including in
3.14-stable to resolve this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Bad rss-counter is back on 3.14-stable
@ 2014-06-04 18:27 ` Greg KH
  0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2014-06-04 18:27 UTC (permalink / raw)
  To: Dave Jones, linux-mm, Linux Kernel; +Cc: Brandon Philips

Hi all,

Dave, I saw you mention that you were seeing the "Bad rss-counter" line
on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
figured it out, or did it just "magically" go away?

I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
causing system crashes and problems:

[16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
[16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508

[20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
[20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518

[21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
[21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569

[21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
[21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946

Anyone have any ideas of a 3.15-rc patch I should be including in
3.14-stable to resolve this?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 18:27 ` Greg KH
@ 2014-06-04 18:47   ` Dennis Mungai
  -1 siblings, 0 replies; 14+ messages in thread
From: Dennis Mungai @ 2014-06-04 18:47 UTC (permalink / raw)
  To: Greg KH; +Cc: Dave Jones, linux-mm, Linux Kernel, Brandon Philips

Hello Greg,

do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().

Let's fix this bug in the safest way, and optimize/cleanup this later.

Reported-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Dennis E. Mungai <dmngaie@gmail.com>
---
 fs/exec.c     |    2 +-
 kernel/exit.c |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index a79786a..da27b91 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -819,10 +819,10 @@ static int exec_mmap(struct mm_struct *mm)
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
 	old_mm = current->mm;
-	sync_mm_rss(old_mm);
 	mm_release(tsk, old_mm);

 	if (old_mm) {
+		sync_mm_rss(old_mm);
 		/*
 		 * Make sure that if there is a core dump in progress
 		 * for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index 34867cc..c0277d3 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -643,6 +643,7 @@ static void exit_mm(struct task_struct * tsk)
 	mm_release(tsk, mm);
 	if (!mm)
 		return;
+	sync_mm_rss(mm);
 	/*
 	 * Serialize with any possible pending coredump.
 	 * We must hold mmap_sem around checking core_state

Apply that patch and see how it goes.

On 4 June 2014 21:27, Greg KH <gregkh@linuxfoundation.org> wrote:
> Hi all,
>
> Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> figured it out, or did it just "magically" go away?
>
> I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> causing system crashes and problems:
>
> [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
>
> [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
>
> [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
>
> [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
>
> Anyone have any ideas of a 3.15-rc patch I should be including in
> 3.14-stable to resolve this?
>
> thanks,
>
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Please avoid sending me Word or PowerPoint attachments.

See http://www.gnu.org/philosophy/no-word-attachments.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-04 18:47   ` Dennis Mungai
  0 siblings, 0 replies; 14+ messages in thread
From: Dennis Mungai @ 2014-06-04 18:47 UTC (permalink / raw)
  To: Greg KH; +Cc: Dave Jones, linux-mm, Linux Kernel, Brandon Philips

Hello Greg,

do_exit() and exec_mmap() call sync_mm_rss() before mm_release()
does put_user(clear_child_tid) which can update task->rss_stat
and thus make mm->rss_stat inconsistent. This triggers the "BUG:"
printk in check_mm().

Let's fix this bug in the safest way, and optimize/cleanup this later.

Reported-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Dennis E. Mungai <dmngaie@gmail.com>
---
 fs/exec.c     |    2 +-
 kernel/exit.c |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c
index a79786a..da27b91 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -819,10 +819,10 @@ static int exec_mmap(struct mm_struct *mm)
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
 	old_mm = current->mm;
-	sync_mm_rss(old_mm);
 	mm_release(tsk, old_mm);

 	if (old_mm) {
+		sync_mm_rss(old_mm);
 		/*
 		 * Make sure that if there is a core dump in progress
 		 * for the old mm, we get out and die instead of going
diff --git a/kernel/exit.c b/kernel/exit.c
index 34867cc..c0277d3 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -643,6 +643,7 @@ static void exit_mm(struct task_struct * tsk)
 	mm_release(tsk, mm);
 	if (!mm)
 		return;
+	sync_mm_rss(mm);
 	/*
 	 * Serialize with any possible pending coredump.
 	 * We must hold mmap_sem around checking core_state

Apply that patch and see how it goes.

On 4 June 2014 21:27, Greg KH <gregkh@linuxfoundation.org> wrote:
> Hi all,
>
> Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> figured it out, or did it just "magically" go away?
>
> I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> causing system crashes and problems:
>
> [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
>
> [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
>
> [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
>
> [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
>
> Anyone have any ideas of a 3.15-rc patch I should be including in
> 3.14-stable to resolve this?
>
> thanks,
>
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Please avoid sending me Word or PowerPoint attachments.

See http://www.gnu.org/philosophy/no-word-attachments.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 18:27 ` Greg KH
@ 2014-06-04 19:12   ` Dave Jones
  -1 siblings, 0 replies; 14+ messages in thread
From: Dave Jones @ 2014-06-04 19:12 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-mm, Linux Kernel, Brandon Philips

On Wed, Jun 04, 2014 at 11:27:39AM -0700, Greg KH wrote:
 > Hi all,
 > 
 > Dave, I saw you mention that you were seeing the "Bad rss-counter" line
 > on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
 > figured it out, or did it just "magically" go away?
 > 
 > I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
 > causing system crashes and problems:
 > 
 > [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
 > [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
 > 
 > [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
 > [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
 > 
 > [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
 > [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
 > 
 > [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
 > [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
 > 
 > Anyone have any ideas of a 3.15-rc patch I should be including in
 > 3.14-stable to resolve this?

hard to tell if they were the same issues I was seeing without the full
backtrace. The only bad rss bugs that I recall being fixed for sure were
the ones that Hugh nailed down right before 3.14 (887843961c4b)

I've not seen anything in a while, but that may just be because I end up
hitting other bugs before they get a chance to show.

Brandon, what kind of workload is that machine doing ? I wonder if I can
add something to trinity to make it provoke it.

	Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-04 19:12   ` Dave Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Jones @ 2014-06-04 19:12 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-mm, Linux Kernel, Brandon Philips

On Wed, Jun 04, 2014 at 11:27:39AM -0700, Greg KH wrote:
 > Hi all,
 > 
 > Dave, I saw you mention that you were seeing the "Bad rss-counter" line
 > on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
 > figured it out, or did it just "magically" go away?
 > 
 > I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
 > causing system crashes and problems:
 > 
 > [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
 > [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
 > 
 > [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
 > [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
 > 
 > [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
 > [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
 > 
 > [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
 > [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
 > 
 > Anyone have any ideas of a 3.15-rc patch I should be including in
 > 3.14-stable to resolve this?

hard to tell if they were the same issues I was seeing without the full
backtrace. The only bad rss bugs that I recall being fixed for sure were
the ones that Hugh nailed down right before 3.14 (887843961c4b)

I've not seen anything in a while, but that may just be because I end up
hitting other bugs before they get a chance to show.

Brandon, what kind of workload is that machine doing ? I wonder if I can
add something to trinity to make it provoke it.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 19:12   ` Dave Jones
@ 2014-06-04 19:35     ` Brandon Philips
  -1 siblings, 0 replies; 14+ messages in thread
From: Brandon Philips @ 2014-06-04 19:35 UTC (permalink / raw)
  To: Dave Jones, Greg KH, linux-mm, Linux Kernel, Brandon Philips

On Wed, Jun 4, 2014 at 12:12 PM, Dave Jones <davej@redhat.com> wrote:
> Brandon, what kind of workload is that machine doing ? I wonder if I can
> add something to trinity to make it provoke it.

A really boring database workload (fsync() ~50ms) with a sloowww block
device with btrfs. There are occasional CPU spikes due to expensive
queries.

How can I be more helpful in my workload description?

Thanks,

Brandon

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-04 19:35     ` Brandon Philips
  0 siblings, 0 replies; 14+ messages in thread
From: Brandon Philips @ 2014-06-04 19:35 UTC (permalink / raw)
  To: Dave Jones, Greg KH, linux-mm, Linux Kernel, Brandon Philips

On Wed, Jun 4, 2014 at 12:12 PM, Dave Jones <davej@redhat.com> wrote:
> Brandon, what kind of workload is that machine doing ? I wonder if I can
> add something to trinity to make it provoke it.

A really boring database workload (fsync() ~50ms) with a sloowww block
device with btrfs. There are occasional CPU spikes due to expensive
queries.

How can I be more helpful in my workload description?

Thanks,

Brandon

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 19:35     ` Brandon Philips
@ 2014-06-04 22:22       ` Dave Jones
  -1 siblings, 0 replies; 14+ messages in thread
From: Dave Jones @ 2014-06-04 22:22 UTC (permalink / raw)
  To: Brandon Philips; +Cc: Greg KH, linux-mm, Linux Kernel

On Wed, Jun 04, 2014 at 12:35:45PM -0700, Brandon Philips wrote:
 > On Wed, Jun 4, 2014 at 12:12 PM, Dave Jones <davej@redhat.com> wrote:
 > > Brandon, what kind of workload is that machine doing ? I wonder if I can
 > > add something to trinity to make it provoke it.
 > 
 > A really boring database workload (fsync() ~50ms) with a sloowww block
 > device with btrfs. There are occasional CPU spikes due to expensive
 > queries.
 > 
 > How can I be more helpful in my workload description?

I feared it would be something like a database. Trying to replicate
things seen under those workloads always seems to be challenging,
in part due to the system specific setups they seem to have.

I wonder if any of the benchmarking apps we have do a realistic
representation of what modern databases do. It might be a fun project
to take something like that and extend it to do random queries.

	Dave


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-04 22:22       ` Dave Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Jones @ 2014-06-04 22:22 UTC (permalink / raw)
  To: Brandon Philips; +Cc: Greg KH, linux-mm, Linux Kernel

On Wed, Jun 04, 2014 at 12:35:45PM -0700, Brandon Philips wrote:
 > On Wed, Jun 4, 2014 at 12:12 PM, Dave Jones <davej@redhat.com> wrote:
 > > Brandon, what kind of workload is that machine doing ? I wonder if I can
 > > add something to trinity to make it provoke it.
 > 
 > A really boring database workload (fsync() ~50ms) with a sloowww block
 > device with btrfs. There are occasional CPU spikes due to expensive
 > queries.
 > 
 > How can I be more helpful in my workload description?

I feared it would be something like a database. Trying to replicate
things seen under those workloads always seems to be challenging,
in part due to the system specific setups they seem to have.

I wonder if any of the benchmarking apps we have do a realistic
representation of what modern databases do. It might be a fun project
to take something like that and extend it to do random queries.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 18:27 ` Greg KH
@ 2014-06-04 23:37   ` Andre Tomt
  -1 siblings, 0 replies; 14+ messages in thread
From: Andre Tomt @ 2014-06-04 23:37 UTC (permalink / raw)
  To: Greg KH, Dave Jones, linux-mm, Linux Kernel; +Cc: Brandon Philips

On 04. juni 2014 20:27, Greg KH wrote:
> Hi all,
> 
> Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> figured it out, or did it just "magically" go away?
> 
> I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> causing system crashes and problems:
> 
> [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
> 
> [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
> 
> [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
> 
> [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
> 
> Anyone have any ideas of a 3.15-rc patch I should be including in
> 3.14-stable to resolve this?

I saw a bunch of similar errors on 3.14.x up to and including 3.14.4,
running Java (Tomcat) and Postgres on Xen PV. Have not seen it since
"mm: use paravirt friendly ops for NUMA hinting ptes" landed in 3.14.5.

402e194dfc5b38d99f9c65b86e2666b29adebf8c in stable,
29c7787075c92ca8af353acd5301481e6f37082f upstream

As I did not follow the original discussion I have no idea if this is
the same thing, and I'm way too lazy to look for it now. ;-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-04 23:37   ` Andre Tomt
  0 siblings, 0 replies; 14+ messages in thread
From: Andre Tomt @ 2014-06-04 23:37 UTC (permalink / raw)
  To: Greg KH, Dave Jones, linux-mm, Linux Kernel; +Cc: Brandon Philips

On 04. juni 2014 20:27, Greg KH wrote:
> Hi all,
> 
> Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> figured it out, or did it just "magically" go away?
> 
> I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> causing system crashes and problems:
> 
> [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
> 
> [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
> 
> [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
> 
> [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
> 
> Anyone have any ideas of a 3.15-rc patch I should be including in
> 3.14-stable to resolve this?

I saw a bunch of similar errors on 3.14.x up to and including 3.14.4,
running Java (Tomcat) and Postgres on Xen PV. Have not seen it since
"mm: use paravirt friendly ops for NUMA hinting ptes" landed in 3.14.5.

402e194dfc5b38d99f9c65b86e2666b29adebf8c in stable,
29c7787075c92ca8af353acd5301481e6f37082f upstream

As I did not follow the original discussion I have no idea if this is
the same thing, and I'm way too lazy to look for it now. ;-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
  2014-06-04 23:37   ` Andre Tomt
@ 2014-06-05  0:21     ` Greg KH
  -1 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2014-06-05  0:21 UTC (permalink / raw)
  To: Andre Tomt; +Cc: Dave Jones, linux-mm, Linux Kernel, Brandon Philips

On Thu, Jun 05, 2014 at 01:37:42AM +0200, Andre Tomt wrote:
> On 04. juni 2014 20:27, Greg KH wrote:
> > Hi all,
> > 
> > Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> > on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> > figured it out, or did it just "magically" go away?
> > 
> > I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> > causing system crashes and problems:
> > 
> > [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> > [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
> > 
> > [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> > [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
> > 
> > [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> > [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
> > 
> > [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> > [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
> > 
> > Anyone have any ideas of a 3.15-rc patch I should be including in
> > 3.14-stable to resolve this?
> 
> I saw a bunch of similar errors on 3.14.x up to and including 3.14.4,
> running Java (Tomcat) and Postgres on Xen PV. Have not seen it since
> "mm: use paravirt friendly ops for NUMA hinting ptes" landed in 3.14.5.
> 
> 402e194dfc5b38d99f9c65b86e2666b29adebf8c in stable,
> 29c7787075c92ca8af353acd5301481e6f37082f upstream
> 
> As I did not follow the original discussion I have no idea if this is
> the same thing, and I'm way too lazy to look for it now. ;-)

Ah, nice find.

Brandon, I think 3.14.5 is in the CoreOs tree, can you update to that on
these boxes to see if it solves the issue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bad rss-counter is back on 3.14-stable
@ 2014-06-05  0:21     ` Greg KH
  0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2014-06-05  0:21 UTC (permalink / raw)
  To: Andre Tomt; +Cc: Dave Jones, linux-mm, Linux Kernel, Brandon Philips

On Thu, Jun 05, 2014 at 01:37:42AM +0200, Andre Tomt wrote:
> On 04. juni 2014 20:27, Greg KH wrote:
> > Hi all,
> > 
> > Dave, I saw you mention that you were seeing the "Bad rss-counter" line
> > on 3.15-rc1, but I couldn't find any follow-up on this to see if anyone
> > figured it out, or did it just "magically" go away?
> > 
> > I ask as Brandon is seeing this same message a lot on a 3.14.4 kernel,
> > causing system crashes and problems:
> > 
> > [16591492.449718] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:0 val:-1836508
> > [16591492.449737] BUG: Bad rss-counter state mm:ffff8801ced99880 idx:1 val:1836508
> > 
> > [20783350.461716] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:0 val:-52518
> > [20783350.461734] BUG: Bad rss-counter state mm:ffff8801d2b1dc00 idx:1 val:52518
> > 
> > [21393387.112302] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:0 val:-1767569
> > [21393387.112321] BUG: Bad rss-counter state mm:ffff8801d0104e00 idx:1 val:1767569
> > 
> > [21430098.512837] BUG: Bad rss-counter state mm:ffff880100036680 idx:0 val:-2946
> > [21430098.512854] BUG: Bad rss-counter state mm:ffff880100036680 idx:1 val:2946
> > 
> > Anyone have any ideas of a 3.15-rc patch I should be including in
> > 3.14-stable to resolve this?
> 
> I saw a bunch of similar errors on 3.14.x up to and including 3.14.4,
> running Java (Tomcat) and Postgres on Xen PV. Have not seen it since
> "mm: use paravirt friendly ops for NUMA hinting ptes" landed in 3.14.5.
> 
> 402e194dfc5b38d99f9c65b86e2666b29adebf8c in stable,
> 29c7787075c92ca8af353acd5301481e6f37082f upstream
> 
> As I did not follow the original discussion I have no idea if this is
> the same thing, and I'm way too lazy to look for it now. ;-)

Ah, nice find.

Brandon, I think 3.14.5 is in the CoreOs tree, can you update to that on
these boxes to see if it solves the issue?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-06-05  0:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-04 18:27 Bad rss-counter is back on 3.14-stable Greg KH
2014-06-04 18:27 ` Greg KH
2014-06-04 18:47 ` Dennis Mungai
2014-06-04 18:47   ` Dennis Mungai
2014-06-04 19:12 ` Dave Jones
2014-06-04 19:12   ` Dave Jones
2014-06-04 19:35   ` Brandon Philips
2014-06-04 19:35     ` Brandon Philips
2014-06-04 22:22     ` Dave Jones
2014-06-04 22:22       ` Dave Jones
2014-06-04 23:37 ` Andre Tomt
2014-06-04 23:37   ` Andre Tomt
2014-06-05  0:21   ` Greg KH
2014-06-05  0:21     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.