linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug).  Problems with glibc.
@ 2019-02-01 21:47 Alan Mackenzie
  2019-02-01 22:04 ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Mackenzie @ 2019-02-01 21:47 UTC (permalink / raw)
  To: Thomas Gleixner, linux-kernel; +Cc: 34235, Eli Zaretskii, Alex Branham

Hello, Thomas, Hello Linux.

0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
posix-timers: Fix division by zero bug
Committed: 2018-12-17 17:35:45 +0100

With this patch in place I am seeing problems with glibc's function
timer_create.  I am an Emacs maintainer, and saw these problems whilst
investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".
Full details of this bug are at
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=34235.

Emacs's profiler fails in kernel 4.19.13, but works in a version of
4.19.13 with the patch reversed, otherwise unchanged.  My current version
of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

The Emacs profiler works by a signal handler being repeatedly triggered
by the SIGPROF signal every 1 millisecond.  In the bug scenario, this
signal gets triggered precisely once each time the Emacs profiler is
started, rather than continually.

The core of the code in Emacs which initialises the glibc timer is:

      int i;
      struct sigevent sigev;
      sigev.sigev_value.sival_ptr = &profiler_timer;
      sigev.sigev_signo = SIGPROF;
      sigev.sigev_notify = SIGEV_SIGNAL;

      for (i = 0; i < ARRAYELTS (system_clock); i++)
        if (timer_create (system_clock[i], &sigev, &profiler_timer) == 0)
          {
            profiler_timer_ok = 1;
            break;
          }
    }

  if (profiler_timer_ok)
    {
      struct itimerspec ispec;
      ispec.it_value = ispec.it_interval = interval;
      if (timer_settime (profiler_timer, 0, &ispec, 0) == 0)
        return TIMER_SETTIME_RUNNING;
    }

The variable `interval' has been checked as non-zero.  This code is in
.../emacs/src/profiler.c

It seems either that the patch has uncovered some invalid call between
Emacs and glibc, or between glibc and Linux, or that there is some
intrinsic problem with the patch.

I have very little familiarity with glibc and Linux source code, so I
would be greatful if you could help me investigate the bug scenario.
Naturally, I will help as I can in this process.

Thanks in advance!

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.
  2019-02-01 21:47 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc Alan Mackenzie
@ 2019-02-01 22:04 ` Thomas Gleixner
  2019-02-02 10:44   ` Alan Mackenzie
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2019-02-01 22:04 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: linux-kernel, 34235, Eli Zaretskii, Alex Branham

Hello Alan,

On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> posix-timers: Fix division by zero bug
> Committed: 2018-12-17 17:35:45 +0100
> 
> With this patch in place I am seeing problems with glibc's function
> timer_create.  I am an Emacs maintainer, and saw these problems whilst
> investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".

> Emacs's profiler fails in kernel 4.19.13, but works in a version of
> 4.19.13 with the patch reversed, otherwise unchanged.  My current version
> of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

Please upgrade to 4.19.19. The issue should be fixed there with the
backported variant of

   93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")

Commit 21c0d1621b8d4b in 4.19.19

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.
  2019-02-01 22:04 ` Thomas Gleixner
@ 2019-02-02 10:44   ` Alan Mackenzie
  2019-02-04 17:25     ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Mackenzie @ 2019-02-02 10:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, 34235, Eli Zaretskii, Alex Branham, Paul Eggert

Hello, Thomas.

Thanks for such a rapid reply!

On Fri, Feb 01, 2019 at 23:04:48 +0100, Thomas Gleixner wrote:
> Hello Alan,

> On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> > 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> > posix-timers: Fix division by zero bug
> > Committed: 2018-12-17 17:35:45 +0100

> > With this patch in place I am seeing problems with glibc's function
> > timer_create.  I am an Emacs maintainer, and saw these problems whilst
> > investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".

> > Emacs's profiler fails in kernel 4.19.13, but works in a version of
> > 4.19.13 with the patch reversed, otherwise unchanged.  My current version
> > of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).

> Please upgrade to 4.19.19. The issue should be fixed there with the
> backported variant of

>    93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")

> Commit 21c0d1621b8d4b in 4.19.19

I've just built and installed Linux 4.19.19, and it does indeed solve
the Emacs profiler bug, #34235.  :-)

I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
Are there any plans to install it into 4.9.x, the other live long term
support branch?  The reason I ask is to make an entry into Emacs's
PROBLEMS file, telling users and distributions which kernel versions to
upgrade to.

> Thanks,

> 	tglx

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.
  2019-02-02 10:44   ` Alan Mackenzie
@ 2019-02-04 17:25     ` Thomas Gleixner
  2019-02-05 13:54       ` Alan Mackenzie
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2019-02-04 17:25 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: LKML, 34235, Eli Zaretskii, Alex Branham, Paul Eggert, Greg KH

On Sat, 2 Feb 2019, Alan Mackenzie wrote:
> Hello, Thomas.
> 
> Thanks for such a rapid reply!
> 
> On Fri, Feb 01, 2019 at 23:04:48 +0100, Thomas Gleixner wrote:
> > Hello Alan,
> 
> > On Fri, 1 Feb 2019, Alan Mackenzie wrote:
> > > 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94
> > > posix-timers: Fix division by zero bug
> > > Committed: 2018-12-17 17:35:45 +0100
> 
> > > With this patch in place I am seeing problems with glibc's function
> > > timer_create.  I am an Emacs maintainer, and saw these problems whilst
> > > investigating Emacs bug #34235 "27.0.50; lisp profiler does not work".
> 
> > > Emacs's profiler fails in kernel 4.19.13, but works in a version of
> > > 4.19.13 with the patch reversed, otherwise unchanged.  My current version
> > > of glibc is 2.27-r6 (I think the "-r6" comes from Gentoo, my distro).
> 
> > Please upgrade to 4.19.19. The issue should be fixed there with the
> > backported variant of
> 
> >    93ad0fc088c5 ("posix-cpu-timers: Unbreak timer rearming")
> 
> > Commit 21c0d1621b8d4b in 4.19.19
> 
> I've just built and installed Linux 4.19.19, and it does indeed solve
> the Emacs profiler bug, #34235.  :-)
> 
> I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
> Are there any plans to install it into 4.9.x, the other live long term
> support branch?  The reason I ask is to make an entry into Emacs's
> PROBLEMS file, telling users and distributions which kernel versions to
> upgrade to.

4.9 doesn't have the offending commit AFAICT.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc.
  2019-02-04 17:25     ` Thomas Gleixner
@ 2019-02-05 13:54       ` Alan Mackenzie
  0 siblings, 0 replies; 8+ messages in thread
From: Alan Mackenzie @ 2019-02-05 13:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, 34235, Eli Zaretskii, Alex Branham, Paul Eggert, Greg KH

Hello, Thomas.

On Mon, Feb 04, 2019 at 17:25:11 +0000, Thomas Gleixner wrote:
> On Sat, 2 Feb 2019, Alan Mackenzie wrote:

[ .... ]

> > I've just built and installed Linux 4.19.19, and it does indeed solve
> > the Emacs profiler bug, #34235.  :-)

> > I see that the patch has been installed in 4.20.6, 4.19.19, and 4.14.97.
> > Are there any plans to install it into 4.9.x, the other live long term
> > support branch?  The reason I ask is to make an entry into Emacs's
> > PROBLEMS file, telling users and distributions which kernel versions to
> > upgrade to.

> 4.9 doesn't have the offending commit AFAICT.

OK, thanks very much!  I've put these three version numbers into the
message in Emacs's PROBLEMS file.

I think we're finished, now.

> Thanks,

> 	tglx

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix, division by zero bug). Problems with glibc.
  2019-02-02  9:21 ` Thomas Gleixner
@ 2019-02-03  6:28   ` Paul Eggert
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Eggert @ 2019-02-03  6:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alex Branham, Alan Mackenzie, Eli Zaretskii, linux-kernel, 34235

Thomas Gleixner wrote:
> Can you please verify whether the issue is fixed with 4.19.19?

It depends on what you mean by "verify". I looked at the Linux kernel source 
code and checked that the "posix-cpu-timers: Unbreak timer rearming" patch is in 
4.19.19 (but not 4.19.18) and in 4.20.6 (but not 4.20.5). I did not test Emacs's 
CPU profiler on these kernels, as I don't have them installed. I expect to 
upgrade soon to 4.20.6 (whenever Fedora 29 release does - 4.20.6 was submitted 
for testing a couple of days ago) and plan to give it a try then.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix, division by zero bug). Problems with glibc.
  2019-02-02  2:07 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix, " Paul Eggert
@ 2019-02-02  9:21 ` Thomas Gleixner
  2019-02-03  6:28   ` Paul Eggert
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2019-02-02  9:21 UTC (permalink / raw)
  To: Paul Eggert
  Cc: Alex Branham, Alan Mackenzie, Eli Zaretskii, linux-kernel, 34235

On Fri, 1 Feb 2019, Paul Eggert wrote:

> Thanks for helping to track down this bug. Since the problem occurs only with
> a few Linux kernel versions and affects Emacs only when doing CPU profiling,
> it doesn't seem worth spending time to try to patch Emacs to work around the
> bug. So I installed the attached patch into emacs-26's etc/PROBLEMS file to
> warn users about the problem, and am closing the Emacs bug report.

Can you please verify whether the issue is fixed with 4.19.19? The commit
in question broke posix CPU timers as a unintended side effect and a follow
up patch unbreaks them again.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix, division by zero bug). Problems with glibc.
@ 2019-02-02  2:07 Paul Eggert
  2019-02-02  9:21 ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2019-02-02  2:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alex Branham, Alan Mackenzie, Eli Zaretskii, linux-kernel, 34235

[-- Attachment #1: Type: text/plain, Size: 381 bytes --]

Thanks for helping to track down this bug. Since the problem occurs only 
with a few Linux kernel versions and affects Emacs only when doing CPU 
profiling, it doesn't seem worth spending time to try to patch Emacs to 
work around the bug. So I installed the attached patch into emacs-26's 
etc/PROBLEMS file to warn users about the problem, and am closing the 
Emacs bug report.


[-- Attachment #2: 0001-etc-PROBLEMS-Mention-profiler-report-bug-Bug-34235.patch --]
[-- Type: text/x-patch, Size: 826 bytes --]

From 1243188bc4f722abf16518bf73924ce5f17750cf Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 1 Feb 2019 17:58:05 -0800
Subject: [PATCH] * etc/PROBLEMS: Mention profiler-report bug (Bug#34235).

---
 etc/PROBLEMS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/etc/PROBLEMS b/etc/PROBLEMS
index cab087631c..00583f016d 100644
--- a/etc/PROBLEMS
+++ b/etc/PROBLEMS
@@ -1850,6 +1850,12 @@ term/xterm.el) for more details.
 
 ** GNU/Linux
 
+*** GNU/Linux: profiler-report outputs nothing.
+
+A few versions of the Linux kernel have timer bugs that break CPU
+profiling; see Bug#34235.  To fix the problem, upgrade to kernel
+versions 4.19.19 or 4.20.6, or later.
+
 *** GNU/Linux: Process output is corrupted.
 
 There is a bug in Linux kernel 2.6.10 PTYs that can cause emacs to
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-02-05 13:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-01 21:47 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix division by zero bug). Problems with glibc Alan Mackenzie
2019-02-01 22:04 ` Thomas Gleixner
2019-02-02 10:44   ` Alan Mackenzie
2019-02-04 17:25     ` Thomas Gleixner
2019-02-05 13:54       ` Alan Mackenzie
2019-02-02  2:07 0e334db6bb4b1fd1e2d72c1f3d8f004313cd9f94 (posix-timers: Fix, " Paul Eggert
2019-02-02  9:21 ` Thomas Gleixner
2019-02-03  6:28   ` Paul Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).