All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Getting rid of get_unused_fd() / enable close-on-exec
@ 2014-01-05 20:36 ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Al Viro, linux-ia64, Jeremy Kerr,
	Arnd Bergmann, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, cbe-oss-dev, linux-fsdevel, Eric Paris,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Andrew Morton
  Cc: Yann Droneaud, linux-kernel, Ulrich Drepper

Hi,

Please find the fifth revision of my patchset to remove get_unused_fd()
macro in order to help subsystems to use get_unused_fd_flags() (or
anon_inode_getfd()) with flags either provided by the userspace or
set to O_CLOEXEC by default where appropriate.

Without get_unused_fd() macro, more subsystems are likely to use
get_unused_fd_flags() (or anon_inode_getfd()) and be teached to
provide an API that let userspace choose the opening flags
of the file descriptor.

Not allowing userspace to provide the "open" flags or not using O_CLOEXEC
by default should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across
exec().

Not allowing userspace to atomically set close-on-exec flag and not using
O_CLOEXEC should be avoided to protect multi-threaded program
from race condition when it tried to set close-on-exec flag using
fcntl(fd, F_SETFD, FD_CLOEXEC) after opening the file descriptor.

Example:

    int fd;
    int ret;

    fd = open(filename, O_RDONLY);
    if (fd < 0) {
        perror("open");
        return -1;
    }

    /*
     * window opened for another thread to call fork(),
     * then the new process can call exec() at any time
     * and the file descriptor would be inherited
     */

    ret = fcntl(fd, F_SETFD, FD_CLOEXEC)
    if (ret < 0) {
        perror("fnctl()");
        close(fd);
        return -1;
    }

vs.:

    int fd;
    fd = open(filaneme, O_RDONLY | O_CLOEXEC);
    if (fd < 0) {
        perror("open");
        return -1;
    }

Using O_CLOEXEC by default when flags are not (eg. cannot be) provided
by userspace is the safest bet as it allows userspace to choose, if, when
and where the file descriptor is going to be inherited across exec():
userspace is free to call fcntl() to remove FD_CLOEXEC flag in the child
process that will call exec().

Unfortunately, O_CLOEXEC cannot be made the default for most existing system
calls as it's not the default behavior for POSIX / Unix. Reader interested
in this issue could have a look at "Ghosts of Unix past, part 2: Conflated
designs" [1] article by Neil Brown.

FAQ:

- Why do one want close-on-exec ?

Setting close-on-exec flag on file descriptor ensure it won't be inherited
silently by child, child of child, etc. when executing another program.

If the file descriptor is not closed, some kernel resources can be locked
until the last process with the opened file descriptor exit.

If the file descriptor is not closed, this can lead to a security issue,
eg. making resources available to a less privileged program allowing
information leak and/or deny of service.

- Why do one need atomic close-on-exec ?

Even if it's possible to set close-on-exec flag through call to fcntl()
as shown previously, it introduces a race condition in multi-threaded
process, where a thread call fork() / exec() while another thread
is between call to open() and fcntl().

Additionally, using close-on-exec free the programmer from tracking manually
which file descriptor is to close in a child before calling exec():
in a program using multiple third-party libraries, it's difficult to know
which file descriptor must be closed.
AFAIK, while there's a atexit(), pthread_atfork(), there's no atexec()
userspace function in libc to allow libraries to register a handler in
order to close its opened file descriptor before exec().

- Why default to close-on-exec ?

Some kernel interfaces don't allow userspace to pass a O_CLOEXEC-like
flag when creating a new file descriptor.

In such cases, if possible (see below), O_CLOEXEC must be made
the default so that userspace doesn't have to call fcntl()
which, as demonstrated previously, is open to race condition in
multi-threaded program.

- How to choose between flag 0 or O_CLOEXEC in call to
  get_unused_fd_flags() (or anon_inode_getfd()) ?

Short: Will it break existing application ? Will it break kernel ABI ?

       If answer is no, use O_CLOEXEC.
       If answer is yes, use 0.

Long: If userspace expect to retrieve a file descriptor with plain
      old Unix(tm) semantics, O_CLOEXEC must not be the default, as it
      could break some applications expecting that the file descriptor
      will be inherited across exec().

      But for some subsystems, such as InfiniBand, KVM, VFIO, it makes
      no sense to have file descriptors inherited across exec() since
      those are tied to resources that will vanish when a another program
      will replace the current one by mean of exec(), so it's safe to use
      O_CLOEXEC in such cases.

      For others, like XFS, the file descriptor is retrieved by one
      program and will be used by a different program, executed as a
      child. In this case, setting O_CLOEXEC would break existing
      application which do not expect to have to call fcntl(fd,
      F_SETFD, 0) to make it available across exec().

      If file descriptor created by a subsystem is not tied to the
      current process resources, it's likely legal to use it in a
      different process context, thus O_CLOEXEC must not be the default.

      If one, as a subsystem maintainer, cannot tell for sure that
      no userspace program ever rely current behavior, eg. file descriptor
      being inherited across exec(), then the default flag *must*
      be kept 0 to not break application.

- This subsystem cannot be turned to use O_CLOEXEC by default:

If O_CLOEXEC cannot be made the default, it would be interesting
to think to extend the API to have a (set of) function(s) taking
a flag parameter so that userspace can atomically request close-on-exec
if it need it (and it should need it).

- Background:

One might want to read "Secure File Descriptor Handling" [2] by
Ulrich Drepper who is responsible of adding O_CLOEXEC flag on open(),
and flags alike on other syscalls.

One might also want to read PEP-446 "Make newly created file descriptors
non-inheritable" [3] by Victor Stinner since it has lot more background
information on file descriptor leaking.

One also like to read "Excuse me son, but your code is leaking !!!" [4]
by Dan Walsh for advice.

[1] http://lwn.net/Articles/412131/
[2] http://udrepper.livejournal.com/20407.html
[3] http://www.python.org/dev/peps/pep-0446/
[4] http://danwalsh.livejournal.com/53603.html

- Statistics:

In linux-next tag 20131224, they're currently:

- 32 calls to fd_install()
       with one call part of anon_inode_getfd()
- 24 calls to get_unused_fd_flags()
       with one call part of anon_inode_getfd()
       with another part of get_unused_fd() macro
- 11 calls to anon_inode_getfd()
-  8 calls to anon_inode_getfile()
       with one call part of anon_inode_getfd()
-  7 calls to get_unused_fd()

Changes from patchset v4 [PATCHSETv4]:

- rewrote cover letter following discussion with perf maintainer.
  Thanks to Peter Zijlstra.

- modified a bit some commit messages.

- events: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- perf: introduce a flag to enable close-on-exec in perf_event_open()
  NEW: instead of hard coding the flags to 0, this patch
       allows userspace to specify close-on-exec flag.

- fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  NEW: instead of hard coding the flags to 0, this patch
       enable close-on-exec if userspace request it.

Changes from patchset v3 [PATCHSETv3]:

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  DROPPED: applied upstream

Changes from patchset v2 [PATCHSETv2]:

- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied upstream.
  Additionally subsystem maintainer applied another patch on top
  to set the flags to O_CLOEXEC.

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  NEW: propose to use O_CLOEXEC as default flag.

Changes from patchset v1 [PATCHSETv1]:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: subsystem maintainer applied another patch
           using get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

Links:

[PATCHSETv4]
  http://lkml.kernel.org/r/cover.1383121137.git.ydroneaud@opteya.com

[PATCHSETv3]
  http://lkml.kernel.org/r/cover.1378460926.git.ydroneaud@opteya.com

[PATCHSETv2]
  http://lkml.kernel.org/r/cover.1376327678.git.ydroneaud@opteya.com

[PATCHSETv1]
  http://lkml.kernel.org/r/cover.1372777600.git.ydroneaud@opteya.com

PS: Happy new (gregorian calendar's) year 2014 and best wishes ;)

Yann Droneaud (7):
  ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: use get_unused_fd_flags(0) instead of get_unused_fd()
  fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  perf: introduce a flag to enable close-on-exec in perf_event_open()
  file: remove macro get_unused_fd()

 arch/ia64/kernel/perfmon.c                |  2 +-
 arch/powerpc/platforms/cell/spufs/inode.c |  4 ++--
 fs/binfmt_misc.c                          |  2 +-
 fs/file.c                                 |  2 +-
 fs/notify/fanotify/fanotify_user.c        |  2 +-
 include/linux/file.h                      |  1 -
 include/uapi/linux/perf_event.h           |  1 +
 kernel/events/core.c                      | 12 +++++++++---
 8 files changed, 16 insertions(+), 10 deletions(-)

-- 
1.8.4.2


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v5 0/7] Getting rid of get_unused_fd() / enable close-on-exec
@ 2014-01-05 20:36 ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Al Viro, linux-ia64, Jeremy Kerr,
	Arnd Bergmann, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, cbe-oss-dev, linux-fsdevel, Eric Paris,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Andrew Morton
  Cc: Yann Droneaud, Ulrich Drepper, linux-kernel

Hi,

Please find the fifth revision of my patchset to remove get_unused_fd()
macro in order to help subsystems to use get_unused_fd_flags() (or
anon_inode_getfd()) with flags either provided by the userspace or
set to O_CLOEXEC by default where appropriate.

Without get_unused_fd() macro, more subsystems are likely to use
get_unused_fd_flags() (or anon_inode_getfd()) and be teached to
provide an API that let userspace choose the opening flags
of the file descriptor.

Not allowing userspace to provide the "open" flags or not using O_CLOEXEC
by default should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across
exec().

Not allowing userspace to atomically set close-on-exec flag and not using
O_CLOEXEC should be avoided to protect multi-threaded program
from race condition when it tried to set close-on-exec flag using
fcntl(fd, F_SETFD, FD_CLOEXEC) after opening the file descriptor.

Example:

    int fd;
    int ret;

    fd = open(filename, O_RDONLY);
    if (fd < 0) {
        perror("open");
        return -1;
    }

    /*
     * window opened for another thread to call fork(),
     * then the new process can call exec() at any time
     * and the file descriptor would be inherited
     */

    ret = fcntl(fd, F_SETFD, FD_CLOEXEC)
    if (ret < 0) {
        perror("fnctl()");
        close(fd);
        return -1;
    }

vs.:

    int fd;
    fd = open(filaneme, O_RDONLY | O_CLOEXEC);
    if (fd < 0) {
        perror("open");
        return -1;
    }

Using O_CLOEXEC by default when flags are not (eg. cannot be) provided
by userspace is the safest bet as it allows userspace to choose, if, when
and where the file descriptor is going to be inherited across exec():
userspace is free to call fcntl() to remove FD_CLOEXEC flag in the child
process that will call exec().

Unfortunately, O_CLOEXEC cannot be made the default for most existing system
calls as it's not the default behavior for POSIX / Unix. Reader interested
in this issue could have a look at "Ghosts of Unix past, part 2: Conflated
designs" [1] article by Neil Brown.

FAQ:

- Why do one want close-on-exec ?

Setting close-on-exec flag on file descriptor ensure it won't be inherited
silently by child, child of child, etc. when executing another program.

If the file descriptor is not closed, some kernel resources can be locked
until the last process with the opened file descriptor exit.

If the file descriptor is not closed, this can lead to a security issue,
eg. making resources available to a less privileged program allowing
information leak and/or deny of service.

- Why do one need atomic close-on-exec ?

Even if it's possible to set close-on-exec flag through call to fcntl()
as shown previously, it introduces a race condition in multi-threaded
process, where a thread call fork() / exec() while another thread
is between call to open() and fcntl().

Additionally, using close-on-exec free the programmer from tracking manually
which file descriptor is to close in a child before calling exec():
in a program using multiple third-party libraries, it's difficult to know
which file descriptor must be closed.
AFAIK, while there's a atexit(), pthread_atfork(), there's no atexec()
userspace function in libc to allow libraries to register a handler in
order to close its opened file descriptor before exec().

- Why default to close-on-exec ?

Some kernel interfaces don't allow userspace to pass a O_CLOEXEC-like
flag when creating a new file descriptor.

In such cases, if possible (see below), O_CLOEXEC must be made
the default so that userspace doesn't have to call fcntl()
which, as demonstrated previously, is open to race condition in
multi-threaded program.

- How to choose between flag 0 or O_CLOEXEC in call to
  get_unused_fd_flags() (or anon_inode_getfd()) ?

Short: Will it break existing application ? Will it break kernel ABI ?

       If answer is no, use O_CLOEXEC.
       If answer is yes, use 0.

Long: If userspace expect to retrieve a file descriptor with plain
      old Unix(tm) semantics, O_CLOEXEC must not be the default, as it
      could break some applications expecting that the file descriptor
      will be inherited across exec().

      But for some subsystems, such as InfiniBand, KVM, VFIO, it makes
      no sense to have file descriptors inherited across exec() since
      those are tied to resources that will vanish when a another program
      will replace the current one by mean of exec(), so it's safe to use
      O_CLOEXEC in such cases.

      For others, like XFS, the file descriptor is retrieved by one
      program and will be used by a different program, executed as a
      child. In this case, setting O_CLOEXEC would break existing
      application which do not expect to have to call fcntl(fd,
      F_SETFD, 0) to make it available across exec().

      If file descriptor created by a subsystem is not tied to the
      current process resources, it's likely legal to use it in a
      different process context, thus O_CLOEXEC must not be the default.

      If one, as a subsystem maintainer, cannot tell for sure that
      no userspace program ever rely current behavior, eg. file descriptor
      being inherited across exec(), then the default flag *must*
      be kept 0 to not break application.

- This subsystem cannot be turned to use O_CLOEXEC by default:

If O_CLOEXEC cannot be made the default, it would be interesting
to think to extend the API to have a (set of) function(s) taking
a flag parameter so that userspace can atomically request close-on-exec
if it need it (and it should need it).

- Background:

One might want to read "Secure File Descriptor Handling" [2] by
Ulrich Drepper who is responsible of adding O_CLOEXEC flag on open(),
and flags alike on other syscalls.

One might also want to read PEP-446 "Make newly created file descriptors
non-inheritable" [3] by Victor Stinner since it has lot more background
information on file descriptor leaking.

One also like to read "Excuse me son, but your code is leaking !!!" [4]
by Dan Walsh for advice.

[1] http://lwn.net/Articles/412131/
[2] http://udrepper.livejournal.com/20407.html
[3] http://www.python.org/dev/peps/pep-0446/
[4] http://danwalsh.livejournal.com/53603.html

- Statistics:

In linux-next tag 20131224, they're currently:

- 32 calls to fd_install()
       with one call part of anon_inode_getfd()
- 24 calls to get_unused_fd_flags()
       with one call part of anon_inode_getfd()
       with another part of get_unused_fd() macro
- 11 calls to anon_inode_getfd()
-  8 calls to anon_inode_getfile()
       with one call part of anon_inode_getfd()
-  7 calls to get_unused_fd()

Changes from patchset v4 [PATCHSETv4]:

- rewrote cover letter following discussion with perf maintainer.
  Thanks to Peter Zijlstra.

- modified a bit some commit messages.

- events: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- perf: introduce a flag to enable close-on-exec in perf_event_open()
  NEW: instead of hard coding the flags to 0, this patch
       allows userspace to specify close-on-exec flag.

- fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  NEW: instead of hard coding the flags to 0, this patch
       enable close-on-exec if userspace request it.

Changes from patchset v3 [PATCHSETv3]:

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  DROPPED: applied upstream

Changes from patchset v2 [PATCHSETv2]:

- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied upstream.
  Additionally subsystem maintainer applied another patch on top
  to set the flags to O_CLOEXEC.

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  NEW: propose to use O_CLOEXEC as default flag.

Changes from patchset v1 [PATCHSETv1]:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: subsystem maintainer applied another patch
           using get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

Links:

[PATCHSETv4]
  http://lkml.kernel.org/r/cover.1383121137.git.ydroneaud@opteya.com

[PATCHSETv3]
  http://lkml.kernel.org/r/cover.1378460926.git.ydroneaud@opteya.com

[PATCHSETv2]
  http://lkml.kernel.org/r/cover.1376327678.git.ydroneaud@opteya.com

[PATCHSETv1]
  http://lkml.kernel.org/r/cover.1372777600.git.ydroneaud@opteya.com

PS: Happy new (gregorian calendar's) year 2014 and best wishes ;)

Yann Droneaud (7):
  ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: use get_unused_fd_flags(0) instead of get_unused_fd()
  fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  perf: introduce a flag to enable close-on-exec in perf_event_open()
  file: remove macro get_unused_fd()

 arch/ia64/kernel/perfmon.c                |  2 +-
 arch/powerpc/platforms/cell/spufs/inode.c |  4 ++--
 fs/binfmt_misc.c                          |  2 +-
 fs/file.c                                 |  2 +-
 fs/notify/fanotify/fanotify_user.c        |  2 +-
 include/linux/file.h                      |  1 -
 include/uapi/linux/perf_event.h           |  1 +
 kernel/events/core.c                      | 12 +++++++++---
 8 files changed, 16 insertions(+), 10 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v5 0/7] Getting rid of get_unused_fd() / enable close-on-exec
@ 2014-01-05 20:36 ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu, Al Viro, linux-ia64, Jeremy Kerr,
	Arnd Bergmann, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, cbe-oss-dev, linux-fsdevel, Eric Paris,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Andrew Morton
  Cc: Yann Droneaud, linux-kernel, Ulrich Drepper

Hi,

Please find the fifth revision of my patchset to remove get_unused_fd()
macro in order to help subsystems to use get_unused_fd_flags() (or
anon_inode_getfd()) with flags either provided by the userspace or
set to O_CLOEXEC by default where appropriate.

Without get_unused_fd() macro, more subsystems are likely to use
get_unused_fd_flags() (or anon_inode_getfd()) and be teached to
provide an API that let userspace choose the opening flags
of the file descriptor.

Not allowing userspace to provide the "open" flags or not using O_CLOEXEC
by default should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across
exec().

Not allowing userspace to atomically set close-on-exec flag and not using
O_CLOEXEC should be avoided to protect multi-threaded program
from race condition when it tried to set close-on-exec flag using
fcntl(fd, F_SETFD, FD_CLOEXEC) after opening the file descriptor.

Example:

    int fd;
    int ret;

    fd = open(filename, O_RDONLY);
    if (fd < 0) {
        perror("open");
        return -1;
    }

    /*
     * window opened for another thread to call fork(),
     * then the new process can call exec() at any time
     * and the file descriptor would be inherited
     */

    ret = fcntl(fd, F_SETFD, FD_CLOEXEC)
    if (ret < 0) {
        perror("fnctl()");
        close(fd);
        return -1;
    }

vs.:

    int fd;
    fd = open(filaneme, O_RDONLY | O_CLOEXEC);
    if (fd < 0) {
        perror("open");
        return -1;
    }

Using O_CLOEXEC by default when flags are not (eg. cannot be) provided
by userspace is the safest bet as it allows userspace to choose, if, when
and where the file descriptor is going to be inherited across exec():
userspace is free to call fcntl() to remove FD_CLOEXEC flag in the child
process that will call exec().

Unfortunately, O_CLOEXEC cannot be made the default for most existing system
calls as it's not the default behavior for POSIX / Unix. Reader interested
in this issue could have a look at "Ghosts of Unix past, part 2: Conflated
designs" [1] article by Neil Brown.

FAQ:

- Why do one want close-on-exec ?

Setting close-on-exec flag on file descriptor ensure it won't be inherited
silently by child, child of child, etc. when executing another program.

If the file descriptor is not closed, some kernel resources can be locked
until the last process with the opened file descriptor exit.

If the file descriptor is not closed, this can lead to a security issue,
eg. making resources available to a less privileged program allowing
information leak and/or deny of service.

- Why do one need atomic close-on-exec ?

Even if it's possible to set close-on-exec flag through call to fcntl()
as shown previously, it introduces a race condition in multi-threaded
process, where a thread call fork() / exec() while another thread
is between call to open() and fcntl().

Additionally, using close-on-exec free the programmer from tracking manually
which file descriptor is to close in a child before calling exec():
in a program using multiple third-party libraries, it's difficult to know
which file descriptor must be closed.
AFAIK, while there's a atexit(), pthread_atfork(), there's no atexec()
userspace function in libc to allow libraries to register a handler in
order to close its opened file descriptor before exec().

- Why default to close-on-exec ?

Some kernel interfaces don't allow userspace to pass a O_CLOEXEC-like
flag when creating a new file descriptor.

In such cases, if possible (see below), O_CLOEXEC must be made
the default so that userspace doesn't have to call fcntl()
which, as demonstrated previously, is open to race condition in
multi-threaded program.

- How to choose between flag 0 or O_CLOEXEC in call to
  get_unused_fd_flags() (or anon_inode_getfd()) ?

Short: Will it break existing application ? Will it break kernel ABI ?

       If answer is no, use O_CLOEXEC.
       If answer is yes, use 0.

Long: If userspace expect to retrieve a file descriptor with plain
      old Unix(tm) semantics, O_CLOEXEC must not be the default, as it
      could break some applications expecting that the file descriptor
      will be inherited across exec().

      But for some subsystems, such as InfiniBand, KVM, VFIO, it makes
      no sense to have file descriptors inherited across exec() since
      those are tied to resources that will vanish when a another program
      will replace the current one by mean of exec(), so it's safe to use
      O_CLOEXEC in such cases.

      For others, like XFS, the file descriptor is retrieved by one
      program and will be used by a different program, executed as a
      child. In this case, setting O_CLOEXEC would break existing
      application which do not expect to have to call fcntl(fd,
      F_SETFD, 0) to make it available across exec().

      If file descriptor created by a subsystem is not tied to the
      current process resources, it's likely legal to use it in a
      different process context, thus O_CLOEXEC must not be the default.

      If one, as a subsystem maintainer, cannot tell for sure that
      no userspace program ever rely current behavior, eg. file descriptor
      being inherited across exec(), then the default flag *must*
      be kept 0 to not break application.

- This subsystem cannot be turned to use O_CLOEXEC by default:

If O_CLOEXEC cannot be made the default, it would be interesting
to think to extend the API to have a (set of) function(s) taking
a flag parameter so that userspace can atomically request close-on-exec
if it need it (and it should need it).

- Background:

One might want to read "Secure File Descriptor Handling" [2] by
Ulrich Drepper who is responsible of adding O_CLOEXEC flag on open(),
and flags alike on other syscalls.

One might also want to read PEP-446 "Make newly created file descriptors
non-inheritable" [3] by Victor Stinner since it has lot more background
information on file descriptor leaking.

One also like to read "Excuse me son, but your code is leaking !!!" [4]
by Dan Walsh for advice.

[1] http://lwn.net/Articles/412131/
[2] http://udrepper.livejournal.com/20407.html
[3] http://www.python.org/dev/peps/pep-0446/
[4] http://danwalsh.livejournal.com/53603.html

- Statistics:

In linux-next tag 20131224, they're currently:

- 32 calls to fd_install()
       with one call part of anon_inode_getfd()
- 24 calls to get_unused_fd_flags()
       with one call part of anon_inode_getfd()
       with another part of get_unused_fd() macro
- 11 calls to anon_inode_getfd()
-  8 calls to anon_inode_getfile()
       with one call part of anon_inode_getfd()
-  7 calls to get_unused_fd()

Changes from patchset v4 [PATCHSETv4]:

- rewrote cover letter following discussion with perf maintainer.
  Thanks to Peter Zijlstra.

- modified a bit some commit messages.

- events: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- perf: introduce a flag to enable close-on-exec in perf_event_open()
  NEW: instead of hard coding the flags to 0, this patch
       allows userspace to specify close-on-exec flag.

- fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: replaced by following patch

- fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  NEW: instead of hard coding the flags to 0, this patch
       enable close-on-exec if userspace request it.

Changes from patchset v3 [PATCHSETv3]:

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  DROPPED: applied upstream

Changes from patchset v2 [PATCHSETv2]:

- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  DROPPED: applied upstream

- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied upstream.
  Additionally subsystem maintainer applied another patch on top
  to set the flags to O_CLOEXEC.

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
  NEW: propose to use O_CLOEXEC as default flag.

Changes from patchset v1 [PATCHSETv1]:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: subsystem maintainer applied another patch
           using get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
  MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
  DROPPED: applied asis by subsystem maintainer.

Links:

[PATCHSETv4]
  http://lkml.kernel.org/r/cover.1383121137.git.ydroneaud@opteya.com

[PATCHSETv3]
  http://lkml.kernel.org/r/cover.1378460926.git.ydroneaud@opteya.com

[PATCHSETv2]
  http://lkml.kernel.org/r/cover.1376327678.git.ydroneaud@opteya.com

[PATCHSETv1]
  http://lkml.kernel.org/r/cover.1372777600.git.ydroneaud@opteya.com

PS: Happy new (gregorian calendar's) year 2014 and best wishes ;)

Yann Droneaud (7):
  ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  file: use get_unused_fd_flags(0) instead of get_unused_fd()
  fanotify: enable close-on-exec on events' fd when requested in
    fanotify_init()
  perf: introduce a flag to enable close-on-exec in perf_event_open()
  file: remove macro get_unused_fd()

 arch/ia64/kernel/perfmon.c                |  2 +-
 arch/powerpc/platforms/cell/spufs/inode.c |  4 ++--
 fs/binfmt_misc.c                          |  2 +-
 fs/file.c                                 |  2 +-
 fs/notify/fanotify/fanotify_user.c        |  2 +-
 include/linux/file.h                      |  1 -
 include/uapi/linux/perf_event.h           |  1 +
 kernel/events/core.c                      | 12 +++++++++---
 8 files changed, 16 insertions(+), 10 deletions(-)

-- 
1.8.4.2


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCHv5 1/7] ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
  2014-01-05 20:36 ` Yann Droneaud
@ 2014-01-05 20:36   ` Yann Droneaud
  -1 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu
  Cc: Yann Droneaud, linux-ia64, linux-kernel, Al Viro, Andrew Morton

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 arch/ia64/kernel/perfmon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index cb592773c78b..44b148fbb560 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2662,7 +2662,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg
 
 	ret = -ENOMEM;
 
-	fd = get_unused_fd();
+	fd = get_unused_fd_flags(0);
 	if (fd < 0)
 		return fd;
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 1/7] ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
@ 2014-01-05 20:36   ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Tony Luck, Fenghua Yu
  Cc: Yann Droneaud, linux-ia64, linux-kernel, Al Viro, Andrew Morton

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 arch/ia64/kernel/perfmon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index cb592773c78b..44b148fbb560 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2662,7 +2662,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg
 
 	ret = -ENOMEM;
 
-	fd = get_unused_fd();
+	fd = get_unused_fd_flags(0);
 	if (fd < 0)
 		return fd;
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 2/7] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
  2014-01-05 20:36 ` Yann Droneaud
@ 2014-01-05 20:36   ` Yann Droneaud
  -1 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Jeremy Kerr, Arnd Bergmann, Benjamin Herrenschmidt, Paul Mackerras
  Cc: Yann Droneaud, linuxppc-dev, cbe-oss-dev, linux-kernel, Al Viro,
	Andrew Morton

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf99cd7..51effcec30d8 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
 	int ret;
 	struct file *filp;
 
-	ret = get_unused_fd();
+	ret = get_unused_fd_flags(0);
 	if (ret < 0)
 		return ret;
 
@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
 	int ret;
 	struct file *filp;
 
-	ret = get_unused_fd();
+	ret = get_unused_fd_flags(0);
 	if (ret < 0)
 		return ret;
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 2/7] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
@ 2014-01-05 20:36   ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Jeremy Kerr, Arnd Bergmann, Benjamin Herrenschmidt, Paul Mackerras
  Cc: Yann Droneaud, cbe-oss-dev, linux-kernel, Al Viro, Andrew Morton,
	linuxppc-dev

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf99cd7..51effcec30d8 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
 	int ret;
 	struct file *filp;
 
-	ret = get_unused_fd();
+	ret = get_unused_fd_flags(0);
 	if (ret < 0)
 		return ret;
 
@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
 	int ret;
 	struct file *filp;
 
-	ret = get_unused_fd();
+	ret = get_unused_fd_flags(0);
 	if (ret < 0)
 		return ret;
 
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 3/7] binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
  2014-01-05 20:36 ` Yann Droneaud
                   ` (3 preceding siblings ...)
  (?)
@ 2014-01-05 20:36 ` Yann Droneaud
  -1 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Al Viro; +Cc: Yann Droneaud, linux-fsdevel, linux-kernel, Andrew Morton

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 fs/binfmt_misc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 1c740e152f38..052f6dc8e549 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -138,7 +138,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
 		/* if the binary should be opened on behalf of the
 		 * interpreter than keep it open and assign descriptor
 		 * to it */
- 		fd_binary = get_unused_fd();
+		fd_binary = get_unused_fd_flags(0);
  		if (fd_binary < 0) {
  			retval = fd_binary;
  			goto _ret;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 4/7] file: use get_unused_fd_flags(0) instead of get_unused_fd()
  2014-01-05 20:36 ` Yann Droneaud
                   ` (4 preceding siblings ...)
  (?)
@ 2014-01-05 20:36 ` Yann Droneaud
  -1 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Al Viro; +Cc: Yann Droneaud, linux-fsdevel, linux-kernel, Andrew Morton

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be made the default to not leak file descriptor
across exec().

In a further patch, get_unused_fd() will be removed so that
new code start using get_unused_fd_flags() (or anon_inode_getfd())
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0).

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 fs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/file.c b/fs/file.c
index 4a78f981557a..1420d2890b43 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -897,7 +897,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
 	struct file *file = fget_raw(fildes);
 
 	if (file) {
-		ret = get_unused_fd();
+		ret = get_unused_fd_flags(0);
 		if (ret >= 0)
 			fd_install(ret, file);
 		else
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 5/7] fanotify: enable close-on-exec on events' fd when requested in fanotify_init()
  2014-01-05 20:36 ` Yann Droneaud
                   ` (5 preceding siblings ...)
  (?)
@ 2014-01-05 20:36 ` Yann Droneaud
  2014-01-20 17:15   ` Yann Droneaud
  -1 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Eric Paris; +Cc: Yann Droneaud, linux-kernel, Al Viro, Andrew Morton, stable

According to commit 80af258867648, file descriptor created as part
of file access notification events inherit flags from the event_f_flags
argument passed to syscall fanotify_init(2).

So while it is legal for userspace to call fanotify_init() with O_CLOEXEC
as part of its second argument, O_CLOEXEC is currently silently ignored.

Indeed event_f_flags are only given to dentry_open(), which only
seems to care about O_ACCMODE and O_PATH in do_dentry_open(), O_DIRECT
in open_check_o_direct() and O_LARGEFILE in generic_file_open().

More, there's no effective check on event_f_flags value that would
catch unknown / unsupported values, unlike the one on f_flags argument
of the syscall (see FAN_ALL_INIT_FLAGS in include/uapi/linux/fanotify.h).

Reading article "Botching up ioctls"[1] by Daniel Vetter might make one
feel uncomfortable when trying to add extension to an API that doesn't
check for unrecognized values.

But it seems logical to set close-on-exec flag on the file descriptor
if userspace is allowed to request it with O_CLOEXEC.

In fact, according to some lookup on http://codesearch.debian.net/ and
various search engine, there's already some userspace code requesting it:

- in systemd's readahead[2]:

    fanotify_fd = fanotify_init(FAN_CLOEXEC|FAN_NONBLOCK, O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_NOATIME);

- in clsync[3]:

    #define FANOTIFY_EVFLAGS (O_LARGEFILE|O_RDONLY|O_CLOEXEC)

    int fanotify_d = fanotify_init(FANOTIFY_FLAGS, FANOTIFY_EVFLAGS);

- in examples [4] from "Filesystem monitoring in the Linux kernel"
  article[5] by Aleksander Morgado:

    if ((fanotify_fd = fanotify_init (FAN_CLOEXEC,
                                      O_RDONLY | O_CLOEXEC | O_LARGEFILE)) < 0)

Lookup also returned some wrong usage of the syscall:

- in Gonk HAL from Mozilla Firefox OS sources[6]:

    mFd = fanotify_init(FAN_CLASS_NOTIF, FAN_CLOEXEC);

Adding support for O_CLOEXEC in fanotify won't magically enable it for
Gonk since FAN_CLOEXEC is defined as 0x1, which is likely equal to
O_WRONLY when used in open flag context. In the other hand, it won't
hurt it either.

So this patch replaces call to macro get_unused_fd() by a call to
function get_unused_fd_flags() with event_f_flags value as argument.
This way O_CLOEXEC flag in the second argument of fanotify_init(2)
syscall is interpreted so that close-on-exec get enabled.

[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html
[2] http://cgit.freedesktop.org/systemd/systemd/tree/src/readahead/readahead-collect.c?id=v208#n294
[3] https://github.com/xaionaro/clsync/blob/v0.2.1/sync.c#L1631
    https://github.com/xaionaro/clsync/blob/v0.2.1/configuration.h#L38
[4] http://www.lanedo.com/~aleksander/fanotify/fanotify-example.c
[5] http://www.lanedo.com/2013/filesystem-monitoring-linux-kernel/
[6] http://hg.mozilla.org/mozilla-central/file/325c74addeba/hal/gonk/GonkDiskSpaceWatcher.cpp#l167

Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 fs/notify/fanotify/fanotify_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 57d7c083cb4b..6d0eaabba02e 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -71,7 +71,7 @@ static int create_fd(struct fsnotify_group *group,
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
-	client_fd = get_unused_fd();
+	client_fd = get_unused_fd_flags(group->fanotify_data.f_flags);
 	if (client_fd < 0)
 		return client_fd;
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 6/7] perf: introduce a flag to enable close-on-exec in perf_event_open()
  2014-01-05 20:36 ` Yann Droneaud
                   ` (6 preceding siblings ...)
  (?)
@ 2014-01-05 20:36 ` Yann Droneaud
  2014-01-06  9:29   ` Peter Zijlstra
  2014-01-12 18:43   ` [tip:perf/core] perf: Introduce a flag to enable close-on-exec in perf_event_open() tip-bot for Yann Droneaud
  -1 siblings, 2 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo
  Cc: Yann Droneaud, linux-kernel, Al Viro, Andrew Morton, Peter Zijlstra

Unlike recent modern userspace API such as
  epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
  fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
  signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
  or the venerable general purpose open (O_CLOEXEC),
perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
(eg. close-on-exec) flag on file descriptor it returns to userspace.

The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
perf_event_open() syscall to atomically set close-on-exec.

Having this flag will enable userspace to remove the file descriptor
from the list of file descriptors being inherited across exec,
without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
associated race condition between the current thread and another
thread calling fork(2) then execve(2).

Links:

 - Secure File Descriptor Handling (Ulrich Drepper, 2008)
   http://udrepper.livejournal.com/20407.html

 - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
   http://danwalsh.livejournal.com/53603.html

 - Notes in DMA buffer sharing: leak and security hole
   http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/1383168950%2d8933-1-git-send-email-ydroneaud@opteya.com
Link: http://lkml.kernel.org/r/1383170413.9171.10.camel@dworkin
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 959d454f76a1..e244ed412745 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -725,6 +725,7 @@ enum perf_callchain_context {
 #define PERF_FLAG_FD_NO_GROUP		(1U << 0)
 #define PERF_FLAG_FD_OUTPUT		(1U << 1)
 #define PERF_FLAG_PID_CGROUP		(1U << 2) /* pid=cgroup id, per-cpu mode only */
+#define PERF_FLAG_FD_CLOEXEC		(1U << 3) /* O_CLOEXEC */
 
 union perf_mem_data_src {
 	__u64 val;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4b743b299124..86e6a49abf8e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -119,7 +119,8 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)
 
 #define PERF_FLAG_ALL (PERF_FLAG_FD_NO_GROUP |\
 		       PERF_FLAG_FD_OUTPUT  |\
-		       PERF_FLAG_PID_CGROUP)
+		       PERF_FLAG_PID_CGROUP |\
+		       PERF_FLAG_FD_CLOEXEC)
 
 /*
  * branch priv levels that need permission checks
@@ -6995,6 +6996,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	int event_fd;
 	int move_group = 0;
 	int err;
+	int f_flags = O_RDWR;
 
 	/* for future expandability... */
 	if (flags & ~PERF_FLAG_ALL)
@@ -7023,7 +7025,10 @@ SYSCALL_DEFINE5(perf_event_open,
 	if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
 		return -EINVAL;
 
-	event_fd = get_unused_fd();
+	if (flags & PERF_FLAG_FD_CLOEXEC)
+		f_flags |= O_CLOEXEC;
+
+	event_fd = get_unused_fd_flags(f_flags);
 	if (event_fd < 0)
 		return event_fd;
 
@@ -7145,7 +7150,8 @@ SYSCALL_DEFINE5(perf_event_open,
 			goto err_context;
 	}
 
-	event_file = anon_inode_getfile("[perf_event]", &perf_fops, event, O_RDWR);
+	event_file = anon_inode_getfile("[perf_event]", &perf_fops, event,
+					f_flags);
 	if (IS_ERR(event_file)) {
 		err = PTR_ERR(event_file);
 		goto err_context;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5 7/7] file: remove macro get_unused_fd()
  2014-01-05 20:36 ` Yann Droneaud
                   ` (7 preceding siblings ...)
  (?)
@ 2014-01-05 20:36 ` Yann Droneaud
  -1 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-05 20:36 UTC (permalink / raw)
  To: Al Viro; +Cc: Yann Droneaud, linux-fsdevel, linux-kernel, Andrew Morton

Macro get_unused_fd() allocates a file descriptor without enabling
close-on-exec: it calls function get_unused_fd_flags() without
O_CLOEXEC flag.

This can be seen as an unsafe default: in most case close-on-exec
should be enabled to not leak file descriptor across exec().

This patch removes get_unused_fd() instead of updating it to use
O_CLOEXEC so that out of tree modules won't be affect by a runtime
behavor change which might introduce other kind of bug. It's better
to catch the change at build time, making it easier to fix.

Removing the macro will also promote use of get_unused_fd_flags()
(or anon_inode_getfd()) with flags provided by userspace. Or, if
flags cannot be given by userspace, with flags set to O_CLOEXEC set
by default.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 include/linux/file.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/file.h b/include/linux/file.h
index cbacf4faf447..866600259c07 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -63,7 +63,6 @@ extern void set_close_on_exec(unsigned int fd, int flag);
 extern bool get_close_on_exec(unsigned int fd);
 extern void put_filp(struct file *);
 extern int get_unused_fd_flags(unsigned flags);
-#define get_unused_fd() get_unused_fd_flags(0)
 extern void put_unused_fd(unsigned int fd);
 
 extern void fd_install(unsigned int fd, struct file *file);
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCHv5 6/7] perf: introduce a flag to enable close-on-exec in perf_event_open()
  2014-01-05 20:36 ` [PATCHv5 6/7] perf: introduce a flag to enable close-on-exec in perf_event_open() Yann Droneaud
@ 2014-01-06  9:29   ` Peter Zijlstra
  2014-01-06 10:51     ` [PATCH] perf tools: enable close-on-exec flag on perf file descriptor Yann Droneaud
  2014-01-12 18:43   ` [tip:perf/core] perf: Introduce a flag to enable close-on-exec in perf_event_open() tip-bot for Yann Droneaud
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2014-01-06  9:29 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo,
	linux-kernel, Al Viro, Andrew Morton

On Sun, Jan 05, 2014 at 09:36:33PM +0100, Yann Droneaud wrote:
> Unlike recent modern userspace API such as
>   epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
>   fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
>   signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
>   or the venerable general purpose open (O_CLOEXEC),
> perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
> (eg. close-on-exec) flag on file descriptor it returns to userspace.
> 
> The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
> perf_event_open() syscall to atomically set close-on-exec.
> 
> Having this flag will enable userspace to remove the file descriptor
> from the list of file descriptors being inherited across exec,
> without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
> associated race condition between the current thread and another
> thread calling fork(2) then execve(2).
> 
> Links:
> 
>  - Secure File Descriptor Handling (Ulrich Drepper, 2008)
>    http://udrepper.livejournal.com/20407.html
> 
>  - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
>    http://danwalsh.livejournal.com/53603.html
> 
>  - Notes in DMA buffer sharing: leak and security hole
>    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>

Seems sane enough.

Thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06  9:29   ` Peter Zijlstra
@ 2014-01-06 10:51     ` Yann Droneaud
  2014-01-06 11:24       ` Peter Zijlstra
                         ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-06 10:51 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Andi Kleen,
	David Ahern, Frederic Weisbecker, Mike Galbraith,
	Stephane Eranian, Adrian Hunter, Benjamin Herrenschmidt,
	Michael Ellerman
  Cc: Yann Droneaud, linux-kernel, Peter Zijlstra

In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
added to perf_event_open(2) syscall to allows userspace
to enable close-on-exec behavor atomically when creating
the file descriptor.

This patch makes perf tools use the new flag.

Beware that perf tools compiled with the new flag won't work
on older kernel which do not support flag PERF_FLAG_FD_CLOEXEC.

[1] http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
[2] https://patchwork.kernel.org/patch/3434971/

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 tools/perf/bench/mem-memcpy.c               | 3 ++-
 tools/perf/bench/mem-memset.c               | 3 ++-
 tools/perf/builtin-sched.c                  | 2 +-
 tools/perf/tests/bp_signal.c                | 2 +-
 tools/perf/tests/bp_signal_overflow.c       | 2 +-
 tools/perf/tests/rdpmc.c                    | 2 +-
 tools/perf/util/evsel.c                     | 4 ++--
 tools/perf/util/record.c                    | 9 ++++++---
 tools/testing/selftests/powerpc/pmu/event.c | 3 ++-
 9 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..c1be6c17b382 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -83,7 +83,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       PERF_FLAG_FD_CLOEXEC);
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..3b92b7da9cf2 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -83,7 +83,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       PERF_FLAG_FD_CLOEXEC);
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0f3c65518a2c..933a15783a54 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -435,7 +435,7 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..de33526ff9d6 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -78,7 +78,7 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..e54bf622adb2 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -91,7 +91,7 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index 46649c25fa5e..46664eca77af 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -104,7 +104,7 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 01ff4cfde1f5..8f5d6bc2ad3e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -970,7 +970,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -979,7 +979,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index e5104538c354..1146fa9003ab 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -22,14 +22,16 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 PERF_FLAG_FD_CLOEXEC);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 PERF_FLAG_FD_CLOEXEC);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -204,7 +206,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 PERF_FLAG_FD_CLOEXEC);
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
diff --git a/tools/testing/selftests/powerpc/pmu/event.c b/tools/testing/selftests/powerpc/pmu/event.c
index 2b2d11df2450..a705e42b6132 100644
--- a/tools/testing/selftests/powerpc/pmu/event.c
+++ b/tools/testing/selftests/powerpc/pmu/event.c
@@ -45,7 +45,8 @@ void event_init_named(struct event *e, u64 config, char *name)
 
 int event_open_with_options(struct event *e, pid_t pid, int cpu, int group_fd)
 {
-	e->fd = perf_event_open(&e->attr, pid, cpu, group_fd, 0);
+	e->fd = perf_event_open(&e->attr, pid, cpu, group_fd,
+				PERF_FLAG_FD_CLOEXEC);
 	if (e->fd == -1) {
 		perror("perf_event_open");
 		return -1;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 10:51     ` [PATCH] perf tools: enable close-on-exec flag on perf file descriptor Yann Droneaud
@ 2014-01-06 11:24       ` Peter Zijlstra
  2014-01-06 14:43         ` Arnaldo Carvalho de Melo
  2014-01-06 14:22       ` Jiri Olsa
  2014-01-06 16:27       ` [PATCH] " Andi Kleen
  2 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2014-01-06 11:24 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim, Andi Kleen, David Ahern, Frederic Weisbecker,
	Mike Galbraith, Stephane Eranian, Adrian Hunter,
	Benjamin Herrenschmidt, Michael Ellerman, linux-kernel

On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> added to perf_event_open(2) syscall to allows userspace
> to enable close-on-exec behavor atomically when creating
> the file descriptor.
> 
> This patch makes perf tools use the new flag.
> 
> Beware that perf tools compiled with the new flag won't work
> on older kernel which do not support flag PERF_FLAG_FD_CLOEXEC.
> 
> [1] http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
> [2] https://patchwork.kernel.org/patch/3434971/
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
> Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com

acme, ACK? I was thinking I'd keep these two patches together so the
entire things lands in tip in one go?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 10:51     ` [PATCH] perf tools: enable close-on-exec flag on perf file descriptor Yann Droneaud
  2014-01-06 11:24       ` Peter Zijlstra
@ 2014-01-06 14:22       ` Jiri Olsa
  2014-01-06 15:31         ` Yann Droneaud
  2014-01-11 18:07         ` [PATCHv1] " Yann Droneaud
  2014-01-06 16:27       ` [PATCH] " Andi Kleen
  2 siblings, 2 replies; 38+ messages in thread
From: Jiri Olsa @ 2014-01-06 14:22 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel, Peter Zijlstra

On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> added to perf_event_open(2) syscall to allows userspace
> to enable close-on-exec behavor atomically when creating
> the file descriptor.
> 
> This patch makes perf tools use the new flag.
> 
> Beware that perf tools compiled with the new flag won't work
> on older kernel which do not support flag PERF_FLAG_FD_CLOEXEC.

I think we should enhance the api probe routines (perf_do_probe_api)
to detect that, than just bypass us from running on older kernels

jirka

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 11:24       ` Peter Zijlstra
@ 2014-01-06 14:43         ` Arnaldo Carvalho de Melo
  2014-01-06 21:01           ` Yann Droneaud
  0 siblings, 1 reply; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-01-06 14:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yann Droneaud, Paul Mackerras, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Andi Kleen, David Ahern, Frederic Weisbecker,
	Mike Galbraith, Stephane Eranian, Adrian Hunter,
	Benjamin Herrenschmidt, Michael Ellerman, linux-kernel

Em Mon, Jan 06, 2014 at 12:24:36PM +0100, Peter Zijlstra escreveu:
> On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> > In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> > added to perf_event_open(2) syscall to allows userspace
> > to enable close-on-exec behavor atomically when creating
> > the file descriptor.

> > This patch makes perf tools use the new flag.

> > Beware that perf tools compiled with the new flag won't work
> > on older kernel which do not support flag PERF_FLAG_FD_CLOEXEC.

> > [1] http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
> > [2] https://patchwork.kernel.org/patch/3434971/

> acme, ACK? I was thinking I'd keep these two patches together so the
> entire things lands in tip in one go?

Nope, it should notice the EINVAL, drop the flag that doesn't work on
older kernels, retry, so that new tools continue to work on older
kernels, with yet another fallback.

Please take a look at __perf_evsel__open(), probably it will be best to
add a flag to perf_missing_features, like the one we have for the
perf_event_attr.mmap2 flag, so that we fail just once, etc.

- Arnaldo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 14:22       ` Jiri Olsa
@ 2014-01-06 15:31         ` Yann Droneaud
  2014-01-11 18:07         ` [PATCHv1] " Yann Droneaud
  1 sibling, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-06 15:31 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel, Peter Zijlstra, Yann Droneaud

Hi,

Le lundi 06 janvier 2014 à 15:22 +0100, Jiri Olsa a écrit :
> On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> > In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> > added to perf_event_open(2) syscall to allows userspace
> > to enable close-on-exec behavor atomically when creating
> > the file descriptor.
> > 
> > This patch makes perf tools use the new flag.
> > 
> > Beware that perf tools compiled with the new flag won't work
> > on older kernel which do not support flag PERF_FLAG_FD_CLOEXEC.
> 
> I think we should enhance the api probe routines (perf_do_probe_api)
> to detect that, than just bypass us from running on older kernels
> 

perf_probe_api() is only available in tools/perf/util/record.c module
(eg. it's a static function), while I've modified all calls to
perf_event_open().

So if you want to follow the probe path, a new function should be added.
This function must be used to retrieve the value of the
PERF_FLAG_FD_CLOEXEC flag if available.
For example:

static int cloexec = PERF_FLAG_FD_CLOEXEC;

int perf_flag_fd_cloexec(void)
{
    static int probed;

    if (!probed) {
        int fd = perf_event_open(&attr, 0, -1, -1,
PERF_FLAG_FD_CLOEXEC);
        probed = 1;
        if (fd >= 0)
            close(fd);
        else
            cloexec = 0;
     }

     return cloexec;
}

But I don't know how to setup a struct perf_event_attr which will work
in "all" case (and do no harm).

I was able to run this code with attr struct cleared (eg. set to 0), but
I don't know perf internals enough to be confident.

Regards.

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 10:51     ` [PATCH] perf tools: enable close-on-exec flag on perf file descriptor Yann Droneaud
  2014-01-06 11:24       ` Peter Zijlstra
  2014-01-06 14:22       ` Jiri Olsa
@ 2014-01-06 16:27       ` Andi Kleen
  2014-01-06 16:39         ` Peter Zijlstra
  2 siblings, 1 reply; 38+ messages in thread
From: Andi Kleen @ 2014-01-06 16:27 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel, Peter Zijlstra

On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> added to perf_event_open(2) syscall to allows userspace
> to enable close-on-exec behavor atomically when creating
> the file descriptor.
> 
> This patch makes perf tools use the new flag.

What is that good for? I can see why for a threaded program, but 
"perf" is not threaded.

-Andi

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 16:27       ` [PATCH] " Andi Kleen
@ 2014-01-06 16:39         ` Peter Zijlstra
  2014-01-06 16:52           ` Andi Kleen
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2014-01-06 16:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Yann Droneaud, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel

On Mon, Jan 06, 2014 at 08:27:54AM -0800, Andi Kleen wrote:
> On Mon, Jan 06, 2014 at 11:51:25AM +0100, Yann Droneaud wrote:
> > In a previous patch [1][2], flag PERF_FLAG_FD_CLOEXEC was
> > added to perf_event_open(2) syscall to allows userspace
> > to enable close-on-exec behavor atomically when creating
> > the file descriptor.
> > 
> > This patch makes perf tools use the new flag.
> 
> What is that good for? I can see why for a threaded program, but 
> "perf" is not threaded.

AFAICT its got nothing to do with threaded or not, but only with exec()
and we do in fact call exec() quite a lot in perf.

It ensures we do not leak open perf FDs into our child processes. Now
I'm not entirely sure how we do the exec these days but I think we were
good about not not leaking them anyway, but more paranoia never really
hurts.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 16:39         ` Peter Zijlstra
@ 2014-01-06 16:52           ` Andi Kleen
  2014-01-06 17:15             ` Yann Droneaud
  0 siblings, 1 reply; 38+ messages in thread
From: Andi Kleen @ 2014-01-06 16:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yann Droneaud, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel

> AFAICT its got nothing to do with threaded or not, but only with exec()
> and we do in fact call exec() quite a lot in perf.
> 
> It ensures we do not leak open perf FDs into our child processes. Now
> I'm not entirely sure how we do the exec these days but I think we were
> good about not not leaking them anyway, but more paranoia never really
> hurts.

Then you can just set it with fcntl, which works everywhere, 
instead of doing extra feature tests.

The atomic setup is only needed for threaded programs to avoid
races with other threads doing exec.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 16:52           ` Andi Kleen
@ 2014-01-06 17:15             ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-06 17:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	linux-kernel, Yann Droneaud

Hi,

Le lundi 06 janvier 2014 à 08:52 -0800, Andi Kleen a écrit :
> > AFAICT its got nothing to do with threaded or not, but only with exec()
> > and we do in fact call exec() quite a lot in perf.
> > 
> > It ensures we do not leak open perf FDs into our child processes. Now
> > I'm not entirely sure how we do the exec these days but I think we were
> > good about not not leaking them anyway, but more paranoia never really
> > hurts.
> 
> Then you can just set it with fcntl, which works everywhere, 
> instead of doing extra feature tests.
> 
> The atomic setup is only needed for threaded programs to avoid
> races with other threads doing exec.
> 

True.

The purpose of this patch was more about exercising the new flag
proposed for perf_event_open() as a way to demonstrate its usage.
Having it used in the perf code base will likely help to spread it to
other tools based on perf event API. 

Trying to add a close-on-exec enable flag late emphasis it should have
been done earlier and perhaps made the default.

As I wrote in "[PATCH v5 0/7] Getting rid of get_unused_fd() / enable
close-on-exec" [1], setting close-on-exec at open time is easier to
write for the programmer and less error prone than trying to add call to
fcntl() and forget to add it in some code path. (forgetting to add the
proper flag to open() is probably more error prone ... so one might want
to make close-on-exec the *default* for any new API).

[1] http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com

Regards.

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 14:43         ` Arnaldo Carvalho de Melo
@ 2014-01-06 21:01           ` Yann Droneaud
  2014-01-06 21:14             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-01-06 21:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Andi Kleen, David Ahern, Frederic Weisbecker,
	Mike Galbraith, Stephane Eranian, Adrian Hunter,
	Benjamin Herrenschmidt, Michael Ellerman, linux-kernel,
	Yann Droneaud

Hi,

Le lundi 06 janvier 2014 à 11:43 -0300, Arnaldo Carvalho de Melo a
écrit :
> Em Mon, Jan 06, 2014 at 12:24:36PM +0100, Peter Zijlstra escreveu:

> > acme, ACK? I was thinking I'd keep these two patches together so the
> > entire things lands in tip in one go?
> 
> Nope, it should notice the EINVAL, drop the flag that doesn't work on
> older kernels, retry, so that new tools continue to work on older
> kernels, with yet another fallback.
> 

But it may be difficult to distinguish the 'origin' of EINVAL: is it
from FD_CLOEXEC flag or from an unsupported parameter in the attributes.

> Please take a look at __perf_evsel__open(), probably it will be best to
> add a flag to perf_missing_features, like the one we have for the
> perf_event_attr.mmap2 flag, so that we fail just once, etc.
> 

Unfortunately perf_event_open() is called in multiple places, not only
in __perf_evsel__open(). So a more generic solution should be designed.

Is something like the function proposed in message
<1389022310.13828.9.camel@localhost.localdomain> [1]
ok to be added to its own module:

static int cloexec = PERF_FLAG_FD_CLOEXEC;

int perf_flag_fd_cloexec(void)
{
    static int probed;

    if (!probed) {
        struct perf_event_attr attr = { 0 };
        int fd = perf_event_open(&attr, 0, -1, -1,
PERF_FLAG_FD_CLOEXEC);
        probed = 1;
        if (fd >= 0)
            close(fd);
        else
            cloexec = 0;
     }

     return cloexec;
}

This function should be used to build the flag passed to
perf_event_open().

Regards.

[1] 
http://lkml.kernel.org/r/1389022310.13828.9.camel@localhost.localdomain

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 21:01           ` Yann Droneaud
@ 2014-01-06 21:14             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-01-06 21:14 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Andi Kleen, David Ahern, Frederic Weisbecker,
	Mike Galbraith, Stephane Eranian, Adrian Hunter,
	Benjamin Herrenschmidt, Michael Ellerman, linux-kernel

Em Mon, Jan 06, 2014 at 10:01:35PM +0100, Yann Droneaud escreveu:
> Hi,
> 
> Le lundi 06 janvier 2014 à 11:43 -0300, Arnaldo Carvalho de Melo a
> écrit :
> > Em Mon, Jan 06, 2014 at 12:24:36PM +0100, Peter Zijlstra escreveu:
> 
> > > acme, ACK? I was thinking I'd keep these two patches together so the
> > > entire things lands in tip in one go?
> > 
> > Nope, it should notice the EINVAL, drop the flag that doesn't work on
> > older kernels, retry, so that new tools continue to work on older
> > kernels, with yet another fallback.
> > 
> 
> But it may be difficult to distinguish the 'origin' of EINVAL: is it
> from FD_CLOEXEC flag or from an unsupported parameter in the attributes.

It definitely is, that is why the first thing to fall back is the most
recently added feature.
 
> > Please take a look at __perf_evsel__open(), probably it will be best to
> > add a flag to perf_missing_features, like the one we have for the
> > perf_event_attr.mmap2 flag, so that we fail just once, etc.
> > 
> 
> Unfortunately perf_event_open() is called in multiple places, not only
> in __perf_evsel__open(). So a more generic solution should be designed.
> 
> Is something like the function proposed in message
> <1389022310.13828.9.camel@localhost.localdomain> [1]
> ok to be added to its own module:

Well, at least make it set a perf_missing_features flag instead of that
'cloexec' global, that way the routines that use perf_evsel__open would
reuse it.

- Arnaldo
 
> static int cloexec = PERF_FLAG_FD_CLOEXEC;
> 
> int perf_flag_fd_cloexec(void)
> {
>     static int probed;
> 
>     if (!probed) {
>         struct perf_event_attr attr = { 0 };
>         int fd = perf_event_open(&attr, 0, -1, -1,
> PERF_FLAG_FD_CLOEXEC);
>         probed = 1;
>         if (fd >= 0)
>             close(fd);
>         else
>             cloexec = 0;
>      }
> 
>      return cloexec;
> }
> 
> This function should be used to build the flag passed to
> perf_event_open().
> 
> Regards.
> 
> [1] 
> http://lkml.kernel.org/r/1389022310.13828.9.camel@localhost.localdomain
> 
> -- 
> Yann Droneaud
> OPTEYA
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCHv1] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-06 14:22       ` Jiri Olsa
  2014-01-06 15:31         ` Yann Droneaud
@ 2014-01-11 18:07         ` Yann Droneaud
  2014-01-13 10:09           ` [PATCHv2] " Yann Droneaud
  1 sibling, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-01-11 18:07 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Andi Kleen,
	David Ahern, Frederic Weisbecker, Mike Galbraith,
	Stephane Eranian, Adrian Hunter, Benjamin Herrenschmidt,
	Michael Ellerman
  Cc: linux-kernel, Yann Droneaud, Peter Zijlstra

In a previous patch [1], flag PERF_FLAG_FD_CLOEXEC was
added to perf_event_open(2) syscall to allows userspace
to enable close-on-exec behavor atomically when creating
the file descriptor.

This patch makes perf tool use the new flag if supported
by the kernel, so that the event file descriptors got automatically
closed if perf tool exec a sub-command.

Changes from v0 [2]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3434971/
[2] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 54 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              | 10 +++++--
 11 files changed, 95 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 97a2145e4cc6..0428b11402b8 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -372,6 +372,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0f3c65518a2c..bce5d4efea9c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -435,7 +436,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..fdc0d3e185f9 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..b0b17415f18c 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index 46649c25fa5e..c1e55ff18774 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "../util/cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..3889a7b9e2d6
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,54 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	struct perf_event_attr attr;
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	memset(&attr, 0, sizeof(attr));
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	if (errno != EINVAL) {
+		err = errno;
+		pr_warning("sys_perf_event_open() syscall returned "
+			   "%d (%d: %s)\n", fd, err, strerror(err));
+	}
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	memset(&attr, 0, sizeof(attr));
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd >= 0) {
+		close(fd);
+		return 0;
+	}
+
+	err = errno;
+	die("sys_perf_event_open() syscall returned "
+	    "%d (%d: %s)\n", fd, err, strerror(err));
+
+	return -1;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static int probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0) {
+			flag = 0;
+		}
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 01ff4cfde1f5..19953fc7f90e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -970,7 +971,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -979,11 +980,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1052,7 +1055,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index e5104538c354..d51db8341437 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include "fs.h"
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -22,14 +23,16 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -204,7 +207,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [tip:perf/core] perf: Introduce a flag to enable close-on-exec in perf_event_open()
  2014-01-05 20:36 ` [PATCHv5 6/7] perf: introduce a flag to enable close-on-exec in perf_event_open() Yann Droneaud
  2014-01-06  9:29   ` Peter Zijlstra
@ 2014-01-12 18:43   ` tip-bot for Yann Droneaud
  1 sibling, 0 replies; 38+ messages in thread
From: tip-bot for Yann Droneaud @ 2014-01-12 18:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, hpa, mingo, torvalds, peterz, acme,
	ydroneaud, viro, akpm, tglx

Commit-ID:  a21b0b354d4ac39be691f51c53562e2c24443d9e
Gitweb:     http://git.kernel.org/tip/a21b0b354d4ac39be691f51c53562e2c24443d9e
Author:     Yann Droneaud <ydroneaud@opteya.com>
AuthorDate: Sun, 5 Jan 2014 21:36:33 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 12 Jan 2014 10:16:59 +0100

perf: Introduce a flag to enable close-on-exec in perf_event_open()

Unlike recent modern userspace API such as:

  epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
  fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
  signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
  or the venerable general purpose open (O_CLOEXEC),

perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
(eg. close-on-exec) flag on file descriptor it returns to userspace.

The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
perf_event_open() syscall to atomically set close-on-exec.

Having this flag will enable userspace to remove the file descriptor
from the list of file descriptors being inherited across exec,
without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
associated race condition between the current thread and another
thread calling fork(2) then execve(2).

Links:

 - Secure File Descriptor Handling (Ulrich Drepper, 2008)
   http://udrepper.livejournal.com/20407.html

 - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
   http://danwalsh.livejournal.com/53603.html

 - Notes in DMA buffer sharing: leak and security hole
   http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e1802d6..ca018b4 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -724,6 +724,7 @@ enum perf_callchain_context {
 #define PERF_FLAG_FD_NO_GROUP		(1U << 0)
 #define PERF_FLAG_FD_OUTPUT		(1U << 1)
 #define PERF_FLAG_PID_CGROUP		(1U << 2) /* pid=cgroup id, per-cpu mode only */
+#define PERF_FLAG_FD_CLOEXEC		(1U << 3) /* O_CLOEXEC */
 
 union perf_mem_data_src {
 	__u64 val;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c3b6c27..5c87264 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -119,7 +119,8 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)
 
 #define PERF_FLAG_ALL (PERF_FLAG_FD_NO_GROUP |\
 		       PERF_FLAG_FD_OUTPUT  |\
-		       PERF_FLAG_PID_CGROUP)
+		       PERF_FLAG_PID_CGROUP |\
+		       PERF_FLAG_FD_CLOEXEC)
 
 /*
  * branch priv levels that need permission checks
@@ -6982,6 +6983,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	int event_fd;
 	int move_group = 0;
 	int err;
+	int f_flags = O_RDWR;
 
 	/* for future expandability... */
 	if (flags & ~PERF_FLAG_ALL)
@@ -7010,7 +7012,10 @@ SYSCALL_DEFINE5(perf_event_open,
 	if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
 		return -EINVAL;
 
-	event_fd = get_unused_fd();
+	if (flags & PERF_FLAG_FD_CLOEXEC)
+		f_flags |= O_CLOEXEC;
+
+	event_fd = get_unused_fd_flags(f_flags);
 	if (event_fd < 0)
 		return event_fd;
 
@@ -7132,7 +7137,8 @@ SYSCALL_DEFINE5(perf_event_open,
 			goto err_context;
 	}
 
-	event_file = anon_inode_getfile("[perf_event]", &perf_fops, event, O_RDWR);
+	event_file = anon_inode_getfile("[perf_event]", &perf_fops, event,
+					f_flags);
 	if (IS_ERR(event_file)) {
 		err = PTR_ERR(event_file);
 		goto err_context;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv2] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-11 18:07         ` [PATCHv1] " Yann Droneaud
@ 2014-01-13 10:09           ` Yann Droneaud
  2014-01-15 18:50             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-01-13 10:09 UTC (permalink / raw)
  To: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Andi Kleen,
	David Ahern, Frederic Weisbecker, Mike Galbraith,
	Stephane Eranian, Adrian Hunter, Benjamin Herrenschmidt,
	Michael Ellerman
  Cc: linux-kernel, Yann Droneaud, Peter Zijlstra

In a previous patch [1], flag PERF_FLAG_FD_CLOEXEC was
added to perf_event_open(2) syscall to allows userspace
to enable close-on-exec behavor atomically when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Changes from v1 [2]:
- don't probe PERF_FLAG_FD_CLOEXEC for each call to
  perf_event_open_cloexec_flag(): don't forget to set
  'probed' variable once flag was probed.
- call perf_event_open_cloexec_flag() only once in
  util/record.c:perf_do_probe_api().
- fixed minor coding style issue (unneeded braces)
  in util/cloexec.c

Changes from v0 [3]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3434971/
[2] http://lkml.kernel.org/r/1389463628-24869-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3469571/
[3] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
---
 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 54 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              |  9 ++++--
 11 files changed, 94 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 3638b0bd20dc..bcce558b5407 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -372,6 +372,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 6a76a07b6789..54017bdec88c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -435,7 +436,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..fdc0d3e185f9 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..b0b17415f18c 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index 46649c25fa5e..c1e55ff18774 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "../util/cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..06f6ee087fa1
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,54 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	struct perf_event_attr attr;
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	memset(&attr, 0, sizeof(attr));
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	if (errno != EINVAL) {
+		err = errno;
+		pr_warning("sys_perf_event_open() syscall returned "
+			   "%d (%d: %s)\n", fd, err, strerror(err));
+	}
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	memset(&attr, 0, sizeof(attr));
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd >= 0) {
+		close(fd);
+		return 0;
+	}
+
+	err = errno;
+	die("sys_perf_event_open() syscall returned "
+	    "%d (%d: %s)\n", fd, err, strerror(err));
+
+	return -1;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static bool probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0)
+			flag = 0;
+		probed = true;
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ade8d9c1c431..ff845c7e7a4f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -968,7 +969,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -977,11 +978,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1050,7 +1053,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 104a47563d39..6483ef5df31b 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include "fs.h"
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 {
 	struct perf_evlist *evlist;
 	struct perf_evsel *evsel;
+	unsigned long flags = perf_event_open_cloexec_flag();
 	int err = -EAGAIN, fd;
 
 	evlist = perf_evlist__new();
@@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCHv2] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-13 10:09           ` [PATCHv2] " Yann Droneaud
@ 2014-01-15 18:50             ` Arnaldo Carvalho de Melo
  2014-01-26 21:20               ` [PATCHv3] " Yann Droneaud
  0 siblings, 1 reply; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-01-15 18:50 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Andi Kleen, David Ahern, Frederic Weisbecker,
	Mike Galbraith, Stephane Eranian, Adrian Hunter,
	Benjamin Herrenschmidt, Michael Ellerman, linux-kernel,
	Peter Zijlstra

Em Mon, Jan 13, 2014 at 11:09:30AM +0100, Yann Droneaud escreveu:
> +++ b/tools/perf/util/cloexec.c
> @@ -0,0 +1,54 @@
> +#include "util.h"
> +#include "../perf.h"
> +#include "cloexec.h"
> +
> +static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
> +
> +static int perf_flag_probe(void)
> +{
> +	struct perf_event_attr attr;
> +	int fd;
> +	int err;
> +
> +	/* check cloexec flag */
> +	memset(&attr, 0, sizeof(attr));

If you do this you'll use attr.type = PERF_TYPE_HARDWARE and
attr.config then means PERF_COUNT_HW_CPU_CYCLES, which will fail
where we don't have a PMU, see perf_evsel__fallback(), perhaps it is
better for you to go with the fallback that is most likely supported in
all arches:

                evsel->attr.type   = PERF_TYPE_SOFTWARE;       // 1
                evsel->attr.config = PERF_COUNT_SW_CPU_CLOCK;  // 0

I.e. just do a:

	struct perf_event_attr attr = { .type = 1, }; and ditch the
memset (or do a attr.type = PERF_COUNT_SW_CPU_CLOCK;, up to you).

Thanks for doing the __perf_evsel__open fallbacking!

- Arnaldo

> +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> +				 PERF_FLAG_FD_CLOEXEC);
> +	if (fd >= 0) {
> +		close(fd);
> +		return 1;
> +	}
> +
> +	if (errno != EINVAL) {
> +		err = errno;
> +		pr_warning("sys_perf_event_open() syscall returned "
> +			   "%d (%d: %s)\n", fd, err, strerror(err));
> +	}
> +
> +	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
> +	memset(&attr, 0, sizeof(attr));
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +	if (fd >= 0) {
> +		close(fd);
> +		return 0;
> +	}
> +
> +	err = errno;
> +	die("sys_perf_event_open() syscall returned "
> +	    "%d (%d: %s)\n", fd, err, strerror(err));
> +
> +	return -1;
> +}
> +
> +unsigned long perf_event_open_cloexec_flag(void)
> +{
> +	static bool probed;
> +
> +	if (!probed) {
> +		if (perf_flag_probe() <= 0)
> +			flag = 0;
> +		probed = true;
> +	}
> +
> +	return flag;
> +}
> diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
> new file mode 100644
> index 000000000000..94a5a7d829d5
> --- /dev/null
> +++ b/tools/perf/util/cloexec.h
> @@ -0,0 +1,6 @@
> +#ifndef __PERF_CLOEXEC_H
> +#define __PERF_CLOEXEC_H
> +
> +unsigned long perf_event_open_cloexec_flag(void);
> +
> +#endif /* __PERF_CLOEXEC_H */
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index ade8d9c1c431..ff845c7e7a4f 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -29,6 +29,7 @@ static struct {
>  	bool sample_id_all;
>  	bool exclude_guest;
>  	bool mmap2;
> +	bool cloexec;
>  } perf_missing_features;
>  
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> @@ -968,7 +969,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  			      struct thread_map *threads)
>  {
>  	int cpu, thread;
> -	unsigned long flags = 0;
> +	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
>  	int pid = -1, err;
>  	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
>  
> @@ -977,11 +978,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  		return -ENOMEM;
>  
>  	if (evsel->cgrp) {
> -		flags = PERF_FLAG_PID_CGROUP;
> +		flags |= PERF_FLAG_PID_CGROUP;
>  		pid = evsel->cgrp->fd;
>  	}
>  
>  fallback_missing_features:
> +	if (perf_missing_features.cloexec)
> +		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
>  	if (perf_missing_features.mmap2)
>  		evsel->attr.mmap2 = 0;
>  	if (perf_missing_features.exclude_guest)
> @@ -1050,7 +1053,10 @@ try_fallback:
>  	if (err != -EINVAL || cpu > 0 || thread > 0)
>  		goto out_close;
>  
> -	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
> +	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
> +		perf_missing_features.cloexec = true;
> +		goto fallback_missing_features;
> +	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
>  		perf_missing_features.mmap2 = true;
>  		goto fallback_missing_features;
>  	} else if (!perf_missing_features.exclude_guest &&
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index 104a47563d39..6483ef5df31b 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -4,6 +4,7 @@
>  #include "parse-events.h"
>  #include "fs.h"
>  #include "util.h"
> +#include "cloexec.h"
>  
>  typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
>  
> @@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
>  {
>  	struct perf_evlist *evlist;
>  	struct perf_evsel *evsel;
> +	unsigned long flags = perf_event_open_cloexec_flag();
>  	int err = -EAGAIN, fd;
>  
>  	evlist = perf_evlist__new();
> @@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
>  
>  	evsel = perf_evlist__first(evlist);
>  
> -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
>  	if (fd < 0)
>  		goto out_delete;
>  	close(fd);
>  
>  	fn(evsel);
>  
> -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
>  	if (fd < 0) {
>  		if (errno == EINVAL)
>  			err = -EINVAL;
> @@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
>  		cpu = evlist->cpus->map[0];
>  	}
>  
> -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
> +				 perf_event_open_cloexec_flag());
>  	if (fd >= 0) {
>  		close(fd);
>  		ret = true;
> -- 
> 1.8.4.2

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCHv5 5/7] fanotify: enable close-on-exec on events' fd when requested in fanotify_init()
  2014-01-05 20:36 ` [PATCHv5 5/7] fanotify: enable close-on-exec on events' fd when requested in fanotify_init() Yann Droneaud
@ 2014-01-20 17:15   ` Yann Droneaud
  0 siblings, 0 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-01-20 17:15 UTC (permalink / raw)
  To: Lino Sanfilippo; +Cc: linux-kernel, Eric Paris, Al Viro, Andrew Morton

Hi Lino,

I've recently sent this patch which may fix a bug in the way fanotify
creates its file descriptors.

I failed to noticed you've been working recently on fanotify subsystem
and miss the opportunity to add you in the Cc:s. 

Please have a look to this patch and help to get it merged.

Le dimanche 05 janvier 2014 à 21:36 +0100, Yann Droneaud a écrit :
> According to commit 80af258867648, file descriptor created as part
> of file access notification events inherit flags from the event_f_flags
> argument passed to syscall fanotify_init(2).
> 
> So while it is legal for userspace to call fanotify_init() with O_CLOEXEC
> as part of its second argument, O_CLOEXEC is currently silently ignored.
> 
> Indeed event_f_flags are only given to dentry_open(), which only
> seems to care about O_ACCMODE and O_PATH in do_dentry_open(), O_DIRECT
> in open_check_o_direct() and O_LARGEFILE in generic_file_open().
> 
> More, there's no effective check on event_f_flags value that would
> catch unknown / unsupported values, unlike the one on f_flags argument
> of the syscall (see FAN_ALL_INIT_FLAGS in include/uapi/linux/fanotify.h).

...

> But it seems logical to set close-on-exec flag on the file descriptor
> if userspace is allowed to request it with O_CLOEXEC.
> 

...

> So this patch replaces call to macro get_unused_fd() by a call to
> function get_unused_fd_flags() with event_f_flags value as argument.
> This way O_CLOEXEC flag in the second argument of fanotify_init(2)
> syscall is interpreted so that close-on-exec get enabled.

...

> Cc: Eric Paris <eparis@redhat.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: stable@vger.kernel.org
> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
> Link:
> http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
> 
> ---
>  fs/notify/fanotify/fanotify_user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 57d7c083cb4b..6d0eaabba02e 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -71,7 +71,7 @@ static int create_fd(struct fsnotify_group *group,
>  
>  	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>  
> -	client_fd = get_unused_fd();
> +	client_fd = get_unused_fd_flags(group->fanotify_data.f_flags);
>  	if (client_fd < 0)
>  		return client_fd;
>  

Regards.

-- 
Yann Droneaud
OPTEYA




^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCHv3] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-15 18:50             ` Arnaldo Carvalho de Melo
@ 2014-01-26 21:20               ` Yann Droneaud
  2014-03-11  8:39                 ` [PATCHv4] " Yann Droneaud
  0 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-01-26 21:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yann Droneaud, Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

In commit a21b0b354d4a, flag PERF_FLAG_FD_CLOEXEC was
added to perf_event_open(2) syscall to allows userspace
to atomically enable close-on-exec behavor when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Changes from v2 [1]:
- use safest perf attr configuration in probe function,
  as used in perf_evsel__fallback().

Changes from v1 [2]:
- don't probe PERF_FLAG_FD_CLOEXEC for each call to
  perf_event_open_cloexec_flag(): don't forget to set
  'probed' variable once flag was probed.
- call perf_event_open_cloexec_flag() only once in
  util/record.c:perf_do_probe_api().
- fixed minor coding style issue (unneeded braces)
  in util/cloexec.c

Changes from v0 [3]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/1389607770-4485-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3474141/

[2] http://lkml.kernel.org/r/1389463628-24869-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3469571/

[3] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
---
 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 56 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              |  9 ++++--
 11 files changed, 96 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 7257e7e9e38a..6aeefd97c9db 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 6a76a07b6789..54017bdec88c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -435,7 +436,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..fdc0d3e185f9 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..b0b17415f18c 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index 46649c25fa5e..c1e55ff18774 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "../util/cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..771ed36fcc61
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,56 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	/* use 'safest' configuration as used in perf_evsel__fallback() */
+	struct perf_event_attr attr = {
+		.type = PERF_COUNT_SW_CPU_CLOCK,
+		.config = PERF_COUNT_SW_CPU_CLOCK,
+	};
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	if (errno != EINVAL) {
+		err = errno;
+		pr_warning("sys_perf_event_open() syscall returned "
+			   "%d (%d: %s)\n", fd, err, strerror(err));
+	}
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd >= 0) {
+		close(fd);
+		return 0;
+	}
+
+	err = errno;
+	die("sys_perf_event_open() syscall returned "
+	    "%d (%d: %s)\n", fd, err, strerror(err));
+
+	return -1;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static bool probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0)
+			flag = 0;
+		probed = true;
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 22e18a26b7e6..0a44fba07b29 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -969,7 +970,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -978,11 +979,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1051,7 +1054,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 373762501dad..fa543aab6963 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include "fs.h"
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 {
 	struct perf_evlist *evlist;
 	struct perf_evsel *evsel;
+	unsigned long flags = perf_event_open_cloexec_flag();
 	int err = -EAGAIN, fd;
 
 	evlist = perf_evlist__new();
@@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv4] perf tools: enable close-on-exec flag on perf file descriptor
  2014-01-26 21:20               ` [PATCHv3] " Yann Droneaud
@ 2014-03-11  8:39                 ` Yann Droneaud
  2014-06-02 10:56                   ` [PATCHv5] " Yann Droneaud
  0 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-03-11  8:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yann Droneaud, Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

In commit a21b0b354d4a, flag PERF_FLAG_FD_CLOEXEC was
added to perf_event_open(2) syscall to allows userspace
to atomically enable close-on-exec behavor when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Changes from v3 [1]:
- rebase on next-20140307 and resend asis.

Changes from v2 [2]:
- use safest perf attr configuration in probe function,
  as used in perf_evsel__fallback().

Changes from v1 [3]:
- don't probe PERF_FLAG_FD_CLOEXEC for each call to
  perf_event_open_cloexec_flag(): don't forget to set
  'probed' variable once flag was probed.
- call perf_event_open_cloexec_flag() only once in
  util/record.c:perf_do_probe_api().
- fixed minor coding style issue (unneeded braces)
  in util/cloexec.c

Changes from v0 [4]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/1390771203-28415-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3540091/

[2] http://lkml.kernel.org/r/1389607770-4485-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3474141/

[3] http://lkml.kernel.org/r/1389463628-24869-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3469571/

[4] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
---
 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 56 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              |  9 ++++--
 11 files changed, 96 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 1f7ec48ac959..7abdeaa9dbd5 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -374,6 +374,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 6a76a07b6789..54017bdec88c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -435,7 +436,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..fdc0d3e185f9 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..b0b17415f18c 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index 46649c25fa5e..c1e55ff18774 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "../util/cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..771ed36fcc61
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,56 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	/* use 'safest' configuration as used in perf_evsel__fallback() */
+	struct perf_event_attr attr = {
+		.type = PERF_COUNT_SW_CPU_CLOCK,
+		.config = PERF_COUNT_SW_CPU_CLOCK,
+	};
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	if (errno != EINVAL) {
+		err = errno;
+		pr_warning("sys_perf_event_open() syscall returned "
+			   "%d (%d: %s)\n", fd, err, strerror(err));
+	}
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd >= 0) {
+		close(fd);
+		return 0;
+	}
+
+	err = errno;
+	die("sys_perf_event_open() syscall returned "
+	    "%d (%d: %s)\n", fd, err, strerror(err));
+
+	return -1;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static bool probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0)
+			flag = 0;
+		probed = true;
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index adc94dd1794d..52772db306d8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -969,7 +970,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -978,11 +979,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1051,7 +1054,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 049e0a09ccd3..9a8d622809f6 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include <api/fs/fs.h>
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 {
 	struct perf_evlist *evlist;
 	struct perf_evsel *evsel;
+	unsigned long flags = perf_event_open_cloexec_flag();
 	int err = -EAGAIN, fd;
 
 	evlist = perf_evlist__new();
@@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCHv5] perf tools: enable close-on-exec flag on perf file descriptor
  2014-03-11  8:39                 ` [PATCHv4] " Yann Droneaud
@ 2014-06-02 10:56                   ` Yann Droneaud
  2014-06-02 19:23                     ` Jiri Olsa
  0 siblings, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-06-02 10:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yann Droneaud, Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

In commit a21b0b354d4a ('perf: Introduce a flag to enable
close-on-exec in perf_event_open()'), flag PERF_FLAG_FD_CLOEXEC
was added to perf_event_open(2) syscall to allows userspace
to atomically enable close-on-exec behavor when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
---

Hi,

Quite the same patch from v4. I'm interested in some
feedback so that I could improve the patch if needed.

Regards.

Changes from v4 [1]:
- rebase on next-20140530 and update commit message.

Changes from v3 [2]:
- rebase on next-20140307 and resend asis.

Changes from v2 [3]:
- use safest perf attr configuration in probe function,
  as used in perf_evsel__fallback().

Changes from v1 [4]:
- don't probe PERF_FLAG_FD_CLOEXEC for each call to
  perf_event_open_cloexec_flag(): don't forget to set
  'probed' variable once flag was probed.
- call perf_event_open_cloexec_flag() only once in
  util/record.c:perf_do_probe_api().
- fixed minor coding style issue (unneeded braces)
  in util/cloexec.c

Changes from v0 [5]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/1394527147-7010-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3809711/

[2] http://lkml.kernel.org/r/1390771203-28415-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3540091/

[3] http://lkml.kernel.org/r/1389607770-4485-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3474141/

[4] http://lkml.kernel.org/r/1389463628-24869-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3469571/

[5] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 56 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              |  9 ++++--
 11 files changed, 96 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 02f0a4dd1a80..000d34087f77 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index d7176830b9b2..bb4afc9fae54 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -434,7 +435,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..fdc0d3e185f9 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..b0b17415f18c 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "../util/cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index e59143fd9e71..8432bfcffed6 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "../util/cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..771ed36fcc61
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,56 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	/* use 'safest' configuration as used in perf_evsel__fallback() */
+	struct perf_event_attr attr = {
+		.type = PERF_COUNT_SW_CPU_CLOCK,
+		.config = PERF_COUNT_SW_CPU_CLOCK,
+	};
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	if (errno != EINVAL) {
+		err = errno;
+		pr_warning("sys_perf_event_open() syscall returned "
+			   "%d (%d: %s)\n", fd, err, strerror(err));
+	}
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd >= 0) {
+		close(fd);
+		return 0;
+	}
+
+	err = errno;
+	die("sys_perf_event_open() syscall returned "
+	    "%d (%d: %s)\n", fd, err, strerror(err));
+
+	return -1;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static bool probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0)
+			flag = 0;
+		probed = true;
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5c28d82b76c4..d32019461993 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -988,7 +989,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -997,11 +998,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1070,7 +1073,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 049e0a09ccd3..9a8d622809f6 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include <api/fs/fs.h>
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 {
 	struct perf_evlist *evlist;
 	struct perf_evsel *evsel;
+	unsigned long flags = perf_event_open_cloexec_flag();
 	int err = -EAGAIN, fd;
 
 	evlist = perf_evlist__new();
@@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCHv5] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-02 10:56                   ` [PATCHv5] " Yann Droneaud
@ 2014-06-02 19:23                     ` Jiri Olsa
  2014-06-03  8:57                       ` Yann Droneaud
  2014-06-30 20:28                       ` [PATCHv6] " Yann Droneaud
  0 siblings, 2 replies; 38+ messages in thread
From: Jiri Olsa @ 2014-06-02 19:23 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

On Mon, Jun 02, 2014 at 12:56:34PM +0200, Yann Droneaud wrote:

SNIP

> 
> Hi,
> 
> Quite the same patch from v4. I'm interested in some
> feedback so that I could improve the patch if needed.
> 
> Regards.
> 
> Changes from v4 [1]:
> - rebase on next-20140530 and update commit message.

hi,
I wasn't following on this one before.. so please shut me up
if my comments were already discussed ;-)

SNIP

> index d7176830b9b2..bb4afc9fae54 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -10,6 +10,7 @@
>  #include "util/header.h"
>  #include "util/session.h"
>  #include "util/tool.h"
> +#include "util/cloexec.h"
>  
>  #include "util/parse-options.h"
>  #include "util/trace-event.h"
> @@ -434,7 +435,8 @@ static int self_open_counters(void)
>  	attr.type = PERF_TYPE_SOFTWARE;
>  	attr.config = PERF_COUNT_SW_TASK_CLOCK;
>  
> -	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> +				 perf_event_open_cloexec_flag());
>  
>  	if (fd < 0)
>  		pr_err("Error: sys_perf_event_open() syscall returned "
> diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
> index aba095489193..fdc0d3e185f9 100644
> --- a/tools/perf/tests/bp_signal.c
> +++ b/tools/perf/tests/bp_signal.c
> @@ -25,6 +25,7 @@
>  #include "tests.h"
>  #include "debug.h"
>  #include "perf.h"
> +#include "../util/cloexec.h"

#include "cloexec.h" should be enough

>  
>  static int fd1;
>  static int fd2;
> @@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
>  	pe.exclude_kernel = 1;
>  	pe.exclude_hv = 1;
>  
> -	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
> +	fd = sys_perf_event_open(&pe, 0, -1, -1,
> +				 perf_event_open_cloexec_flag());
>  	if (fd < 0) {
>  		pr_debug("failed opening event %llx\n", pe.config);
>  		return TEST_FAIL;
> diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
> index 44ac82179708..b0b17415f18c 100644
> --- a/tools/perf/tests/bp_signal_overflow.c
> +++ b/tools/perf/tests/bp_signal_overflow.c
> @@ -24,6 +24,7 @@
>  #include "tests.h"
>  #include "debug.h"
>  #include "perf.h"
> +#include "../util/cloexec.h"

ditto

>  
>  static int overflows;
>  
> @@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
>  	pe.exclude_kernel = 1;
>  	pe.exclude_hv = 1;
>  
> -	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
> +	fd = sys_perf_event_open(&pe, 0, -1, -1,
> +				 perf_event_open_cloexec_flag());
>  	if (fd < 0) {
>  		pr_debug("failed opening event %llx\n", pe.config);
>  		return TEST_FAIL;
> diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
> index e59143fd9e71..8432bfcffed6 100644
> --- a/tools/perf/tests/rdpmc.c
> +++ b/tools/perf/tests/rdpmc.c
> @@ -6,6 +6,7 @@
>  #include "perf.h"
>  #include "debug.h"
>  #include "tests.h"
> +#include "../util/cloexec.h"

ditto

SNIP

> +
> +static int perf_flag_probe(void)
> +{
> +	/* use 'safest' configuration as used in perf_evsel__fallback() */
> +	struct perf_event_attr attr = {
> +		.type = PERF_COUNT_SW_CPU_CLOCK,
> +		.config = PERF_COUNT_SW_CPU_CLOCK,
> +	};
> +	int fd;
> +	int err;
> +
> +	/* check cloexec flag */
> +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> +				 PERF_FLAG_FD_CLOEXEC);
> +	if (fd >= 0) {
> +		close(fd);
> +		return 1;
> +	}
> +
> +	if (errno != EINVAL) {
> +		err = errno;
> +		pr_warning("sys_perf_event_open() syscall returned "
> +			   "%d (%d: %s)\n", fd, err, strerror(err));
> +	}
> +
> +	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +	if (fd >= 0) {
> +		close(fd);
> +		return 0;
> +	}
> +
> +	err = errno;
> +	die("sys_perf_event_open() syscall returned "
> +	    "%d (%d: %s)\n", fd, err, strerror(err));

I think we have a strategy of not using die in new code..
please just use WARN_ONCE.. and let the code fail in the
standard way

SNIP

>  
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> @@ -988,7 +989,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  			      struct thread_map *threads)
>  {
>  	int cpu, thread;
> -	unsigned long flags = 0;
> +	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
>  	int pid = -1, err;
>  	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
>  
> @@ -997,11 +998,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  		return -ENOMEM;
>  
>  	if (evsel->cgrp) {
> -		flags = PERF_FLAG_PID_CGROUP;
> +		flags |= PERF_FLAG_PID_CGROUP;
>  		pid = evsel->cgrp->fd;
>  	}
>  
>  fallback_missing_features:
> +	if (perf_missing_features.cloexec)
> +		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
>  	if (perf_missing_features.mmap2)
>  		evsel->attr.mmap2 = 0;
>  	if (perf_missing_features.exclude_guest)
> @@ -1070,7 +1073,10 @@ try_fallback:
>  	if (err != -EINVAL || cpu > 0 || thread > 0)
>  		goto out_close;
>  
> -	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
> +	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
> +		perf_missing_features.cloexec = true;
> +		goto fallback_missing_features;

I think it does not fit in here, because we check latest perf_event_attr
bits and removing one after another, to see if miracle happens.
(also looks like we miss exclude_callchain_(kernel|user) bits)

Wouldn't it be better just to use perf_event_open_cloexec_flag() function
as you did in the rest of the code? I think we dont need 2 detection codes
for this.

> +	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
>  		perf_missing_features.mmap2 = true;
>  		goto fallback_missing_features;
>  	} else if (!perf_missing_features.exclude_guest &&
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index 049e0a09ccd3..9a8d622809f6 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -4,6 +4,7 @@
>  #include "parse-events.h"
>  #include <api/fs/fs.h>
>  #include "util.h"
> +#include "cloexec.h"
>  
>  typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
>  
> @@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
>  {
>  	struct perf_evlist *evlist;
>  	struct perf_evsel *evsel;
> +	unsigned long flags = perf_event_open_cloexec_flag();
>  	int err = -EAGAIN, fd;
>  
>  	evlist = perf_evlist__new();
> @@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
>  
>  	evsel = perf_evlist__first(evlist);
>  
> -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
>  	if (fd < 0)
>  		goto out_delete;
>  	close(fd);
>  
>  	fn(evsel);
>  
> -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
>  	if (fd < 0) {
>  		if (errno == EINVAL)
>  			err = -EINVAL;
> @@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
>  		cpu = evlist->cpus->map[0];
>  	}

hum... no one calls this exported function ^^^ :-) I wonder why it's there.. cool

thanks,
jirka

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCHv5] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-02 19:23                     ` Jiri Olsa
@ 2014-06-03  8:57                       ` Yann Droneaud
  2014-06-03  9:23                         ` Adrian Hunter
  2014-06-03 11:51                         ` Jiri Olsa
  2014-06-30 20:28                       ` [PATCHv6] " Yann Droneaud
  1 sibling, 2 replies; 38+ messages in thread
From: Yann Droneaud @ 2014-06-03  8:57 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar, AdrianHunter,
	ydroneaud

Le lundi 02 juin 2014 à 21:23 +0200, Jiri Olsa a écrit :
> On Mon, Jun 02, 2014 at 12:56:34PM +0200, Yann Droneaud wrote:
> 
> SNIP
> 
> > 
> > Hi,
> > 
> > Quite the same patch from v4. I'm interested in some
> > feedback so that I could improve the patch if needed.
> > 
> > Regards.
> > 
> > Changes from v4 [1]:
> > - rebase on next-20140530 and update commit message.
> 
> hi,
> I wasn't following on this one before.. so please shut me up
> if my comments were already discussed ;-)
> 

You had a chance to review it before :)
See http://lkml.kernel.org/r/20140106142220.GB1183@krava.brq.redhat.com

> SNIP
> 
> > index d7176830b9b2..bb4afc9fae54 100644
> > --- a/tools/perf/builtin-sched.c
> > +++ b/tools/perf/builtin-sched.c
> > @@ -10,6 +10,7 @@
> >  #include "util/header.h"
> >  #include "util/session.h"
> >  #include "util/tool.h"
> > +#include "util/cloexec.h"
> >  
> >  #include "util/parse-options.h"
> >  #include "util/trace-event.h"
> > @@ -434,7 +435,8 @@ static int self_open_counters(void)
> >  	attr.type = PERF_TYPE_SOFTWARE;
> >  	attr.config = PERF_COUNT_SW_TASK_CLOCK;
> >  
> > -	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> > +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> > +				 perf_event_open_cloexec_flag());
> >  
> >  	if (fd < 0)
> >  		pr_err("Error: sys_perf_event_open() syscall returned "
> > diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
> > index aba095489193..fdc0d3e185f9 100644
> > --- a/tools/perf/tests/bp_signal.c
> > +++ b/tools/perf/tests/bp_signal.c
> > @@ -25,6 +25,7 @@
> >  #include "tests.h"
> >  #include "debug.h"
> >  #include "perf.h"
> > +#include "../util/cloexec.h"
> 
> #include "cloexec.h" should be enough
> 

OK

> >  
> >  static int fd1;
> >  static int fd2;
> > @@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
> >  	pe.exclude_kernel = 1;
> >  	pe.exclude_hv = 1;
> >  
> > -	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
> > +	fd = sys_perf_event_open(&pe, 0, -1, -1,
> > +				 perf_event_open_cloexec_flag());
> >  	if (fd < 0) {
> >  		pr_debug("failed opening event %llx\n", pe.config);
> >  		return TEST_FAIL;
> > diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
> > index 44ac82179708..b0b17415f18c 100644
> > --- a/tools/perf/tests/bp_signal_overflow.c
> > +++ b/tools/perf/tests/bp_signal_overflow.c
> > @@ -24,6 +24,7 @@
> >  #include "tests.h"
> >  #include "debug.h"
> >  #include "perf.h"
> > +#include "../util/cloexec.h"
> 
> ditto
> 

OK

> >  
> >  static int overflows;
> >  
> > @@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
> >  	pe.exclude_kernel = 1;
> >  	pe.exclude_hv = 1;
> >  
> > -	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
> > +	fd = sys_perf_event_open(&pe, 0, -1, -1,
> > +				 perf_event_open_cloexec_flag());
> >  	if (fd < 0) {
> >  		pr_debug("failed opening event %llx\n", pe.config);
> >  		return TEST_FAIL;
> > diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
> > index e59143fd9e71..8432bfcffed6 100644
> > --- a/tools/perf/tests/rdpmc.c
> > +++ b/tools/perf/tests/rdpmc.c
> > @@ -6,6 +6,7 @@
> >  #include "perf.h"
> >  #include "debug.h"
> >  #include "tests.h"
> > +#include "../util/cloexec.h"
> 
> ditto
> 

OK

> SNIP
> 
> > +
> > +static int perf_flag_probe(void)
> > +{
> > +	/* use 'safest' configuration as used in perf_evsel__fallback() */
> > +	struct perf_event_attr attr = {
> > +		.type = PERF_COUNT_SW_CPU_CLOCK,
> > +		.config = PERF_COUNT_SW_CPU_CLOCK,
> > +	};
> > +	int fd;
> > +	int err;
> > +
> > +	/* check cloexec flag */
> > +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> > +				 PERF_FLAG_FD_CLOEXEC);
> > +	if (fd >= 0) {
> > +		close(fd);
> > +		return 1;
> > +	}
> > +
> > +	if (errno != EINVAL) {
> > +		err = errno;
> > +		pr_warning("sys_perf_event_open() syscall returned "
> > +			   "%d (%d: %s)\n", fd, err, strerror(err));
> > +	}
> > +
> > +	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
> > +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> > +	if (fd >= 0) {
> > +		close(fd);
> > +		return 0;
> > +	}
> > +
> > +	err = errno;
> > +	die("sys_perf_event_open() syscall returned "
> > +	    "%d (%d: %s)\n", fd, err, strerror(err));
> 
> I think we have a strategy of not using die in new code..
> please just use WARN_ONCE.. and let the code fail in the
> standard way
> 

OK


> SNIP
> 
> >  
> >  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> > @@ -988,7 +989,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
> >  			      struct thread_map *threads)
> >  {
> >  	int cpu, thread;
> > -	unsigned long flags = 0;
> > +	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
> >  	int pid = -1, err;
> >  	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> >  
> > @@ -997,11 +998,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
> >  		return -ENOMEM;
> >  
> >  	if (evsel->cgrp) {
> > -		flags = PERF_FLAG_PID_CGROUP;
> > +		flags |= PERF_FLAG_PID_CGROUP;
> >  		pid = evsel->cgrp->fd;
> >  	}
> >  
> >  fallback_missing_features:
> > +	if (perf_missing_features.cloexec)
> > +		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
> >  	if (perf_missing_features.mmap2)
> >  		evsel->attr.mmap2 = 0;
> >  	if (perf_missing_features.exclude_guest)
> > @@ -1070,7 +1073,10 @@ try_fallback:
> >  	if (err != -EINVAL || cpu > 0 || thread > 0)
> >  		goto out_close;
> >  
> > -	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
> > +	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
> > +		perf_missing_features.cloexec = true;
> > +		goto fallback_missing_features;
> 
> I think it does not fit in here, because we check latest perf_event_attr
> bits and removing one after another, to see if miracle happens.
> (also looks like we miss exclude_callchain_(kernel|user) bits)
> 
> Wouldn't it be better just to use perf_event_open_cloexec_flag() function
> as you did in the rest of the code? I think we dont need 2 detection codes
> for this.
> 

I was asked by ACME to use here the missing feature detection pattern
and to treat the flag just as a new feature like perf_event_attr new
attributes.

See http://lkml.kernel.org/r/20140106144347.GA13500@ghostprotocols.net
    http://lkml.kernel.org/r/20140106211444.GD2810@ghostprotocols.net

> > +	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
> >  		perf_missing_features.mmap2 = true;
> >  		goto fallback_missing_features;
> >  	} else if (!perf_missing_features.exclude_guest &&
> > diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> > index 049e0a09ccd3..9a8d622809f6 100644
> > --- a/tools/perf/util/record.c
> > +++ b/tools/perf/util/record.c
> > @@ -4,6 +4,7 @@
> >  #include "parse-events.h"
> >  #include <api/fs/fs.h>
> >  #include "util.h"
> > +#include "cloexec.h"
> >  
> >  typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
> >  
> > @@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
> >  {
> >  	struct perf_evlist *evlist;
> >  	struct perf_evsel *evsel;
> > +	unsigned long flags = perf_event_open_cloexec_flag();
> >  	int err = -EAGAIN, fd;
> >  
> >  	evlist = perf_evlist__new();
> > @@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
> >  
> >  	evsel = perf_evlist__first(evlist);
> >  
> > -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> > +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
> >  	if (fd < 0)
> >  		goto out_delete;
> >  	close(fd);
> >  
> >  	fn(evsel);
> >  
> > -	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
> > +	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
> >  	if (fd < 0) {
> >  		if (errno == EINVAL)
> >  			err = -EINVAL;
> > @@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
> >  		cpu = evlist->cpus->map[0];
> >  	}
> 
> hum... no one calls this exported function ^^^ :-) I wonder why it's there.. cool
> 

Nice catch. This function was introduced by commit 
c09ec622629eeb4b7877646a42852e7156363425 ('perf evlist: Add 
can_select_event() method') by Adrian Hunter <adrian.hunter@intel.com>
and it seems it was never used by perf tools.
Is it available to some outside tree perf tools ?


Thanks for the review.

Regards

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCHv5] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-03  8:57                       ` Yann Droneaud
@ 2014-06-03  9:23                         ` Adrian Hunter
  2014-06-03 11:51                         ` Jiri Olsa
  1 sibling, 0 replies; 38+ messages in thread
From: Adrian Hunter @ 2014-06-03  9:23 UTC (permalink / raw)
  To: Yann Droneaud, Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Benjamin Herrenschmidt, Michael Ellerman, Peter Zijlstra,
	linux-kernel, Ingo Molnar

On 06/03/2014 11:57 AM, Yann Droneaud wrote:
> Nice catch. This function was introduced by commit 
> c09ec622629eeb4b7877646a42852e7156363425 ('perf evlist: Add 
> can_select_event() method') by Adrian Hunter <adrian.hunter@intel.com>
> and it seems it was never used by perf tools.

It is part of preparation for Intel PT.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCHv5] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-03  8:57                       ` Yann Droneaud
  2014-06-03  9:23                         ` Adrian Hunter
@ 2014-06-03 11:51                         ` Jiri Olsa
  1 sibling, 0 replies; 38+ messages in thread
From: Jiri Olsa @ 2014-06-03 11:51 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

On Tue, Jun 03, 2014 at 10:57:15AM +0200, Yann Droneaud wrote:
> Le lundi 02 juin 2014 à 21:23 +0200, Jiri Olsa a écrit :
> > On Mon, Jun 02, 2014 at 12:56:34PM +0200, Yann Droneaud wrote:
> > 
> > SNIP
> > 
> > > 
> > > Hi,
> > > 
> > > Quite the same patch from v4. I'm interested in some
> > > feedback so that I could improve the patch if needed.
> > > 
> > > Regards.
> > > 
> > > Changes from v4 [1]:
> > > - rebase on next-20140530 and update commit message.
> > 
> > hi,
> > I wasn't following on this one before.. so please shut me up
> > if my comments were already discussed ;-)
> > 
> 
> You had a chance to review it before :)
> See http://lkml.kernel.org/r/20140106142220.GB1183@krava.brq.redhat.com

yep, but then I drift away ;-)

SNIP

> > >  
> > >  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> > > @@ -988,7 +989,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
> > >  			      struct thread_map *threads)
> > >  {
> > >  	int cpu, thread;
> > > -	unsigned long flags = 0;
> > > +	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
> > >  	int pid = -1, err;
> > >  	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
> > >  
> > > @@ -997,11 +998,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
> > >  		return -ENOMEM;
> > >  
> > >  	if (evsel->cgrp) {
> > > -		flags = PERF_FLAG_PID_CGROUP;
> > > +		flags |= PERF_FLAG_PID_CGROUP;
> > >  		pid = evsel->cgrp->fd;
> > >  	}
> > >  
> > >  fallback_missing_features:
> > > +	if (perf_missing_features.cloexec)
> > > +		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
> > >  	if (perf_missing_features.mmap2)
> > >  		evsel->attr.mmap2 = 0;
> > >  	if (perf_missing_features.exclude_guest)
> > > @@ -1070,7 +1073,10 @@ try_fallback:
> > >  	if (err != -EINVAL || cpu > 0 || thread > 0)
> > >  		goto out_close;
> > >  
> > > -	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
> > > +	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
> > > +		perf_missing_features.cloexec = true;
> > > +		goto fallback_missing_features;
> > 
> > I think it does not fit in here, because we check latest perf_event_attr
> > bits and removing one after another, to see if miracle happens.
> > (also looks like we miss exclude_callchain_(kernel|user) bits)
> > 
> > Wouldn't it be better just to use perf_event_open_cloexec_flag() function
> > as you did in the rest of the code? I think we dont need 2 detection codes
> > for this.
> > 
> 
> I was asked by ACME to use here the missing feature detection pattern
> and to treat the flag just as a new feature like perf_event_attr new
> attributes.
> 
> See http://lkml.kernel.org/r/20140106144347.GA13500@ghostprotocols.net
>     http://lkml.kernel.org/r/20140106211444.GD2810@ghostprotocols.net

ook, I probably missed that.. anyway I still feel like the perf_event_open_cloexec_flag
function should be enough.. I'll leave this one to Arnadlo

thanks,
jirka

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCHv6] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-02 19:23                     ` Jiri Olsa
  2014-06-03  8:57                       ` Yann Droneaud
@ 2014-06-30 20:28                       ` Yann Droneaud
  2014-07-12 23:28                         ` Jiri Olsa
  1 sibling, 1 reply; 38+ messages in thread
From: Yann Droneaud @ 2014-06-30 20:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Yann Droneaud, Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

In commit a21b0b354d4a ('perf: Introduce a flag to enable
close-on-exec in perf_event_open()'), flag PERF_FLAG_FD_CLOEXEC
was added to perf_event_open(2) syscall to allows userspace
to atomically enable close-on-exec behavor when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
---

Hi,

I've addressed the issues reported by Jiri Olsa,
please have a look at the update patch.

Regards.

Changes from v5 [1]
- replace "../util/cloexec.h" by "cloexec.h"
- replace die() by WARN_ONCE()
- replace pr_warning() by WARN_ONCE()
- rework warning messages

Changes from v4 [2]:
- rebase on next-20140530 and update commit message.

Changes from v3 [3]:
- rebase on next-20140307 and resend asis.

Changes from v2 [4]:
- use safest perf attr configuration in probe function,
  as used in perf_evsel__fallback().

Changes from v1 [5]:
- don't probe PERF_FLAG_FD_CLOEXEC for each call to
  perf_event_open_cloexec_flag(): don't forget to set
  'probed' variable once flag was probed.
- call perf_event_open_cloexec_flag() only once in
  util/record.c:perf_do_probe_api().
- fixed minor coding style issue (unneeded braces)
  in util/cloexec.c

Changes from v0 [6]:
- add probing for PERF_FLAG_FD_CLOEXEC flag allowing
  perf to run on older kernel:
  * use "missing feature" pattern in evsel to disable
    PERF_FLAG_FD_CLOEXEC in perf_evsel__open() if not
    supported by kernel;
  * add a dedicated function to probe for
    PERF_FLAG_FD_CLOEXEC support in the current kernel.
    This function is to be used by other functions
    calling sys_perf_event_open() directly.
- remove PERF_FLAG_FD_CLOEXEC from PowerPC selftest
  as it's not related to perf tool: it should be part
  of a separate patch.

[1] http://lkml.kernel.org/r/1401706594-4995-1-git-send-email-ydroneaud@opteya.com

[2] http://lkml.kernel.org/r/1394527147-7010-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3809711/

[3] http://lkml.kernel.org/r/1390771203-28415-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3540091/

[4] http://lkml.kernel.org/r/1389607770-4485-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3474141/

[5] http://lkml.kernel.org/r/1389463628-24869-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3469571/

[6] http://lkml.kernel.org/r/1389005485-12778-1-git-send-email-ydroneaud@opteya.com
    https://patchwork.kernel.org/patch/3437111/

 tools/perf/Makefile.perf              |  1 +
 tools/perf/bench/mem-memcpy.c         |  4 ++-
 tools/perf/bench/mem-memset.c         |  4 ++-
 tools/perf/builtin-sched.c            |  4 ++-
 tools/perf/tests/bp_signal.c          |  4 ++-
 tools/perf/tests/bp_signal_overflow.c |  4 ++-
 tools/perf/tests/rdpmc.c              |  4 ++-
 tools/perf/util/cloexec.c             | 57 +++++++++++++++++++++++++++++++++++
 tools/perf/util/cloexec.h             |  6 ++++
 tools/perf/util/evsel.c               | 12 ++++++--
 tools/perf/util/record.c              |  9 ++++--
 11 files changed, 97 insertions(+), 12 deletions(-)
 create mode 100644 tools/perf/util/cloexec.c
 create mode 100644 tools/perf/util/cloexec.h

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 9670a16fa577..66cde74ee851 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
 LIB_OBJS += $(OUTPUT)util/record.o
 LIB_OBJS += $(OUTPUT)util/srcline.o
 LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/cloexec.o
 
 LIB_OBJS += $(OUTPUT)ui/setup.o
 LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 5ce71d3b72cf..bf5a21b848a9 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memcpy-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/bench/mem-memset.c b/tools/perf/bench/mem-memset.c
index 9af79d2b18e5..260747ea1e0e 100644
--- a/tools/perf/bench/mem-memset.c
+++ b/tools/perf/bench/mem-memset.c
@@ -10,6 +10,7 @@
 #include "../util/util.h"
 #include "../util/parse-options.h"
 #include "../util/header.h"
+#include "../util/cloexec.h"
 #include "bench.h"
 #include "mem-memset-arch.h"
 
@@ -83,7 +84,8 @@ static struct perf_event_attr cycle_attr = {
 
 static void init_cycle(void)
 {
-	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1, 0);
+	cycle_fd = sys_perf_event_open(&cycle_attr, getpid(), -1, -1,
+				       perf_event_open_cloexec_flag());
 
 	if (cycle_fd < 0 && errno == ENOSYS)
 		die("No CONFIG_PERF_EVENTS=y kernel support configured?\n");
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index c38d06c04775..9c686b342013 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -10,6 +10,7 @@
 #include "util/header.h"
 #include "util/session.h"
 #include "util/tool.h"
+#include "util/cloexec.h"
 
 #include "util/parse-options.h"
 #include "util/trace-event.h"
@@ -434,7 +435,8 @@ static int self_open_counters(void)
 	attr.type = PERF_TYPE_SOFTWARE;
 	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 
 	if (fd < 0)
 		pr_err("Error: sys_perf_event_open() syscall returned "
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index aba095489193..a02b035fd5aa 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -25,6 +25,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "cloexec.h"
 
 static int fd1;
 static int fd2;
@@ -78,7 +79,8 @@ static int bp_event(void *fn, int setup_signal)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/bp_signal_overflow.c b/tools/perf/tests/bp_signal_overflow.c
index 44ac82179708..e76537724491 100644
--- a/tools/perf/tests/bp_signal_overflow.c
+++ b/tools/perf/tests/bp_signal_overflow.c
@@ -24,6 +24,7 @@
 #include "tests.h"
 #include "debug.h"
 #include "perf.h"
+#include "cloexec.h"
 
 static int overflows;
 
@@ -91,7 +92,8 @@ int test__bp_signal_overflow(void)
 	pe.exclude_kernel = 1;
 	pe.exclude_hv = 1;
 
-	fd = sys_perf_event_open(&pe, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&pe, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_debug("failed opening event %llx\n", pe.config);
 		return TEST_FAIL;
diff --git a/tools/perf/tests/rdpmc.c b/tools/perf/tests/rdpmc.c
index e59143fd9e71..c04d1f268576 100644
--- a/tools/perf/tests/rdpmc.c
+++ b/tools/perf/tests/rdpmc.c
@@ -6,6 +6,7 @@
 #include "perf.h"
 #include "debug.h"
 #include "tests.h"
+#include "cloexec.h"
 
 #if defined(__x86_64__) || defined(__i386__)
 
@@ -104,7 +105,8 @@ static int __test__rdpmc(void)
 	sa.sa_sigaction = segfault_handler;
 	sigaction(SIGSEGV, &sa, NULL);
 
-	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd < 0) {
 		pr_err("Error: sys_perf_event_open() syscall returned "
 		       "with %d (%s)\n", fd, strerror(errno));
diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
new file mode 100644
index 000000000000..c5d05ec17220
--- /dev/null
+++ b/tools/perf/util/cloexec.c
@@ -0,0 +1,57 @@
+#include "util.h"
+#include "../perf.h"
+#include "cloexec.h"
+#include "asm/bug.h"
+
+static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
+
+static int perf_flag_probe(void)
+{
+	/* use 'safest' configuration as used in perf_evsel__fallback() */
+	struct perf_event_attr attr = {
+		.type = PERF_COUNT_SW_CPU_CLOCK,
+		.config = PERF_COUNT_SW_CPU_CLOCK,
+	};
+	int fd;
+	int err;
+
+	/* check cloexec flag */
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 PERF_FLAG_FD_CLOEXEC);
+	err = errno;
+
+	if (fd >= 0) {
+		close(fd);
+		return 1;
+	}
+
+	WARN_ONCE(err != EINVAL,
+		  "perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error %d (%s)\n",
+		  err, strerror(err));
+
+	/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	err = errno;
+
+	if (WARN_ONCE(fd < 0,
+		      "perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
+		      err, strerror(err)))
+		return -1;
+
+	close(fd);
+
+	return 0;
+}
+
+unsigned long perf_event_open_cloexec_flag(void)
+{
+	static bool probed;
+
+	if (!probed) {
+		if (perf_flag_probe() <= 0)
+			flag = 0;
+		probed = true;
+	}
+
+	return flag;
+}
diff --git a/tools/perf/util/cloexec.h b/tools/perf/util/cloexec.h
new file mode 100644
index 000000000000..94a5a7d829d5
--- /dev/null
+++ b/tools/perf/util/cloexec.h
@@ -0,0 +1,6 @@
+#ifndef __PERF_CLOEXEC_H
+#define __PERF_CLOEXEC_H
+
+unsigned long perf_event_open_cloexec_flag(void);
+
+#endif /* __PERF_CLOEXEC_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8606175fe1e8..8623840a6e4a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,6 +29,7 @@ static struct {
 	bool sample_id_all;
 	bool exclude_guest;
 	bool mmap2;
+	bool cloexec;
 } perf_missing_features;
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
@@ -989,7 +990,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			      struct thread_map *threads)
 {
 	int cpu, thread;
-	unsigned long flags = 0;
+	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
@@ -998,11 +999,13 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		return -ENOMEM;
 
 	if (evsel->cgrp) {
-		flags = PERF_FLAG_PID_CGROUP;
+		flags |= PERF_FLAG_PID_CGROUP;
 		pid = evsel->cgrp->fd;
 	}
 
 fallback_missing_features:
+	if (perf_missing_features.cloexec)
+		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
 	if (perf_missing_features.mmap2)
 		evsel->attr.mmap2 = 0;
 	if (perf_missing_features.exclude_guest)
@@ -1071,7 +1074,10 @@ try_fallback:
 	if (err != -EINVAL || cpu > 0 || thread > 0)
 		goto out_close;
 
-	if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
+	if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
+		perf_missing_features.cloexec = true;
+		goto fallback_missing_features;
+	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 049e0a09ccd3..9a8d622809f6 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -4,6 +4,7 @@
 #include "parse-events.h"
 #include <api/fs/fs.h>
 #include "util.h"
+#include "cloexec.h"
 
 typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
 
@@ -11,6 +12,7 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 {
 	struct perf_evlist *evlist;
 	struct perf_evsel *evsel;
+	unsigned long flags = perf_event_open_cloexec_flag();
 	int err = -EAGAIN, fd;
 
 	evlist = perf_evlist__new();
@@ -22,14 +24,14 @@ static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
 
 	evsel = perf_evlist__first(evlist);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0)
 		goto out_delete;
 	close(fd);
 
 	fn(evsel);
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, flags);
 	if (fd < 0) {
 		if (errno == EINVAL)
 			err = -EINVAL;
@@ -203,7 +205,8 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str)
 		cpu = evlist->cpus->map[0];
 	}
 
-	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+	fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1,
+				 perf_event_open_cloexec_flag());
 	if (fd >= 0) {
 		close(fd);
 		ret = true;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCHv6] perf tools: enable close-on-exec flag on perf file descriptor
  2014-06-30 20:28                       ` [PATCHv6] " Yann Droneaud
@ 2014-07-12 23:28                         ` Jiri Olsa
  0 siblings, 0 replies; 38+ messages in thread
From: Jiri Olsa @ 2014-07-12 23:28 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras,
	Ingo Molnar, Namhyung Kim, Andi Kleen, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Stephane Eranian,
	Adrian Hunter, Benjamin Herrenschmidt, Michael Ellerman,
	Peter Zijlstra, linux-kernel, Ingo Molnar

On Mon, Jun 30, 2014 at 10:28:47PM +0200, Yann Droneaud wrote:
> In commit a21b0b354d4a ('perf: Introduce a flag to enable
> close-on-exec in perf_event_open()'), flag PERF_FLAG_FD_CLOEXEC
> was added to perf_event_open(2) syscall to allows userspace
> to atomically enable close-on-exec behavor when creating
> the file descriptor.
> 
> This patch makes perf tools use the new flag if supported
> by the kernel, so that the event file descriptors got
> automatically closed if perf tool exec a sub-command.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
> ---
> 
> Hi,
> 
> I've addressed the issues reported by Jiri Olsa,
> please have a look at the update patch.

hi,
I'll queue this one

thanks,
jirka

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2014-07-12 23:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-05 20:36 [PATCH v5 0/7] Getting rid of get_unused_fd() / enable close-on-exec Yann Droneaud
2014-01-05 20:36 ` Yann Droneaud
2014-01-05 20:36 ` Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 1/7] ia64: use get_unused_fd_flags(0) instead of get_unused_fd() Yann Droneaud
2014-01-05 20:36   ` Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 2/7] ppc/cell: " Yann Droneaud
2014-01-05 20:36   ` Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 3/7] binfmt_misc: " Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 4/7] file: " Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 5/7] fanotify: enable close-on-exec on events' fd when requested in fanotify_init() Yann Droneaud
2014-01-20 17:15   ` Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 6/7] perf: introduce a flag to enable close-on-exec in perf_event_open() Yann Droneaud
2014-01-06  9:29   ` Peter Zijlstra
2014-01-06 10:51     ` [PATCH] perf tools: enable close-on-exec flag on perf file descriptor Yann Droneaud
2014-01-06 11:24       ` Peter Zijlstra
2014-01-06 14:43         ` Arnaldo Carvalho de Melo
2014-01-06 21:01           ` Yann Droneaud
2014-01-06 21:14             ` Arnaldo Carvalho de Melo
2014-01-06 14:22       ` Jiri Olsa
2014-01-06 15:31         ` Yann Droneaud
2014-01-11 18:07         ` [PATCHv1] " Yann Droneaud
2014-01-13 10:09           ` [PATCHv2] " Yann Droneaud
2014-01-15 18:50             ` Arnaldo Carvalho de Melo
2014-01-26 21:20               ` [PATCHv3] " Yann Droneaud
2014-03-11  8:39                 ` [PATCHv4] " Yann Droneaud
2014-06-02 10:56                   ` [PATCHv5] " Yann Droneaud
2014-06-02 19:23                     ` Jiri Olsa
2014-06-03  8:57                       ` Yann Droneaud
2014-06-03  9:23                         ` Adrian Hunter
2014-06-03 11:51                         ` Jiri Olsa
2014-06-30 20:28                       ` [PATCHv6] " Yann Droneaud
2014-07-12 23:28                         ` Jiri Olsa
2014-01-06 16:27       ` [PATCH] " Andi Kleen
2014-01-06 16:39         ` Peter Zijlstra
2014-01-06 16:52           ` Andi Kleen
2014-01-06 17:15             ` Yann Droneaud
2014-01-12 18:43   ` [tip:perf/core] perf: Introduce a flag to enable close-on-exec in perf_event_open() tip-bot for Yann Droneaud
2014-01-05 20:36 ` [PATCHv5 7/7] file: remove macro get_unused_fd() Yann Droneaud

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.