linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] restrict /dev/mem to idle io memory ranges
@ 2015-11-24  0:05 Dan Williams
  2015-11-24  0:05 ` [PATCH v2 1/2] arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug Dan Williams
  2015-11-24  0:06 ` [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges Dan Williams
  0 siblings, 2 replies; 10+ messages in thread
From: Dan Williams @ 2015-11-24  0:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, H. Peter Anvin, Heiko Carstens, Will Deacon,
	David S. Miller, Ingo Molnar, Benjamin Herrenschmidt,
	Martin Schwidefsky, Andrew Morton, Catalin Marinas,
	Thomas Gleixner, linux-arm-kernel

Changes since v1 [1]:

1/ Introduce ARCH_HAS_DEVMEM_IS_ALLOWED to flag archs where
   CONFIG_STRICT_DEVMEM will compile (Ingo)

2/ Drop "default y" for s390 (Heiko)

3/ Fix iomem_is_exclusive() return value in the
   CONFIG_IO_STRICT_DEVMEM=y case.

[1]: https://lkml.org/lkml/2015/11/21/183

---

Dan Williams (2):
      arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
      restrict /dev/mem to idle io memory ranges


 arch/arm/Kconfig             |    1 +
 arch/arm/Kconfig.debug       |   14 --------------
 arch/arm64/Kconfig           |    1 +
 arch/arm64/Kconfig.debug     |   14 --------------
 arch/frv/Kconfig             |    1 +
 arch/m32r/Kconfig            |    1 +
 arch/powerpc/Kconfig         |    1 +
 arch/powerpc/Kconfig.debug   |   12 ------------
 arch/s390/Kconfig            |    1 +
 arch/s390/Kconfig.debug      |   12 ------------
 arch/tile/Kconfig            |    4 +---
 arch/unicore32/Kconfig       |    1 +
 arch/unicore32/Kconfig.debug |   14 --------------
 arch/x86/Kconfig             |    1 +
 arch/x86/Kconfig.debug       |   17 -----------------
 kernel/resource.c            |   11 +++++++++--
 lib/Kconfig.debug            |   39 +++++++++++++++++++++++++++++++++++++++
 17 files changed, 57 insertions(+), 88 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
  2015-11-24  0:05 [PATCH v2 0/2] restrict /dev/mem to idle io memory ranges Dan Williams
@ 2015-11-24  0:05 ` Dan Williams
  2015-11-24  0:06 ` [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges Dan Williams
  1 sibling, 0 replies; 10+ messages in thread
From: Dan Williams @ 2015-11-24  0:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Benjamin Herrenschmidt, H. Peter Anvin, Heiko Carstens,
	Andrew Morton, Will Deacon, Ingo Molnar, Greg Kroah-Hartman,
	Martin Schwidefsky, Thomas Gleixner, Catalin Marinas,
	David S. Miller, linux-arm-kernel

Let all the archs that implement devmem_is_allowed() opt-in to a common
definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.

Cc: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko: drop 'default y' for s390]
Acked-by: Ingo Molnar <mingo@redhat.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/arm/Kconfig             |    1 +
 arch/arm/Kconfig.debug       |   14 --------------
 arch/arm64/Kconfig           |    1 +
 arch/arm64/Kconfig.debug     |   14 --------------
 arch/frv/Kconfig             |    1 +
 arch/m32r/Kconfig            |    1 +
 arch/powerpc/Kconfig         |    1 +
 arch/powerpc/Kconfig.debug   |   12 ------------
 arch/s390/Kconfig            |    1 +
 arch/s390/Kconfig.debug      |   12 ------------
 arch/tile/Kconfig            |    4 +---
 arch/unicore32/Kconfig       |    1 +
 arch/unicore32/Kconfig.debug |   14 --------------
 arch/x86/Kconfig             |    1 +
 arch/x86/Kconfig.debug       |   17 -----------------
 lib/Kconfig.debug            |   22 ++++++++++++++++++++++
 16 files changed, 31 insertions(+), 86 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0365cbbc9179..c7d92948bb49 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -2,6 +2,7 @@ config ARM
 	bool
 	default y
 	select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAVE_CUSTOM_GPIO_H
diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 259c0ca9c99a..e356357d86bb 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -15,20 +15,6 @@ config ARM_PTDUMP
 	  kernel.
 	  If in doubt, say "N"
 
-config STRICT_DEVMEM
-	bool "Filter access to /dev/mem"
-	depends on MMU
-	---help---
-	  If this option is disabled, you allow userspace (root) access to all
-	  of memory, including kernel and userspace memory. Accidental
-	  access to this is obviously disastrous, but specific access can
-	  be used by people debugging the kernel.
-
-	  If this option is switched on, the /dev/mem file only allows
-	  userspace access to memory mapped peripherals.
-
-          If in doubt, say Y.
-
 # RMK wants arm kernels compiled with frame pointers or stack unwinding.
 # If you know what you are doing and are willing to live without stack
 # traces, you can get a slightly smaller kernel by setting this option to
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9ac16a482ff1..4e7bf1d73038 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -3,6 +3,7 @@ config ARM64
 	select ACPI_CCA_REQUIRED if ACPI
 	select ACPI_GENERIC_GSI if ACPI
 	select ACPI_REDUCED_HARDWARE_ONLY if ACPI
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 04fb73b973f1..e13c4bf84d9e 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -14,20 +14,6 @@ config ARM64_PTDUMP
 	  kernel.
 	  If in doubt, say "N"
 
-config STRICT_DEVMEM
-	bool "Filter access to /dev/mem"
-	depends on MMU
-	help
-	  If this option is disabled, you allow userspace (root) access to all
-	  of memory, including kernel and userspace memory. Accidental
-	  access to this is obviously disastrous, but specific access can
-	  be used by people debugging the kernel.
-
-	  If this option is switched on, the /dev/mem file only allows
-	  userspace access to memory mapped peripherals.
-
-	  If in doubt, say Y.
-
 config PID_IN_CONTEXTIDR
 	bool "Write the current PID to the CONTEXTIDR register"
 	help
diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index 34aa19352dc1..03bfd6bf03e7 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -10,6 +10,7 @@ config FRV
 	select HAVE_DEBUG_BUGVERBOSE
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select GENERIC_CPU_DEVICES
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index 9e44bbd8051e..836ac5a963c8 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -13,6 +13,7 @@ config M32R
 	select GENERIC_IRQ_PROBE
 	select GENERIC_IRQ_SHOW
 	select GENERIC_ATOMIC64
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_USES_GETTIMEOFFSET
 	select MODULES_USE_ELF_RELA
 	select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index db49e0d796b1..85eabc49de61 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -159,6 +159,7 @@ config PPC
 	select EDAC_SUPPORT
 	select EDAC_ATOMIC_SCRUB
 	select ARCH_HAS_DMA_SET_COHERENT_MASK
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select HAVE_ARCH_SECCOMP_FILTER
 
 config GENERIC_CSUM
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 3a510f4a6b68..a0e44a9c456f 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -335,18 +335,6 @@ config PPC_EARLY_DEBUG_CPM_ADDR
 	  platform probing is done, all platforms selected must
 	  share the same address.
 
-config STRICT_DEVMEM
-	def_bool y
-	prompt "Filter access to /dev/mem"
-	help
-	  This option restricts access to /dev/mem.  If this option is
-	  disabled, you allow userspace access to all memory, including
-	  kernel and userspace memory. Accidental memory access is likely
-	  to be disastrous.
-	  Memory access is required for experts who want to debug the kernel.
-
-	  If you are unsure, say Y.
-
 config FAIL_IOMMU
 	bool "Fault-injection capability for IOMMU"
 	depends on FAULT_INJECTION
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 3a55f493c7da..779becb895be 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -66,6 +66,7 @@ config S390
 	def_bool y
 	select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
 	select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_SG_CHAIN
diff --git a/arch/s390/Kconfig.debug b/arch/s390/Kconfig.debug
index c56878e1245f..26c5d5beb4be 100644
--- a/arch/s390/Kconfig.debug
+++ b/arch/s390/Kconfig.debug
@@ -5,18 +5,6 @@ config TRACE_IRQFLAGS_SUPPORT
 
 source "lib/Kconfig.debug"
 
-config STRICT_DEVMEM
-	def_bool y
-	prompt "Filter access to /dev/mem"
-	---help---
-	  This option restricts access to /dev/mem.  If this option is
-	  disabled, you allow userspace access to all memory, including
-	  kernel and userspace memory. Accidental memory access is likely
-	  to be disastrous.
-	  Memory access is required for experts who want to debug the kernel.
-
-	  If you are unsure, say Y.
-
 config S390_PTDUMP
 	bool "Export kernel pagetable layout to userspace via debugfs"
 	depends on DEBUG_KERNEL
diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig
index 106c21bd7f44..cf3116887509 100644
--- a/arch/tile/Kconfig
+++ b/arch/tile/Kconfig
@@ -19,6 +19,7 @@ config TILE
 	select VIRT_TO_BUS
 	select SYS_HYPERVISOR
 	select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select GENERIC_CLOCKEVENTS
 	select MODULES_USE_ELF_RELA
@@ -116,9 +117,6 @@ config ARCH_DISCONTIGMEM_DEFAULT
 config TRACE_IRQFLAGS_SUPPORT
 	def_bool y
 
-config STRICT_DEVMEM
-	def_bool y
-
 # SMP is required for Tilera Linux.
 config SMP
 	def_bool y
diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
index c9faddc61100..5dc4c0a43ccd 100644
--- a/arch/unicore32/Kconfig
+++ b/arch/unicore32/Kconfig
@@ -1,5 +1,6 @@
 config UNICORE32
 	def_bool y
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select HAVE_MEMBLOCK
diff --git a/arch/unicore32/Kconfig.debug b/arch/unicore32/Kconfig.debug
index 1a3626239843..f075bbe1d46f 100644
--- a/arch/unicore32/Kconfig.debug
+++ b/arch/unicore32/Kconfig.debug
@@ -2,20 +2,6 @@ menu "Kernel hacking"
 
 source "lib/Kconfig.debug"
 
-config STRICT_DEVMEM
-	bool "Filter access to /dev/mem"
-	depends on MMU
-	---help---
-	  If this option is disabled, you allow userspace (root) access to all
-	  of memory, including kernel and userspace memory. Accidental
-	  access to this is obviously disastrous, but specific access can
-	  be used by people debugging the kernel.
-
-	  If this option is switched on, the /dev/mem file only allows
-	  userspace access to memory mapped peripherals.
-
-          If in doubt, say Y.
-
 config EARLY_PRINTK
 	def_bool DEBUG_OCD
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index db3622f22b61..75fba1fc205d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -24,6 +24,7 @@ config X86
 	select ARCH_DISCARD_MEMBLOCK
 	select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
 	select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS
+	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_FAST_MULTIPLIER
 	select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 137dfa96aa14..1116452fcfc2 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -5,23 +5,6 @@ config TRACE_IRQFLAGS_SUPPORT
 
 source "lib/Kconfig.debug"
 
-config STRICT_DEVMEM
-	bool "Filter access to /dev/mem"
-	---help---
-	  If this option is disabled, you allow userspace (root) access to all
-	  of memory, including kernel and userspace memory. Accidental
-	  access to this is obviously disastrous, but specific access can
-	  be used by people debugging the kernel. Note that with PAT support
-	  enabled, even in this case there are restrictions on /dev/mem
-	  use due to the cache aliasing requirements.
-
-	  If this option is switched on, the /dev/mem file only allows
-	  userspace access to PCI space and the BIOS code and data regions.
-	  This is sufficient for dosemu and X and all common users of
-	  /dev/mem.
-
-	  If in doubt, say Y.
-
 config X86_VERBOSE_BOOTUP
 	bool "Enable verbose x86 bootup info messages"
 	default y
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8c15b29d5adc..289dfcbc14eb 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1853,3 +1853,25 @@ source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
 
+config ARCH_HAS_DEVMEM_IS_ALLOWED
+	bool
+
+config STRICT_DEVMEM
+	bool "Filter access to /dev/mem"
+	depends on MMU
+	depends on ARCH_HAS_DEVMEM_IS_ALLOWED
+	default y if TILE || PPC
+	---help---
+	  If this option is disabled, you allow userspace (root) access to all
+	  of memory, including kernel and userspace memory. Accidental
+	  access to this is obviously disastrous, but specific access can
+	  be used by people debugging the kernel. Note that with PAT support
+	  enabled, even in this case there are restrictions on /dev/mem
+	  use due to the cache aliasing requirements.
+
+	  If this option is switched on, the /dev/mem file only allows
+	  userspace access to PCI space and the BIOS code and data regions.
+	  This is sufficient for dosemu and X and all common users of
+	  /dev/mem.
+
+	  If in doubt, say Y.


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-24  0:05 [PATCH v2 0/2] restrict /dev/mem to idle io memory ranges Dan Williams
  2015-11-24  0:05 ` [PATCH v2 1/2] arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug Dan Williams
@ 2015-11-24  0:06 ` Dan Williams
  2015-11-24 22:25   ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: Dan Williams @ 2015-11-24  0:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, linux-arm-kernel

This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
semantics by default.  If userspace really believes it is safe to access
the memory region it can also perform the extra step of disabling an
active driver.  This protects device address ranges with read side
effects and otherwise directs userspace to use the driver.

Persistent memory presents a large "mistake surface" to /dev/mem as now
accidental writes can corrupt a filesystem.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 kernel/resource.c |   11 +++++++++--
 lib/Kconfig.debug |   23 ++++++++++++++++++++---
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index f150dbbe6f62..09c0597840b0 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1498,8 +1498,15 @@ int iomem_is_exclusive(u64 addr)
 			break;
 		if (p->end < addr)
 			continue;
-		if (p->flags & IORESOURCE_BUSY &&
-		     p->flags & IORESOURCE_EXCLUSIVE) {
+		/*
+		 * A resource is exclusive if IORESOURCE_EXCLUSIVE is set
+		 * or CONFIG_IO_STRICT_DEVMEM is enabled and the
+		 * resource is busy.
+		 */
+		if ((p->flags & IORESOURCE_BUSY) == 0)
+			continue;
+		if (IS_ENABLED(CONFIG_IO_STRICT_DEVMEM)
+				|| p->flags & IORESOURCE_EXCLUSIVE) {
 			err = 1;
 			break;
 		}
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 289dfcbc14eb..073496dea848 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1869,9 +1869,26 @@ config STRICT_DEVMEM
 	  enabled, even in this case there are restrictions on /dev/mem
 	  use due to the cache aliasing requirements.
 
+	  If this option is switched on, and IO_STRICT_DEVMEM=n, the /dev/mem
+	  file only allows userspace access to PCI space and the BIOS code and
+	  data regions.  This is sufficient for dosemu and X and all common
+	  users of /dev/mem.
+
+	  If in doubt, say Y.
+
+config IO_STRICT_DEVMEM
+	bool "Filter I/O access to /dev/mem"
+	depends on STRICT_DEVMEM
+	default STRICT_DEVMEM
+	---help---
+	  If this option is disabled, you allow userspace (root) access to all
+	  io-memory regardless of whether a driver is actively using that
+	  range.  Accidental access to this is obviously disastrous, but
+	  specific access can be used by people debugging kernel drivers.
+
 	  If this option is switched on, the /dev/mem file only allows
-	  userspace access to PCI space and the BIOS code and data regions.
-	  This is sufficient for dosemu and X and all common users of
-	  /dev/mem.
+	  userspace access to *idle* io-memory ranges (see /proc/iomem) This
+	  may break traditional users of /dev/mem (dosemu, legacy X, etc...)
+	  if the driver using a given range cannot be disabled.
 
 	  If in doubt, say Y.


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-24  0:06 ` [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges Dan Williams
@ 2015-11-24 22:25   ` Andrew Morton
  2015-11-25  0:34     ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Mon, 23 Nov 2015 16:06:04 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
> semantics by default.  If userspace really believes it is safe to access
> the memory region it can also perform the extra step of disabling an
> active driver.  This protects device address ranges with read side
> effects and otherwise directs userspace to use the driver.

I don't think I'm sufficiently understanding what this is all needed
for, sorry.  A better changelog would help: what's wrong with the
current code, how you propose it be changed, how the kernel's
externally-visible behaviour is altered, etc.

Please pay particular attention to the back-compatibility issues which
will be encountered when people enable these options.

Perhaps when all that material is described, I'll understand why the
heck we're doing this with a build-time switch rather than a runtime
one...

> Persistent memory presents a large "mistake surface" to /dev/mem as now
> accidental writes can corrupt a filesystem.

Is that the motivation?  root can come in and accidentally alter
persistent memory contents?  If so, 

- why do we care?  There are all sorts of ways in which root can muck
  up the persistent memory, starting with dd(1).  What's special about
  /dev/mem?

- why is the patch mucking with access to PCI and BIOS space?  Is the
  persistent memory even mappable in those regions?  Or is the concern
  that userspace can access control registers associated with the
  persistent memory?  What is the problem scenario?


IOW, a very good description of the problem-being-solved would help out
a lot here...


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-24 22:25   ` Andrew Morton
@ 2015-11-25  0:34     ` Dan Williams
  2015-11-25  0:47       ` Andrew Morton
  2015-11-26 11:08       ` Ingo Molnar
  0 siblings, 2 replies; 10+ messages in thread
From: Dan Williams @ 2015-11-25  0:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Tue, Nov 24, 2015 at 2:25 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Mon, 23 Nov 2015 16:06:04 -0800 Dan Williams <dan.j.williams@intel.com> wrote:
>
>> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
>> semantics by default.  If userspace really believes it is safe to access
>> the memory region it can also perform the extra step of disabling an
>> active driver.  This protects device address ranges with read side
>> effects and otherwise directs userspace to use the driver.
>
> I don't think I'm sufficiently understanding what this is all needed
> for, sorry.  A better changelog would help: what's wrong with the
> current code, how you propose it be changed, how the kernel's
> externally-visible behaviour is altered, etc.
>

I should have duplicated the Kconfig description for IO_STRICT_DEVMEM
in the changelog, but the justification is simply that if the kernel
has a driver busily using a memory range, userspace needs to assert it
knows it is safe to access that range by disabling the driver.  This
makes the kernel safer by default.

> Please pay particular attention to the back-compatibility issues which
> will be encountered when people enable these options.

It certainly diminishes debug capabilities, mmap of sysfs pci
resources will also fail while a driver is active.  The only general
purpose application I know that uses /dev/mem is dosemu.  It should
continue to work fine as x86 "devmem_is_allowed()" permits access from
0-to-1MB by default.  The other stated user of /dev/mem legacy X
drivers.  With the prevalence of kernel modesetting in graphics
drivers I don't know how much of a concern this is anymore.

> Perhaps when all that material is described, I'll understand why the
> heck we're doing this with a build-time switch rather than a runtime
> one...

We have the "iomem=" kernel parameter.  I think it makes sense to have
that setting be configurable at runtime to augment this build time
decision.

>> Persistent memory presents a large "mistake surface" to /dev/mem as now
>> accidental writes can corrupt a filesystem.
>
> Is that the motivation?  root can come in and accidentally alter
> persistent memory contents?  If so,
>
> - why do we care?  There are all sorts of ways in which root can muck
>   up the persistent memory, starting with dd(1).  What's special about
>   /dev/mem?

dd through /dev/pmem and the driver will do all the proper flushing
and syncing to make the writes durable on media.  /dev/mem knows none
of those semantics.  /dev/pmem as a block device responds to O_EXCL
and prevents other attempts to open the device.

> - why is the patch mucking with access to PCI and BIOS space?  Is the
>   persistent memory even mappable in those regions?  Or is the concern
>   that userspace can access control registers associated with the
>   persistent memory?  What is the problem scenario?

It seems to me that letting /dev/mem do arbitrary access to any region
of memory is a dangerous capability for a production environment.
Drivers assume that request_mem_region() tells other parts of the
kernel to not touch their memory.  Having the option to extend that
protection to /dev/mem by default seemed a reasonable idea.

Of course, all of this assumes that you think it is worthwhile to have
some protections and safety measures even for root.

> IOW, a very good description of the problem-being-solved would help out
> a lot here...

I'll fold the eventual result of this discussion into the changelog if
I can convince you it's worth moving forward.

I also have the option of just tagging the pmem regions as
IORESOURCE_EXCLUSIVE, but I decided against that because I think our
current definition of STRICT_DEVMEM leaves a big hole if the goal is
"/dev/mem access is safe by default".

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-25  0:34     ` Dan Williams
@ 2015-11-25  0:47       ` Andrew Morton
  2015-11-25  0:50         ` Kees Cook
  2015-11-25  1:28         ` Dan Williams
  2015-11-26 11:08       ` Ingo Molnar
  1 sibling, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2015-11-25  0:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Tue, 24 Nov 2015 16:34:19 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

> > IOW, a very good description of the problem-being-solved would help out
> > a lot here...
> 
> I'll fold the eventual result of this discussion into the changelog if
> I can convince you it's worth moving forward.

I'm easily convinced ;) Please let's get all the info into the right
place, make sure it answers the thus-far-asked questions (at least) and
we'll take it from there.

And please do have a think about switching as much as possible over to
runtime-configurability.  Because "please echo foo > /proc" is a heck
of a lot nicer than "please reboot with iomem=" which is a heck of a lot
nicer than "please ask vendor for a new kernel".


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-25  0:47       ` Andrew Morton
@ 2015-11-25  0:50         ` Kees Cook
  2015-11-25  1:28         ` Dan Williams
  1 sibling, 0 replies; 10+ messages in thread
From: Kees Cook @ 2015-11-25  0:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dan Williams, linux-kernel, linux-arch, Russell King,
	Arnd Bergmann, Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Tue, Nov 24, 2015 at 4:47 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 24 Nov 2015 16:34:19 -0800 Dan Williams <dan.j.williams@intel.com> wrote:
>
>> > IOW, a very good description of the problem-being-solved would help out
>> > a lot here...
>>
>> I'll fold the eventual result of this discussion into the changelog if
>> I can convince you it's worth moving forward.
>
> I'm easily convinced ;) Please let's get all the info into the right
> place, make sure it answers the thus-far-asked questions (at least) and
> we'll take it from there.
>
> And please do have a think about switching as much as possible over to
> runtime-configurability.  Because "please echo foo > /proc" is a heck
> of a lot nicer than "please reboot with iomem=" which is a heck of a lot
> nicer than "please ask vendor for a new kernel".

I think run-time config for this should be an as-needed case. Nothing
should be fiddling with this memory from userspace anyway -- a driver
covering it should be unloaded first.

And, with my dosemu maintainer hat on, If you're using dosemu in a
mode where this will cause a problem, you are already running a custom
kernel. :)

And that said, if someone can actually produce a case where we need
this runtime configurable, I'm all for it. I just don't think we need
to design it in right now.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-25  0:47       ` Andrew Morton
  2015-11-25  0:50         ` Kees Cook
@ 2015-11-25  1:28         ` Dan Williams
  2015-11-25 18:54           ` Dan Williams
  1 sibling, 1 reply; 10+ messages in thread
From: Dan Williams @ 2015-11-25  1:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Tue, Nov 24, 2015 at 4:47 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 24 Nov 2015 16:34:19 -0800 Dan Williams <dan.j.williams@intel.com> wrote:
>
>> > IOW, a very good description of the problem-being-solved would help out
>> > a lot here...
>>
>> I'll fold the eventual result of this discussion into the changelog if
>> I can convince you it's worth moving forward.
>
> I'm easily convinced ;) Please let's get all the info into the right
> place, make sure it answers the thus-far-asked questions (at least) and
> we'll take it from there.
>
> And please do have a think about switching as much as possible over to
> runtime-configurability.  Because "please echo foo > /proc" is a heck
> of a lot nicer than "please reboot with iomem=" which is a heck of a lot
> nicer than "please ask vendor for a new kernel".

Actually, we already have runtime configuration.  For example, if you
want to muck with pmem via /dev/mem, just do this first:

    echo namespace0.0 > /sys/bus/nd/drivers/nd_pmem/unbind

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-25  1:28         ` Dan Williams
@ 2015-11-25 18:54           ` Dan Williams
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2015-11-25 18:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-arch, Russell King, Kees Cook, Arnd Bergmann,
	Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel

On Tue, Nov 24, 2015 at 5:28 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Nov 24, 2015 at 4:47 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>> On Tue, 24 Nov 2015 16:34:19 -0800 Dan Williams <dan.j.williams@intel.com> wrote:
>>
>>> > IOW, a very good description of the problem-being-solved would help out
>>> > a lot here...
>>>
>>> I'll fold the eventual result of this discussion into the changelog if
>>> I can convince you it's worth moving forward.
>>
>> I'm easily convinced ;) Please let's get all the info into the right
>> place, make sure it answers the thus-far-asked questions (at least) and
>> we'll take it from there.
>>
>> And please do have a think about switching as much as possible over to
>> runtime-configurability.  Because "please echo foo > /proc" is a heck
>> of a lot nicer than "please reboot with iomem=" which is a heck of a lot
>> nicer than "please ask vendor for a new kernel".
>
> Actually, we already have runtime configuration.  For example, if you
> want to muck with pmem via /dev/mem, just do this first:
>
>     echo namespace0.0 > /sys/bus/nd/drivers/nd_pmem/unbind

Here's the summary of the thread that I will add to the changelog:

---

In general if a device driver is busily using a memory region it
already informs other parts of the kernel to not touch it via
request_mem_region().  /dev/mem should honor the same safety
restriction by default.  Debugging a device driver from userspace
becomes more
difficult with this enabled.  Any application using /dev/mem or mmap
of sysfs pci resources will now need to perform the extra step of
either:

1/ Disabling the driver, for example:

  echo <device id> > /dev/bus/<parent bus>/drivers/<driver name>/unbind

2/ Rebooting with "iomem=relaxed" on the command line

3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n

Traditional users of /dev/mem like dosemu are unaffected because the
first 1MB of memory is not subject to the IO_STRICT_DEVMEM
restriction.
Legacy X configurations use /dev/mem to talk to graphics hardware, but
that functionality has since moved to kernel graphics drivers.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges
  2015-11-25  0:34     ` Dan Williams
  2015-11-25  0:47       ` Andrew Morton
@ 2015-11-26 11:08       ` Ingo Molnar
  1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2015-11-26 11:08 UTC (permalink / raw)
  To: Dan Williams
  Cc: Andrew Morton, linux-kernel, linux-arch, Russell King, Kees Cook,
	Arnd Bergmann, Greg Kroah-Hartman, Ingo Molnar, linux-arm-kernel


* Dan Williams <dan.j.williams@intel.com> wrote:

> > - why is the patch mucking with access to PCI and BIOS space?  Is the
> >   persistent memory even mappable in those regions?  Or is the concern
> >   that userspace can access control registers associated with the
> >   persistent memory?  What is the problem scenario?
> 
> It seems to me that letting /dev/mem do arbitrary access to any region
> of memory is a dangerous capability for a production environment.

So basically the original motivation years ago was to disable /dev/mem altogether: 
it used to be a wide open roothole if anything with write access to it (such as 
old Xorg) is exploited, plus it's a favorite and convenient tool for stealth 
access to system areas of memory in cases where the attacker already has root. 

(this is relevant as even being root might not give easy access to system mmio 
areas if things like being able to load modules is restricted even for room, and 
if the system has readonly storage and a few other things configured.)

But we couldn't disable /dev/mem completely due to Xorg and dosemu legacies - so 
we came up with this restriction feature that limits its scope.

Any additional steps that limit the scope of access under the STRICT_DEVMEM option 
(which is really a misnomer: it should be RESTRICT_DEVMEM instead) are welcome 
from a generic Linux distro POV.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-11-26 11:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24  0:05 [PATCH v2 0/2] restrict /dev/mem to idle io memory ranges Dan Williams
2015-11-24  0:05 ` [PATCH v2 1/2] arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug Dan Williams
2015-11-24  0:06 ` [PATCH v2 2/2] restrict /dev/mem to idle io memory ranges Dan Williams
2015-11-24 22:25   ` Andrew Morton
2015-11-25  0:34     ` Dan Williams
2015-11-25  0:47       ` Andrew Morton
2015-11-25  0:50         ` Kees Cook
2015-11-25  1:28         ` Dan Williams
2015-11-25 18:54           ` Dan Williams
2015-11-26 11:08       ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).