* Re: [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions @ 2021-08-12 1:42 kernel test robot 0 siblings, 0 replies; 5+ messages in thread From: kernel test robot @ 2021-08-12 1:42 UTC (permalink / raw) To: kbuild [-- Attachment #1: Type: text/plain, Size: 3944 bytes --] CC: kbuild-all(a)lists.01.org In-Reply-To: <20210811203612.138506-2-david@redhat.com> References: <20210811203612.138506-2-david@redhat.com> TO: David Hildenbrand <david@redhat.com> TO: linux-kernel(a)vger.kernel.org CC: David Hildenbrand <david@redhat.com> CC: Arnd Bergmann <arnd@arndb.de> CC: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Jason Wang <jasowang@redhat.com> CC: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux Memory Management List <linux-mm@kvack.org> CC: Dan Williams <dan.j.williams@intel.com> Hi David, I love your patch! Perhaps something to improve: [auto build test WARNING on linux/master] [also build test WARNING on char-misc/char-misc-testing soc/for-next linus/master v5.14-rc5 next-20210811] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/David-Hildenbrand/virtio-mem-disallow-mapping-virtio-mem-memory-via-dev-mem/20210812-043834 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 349a2d52ffe59b7a0c5876fa7ee9f3eaf188b830 :::::: branch date: 5 hours ago :::::: commit date: 5 hours ago config: i386-randconfig-m021-20210810 (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> smatch warnings: drivers/char/mem.c:70 page_is_allowed() warn: should 'pfn << 12' be a 64 bit type? drivers/char/mem.c:87 range_is_allowed() warn: should 'pfn << 12' be a 64 bit type? vim +70 drivers/char/mem.c ^1da177e4c3f41 Linus Torvalds 2005-04-16 63 a4866aa812518e Kees Cook 2017-04-05 64 static inline int page_is_allowed(unsigned long pfn) a4866aa812518e Kees Cook 2017-04-05 65 { 19e42009961255 David Hildenbrand 2021-08-11 66 #ifdef CONFIG_STRICT_DEVMEM 19e42009961255 David Hildenbrand 2021-08-11 67 if (!devmem_is_allowed(pfn)) 19e42009961255 David Hildenbrand 2021-08-11 68 return 0; 19e42009961255 David Hildenbrand 2021-08-11 69 #endif /* CONFIG_STRICT_DEVMEM */ 19e42009961255 David Hildenbrand 2021-08-11 @70 return !iomem_range_contains_excluded(PFN_PHYS(pfn), PAGE_SIZE); a4866aa812518e Kees Cook 2017-04-05 71 } 19e42009961255 David Hildenbrand 2021-08-11 72 e2beb3eae62721 Venki Pallipadi 2008-03-06 73 static inline int range_is_allowed(unsigned long pfn, unsigned long size) ae531c26c5c2a2 Arjan van de Ven 2008-04-24 74 { 19e42009961255 David Hildenbrand 2021-08-11 75 #ifdef CONFIG_STRICT_DEVMEM e2beb3eae62721 Venki Pallipadi 2008-03-06 76 u64 from = ((u64)pfn) << PAGE_SHIFT; e2beb3eae62721 Venki Pallipadi 2008-03-06 77 u64 to = from + size; e2beb3eae62721 Venki Pallipadi 2008-03-06 78 u64 cursor = from; e2beb3eae62721 Venki Pallipadi 2008-03-06 79 e2beb3eae62721 Venki Pallipadi 2008-03-06 80 while (cursor < to) { 39380b80d72723 Jiri Kosina 2016-07-08 81 if (!devmem_is_allowed(pfn)) ae531c26c5c2a2 Arjan van de Ven 2008-04-24 82 return 0; e2beb3eae62721 Venki Pallipadi 2008-03-06 83 cursor += PAGE_SIZE; e2beb3eae62721 Venki Pallipadi 2008-03-06 84 pfn++; ae531c26c5c2a2 Arjan van de Ven 2008-04-24 85 } 19e42009961255 David Hildenbrand 2021-08-11 86 #endif /* CONFIG_STRICT_DEVMEM */ 19e42009961255 David Hildenbrand 2021-08-11 @87 return !iomem_range_contains_excluded(PFN_PHYS(pfn), size); ae531c26c5c2a2 Arjan van de Ven 2008-04-24 88 } ae531c26c5c2a2 Arjan van de Ven 2008-04-24 89 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 35188 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v1 0/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem @ 2021-08-11 20:36 David Hildenbrand 2021-08-11 20:36 ` David Hildenbrand 0 siblings, 1 reply; 5+ messages in thread From: David Hildenbrand @ 2021-08-11 20:36 UTC (permalink / raw) To: linux-kernel Cc: David Hildenbrand, Arnd Bergmann, Greg Kroah-Hartman, Michael S. Tsirkin, Jason Wang, Rafael J. Wysocki, Andrew Morton, Dan Williams, Hanjun Guo, Andy Shevchenko, virtualization, linux-mm Let's add the basic infrastructure to exclude some physical memory regions completely from /dev/mem access, on any architecture and under any system configuration (independent of CONFIG_STRICT_DEVMEM and independent of "iomem="). Use it for virtio-mem, to disallow mapping any virtio-mem memory via /dev/mem to user space after the virtio-mem driver was loaded: there is no sane use case to access the device-managed memory region via /dev/mem once the driver is actively (un)plugging memory within that region and we want to make sure that nobody will accidentially access unplugged memory in a sane environment. Details can be found in patch #1. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: virtualization@lists.linux-foundation.org Cc: linux-mm@kvack.org David Hildenbrand (3): /dev/mem: disallow access to explicitly excluded system RAM regions virtio-mem: disallow mapping virtio-mem memory via /dev/mem kernel/resource: cleanup and optimize iomem_is_exclusive() drivers/char/mem.c | 22 ++++++------- drivers/virtio/virtio_mem.c | 4 ++- include/linux/ioport.h | 1 + kernel/resource.c | 61 ++++++++++++++++++++++++++++++++----- lib/Kconfig.debug | 4 ++- 5 files changed, 69 insertions(+), 23 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions 2021-08-11 20:36 [PATCH v1 0/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem David Hildenbrand @ 2021-08-11 20:36 ` David Hildenbrand 0 siblings, 0 replies; 5+ messages in thread From: David Hildenbrand @ 2021-08-11 20:36 UTC (permalink / raw) To: linux-kernel Cc: David Hildenbrand, Arnd Bergmann, Greg Kroah-Hartman, Michael S. Tsirkin, Jason Wang, Rafael J. Wysocki, Andrew Morton, Dan Williams, Hanjun Guo, Andy Shevchenko, virtualization, linux-mm virtio-mem dynamically exposes memory inside a device memory region as system RAM to Linux, coordinating with the hypervisor which parts are actually "plugged" and consequently usable/accessible. On the one hand, the virtio-mem driver adds/removes whole memory blocks, creating/removing busy IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs memory inside added memory blocks, dynamically either exposing them to the buddy or hiding them from the buddy and marking them PG_offline. virtio-mem wants to make sure that in a sane environment, nobody "accidentially" accesses unplugged memory inside the device managed region. After /proc/kcore has been sanitized and /dev/kmem has been removed, /dev/mem is the remaining interface that still allows uncontrolled access to the device-managed region of virtio-mem devices from user space. There is no known sane use case for mapping virtio-mem device memory via /dev/mem while virtio-mem driver concurrently (un)plugs memory inside that region. So once the driver was loaded and detected the device along the device-managed region, we just want to disallow any access via /dev/mem to it. Let's add the basic infrastructure to exclude some physical memory regions completely from /dev/mem access, on any architecture and under any system configuration (independent of CONFIG_STRICT_DEVMEM and independent of "iomem="). Any range marked with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE" will be excluded, even if not busy. For now, there are no applicable ranges and we'll modify virtio-mem next to properly set IORESOURCE_EXCLUSIVE on the parent resource. As next_resource() will iterate over children although we might want to skip a certain range completely, let's add and use next_range_skip_children() to optimize that case, avoding having to traverse subtrees that are not of interest. Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/char/mem.c | 22 +++++++++------------- include/linux/ioport.h | 1 + kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++++ lib/Kconfig.debug | 4 +++- 4 files changed, 55 insertions(+), 14 deletions(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 1c596b5cdb27..bb6d95daab45 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -60,13 +60,18 @@ static inline int valid_mmap_phys_addr_range(unsigned long pfn, size_t size) } #endif -#ifdef CONFIG_STRICT_DEVMEM static inline int page_is_allowed(unsigned long pfn) { - return devmem_is_allowed(pfn); +#ifdef CONFIG_STRICT_DEVMEM + if (!devmem_is_allowed(pfn)) + return 0; +#endif /* CONFIG_STRICT_DEVMEM */ + return !iomem_range_contains_excluded(PFN_PHYS(pfn), PAGE_SIZE); } + static inline int range_is_allowed(unsigned long pfn, unsigned long size) { +#ifdef CONFIG_STRICT_DEVMEM u64 from = ((u64)pfn) << PAGE_SHIFT; u64 to = from + size; u64 cursor = from; @@ -77,18 +82,9 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size) cursor += PAGE_SIZE; pfn++; } - return 1; -} -#else -static inline int page_is_allowed(unsigned long pfn) -{ - return 1; -} -static inline int range_is_allowed(unsigned long pfn, unsigned long size) -{ - return 1; +#endif /* CONFIG_STRICT_DEVMEM */ + return !iomem_range_contains_excluded(PFN_PHYS(pfn), size); } -#endif #ifndef unxlate_dev_mem_ptr #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 8359c50f9988..50523c28a5f1 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -308,6 +308,7 @@ extern struct resource * __devm_request_region(struct device *dev, extern void __devm_release_region(struct device *dev, struct resource *parent, resource_size_t start, resource_size_t n); extern int iomem_map_sanity_check(resource_size_t addr, unsigned long size); +extern bool iomem_range_contains_excluded(u64 addr, u64 size); extern bool iomem_is_exclusive(u64 addr); extern int diff --git a/kernel/resource.c b/kernel/resource.c index ca9f5198a01f..2938cf520ca3 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -73,6 +73,13 @@ static struct resource *next_resource(struct resource *p) return p->sibling; } +static struct resource *next_resource_skip_children(struct resource *p) +{ + while (!p->sibling && p->parent) + p = p->parent; + return p->sibling; +} + static void *r_next(struct seq_file *m, void *v, loff_t *pos) { struct resource *p = v; @@ -1700,6 +1707,41 @@ int iomem_map_sanity_check(resource_size_t addr, unsigned long size) return err; } +/* + * Check if a physical memory range is completely excluded from getting + * mapped/accessed via /dev/mem. + */ +bool iomem_range_contains_excluded(u64 addr, u64 size) +{ + const unsigned int flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE; + bool excluded = false; + struct resource *p; + + read_lock(&resource_lock); + for (p = iomem_resource.child; p ;) { + if (p->start >= addr + size) + break; + if (p->end < addr) { + /* No need to consider children */ + p = next_resource_skip_children(p); + continue; + } + /* + * A system RAM resource is excluded if IORESOURCE_EXCLUSIVE + * is set, even if not busy and even if we don't have strict + * checks enabled -- no ifs or buts. + */ + if ((p->flags & flags) == flags) { + excluded = true; + break; + } + p = next_resource(p); + } + read_unlock(&resource_lock); + + return excluded; +} + #ifdef CONFIG_STRICT_DEVMEM static int strict_iomem_checks = 1; #else diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5ddd575159fb..d0ce6e23a6db 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1780,7 +1780,9 @@ config STRICT_DEVMEM access to this is obviously disastrous, but specific access can be used by people debugging the kernel. Note that with PAT support enabled, even in this case there are restrictions on /dev/mem - use due to the cache aliasing requirements. + use due to the cache aliasing requirements. Further, some drivers + will still restrict access to some physical memory regions either + already used or to be used in the future as system RAM. If this option is switched on, and IO_STRICT_DEVMEM=n, the /dev/mem file only allows userspace access to PCI space and the BIOS code and -- 2.31.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions @ 2021-08-11 20:36 ` David Hildenbrand 0 siblings, 0 replies; 5+ messages in thread From: David Hildenbrand @ 2021-08-11 20:36 UTC (permalink / raw) To: linux-kernel Cc: Arnd Bergmann, Michael S. Tsirkin, Greg Kroah-Hartman, Rafael J. Wysocki, virtualization, linux-mm, Hanjun Guo, Andrew Morton, Andy Shevchenko, Dan Williams virtio-mem dynamically exposes memory inside a device memory region as system RAM to Linux, coordinating with the hypervisor which parts are actually "plugged" and consequently usable/accessible. On the one hand, the virtio-mem driver adds/removes whole memory blocks, creating/removing busy IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs memory inside added memory blocks, dynamically either exposing them to the buddy or hiding them from the buddy and marking them PG_offline. virtio-mem wants to make sure that in a sane environment, nobody "accidentially" accesses unplugged memory inside the device managed region. After /proc/kcore has been sanitized and /dev/kmem has been removed, /dev/mem is the remaining interface that still allows uncontrolled access to the device-managed region of virtio-mem devices from user space. There is no known sane use case for mapping virtio-mem device memory via /dev/mem while virtio-mem driver concurrently (un)plugs memory inside that region. So once the driver was loaded and detected the device along the device-managed region, we just want to disallow any access via /dev/mem to it. Let's add the basic infrastructure to exclude some physical memory regions completely from /dev/mem access, on any architecture and under any system configuration (independent of CONFIG_STRICT_DEVMEM and independent of "iomem="). Any range marked with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE" will be excluded, even if not busy. For now, there are no applicable ranges and we'll modify virtio-mem next to properly set IORESOURCE_EXCLUSIVE on the parent resource. As next_resource() will iterate over children although we might want to skip a certain range completely, let's add and use next_range_skip_children() to optimize that case, avoding having to traverse subtrees that are not of interest. Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/char/mem.c | 22 +++++++++------------- include/linux/ioport.h | 1 + kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++++ lib/Kconfig.debug | 4 +++- 4 files changed, 55 insertions(+), 14 deletions(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 1c596b5cdb27..bb6d95daab45 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -60,13 +60,18 @@ static inline int valid_mmap_phys_addr_range(unsigned long pfn, size_t size) } #endif -#ifdef CONFIG_STRICT_DEVMEM static inline int page_is_allowed(unsigned long pfn) { - return devmem_is_allowed(pfn); +#ifdef CONFIG_STRICT_DEVMEM + if (!devmem_is_allowed(pfn)) + return 0; +#endif /* CONFIG_STRICT_DEVMEM */ + return !iomem_range_contains_excluded(PFN_PHYS(pfn), PAGE_SIZE); } + static inline int range_is_allowed(unsigned long pfn, unsigned long size) { +#ifdef CONFIG_STRICT_DEVMEM u64 from = ((u64)pfn) << PAGE_SHIFT; u64 to = from + size; u64 cursor = from; @@ -77,18 +82,9 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size) cursor += PAGE_SIZE; pfn++; } - return 1; -} -#else -static inline int page_is_allowed(unsigned long pfn) -{ - return 1; -} -static inline int range_is_allowed(unsigned long pfn, unsigned long size) -{ - return 1; +#endif /* CONFIG_STRICT_DEVMEM */ + return !iomem_range_contains_excluded(PFN_PHYS(pfn), size); } -#endif #ifndef unxlate_dev_mem_ptr #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 8359c50f9988..50523c28a5f1 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -308,6 +308,7 @@ extern struct resource * __devm_request_region(struct device *dev, extern void __devm_release_region(struct device *dev, struct resource *parent, resource_size_t start, resource_size_t n); extern int iomem_map_sanity_check(resource_size_t addr, unsigned long size); +extern bool iomem_range_contains_excluded(u64 addr, u64 size); extern bool iomem_is_exclusive(u64 addr); extern int diff --git a/kernel/resource.c b/kernel/resource.c index ca9f5198a01f..2938cf520ca3 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -73,6 +73,13 @@ static struct resource *next_resource(struct resource *p) return p->sibling; } +static struct resource *next_resource_skip_children(struct resource *p) +{ + while (!p->sibling && p->parent) + p = p->parent; + return p->sibling; +} + static void *r_next(struct seq_file *m, void *v, loff_t *pos) { struct resource *p = v; @@ -1700,6 +1707,41 @@ int iomem_map_sanity_check(resource_size_t addr, unsigned long size) return err; } +/* + * Check if a physical memory range is completely excluded from getting + * mapped/accessed via /dev/mem. + */ +bool iomem_range_contains_excluded(u64 addr, u64 size) +{ + const unsigned int flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE; + bool excluded = false; + struct resource *p; + + read_lock(&resource_lock); + for (p = iomem_resource.child; p ;) { + if (p->start >= addr + size) + break; + if (p->end < addr) { + /* No need to consider children */ + p = next_resource_skip_children(p); + continue; + } + /* + * A system RAM resource is excluded if IORESOURCE_EXCLUSIVE + * is set, even if not busy and even if we don't have strict + * checks enabled -- no ifs or buts. + */ + if ((p->flags & flags) == flags) { + excluded = true; + break; + } + p = next_resource(p); + } + read_unlock(&resource_lock); + + return excluded; +} + #ifdef CONFIG_STRICT_DEVMEM static int strict_iomem_checks = 1; #else diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5ddd575159fb..d0ce6e23a6db 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1780,7 +1780,9 @@ config STRICT_DEVMEM access to this is obviously disastrous, but specific access can be used by people debugging the kernel. Note that with PAT support enabled, even in this case there are restrictions on /dev/mem - use due to the cache aliasing requirements. + use due to the cache aliasing requirements. Further, some drivers + will still restrict access to some physical memory regions either + already used or to be used in the future as system RAM. If this option is switched on, and IO_STRICT_DEVMEM=n, the /dev/mem file only allows userspace access to PCI space and the BIOS code and -- 2.31.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions 2021-08-11 20:36 ` David Hildenbrand @ 2021-08-11 20:50 ` Andy Shevchenko -1 siblings, 0 replies; 5+ messages in thread From: Andy Shevchenko @ 2021-08-11 20:50 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, Arnd Bergmann, Greg Kroah-Hartman, Michael S. Tsirkin, Jason Wang, Rafael J. Wysocki, Andrew Morton, Dan Williams, Hanjun Guo, Andy Shevchenko, virtualization, linux-mm [-- Attachment #1: Type: text/plain, Size: 7388 bytes --] On Wednesday, August 11, 2021, David Hildenbrand <david@redhat.com> wrote: > virtio-mem dynamically exposes memory inside a device memory region as > system RAM to Linux, coordinating with the hypervisor which parts are > actually "plugged" and consequently usable/accessible. On the one hand, the > virtio-mem driver adds/removes whole memory blocks, creating/removing busy > IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs > memory inside added memory blocks, dynamically either exposing them to > the buddy or hiding them from the buddy and marking them PG_offline. > > virtio-mem wants to make sure that in a sane environment, nobody > "accidentially" accesses unplugged memory inside the device managed > region. After /proc/kcore has been sanitized and /dev/kmem has been > removed, /dev/mem is the remaining interface that still allows uncontrolled > access to the device-managed region of virtio-mem devices from user > space. > > There is no known sane use case for mapping virtio-mem device memory > via /dev/mem while virtio-mem driver concurrently (un)plugs memory inside > that region. So once the driver was loaded and detected the device > along the device-managed region, we just want to disallow any access via > /dev/mem to it. > > Let's add the basic infrastructure to exclude some physical memory > regions completely from /dev/mem access, on any architecture and under > any system configuration (independent of CONFIG_STRICT_DEVMEM and > independent of "iomem="). > > Any range marked with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE" > will be excluded, even if not busy. For now, there are no applicable > ranges and we'll modify virtio-mem next to properly set > IORESOURCE_EXCLUSIVE on the parent resource. > > As next_resource() will iterate over children although we might want to > skip a certain range completely, let's add and use > next_range_skip_children() to optimize that case, avoding having to > traverse subtrees that are not of interest. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > drivers/char/mem.c | 22 +++++++++------------- > include/linux/ioport.h | 1 + > kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > lib/Kconfig.debug | 4 +++- > 4 files changed, 55 insertions(+), 14 deletions(-) > > diff --git a/drivers/char/mem.c b/drivers/char/mem.c > index 1c596b5cdb27..bb6d95daab45 100644 > --- a/drivers/char/mem.c > +++ b/drivers/char/mem.c > @@ -60,13 +60,18 @@ static inline int valid_mmap_phys_addr_range(unsigned > long pfn, size_t size) > } > #endif > > -#ifdef CONFIG_STRICT_DEVMEM > static inline int page_is_allowed(unsigned long pfn) > { > - return devmem_is_allowed(pfn); > +#ifdef CONFIG_STRICT_DEVMEM > + if (!devmem_is_allowed(pfn)) > + return 0; > +#endif /* CONFIG_STRICT_DEVMEM */ > + return !iomem_range_contains_excluded(PFN_PHYS(pfn), PAGE_SIZE); > } > + > static inline int range_is_allowed(unsigned long pfn, unsigned long size) > { > +#ifdef CONFIG_STRICT_DEVMEM > u64 from = ((u64)pfn) << PAGE_SHIFT; > u64 to = from + size; > u64 cursor = from; > @@ -77,18 +82,9 @@ static inline int range_is_allowed(unsigned long pfn, > unsigned long size) > cursor += PAGE_SIZE; > pfn++; > } > - return 1; > -} > -#else > -static inline int page_is_allowed(unsigned long pfn) > -{ > - return 1; > -} > -static inline int range_is_allowed(unsigned long pfn, unsigned long size) > -{ > - return 1; > +#endif /* CONFIG_STRICT_DEVMEM */ > + return !iomem_range_contains_excluded(PFN_PHYS(pfn), size); > } > -#endif > > #ifndef unxlate_dev_mem_ptr > #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr > diff --git a/include/linux/ioport.h b/include/linux/ioport.h > index 8359c50f9988..50523c28a5f1 100644 > --- a/include/linux/ioport.h > +++ b/include/linux/ioport.h > @@ -308,6 +308,7 @@ extern struct resource * __devm_request_region(struct > device *dev, > extern void __devm_release_region(struct device *dev, struct resource > *parent, > resource_size_t start, resource_size_t > n); > extern int iomem_map_sanity_check(resource_size_t addr, unsigned long > size); > +extern bool iomem_range_contains_excluded(u64 addr, u64 size); > extern bool iomem_is_exclusive(u64 addr); > > extern int > diff --git a/kernel/resource.c b/kernel/resource.c > index ca9f5198a01f..2938cf520ca3 100644 > --- a/kernel/resource.c > +++ b/kernel/resource.c > @@ -73,6 +73,13 @@ static struct resource *next_resource(struct resource > *p) > return p->sibling; > } > > +static struct resource *next_resource_skip_children(struct resource *p) > +{ > + while (!p->sibling && p->parent) > + p = p->parent; > + return p->sibling; > +} > + > static void *r_next(struct seq_file *m, void *v, loff_t *pos) > { > struct resource *p = v; > @@ -1700,6 +1707,41 @@ int iomem_map_sanity_check(resource_size_t addr, > unsigned long size) > return err; > } > > +/* > + * Check if a physical memory range is completely excluded from getting > + * mapped/accessed via /dev/mem. > + */ > +bool iomem_range_contains_excluded(u64 addr, u64 size) > +{ > + const unsigned int flags = IORESOURCE_SYSTEM_RAM | > IORESOURCE_EXCLUSIVE; > + bool excluded = false; > + struct resource *p; > + > + read_lock(&resource_lock); > + for (p = iomem_resource.child; p ;) { Same comment as per patch 3. > + if (p->start >= addr + size) > + break; > + if (p->end < addr) { > + /* No need to consider children */ > + p = next_resource_skip_children(p); > + continue; > + } > + /* > + * A system RAM resource is excluded if > IORESOURCE_EXCLUSIVE > + * is set, even if not busy and even if we don't have > strict > + * checks enabled -- no ifs or buts. > + */ > + if ((p->flags & flags) == flags) { > + excluded = true; > + break; > + } > + p = next_resource(p); > + } > + read_unlock(&resource_lock); > + > + return excluded; > +} > + > #ifdef CONFIG_STRICT_DEVMEM > static int strict_iomem_checks = 1; > #else > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 5ddd575159fb..d0ce6e23a6db 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1780,7 +1780,9 @@ config STRICT_DEVMEM > access to this is obviously disastrous, but specific access can > be used by people debugging the kernel. Note that with PAT > support > enabled, even in this case there are restrictions on /dev/mem > - use due to the cache aliasing requirements. > + use due to the cache aliasing requirements. Further, some drivers > + will still restrict access to some physical memory regions either > + already used or to be used in the future as system RAM. > > If this option is switched on, and IO_STRICT_DEVMEM=n, the > /dev/mem > file only allows userspace access to PCI space and the BIOS code > and > -- > 2.31.1 > > -- With Best Regards, Andy Shevchenko [-- Attachment #2: Type: text/html, Size: 8860 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions @ 2021-08-11 20:50 ` Andy Shevchenko 0 siblings, 0 replies; 5+ messages in thread From: Andy Shevchenko @ 2021-08-11 20:50 UTC (permalink / raw) To: David Hildenbrand Cc: Arnd Bergmann, Michael S. Tsirkin, Greg Kroah-Hartman, Rafael J. Wysocki, linux-kernel, virtualization, linux-mm, Hanjun Guo, Andrew Morton, Andy Shevchenko, Dan Williams [-- Attachment #1.1: Type: text/plain, Size: 7388 bytes --] On Wednesday, August 11, 2021, David Hildenbrand <david@redhat.com> wrote: > virtio-mem dynamically exposes memory inside a device memory region as > system RAM to Linux, coordinating with the hypervisor which parts are > actually "plugged" and consequently usable/accessible. On the one hand, the > virtio-mem driver adds/removes whole memory blocks, creating/removing busy > IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs > memory inside added memory blocks, dynamically either exposing them to > the buddy or hiding them from the buddy and marking them PG_offline. > > virtio-mem wants to make sure that in a sane environment, nobody > "accidentially" accesses unplugged memory inside the device managed > region. After /proc/kcore has been sanitized and /dev/kmem has been > removed, /dev/mem is the remaining interface that still allows uncontrolled > access to the device-managed region of virtio-mem devices from user > space. > > There is no known sane use case for mapping virtio-mem device memory > via /dev/mem while virtio-mem driver concurrently (un)plugs memory inside > that region. So once the driver was loaded and detected the device > along the device-managed region, we just want to disallow any access via > /dev/mem to it. > > Let's add the basic infrastructure to exclude some physical memory > regions completely from /dev/mem access, on any architecture and under > any system configuration (independent of CONFIG_STRICT_DEVMEM and > independent of "iomem="). > > Any range marked with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE" > will be excluded, even if not busy. For now, there are no applicable > ranges and we'll modify virtio-mem next to properly set > IORESOURCE_EXCLUSIVE on the parent resource. > > As next_resource() will iterate over children although we might want to > skip a certain range completely, let's add and use > next_range_skip_children() to optimize that case, avoding having to > traverse subtrees that are not of interest. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > drivers/char/mem.c | 22 +++++++++------------- > include/linux/ioport.h | 1 + > kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > lib/Kconfig.debug | 4 +++- > 4 files changed, 55 insertions(+), 14 deletions(-) > > diff --git a/drivers/char/mem.c b/drivers/char/mem.c > index 1c596b5cdb27..bb6d95daab45 100644 > --- a/drivers/char/mem.c > +++ b/drivers/char/mem.c > @@ -60,13 +60,18 @@ static inline int valid_mmap_phys_addr_range(unsigned > long pfn, size_t size) > } > #endif > > -#ifdef CONFIG_STRICT_DEVMEM > static inline int page_is_allowed(unsigned long pfn) > { > - return devmem_is_allowed(pfn); > +#ifdef CONFIG_STRICT_DEVMEM > + if (!devmem_is_allowed(pfn)) > + return 0; > +#endif /* CONFIG_STRICT_DEVMEM */ > + return !iomem_range_contains_excluded(PFN_PHYS(pfn), PAGE_SIZE); > } > + > static inline int range_is_allowed(unsigned long pfn, unsigned long size) > { > +#ifdef CONFIG_STRICT_DEVMEM > u64 from = ((u64)pfn) << PAGE_SHIFT; > u64 to = from + size; > u64 cursor = from; > @@ -77,18 +82,9 @@ static inline int range_is_allowed(unsigned long pfn, > unsigned long size) > cursor += PAGE_SIZE; > pfn++; > } > - return 1; > -} > -#else > -static inline int page_is_allowed(unsigned long pfn) > -{ > - return 1; > -} > -static inline int range_is_allowed(unsigned long pfn, unsigned long size) > -{ > - return 1; > +#endif /* CONFIG_STRICT_DEVMEM */ > + return !iomem_range_contains_excluded(PFN_PHYS(pfn), size); > } > -#endif > > #ifndef unxlate_dev_mem_ptr > #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr > diff --git a/include/linux/ioport.h b/include/linux/ioport.h > index 8359c50f9988..50523c28a5f1 100644 > --- a/include/linux/ioport.h > +++ b/include/linux/ioport.h > @@ -308,6 +308,7 @@ extern struct resource * __devm_request_region(struct > device *dev, > extern void __devm_release_region(struct device *dev, struct resource > *parent, > resource_size_t start, resource_size_t > n); > extern int iomem_map_sanity_check(resource_size_t addr, unsigned long > size); > +extern bool iomem_range_contains_excluded(u64 addr, u64 size); > extern bool iomem_is_exclusive(u64 addr); > > extern int > diff --git a/kernel/resource.c b/kernel/resource.c > index ca9f5198a01f..2938cf520ca3 100644 > --- a/kernel/resource.c > +++ b/kernel/resource.c > @@ -73,6 +73,13 @@ static struct resource *next_resource(struct resource > *p) > return p->sibling; > } > > +static struct resource *next_resource_skip_children(struct resource *p) > +{ > + while (!p->sibling && p->parent) > + p = p->parent; > + return p->sibling; > +} > + > static void *r_next(struct seq_file *m, void *v, loff_t *pos) > { > struct resource *p = v; > @@ -1700,6 +1707,41 @@ int iomem_map_sanity_check(resource_size_t addr, > unsigned long size) > return err; > } > > +/* > + * Check if a physical memory range is completely excluded from getting > + * mapped/accessed via /dev/mem. > + */ > +bool iomem_range_contains_excluded(u64 addr, u64 size) > +{ > + const unsigned int flags = IORESOURCE_SYSTEM_RAM | > IORESOURCE_EXCLUSIVE; > + bool excluded = false; > + struct resource *p; > + > + read_lock(&resource_lock); > + for (p = iomem_resource.child; p ;) { Same comment as per patch 3. > + if (p->start >= addr + size) > + break; > + if (p->end < addr) { > + /* No need to consider children */ > + p = next_resource_skip_children(p); > + continue; > + } > + /* > + * A system RAM resource is excluded if > IORESOURCE_EXCLUSIVE > + * is set, even if not busy and even if we don't have > strict > + * checks enabled -- no ifs or buts. > + */ > + if ((p->flags & flags) == flags) { > + excluded = true; > + break; > + } > + p = next_resource(p); > + } > + read_unlock(&resource_lock); > + > + return excluded; > +} > + > #ifdef CONFIG_STRICT_DEVMEM > static int strict_iomem_checks = 1; > #else > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 5ddd575159fb..d0ce6e23a6db 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1780,7 +1780,9 @@ config STRICT_DEVMEM > access to this is obviously disastrous, but specific access can > be used by people debugging the kernel. Note that with PAT > support > enabled, even in this case there are restrictions on /dev/mem > - use due to the cache aliasing requirements. > + use due to the cache aliasing requirements. Further, some drivers > + will still restrict access to some physical memory regions either > + already used or to be used in the future as system RAM. > > If this option is switched on, and IO_STRICT_DEVMEM=n, the > /dev/mem > file only allows userspace access to PCI space and the BIOS code > and > -- > 2.31.1 > > -- With Best Regards, Andy Shevchenko [-- Attachment #1.2: Type: text/html, Size: 8860 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-12 1:42 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-12 1:42 [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions kernel test robot -- strict thread matches above, loose matches on Subject: below -- 2021-08-11 20:36 [PATCH v1 0/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem David Hildenbrand 2021-08-11 20:36 ` [PATCH v1 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions David Hildenbrand 2021-08-11 20:36 ` David Hildenbrand 2021-08-11 20:50 ` Andy Shevchenko 2021-08-11 20:50 ` Andy Shevchenko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.