linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scatterlist: don't overflow length field
@ 2017-02-01 21:29 David Dillow
  2017-02-03 19:57 ` Linus Torvalds
  0 siblings, 1 reply; 3+ messages in thread
From: David Dillow @ 2017-02-01 21:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, linux-kernel, David Dillow

When called with a region of contiguous pages totaling > 4 GB of memory,
sg_alloc_table_from_pages() will overflow the length field, leading to a
corrupt scatter list. Fix this by tracking the number of pages we've
merged and start a new chunk when we would overflow.

Tested by building various page lists with contiguous 8GB regions and
observing that they are correctly split without overflowing length.

Signed-off-by: David Dillow <dillow@google.com>
---
 lib/scatterlist.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..539dd344f1c5 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,17 +394,26 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 	unsigned long offset, unsigned long size,
 	gfp_t gfp_mask)
 {
+	unsigned int chunk_pages;
 	unsigned int chunks;
 	unsigned int i;
 	unsigned int cur_page;
 	int ret;
 	struct scatterlist *s;
 
+	BUILD_BUG_ON(!typecheck(typeof(s->length), unsigned int));
+
 	/* compute number of contiguous chunks */
 	chunks = 1;
-	for (i = 1; i < n_pages; ++i)
-		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+	chunk_pages = 1;
+	for (i = 1; i < n_pages; ++i) {
+		if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1 ||
+		    chunk_pages >= UINT_MAX >> PAGE_SHIFT) {
 			++chunks;
+			chunk_pages = 0;
+		}
+		++chunk_pages;
+	}
 
 	ret = sg_alloc_table(sgt, chunks, gfp_mask);
 	if (unlikely(ret))
@@ -417,10 +426,15 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 		unsigned int j;
 
 		/* look for the end of the current chunk */
-		for (j = cur_page + 1; j < n_pages; ++j)
+		chunk_pages = 1;
+		for (j = cur_page + 1; j < n_pages; ++j) {
 			if (page_to_pfn(pages[j]) !=
-			    page_to_pfn(pages[j - 1]) + 1)
+			    page_to_pfn(pages[j - 1]) + 1 ||
+			    chunk_pages >= UINT_MAX >> PAGE_SHIFT) {
 				break;
+			}
+			++chunk_pages;
+		}
 
 		chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
 		sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
-- 
2.11.0.483.g087da7b7c-goog

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] scatterlist: don't overflow length field
  2017-02-01 21:29 [PATCH] scatterlist: don't overflow length field David Dillow
@ 2017-02-03 19:57 ` Linus Torvalds
  2017-02-06 18:31   ` David Dillow
  0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2017-02-03 19:57 UTC (permalink / raw)
  To: David Dillow; +Cc: Andrew Morton, Linux Kernel Mailing List

On Wed, Feb 1, 2017 at 1:29 PM, David Dillow <dillow@google.com> wrote:
> When called with a region of contiguous pages totaling > 4 GB of memory,
> sg_alloc_table_from_pages() will overflow the length field, leading to a
> corrupt scatter list. Fix this by tracking the number of pages we've
> merged and start a new chunk when we would overflow.

So what allows these things to be built in the first place?

We limit IO sizes to fit in a signed int (so just below 2GB) not only
because it's often an effective denial of service, but also because
we've had issues with various drivers (and filesystems) getting
int/long wrong.

So nothing should be building those kinds of scatterlists, and it
something is able to, it might result in other problems downstreams..

Put another way: why not just say "this can't happen", and make the
sg_alloc_table_from_pages() perhaps also have the same check.

MAX_RW_COUNT is what we limit reads and writes to:

    #define MAX_RW_COUNT (INT_MAX & PAGE_MASK)

Hmm?

                   Linus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] scatterlist: don't overflow length field
  2017-02-03 19:57 ` Linus Torvalds
@ 2017-02-06 18:31   ` David Dillow
  0 siblings, 0 replies; 3+ messages in thread
From: David Dillow @ 2017-02-06 18:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Linux Kernel Mailing List, axboe, hch, ming.lei

+Jens, Christoph, and Ming based on off-list suggestion

On Fri, Feb 3, 2017 at 11:57 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, Feb 1, 2017 at 1:29 PM, David Dillow <dillow@google.com> wrote:
> > When called with a region of contiguous pages totaling > 4 GB of memory,
> > sg_alloc_table_from_pages() will overflow the length field, leading to a
> > corrupt scatter list. Fix this by tracking the number of pages we've
> > merged and start a new chunk when we would overflow.
>
> So what allows these things to be built in the first place?
>
> We limit IO sizes to fit in a signed int (so just below 2GB) not only
> because it's often an effective denial of service, but also because
> we've had issues with various drivers (and filesystems) getting
> int/long wrong.
>
> So nothing should be building those kinds of scatterlists, and it
> something is able to, it might result in other problems downstreams..

This isn't from normal read/write IO -- some applications want to
access large amounts
of userspace memory directly from hardware, and it is cleaner for them
to manage one
mapping than multiple 1GB or 2GB mappings -- assuming the hardware can even
support multiple mappings. If they have room in their container to
allocate and pin the
memory, we'd like to allow it.

There's definitely potential for problems downstream, even without
going through the
filesystems and block layers -- we noticed this potential issue while
tracking down an
bug in the IOMMU code when an entry in the list was over 1GB. We still
see a benefit
from building the large entries, though -- it allows superpages in the
IOMMU mapping
which helps the IOTLB cache.

We currently use sg_alloc_table_from_pages() to build the scatterlist
for dma_map_sg()
but we could do it ourselves if you'd rather add a length limit to the
more general code.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-02-06 18:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-01 21:29 [PATCH] scatterlist: don't overflow length field David Dillow
2017-02-03 19:57 ` Linus Torvalds
2017-02-06 18:31   ` David Dillow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).