linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] sg: simplify page_count manipulations
@ 2006-01-18 15:52 Nick Piggin
  2006-01-19  3:59 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2006-01-18 15:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux Kernel Mailing List

Hi Linus,

What do you think about the following couple of patches? Hugh's had a
look and thinks they're OK.

Nick

--
Allocate a compound page for the user mapping instead of tweaking
the page refcounts.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Index: linux-2.6/drivers/scsi/sg.c
===================================================================
--- linux-2.6.orig/drivers/scsi/sg.c
+++ linux-2.6/drivers/scsi/sg.c
@@ -1140,32 +1140,6 @@ sg_fasync(int fd, struct file *filp, int
 	return (retval < 0) ? retval : 0;
 }
 
-/* When startFinish==1 increments page counts for pages other than the 
-   first of scatter gather elements obtained from alloc_pages().
-   When startFinish==0 decrements ... */
-static void
-sg_rb_correct4mmap(Sg_scatter_hold * rsv_schp, int startFinish)
-{
-	struct scatterlist *sg = rsv_schp->buffer;
-	struct page *page;
-	int k, m;
-
-	SCSI_LOG_TIMEOUT(3, printk("sg_rb_correct4mmap: startFinish=%d, scatg=%d\n", 
-				   startFinish, rsv_schp->k_use_sg));
-	/* N.B. correction _not_ applied to base page of each allocation */
-	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
-		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
-			page = sg->page;
-			if (startFinish)
-				get_page(page);
-			else {
-				if (page_count(page) > 0)
-					__put_page(page);
-			}
-		}
-	}
-}
-
 static struct page *
 sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type)
 {
@@ -1237,10 +1211,7 @@ sg_mmap(struct file *filp, struct vm_are
 		sa += len;
 	}
 
-	if (0 == sfp->mmap_called) {
-		sg_rb_correct4mmap(rsv_schp, 1);	/* do only once per fd lifetime */
-		sfp->mmap_called = 1;
-	}
+	sfp->mmap_called = 1;
 	vma->vm_flags |= VM_RESERVED;
 	vma->vm_private_data = sfp;
 	vma->vm_ops = &sg_mmap_vm_ops;
@@ -2395,8 +2366,6 @@ __sg_remove_sfp(Sg_device * sdp, Sg_fd *
 		SCSI_LOG_TIMEOUT(6, 
 			printk("__sg_remove_sfp:    bufflen=%d, k_use_sg=%d\n",
 			(int) sfp->reserve.bufflen, (int) sfp->reserve.k_use_sg));
-		if (sfp->mmap_called)
-			sg_rb_correct4mmap(&sfp->reserve, 0);	/* undo correction */
 		sg_remove_scat(&sfp->reserve);
 	}
 	sfp->parentdp = NULL;
@@ -2478,9 +2447,9 @@ sg_page_malloc(int rqSz, int lowDma, int
 		return resp;
 
 	if (lowDma)
-		page_mask = GFP_ATOMIC | GFP_DMA | __GFP_NOWARN;
+		page_mask = GFP_ATOMIC | GFP_DMA | __GFP_COMP | __GFP_NOWARN;
 	else
-		page_mask = GFP_ATOMIC | __GFP_NOWARN;
+		page_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN;
 
 	for (order = 0, a_size = PAGE_SIZE; a_size < rqSz;
 	     order++, a_size <<= 1) ;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-18 15:52 [patch] sg: simplify page_count manipulations Nick Piggin
@ 2006-01-19  3:59 ` Andrew Morton
  2006-01-19 14:45   ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-01-19  3:59 UTC (permalink / raw)
  To: Nick Piggin; +Cc: torvalds, linux-kernel, linux-scsi

Nick Piggin <npiggin@suse.de> wrote:
>
> Hi Linus,
> 
> What do you think about the following couple of patches? Hugh's had a
> look and thinks they're OK.
> 

Gad.  You're brave.

> Allocate a compound page for the user mapping instead of tweaking
> the page refcounts.
> 
> Signed-off-by: Nick Piggin <npiggin@suse.de>
> 
> Index: linux-2.6/drivers/scsi/sg.c
> ===================================================================
> --- linux-2.6.orig/drivers/scsi/sg.c
> +++ linux-2.6/drivers/scsi/sg.c
> @@ -1140,32 +1140,6 @@ sg_fasync(int fd, struct file *filp, int
>  	return (retval < 0) ? retval : 0;
>  }
>  
> -/* When startFinish==1 increments page counts for pages other than the 
> -   first of scatter gather elements obtained from alloc_pages().
> -   When startFinish==0 decrements ... */
> -static void
> -sg_rb_correct4mmap(Sg_scatter_hold * rsv_schp, int startFinish)
> -{
> -	struct scatterlist *sg = rsv_schp->buffer;
> -	struct page *page;
> -	int k, m;
> -
> -	SCSI_LOG_TIMEOUT(3, printk("sg_rb_correct4mmap: startFinish=%d, scatg=%d\n", 
> -				   startFinish, rsv_schp->k_use_sg));
> -	/* N.B. correction _not_ applied to base page of each allocation */
> -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> -			page = sg->page;
> -			if (startFinish)
> -				get_page(page);
> -			else {
> -				if (page_count(page) > 0)
> -					__put_page(page);
> -			}
> -		}
> -	}
> -}

What on earth is the above trying to do?  The inner loop is a rather
complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
suspects there's a missing "[m]" in there.

Yes, using a compound page for the refcounting sounds sane, but I think
this code is fragile and has monsters in it.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-19  3:59 ` Andrew Morton
@ 2006-01-19 14:45   ` Nick Piggin
  2006-01-19 22:05     ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2006-01-19 14:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, torvalds, linux-kernel, linux-scsi

On Wed, Jan 18, 2006 at 07:59:37PM -0800, Andrew Morton wrote:
> Nick Piggin <npiggin@suse.de> wrote:
> > -	/* N.B. correction _not_ applied to base page of each allocation */
> > -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> > -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> > -			page = sg->page;
> > -			if (startFinish)
> > -				get_page(page);
> > -			else {
> > -				if (page_count(page) > 0)
> > -					__put_page(page);
> > -			}
> > -		}
> > -	}
> > -}
> 
> What on earth is the above trying to do?  The inner loop is a rather
> complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
> suspects there's a missing "[m]" in there.
> 

It does this on the first mmap of the device, in the hope that subsequent
nopage, unmaps would not free the constituent pages in the scatterlist.

> Yes, using a compound page for the refcounting sounds sane, but I think
> this code is fragile and has monsters in it.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-19 14:45   ` Nick Piggin
@ 2006-01-19 22:05     ` Andrew Morton
  2006-01-20 10:18       ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-01-19 22:05 UTC (permalink / raw)
  To: Nick Piggin; +Cc: npiggin, torvalds, linux-kernel, linux-scsi

Nick Piggin <npiggin@suse.de> wrote:
>
> On Wed, Jan 18, 2006 at 07:59:37PM -0800, Andrew Morton wrote:
> > Nick Piggin <npiggin@suse.de> wrote:
> > > -	/* N.B. correction _not_ applied to base page of each allocation */
> > > -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> > > -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> > > -			page = sg->page;
> > > -			if (startFinish)
> > > -				get_page(page);
> > > -			else {
> > > -				if (page_count(page) > 0)
> > > -					__put_page(page);
> > > -			}
> > > -		}
> > > -	}
> > > -}
> > 
> > What on earth is the above trying to do?  The inner loop is a rather
> > complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
> > suspects there's a missing "[m]" in there.
> > 
> 
> It does this on the first mmap of the device, in the hope that subsequent
> nopage, unmaps would not free the constituent pages in the scatterlist.
> 

But it's doing it wrongly, isn't it?  Or am I completely nuts?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-19 22:05     ` Andrew Morton
@ 2006-01-20 10:18       ` Nick Piggin
  2006-01-20 10:47         ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Piggin @ 2006-01-20 10:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, torvalds, linux-kernel, linux-scsi, dougg

On Thu, Jan 19, 2006 at 02:05:25PM -0800, Andrew Morton wrote:
> Nick Piggin <npiggin@suse.de> wrote:
> >
> > On Wed, Jan 18, 2006 at 07:59:37PM -0800, Andrew Morton wrote:
> > > Nick Piggin <npiggin@suse.de> wrote:
> > > > -	/* N.B. correction _not_ applied to base page of each allocation */
> > > > -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> > > > -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> > > > -			page = sg->page;
> > > > -			if (startFinish)
> > > > -				get_page(page);
> > > > -			else {
> > > > -				if (page_count(page) > 0)
> > > > -					__put_page(page);
> > > > -			}
> > > > -		}
> > > > -	}
> > > > -}
> > > 
> > > What on earth is the above trying to do?  The inner loop is a rather
> > > complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
> > > suspects there's a missing "[m]" in there.
> > > 
> > 
> > It does this on the first mmap of the device, in the hope that subsequent
> > nopage, unmaps would not free the constituent pages in the scatterlist.
> > 
> 
> But it's doing it wrongly, isn't it?  Or am I completely nuts?

No I think you're right. I'm not sure why this doesn't oops but I
thought it was the (main) reason others wanted to get rid of this
convoluted code earlier on. I see nobody else is planning to do anything
about it though, so I figure I must have missed the reason why it isn't
a problem.

But either way I don't think the code actually _does_ anything, even if
its bugginess doesn't actually lead to a bug.

Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-20 10:18       ` Nick Piggin
@ 2006-01-20 10:47         ` Andrew Morton
  2006-01-20 16:32           ` Hugh Dickins
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-01-20 10:47 UTC (permalink / raw)
  To: Nick Piggin; +Cc: npiggin, torvalds, linux-kernel, linux-scsi, dougg

Nick Piggin <npiggin@suse.de> wrote:
>
> On Thu, Jan 19, 2006 at 02:05:25PM -0800, Andrew Morton wrote:
> > Nick Piggin <npiggin@suse.de> wrote:
> > >
> > > On Wed, Jan 18, 2006 at 07:59:37PM -0800, Andrew Morton wrote:
> > > > Nick Piggin <npiggin@suse.de> wrote:
> > > > > -	/* N.B. correction _not_ applied to base page of each allocation */
> > > > > -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> > > > > -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> > > > > -			page = sg->page;
> > > > > -			if (startFinish)
> > > > > -				get_page(page);
> > > > > -			else {
> > > > > -				if (page_count(page) > 0)
> > > > > -					__put_page(page);
> > > > > -			}
> > > > > -		}
> > > > > -	}
> > > > > -}
> > > > 
> > > > What on earth is the above trying to do?  The inner loop is a rather
> > > > complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
> > > > suspects there's a missing "[m]" in there.
> > > > 
> > > 
> > > It does this on the first mmap of the device, in the hope that subsequent
> > > nopage, unmaps would not free the constituent pages in the scatterlist.
> > > 
> > 
> > But it's doing it wrongly, isn't it?  Or am I completely nuts?
> 
> No I think you're right. I'm not sure why this doesn't oops but I
> thought it was the (main) reason others wanted to get rid of this
> convoluted code earlier on. I see nobody else is planning to do anything
> about it though, so I figure I must have missed the reason why it isn't
> a problem.
> 
> But either way I don't think the code actually _does_ anything, even if
> its bugginess doesn't actually lead to a bug.
> 

I suspect nobody tried to munmap pages beyond the first one.

Yes, let's use a compound page in there and I expect Doug will be able to
test it for us sometime.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-20 10:47         ` Andrew Morton
@ 2006-01-20 16:32           ` Hugh Dickins
  2006-01-21 10:15             ` Nick Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2006-01-20 16:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, torvalds, linux-kernel, linux-scsi, dougg

On Fri, 20 Jan 2006, Andrew Morton wrote:
> Nick Piggin <npiggin@suse.de> wrote:
> > On Thu, Jan 19, 2006 at 02:05:25PM -0800, Andrew Morton wrote:
> > > Nick Piggin <npiggin@suse.de> wrote:
> > > >
> > > > On Wed, Jan 18, 2006 at 07:59:37PM -0800, Andrew Morton wrote:
> > > > > Nick Piggin <npiggin@suse.de> wrote:
> > > > > > -	/* N.B. correction _not_ applied to base page of each allocation */
> > > > > > -	for (k = 0; k < rsv_schp->k_use_sg; ++k, ++sg) {
> > > > > > -		for (m = PAGE_SIZE; m < sg->length; m += PAGE_SIZE) {
> > > > > > -			page = sg->page;
> > > > > > -			if (startFinish)
> > > > > > -				get_page(page);
> > > > > > -			else {
> > > > > > -				if (page_count(page) > 0)
> > > > > > -					__put_page(page);
> > > > > > -			}
> > > > > > -		}
> > > > > > -	}
> > > > > > -}
> > > > > 
> > > > > What on earth is the above trying to do?  The inner loop is a rather
> > > > > complex way of doing atomic_add(&page->count, sg->length/PAGE_SIZE).  One
> > > > > suspects there's a missing "[m]" in there.
> > > > > 
> > > > 
> > > > It does this on the first mmap of the device, in the hope that subsequent
> > > > nopage, unmaps would not free the constituent pages in the scatterlist.
> > > > 
> > > 
> > > But it's doing it wrongly, isn't it?  Or am I completely nuts?
> > 
> > No I think you're right. I'm not sure why this doesn't oops but I
> > thought it was the (main) reason others wanted to get rid of this
> > convoluted code earlier on. I see nobody else is planning to do anything
> > about it though, so I figure I must have missed the reason why it isn't
> > a problem.
> > 
> > But either way I don't think the code actually _does_ anything, even if
> > its bugginess doesn't actually lead to a bug.
> > 
> 
> I suspect nobody tried to munmap pages beyond the first one.
> 
> Yes, let's use a compound page in there and I expect Doug will be able to
> test it for us sometime.

That function did move page along in 2.6.15, but has got screwed up since:
good reason, I think, to speed Nick's patch through to clean it all away.

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] sg: simplify page_count manipulations
  2006-01-20 16:32           ` Hugh Dickins
@ 2006-01-21 10:15             ` Nick Piggin
  0 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2006-01-21 10:15 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Nick Piggin, torvalds, linux-kernel, linux-scsi, dougg

On Fri, Jan 20, 2006 at 04:32:00PM +0000, Hugh Dickins wrote:
> On Fri, 20 Jan 2006, Andrew Morton wrote:
> > 
> > I suspect nobody tried to munmap pages beyond the first one.
> > 
> > Yes, let's use a compound page in there and I expect Doug will be able to
> > test it for us sometime.
> 
> That function did move page along in 2.6.15, but has got screwed up since:
> good reason, I think, to speed Nick's patch through to clean it all away.
> 

Sorry, I'm wrong then (about the conversation around the initial patch).

Definitely restore 2.6.15 behaviour for 2.6.16 if you are hesitant to
change it. Doug's call I guess.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-01-21 10:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-18 15:52 [patch] sg: simplify page_count manipulations Nick Piggin
2006-01-19  3:59 ` Andrew Morton
2006-01-19 14:45   ` Nick Piggin
2006-01-19 22:05     ` Andrew Morton
2006-01-20 10:18       ` Nick Piggin
2006-01-20 10:47         ` Andrew Morton
2006-01-20 16:32           ` Hugh Dickins
2006-01-21 10:15             ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).