All of lore.kernel.org
 help / color / mirror / Atom feed
* Large amount of scsi-sgpool objects
@ 2009-03-03  1:28 Jan Engelhardt
  2009-03-03  9:31 ` Boaz Harrosh
  0 siblings, 1 reply; 66+ messages in thread
From: Jan Engelhardt @ 2009-03-03  1:28 UTC (permalink / raw)
  To: linux-scsi; +Cc: Linux Kernel Mailing List

Hi,


I am noticing that there are a lot of objects active after a few tens 
minutes of running xfs_fsr.

$ slabtop
  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
818616 818616 100%    0.16K  34109       24    136436K sgpool-8
253692 253692 100%    0.62K  42282        6    169128K sgpool-32
 52017  52016  99%    2.50K  17339        3    138712K sgpool-128
 26220  26219  99%    0.31K   2185       12      8740K sgpool-16
  8927   8574  96%    0.03K     79      113       316K size-32

$ uname -a
Linux yaguchi 2.6.29-rc6-rt #1 SMP PREEMPT RT 2009-02-19 23:12:33 
+0100 i686 athlon i386 GNU/Linux

What could be the problem that there are so many objects around and not 
freed? This makes the system pretty much unusable after a while as it 
runs towards a low-memory condition.

$ free
             total       used       free     shared    buffers     cached
Mem:        766592     758688       7904          0          0     137184
-/+ buffers/cache:     621504     145088
Swap:       795136         32     795104


$ cat /proc/meminfo 
MemTotal:         766592 kB
MemFree:           79388 kB
Buffers:               0 kB
Cached:            62120 kB
SwapCached:         1428 kB
Active:            74472 kB
Inactive:         113524 kB
Active(anon):      58228 kB
Inactive(anon):    69124 kB
Active(file):      16244 kB
Inactive(file):    44400 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:             0 kB
HighFree:              0 kB
LowTotal:         766592 kB
LowFree:           79388 kB
SwapTotal:        795136 kB
SwapFree:         792404 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        124440 kB
Mapped:            13672 kB
Slab:             478576 kB
SReclaimable:       5316 kB
SUnreclaim:       473260 kB
PageTables:         1220 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1178432 kB
Committed_AS:   4294655912 kB
VmallocTotal:     245816 kB
VmallocUsed:       27404 kB
VmallocChunk:     197108 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       4096 kB
DirectMap4k:      696256 kB
DirectMap4M:       90112 kB

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03  1:28 Large amount of scsi-sgpool objects Jan Engelhardt
@ 2009-03-03  9:31 ` Boaz Harrosh
  2009-03-03 15:21   ` James Bottomley
  0 siblings, 1 reply; 66+ messages in thread
From: Boaz Harrosh @ 2009-03-03  9:31 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linux-scsi, Linux Kernel Mailing List

Jan Engelhardt wrote:
> Hi,
> 
> 
> I am noticing that there are a lot of objects active after a few tens 
> minutes of running xfs_fsr.
> 
> $ slabtop
>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
>  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
>  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
>   8927   8574  96%    0.03K     79      113       316K size-32
> 

Looks like a leak, by failing to call scsi_release_buffers()
somehow. (Which was changed recently)

> $ uname -a
> Linux yaguchi 2.6.29-rc6-rt #1 SMP PREEMPT RT 2009-02-19 23:12:33 
> +0100 i686 athlon i386 GNU/Linux
> 

Did you have a chance to try exact same thing with vanilla 2.6.29-rc6?

try a bisect -- drivers/scsi/scsi_lib.c there should not be more then
a couple of patches

> What could be the problem that there are so many objects around and not 
> freed? 

Only one reason, failing to call scsi_release_buffers()

There should not be more then .can_queue sg-pool objects in flight
per disk. Which is rarely more then 255

This makes the system pretty much unusable after a while as it 
> runs towards a low-memory condition.
> 
> $ free
>              total       used       free     shared    buffers     cached
> Mem:        766592     758688       7904          0          0     137184
> -/+ buffers/cache:     621504     145088
> Swap:       795136         32     795104
> 
> 
> $ cat /proc/meminfo 
> MemTotal:         766592 kB
> MemFree:           79388 kB
> Buffers:               0 kB
> Cached:            62120 kB
> SwapCached:         1428 kB
> Active:            74472 kB
> Inactive:         113524 kB
> Active(anon):      58228 kB
> Inactive(anon):    69124 kB
> Active(file):      16244 kB
> Inactive(file):    44400 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> HighTotal:             0 kB
> HighFree:              0 kB
> LowTotal:         766592 kB
> LowFree:           79388 kB
> SwapTotal:        795136 kB
> SwapFree:         792404 kB
> Dirty:                 0 kB
> Writeback:             0 kB
> AnonPages:        124440 kB
> Mapped:            13672 kB
> Slab:             478576 kB
> SReclaimable:       5316 kB
> SUnreclaim:       473260 kB
> PageTables:         1220 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:     1178432 kB
> Committed_AS:   4294655912 kB
> VmallocTotal:     245816 kB
> VmallocUsed:       27404 kB
> VmallocChunk:     197108 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       4096 kB
> DirectMap4k:      696256 kB
> DirectMap4M:       90112 kB
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Please try some more tests, could it be a race that gets exposed in -rt?

Boaz

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03  9:31 ` Boaz Harrosh
@ 2009-03-03 15:21   ` James Bottomley
  2009-03-03 16:08     ` Jan Engelhardt
  0 siblings, 1 reply; 66+ messages in thread
From: James Bottomley @ 2009-03-03 15:21 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: Jan Engelhardt, linux-scsi, Linux Kernel Mailing List

On Tue, 2009-03-03 at 11:31 +0200, Boaz Harrosh wrote:
> Jan Engelhardt wrote:
> > Hi,
> > 
> > 
> > I am noticing that there are a lot of objects active after a few tens 
> > minutes of running xfs_fsr.
> > 
> > $ slabtop
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> >   8927   8574  96%    0.03K     79      113       316K size-32
> > 
> 
> Looks like a leak, by failing to call scsi_release_buffers()
> somehow. (Which was changed recently)

Firstly, I have to say I don't see this in the mainline tree, so could
you try that with your setup just to verify (git head at 2.6.29-rc6).

If this holds true, there must be a bad patch in the -rt tree.  You
should be able to diff scsi_lib.c to see if there's something missing.

Finally, there are one or two drivers (SCSI target) that do their own
buffer management, so what drivers are you using?

James



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 15:21   ` James Bottomley
@ 2009-03-03 16:08     ` Jan Engelhardt
  2009-03-03 16:24       ` Boaz Harrosh
  2009-03-03 16:25       ` James Bottomley
  0 siblings, 2 replies; 66+ messages in thread
From: Jan Engelhardt @ 2009-03-03 16:08 UTC (permalink / raw)
  To: James Bottomley
  Cc: Boaz Harrosh, linux-scsi, Linux Kernel Mailing List, linux-rt-users


On Tuesday 2009-03-03 16:21, James Bottomley wrote:
>> > $ slabtop
>> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
>> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
>> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
>> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
>> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
>> >   8927   8574  96%    0.03K     79      113       316K size-32
>> 
>> Looks like a leak, by failing to call scsi_release_buffers()
>> somehow. (Which was changed recently)
>
>Firstly, I have to say I don't see this in the mainline tree, so could
>you try that with your setup just to verify (git head at 2.6.29-rc6).

Yes, looking at the rt patch (in broken-out it's in origin.diff),
it seems a bit obvious - the scsi_release_buffers is not called anymore:


diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 940dc32..d4c6ac3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
 
 static void __scsi_release_buffers(struct scsi_cmnd *, int);
 
-/*
- * Function:    scsi_end_request()
- *
- * Purpose:     Post-processing of completed commands (usually invoked at end
- *		of upper level post-processing and scsi_io_completion).
- *
- * Arguments:   cmd	 - command that is complete.
- *              error    - 0 if I/O indicates success, < 0 for I/O error.
- *              bytes    - number of bytes of completed I/O
- *		requeue  - indicates whether we should requeue leftovers.
- *
- * Lock status: Assumed that lock is not held upon entry.
- *
- * Returns:     cmd if requeue required, NULL otherwise.
- *
- * Notes:       This is called for block device requests in order to
- *              mark some number of sectors as complete.
- * 
- *		We are guaranteeing that the request queue will be goosed
- *		at some point during this call.
- * Notes:	If cmd was requeued, upon return it will be a stale pointer.
- */
-static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
-					  int bytes, int requeue)
-{
-	struct request_queue *q = cmd->device->request_queue;
-	struct request *req = cmd->request;
-
-	/*
-	 * If there are blocks left over at the end, set up the command
-	 * to queue the remainder of them.
-	 */
-	if (blk_end_request(req, error, bytes)) {
-		int leftover = (req->hard_nr_sectors << 9);
-
-		if (blk_pc_request(req))
-			leftover = req->data_len;
-
-		/* kill remainder if no retrys */
-		if (error && scsi_noretry_cmd(cmd))
-			blk_end_request(req, error, leftover);
-		else {
-			if (requeue) {
-				/*
-				 * Bleah.  Leftovers again.  Stick the
-				 * leftovers in the front of the
-				 * queue, and goose the queue again.
-				 */
-				scsi_release_buffers(cmd);
-				scsi_requeue_command(q, cmd);
-				cmd = NULL;
-			}
-			return cmd;
-		}
-	}
-
-	/*
-	 * This will goose the queue request function at the end, so we don't
-	 * need to worry about launching another command.
-	 */
-	__scsi_release_buffers(cmd, 0);
-	scsi_next_command(cmd);
-	return NULL;
-}
-
 static inline unsigned int scsi_sgtable_index(unsigned short nents)
 {
 	unsigned int index;
@@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
 	int result = cmd->result;
-	int this_count;
 	struct request_queue *q = cmd->device->request_queue;
 	struct request *req = cmd->request;
 	int error = 0;
@@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
 				      "%d bytes done.\n",
 				      req->nr_sectors, good_bytes));
-
-	/* A number of bytes were successfully read.  If there
-	 * are leftovers and there is some kind of error
-	 * (result != 0), retry the rest.
-	 */
-	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
+	if (blk_end_request(req, error, good_bytes) == 0) {
+		/* This request is completely finished; start the next one */
+		scsi_next_command(cmd);
 		return;
-	this_count = blk_rq_bytes(req);
+	}
 
 	error = -EIO;
 
-	if (host_byte(result) == DID_RESET) {
+	/* The request isn't finished yet.  Figure out what to do next. */
+	if (result == 0) {
+		/* No error, so carry out the remainder of the request.
+		 * Failure to make forward progress counts against the
+		 * the number of retries.
+		 */
+		if (good_bytes > 0 || --req->retries >= 0)
+			action = ACTION_REPREP;
+		else {
+			action = ACTION_FAIL;
+			description = "Retries exhausted";
+		}
+	} else if (error && scsi_noretry_cmd(cmd)) {
+		/* Retrys are disallowed, so kill the remainder. */
+		action = ACTION_FAIL;
+	} else if (host_byte(result) == DID_RESET) {
 		/* Third party bus reset or reset for error recovery
 		 * reasons.  Just retry the command and see what
 		 * happens.



Would you happen to know where I have to reinsert them in the
rt modification shown here?

>If this holds true, there must be a bad patch in the -rt tree.  You
>should be able to diff scsi_lib.c to see if there's something missing.
>
>Finally, there are one or two drivers (SCSI target) that do their own
>buffer management, so what drivers are you using?


ata1.00: ATA-5: WDC WD400EB-00CPF0, 06.04G06, max UDMA/100
ata1.00: 78165360 sectors, multi 16: LBA
ata1.01: ATAPI: HL-DT-ST DVDRAM GSA-4160B, A301, max UDMA/66
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/66
scsi 0:0:0:0: Direct-Access     ATA      WDC WD400EB-00CP 06.0 PQ: 0 ANSI: 5
scsi 0:0:1:0: CD-ROM            HL-DT-ST DVDRAM GSA-4160B A301 PQ: 0 ANSI: 5
ata2.00: ATA-7: DIAMOND  250G 2B5400, RAMB1TU0, max UDMA/133
ata2.00: 490234752 sectors, multi 16: LBA48
ata2.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access     ATA      DIAMOND  250G 2B RAMB PQ: 0 ANSI: 5




Jan

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 16:08     ` Jan Engelhardt
@ 2009-03-03 16:24       ` Boaz Harrosh
  2009-03-03 17:59           ` Alan Stern
  2009-03-03 16:25       ` James Bottomley
  1 sibling, 1 reply; 66+ messages in thread
From: Boaz Harrosh @ 2009-03-03 16:24 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: James Bottomley, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Alan Stern

Jan Engelhardt wrote:
> On Tuesday 2009-03-03 16:21, James Bottomley wrote:
>>>> $ slabtop
>>>>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
>>>> 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
>>>> 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
>>>>  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
>>>>  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
>>>>   8927   8574  96%    0.03K     79      113       316K size-32
>>> Looks like a leak, by failing to call scsi_release_buffers()
>>> somehow. (Which was changed recently)
>> Firstly, I have to say I don't see this in the mainline tree, so could
>> you try that with your setup just to verify (git head at 2.6.29-rc6).
> 
> Yes, looking at the rt patch (in broken-out it's in origin.diff),
> it seems a bit obvious - the scsi_release_buffers is not called anymore:
> 
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 940dc32..d4c6ac3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
>  
>  static void __scsi_release_buffers(struct scsi_cmnd *, int);
>  
> -/*
> - * Function:    scsi_end_request()
> - *
> - * Purpose:     Post-processing of completed commands (usually invoked at end
> - *		of upper level post-processing and scsi_io_completion).
> - *
> - * Arguments:   cmd	 - command that is complete.
> - *              error    - 0 if I/O indicates success, < 0 for I/O error.
> - *              bytes    - number of bytes of completed I/O
> - *		requeue  - indicates whether we should requeue leftovers.
> - *
> - * Lock status: Assumed that lock is not held upon entry.
> - *
> - * Returns:     cmd if requeue required, NULL otherwise.
> - *
> - * Notes:       This is called for block device requests in order to
> - *              mark some number of sectors as complete.
> - * 
> - *		We are guaranteeing that the request queue will be goosed
> - *		at some point during this call.
> - * Notes:	If cmd was requeued, upon return it will be a stale pointer.
> - */
> -static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
> -					  int bytes, int requeue)
> -{
> -	struct request_queue *q = cmd->device->request_queue;
> -	struct request *req = cmd->request;
> -
> -	/*
> -	 * If there are blocks left over at the end, set up the command
> -	 * to queue the remainder of them.
> -	 */
> -	if (blk_end_request(req, error, bytes)) {
> -		int leftover = (req->hard_nr_sectors << 9);
> -
> -		if (blk_pc_request(req))
> -			leftover = req->data_len;
> -
> -		/* kill remainder if no retrys */
> -		if (error && scsi_noretry_cmd(cmd))
> -			blk_end_request(req, error, leftover);
> -		else {
> -			if (requeue) {
> -				/*
> -				 * Bleah.  Leftovers again.  Stick the
> -				 * leftovers in the front of the
> -				 * queue, and goose the queue again.
> -				 */
> -				scsi_release_buffers(cmd);
> -				scsi_requeue_command(q, cmd);
> -				cmd = NULL;
> -			}
> -			return cmd;
> -		}
> -	}
> -
> -	/*
> -	 * This will goose the queue request function at the end, so we don't
> -	 * need to worry about launching another command.
> -	 */
> -	__scsi_release_buffers(cmd, 0);
> -	scsi_next_command(cmd);
> -	return NULL;
> -}
> -
>  static inline unsigned int scsi_sgtable_index(unsigned short nents)
>  {
>  	unsigned int index;
> @@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
>  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  {
>  	int result = cmd->result;
> -	int this_count;
>  	struct request_queue *q = cmd->device->request_queue;
>  	struct request *req = cmd->request;
>  	int error = 0;
> @@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
>  				      "%d bytes done.\n",
>  				      req->nr_sectors, good_bytes));
> -
> -	/* A number of bytes were successfully read.  If there
> -	 * are leftovers and there is some kind of error
> -	 * (result != 0), retry the rest.
> -	 */
> -	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
> +	if (blk_end_request(req, error, good_bytes) == 0) {
> +		/* This request is completely finished; start the next one */
> +		scsi_next_command(cmd);
>  		return;
> -	this_count = blk_rq_bytes(req);
> +	}
>  
>  	error = -EIO;
>  
> -	if (host_byte(result) == DID_RESET) {
> +	/* The request isn't finished yet.  Figure out what to do next. */
> +	if (result == 0) {
> +		/* No error, so carry out the remainder of the request.
> +		 * Failure to make forward progress counts against the
> +		 * the number of retries.
> +		 */
> +		if (good_bytes > 0 || --req->retries >= 0)
> +			action = ACTION_REPREP;
> +		else {
> +			action = ACTION_FAIL;
> +			description = "Retries exhausted";
> +		}
> +	} else if (error && scsi_noretry_cmd(cmd)) {
> +		/* Retrys are disallowed, so kill the remainder. */
> +		action = ACTION_FAIL;
> +	} else if (host_byte(result) == DID_RESET) {
>  		/* Third party bus reset or reset for error recovery
>  		 * reasons.  Just retry the command and see what
>  		 * happens.
> 
> 
> 
> Would you happen to know where I have to reinsert them in the
> rt modification shown here?
> 

You lost me. Why does rt needs to patch scsi_io_completion at all?
You should remove any rt patches that modify scsi_lib.c and revert to
vanilla 2.6.29-rc6 (scsi wise that is).

The above diff looks like something that was sent in the past to the mailing
list, but only half of it. It was sent by Alan Stern. It might patch but
it is not applicable any more because of changes made since. 

>> If this holds true, there must be a bad patch in the -rt tree.  You
>> should be able to diff scsi_lib.c to see if there's something missing.
>>
>> Finally, there are one or two drivers (SCSI target) that do their own
>> buffer management, so what drivers are you using?
> 
> 
> ata1.00: ATA-5: WDC WD400EB-00CPF0, 06.04G06, max UDMA/100
> ata1.00: 78165360 sectors, multi 16: LBA
> ata1.01: ATAPI: HL-DT-ST DVDRAM GSA-4160B, A301, max UDMA/66
> ata1.00: configured for UDMA/100
> ata1.01: configured for UDMA/66
> scsi 0:0:0:0: Direct-Access     ATA      WDC WD400EB-00CP 06.0 PQ: 0 ANSI: 5
> scsi 0:0:1:0: CD-ROM            HL-DT-ST DVDRAM GSA-4160B A301 PQ: 0 ANSI: 5
> ata2.00: ATA-7: DIAMOND  250G 2B5400, RAMB1TU0, max UDMA/133
> ata2.00: 490234752 sectors, multi 16: LBA48
> ata2.00: configured for UDMA/133
> scsi 1:0:0:0: Direct-Access     ATA      DIAMOND  250G 2B RAMB PQ: 0 ANSI: 5
> 
> 
> 
> 
> Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 16:08     ` Jan Engelhardt
  2009-03-03 16:24       ` Boaz Harrosh
@ 2009-03-03 16:25       ` James Bottomley
  2009-03-03 17:19         ` Thomas Gleixner
  2009-03-03 19:22         ` Large amount of scsi-sgpool objects Ingo Molnar
  1 sibling, 2 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-03 16:25 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Boaz Harrosh, linux-scsi, Linux Kernel Mailing List, linux-rt-users

On Tue, 2009-03-03 at 17:08 +0100, Jan Engelhardt wrote:
> On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> >> > $ slabtop
> >> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> >> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> >> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> >> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> >> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> >> >   8927   8574  96%    0.03K     79      113       316K size-32
> >> 
> >> Looks like a leak, by failing to call scsi_release_buffers()
> >> somehow. (Which was changed recently)
> >
> >Firstly, I have to say I don't see this in the mainline tree, so could
> >you try that with your setup just to verify (git head at 2.6.29-rc6).
> 
> Yes, looking at the rt patch (in broken-out it's in origin.diff),
> it seems a bit obvious - the scsi_release_buffers is not called anymore:

OK, this is a bad patch, so just revert it.  It was posted to linux-scsi
initially in this form before the author posted a new one with the
missing release buffers added.  It looks like the first incarnation got
pulled into the -rt tree for some reasons.

So the real question is why does the -rt tree even have patches not in
the vanilla SCSI tree?  This type of cockup clearly demonstrates why
it's a bad idea.

James

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 940dc32..d4c6ac3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
>  
>  static void __scsi_release_buffers(struct scsi_cmnd *, int);
>  
> -/*
> - * Function:    scsi_end_request()
> - *
> - * Purpose:     Post-processing of completed commands (usually invoked at end
> - *		of upper level post-processing and scsi_io_completion).
> - *
> - * Arguments:   cmd	 - command that is complete.
> - *              error    - 0 if I/O indicates success, < 0 for I/O error.
> - *              bytes    - number of bytes of completed I/O
> - *		requeue  - indicates whether we should requeue leftovers.
> - *
> - * Lock status: Assumed that lock is not held upon entry.
> - *
> - * Returns:     cmd if requeue required, NULL otherwise.
> - *
> - * Notes:       This is called for block device requests in order to
> - *              mark some number of sectors as complete.
> - * 
> - *		We are guaranteeing that the request queue will be goosed
> - *		at some point during this call.
> - * Notes:	If cmd was requeued, upon return it will be a stale pointer.
> - */
> -static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
> -					  int bytes, int requeue)
> -{
> -	struct request_queue *q = cmd->device->request_queue;
> -	struct request *req = cmd->request;
> -
> -	/*
> -	 * If there are blocks left over at the end, set up the command
> -	 * to queue the remainder of them.
> -	 */
> -	if (blk_end_request(req, error, bytes)) {
> -		int leftover = (req->hard_nr_sectors << 9);
> -
> -		if (blk_pc_request(req))
> -			leftover = req->data_len;
> -
> -		/* kill remainder if no retrys */
> -		if (error && scsi_noretry_cmd(cmd))
> -			blk_end_request(req, error, leftover);
> -		else {
> -			if (requeue) {
> -				/*
> -				 * Bleah.  Leftovers again.  Stick the
> -				 * leftovers in the front of the
> -				 * queue, and goose the queue again.
> -				 */
> -				scsi_release_buffers(cmd);
> -				scsi_requeue_command(q, cmd);
> -				cmd = NULL;
> -			}
> -			return cmd;
> -		}
> -	}
> -
> -	/*
> -	 * This will goose the queue request function at the end, so we don't
> -	 * need to worry about launching another command.
> -	 */
> -	__scsi_release_buffers(cmd, 0);
> -	scsi_next_command(cmd);
> -	return NULL;
> -}
> -
>  static inline unsigned int scsi_sgtable_index(unsigned short nents)
>  {
>  	unsigned int index;
> @@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
>  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  {
>  	int result = cmd->result;
> -	int this_count;
>  	struct request_queue *q = cmd->device->request_queue;
>  	struct request *req = cmd->request;
>  	int error = 0;
> @@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
>  				      "%d bytes done.\n",
>  				      req->nr_sectors, good_bytes));
> -
> -	/* A number of bytes were successfully read.  If there
> -	 * are leftovers and there is some kind of error
> -	 * (result != 0), retry the rest.
> -	 */
> -	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
> +	if (blk_end_request(req, error, good_bytes) == 0) {
> +		/* This request is completely finished; start the next one */
> +		scsi_next_command(cmd);
>  		return;
> -	this_count = blk_rq_bytes(req);
> +	}
>  
>  	error = -EIO;
>  
> -	if (host_byte(result) == DID_RESET) {
> +	/* The request isn't finished yet.  Figure out what to do next. */
> +	if (result == 0) {
> +		/* No error, so carry out the remainder of the request.
> +		 * Failure to make forward progress counts against the
> +		 * the number of retries.
> +		 */
> +		if (good_bytes > 0 || --req->retries >= 0)
> +			action = ACTION_REPREP;
> +		else {
> +			action = ACTION_FAIL;
> +			description = "Retries exhausted";
> +		}
> +	} else if (error && scsi_noretry_cmd(cmd)) {
> +		/* Retrys are disallowed, so kill the remainder. */
> +		action = ACTION_FAIL;
> +	} else if (host_byte(result) == DID_RESET) {
>  		/* Third party bus reset or reset for error recovery
>  		 * reasons.  Just retry the command and see what
>  		 * happens.
> 
> 
> 
> Would you happen to know where I have to reinsert them in the
> rt modification shown here?
> 
> >If this holds true, there must be a bad patch in the -rt tree.  You
> >should be able to diff scsi_lib.c to see if there's something missing.
> >
> >Finally, there are one or two drivers (SCSI target) that do their own
> >buffer management, so what drivers are you using?
> 
> 
> ata1.00: ATA-5: WDC WD400EB-00CPF0, 06.04G06, max UDMA/100
> ata1.00: 78165360 sectors, multi 16: LBA
> ata1.01: ATAPI: HL-DT-ST DVDRAM GSA-4160B, A301, max UDMA/66
> ata1.00: configured for UDMA/100
> ata1.01: configured for UDMA/66
> scsi 0:0:0:0: Direct-Access     ATA      WDC WD400EB-00CP 06.0 PQ: 0 ANSI: 5
> scsi 0:0:1:0: CD-ROM            HL-DT-ST DVDRAM GSA-4160B A301 PQ: 0 ANSI: 5
> ata2.00: ATA-7: DIAMOND  250G 2B5400, RAMB1TU0, max UDMA/133
> ata2.00: 490234752 sectors, multi 16: LBA48
> ata2.00: configured for UDMA/133
> scsi 1:0:0:0: Direct-Access     ATA      DIAMOND  250G 2B RAMB PQ: 0 ANSI: 5
> 
> 
> 
> 
> Jan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 16:25       ` James Bottomley
@ 2009-03-03 17:19         ` Thomas Gleixner
  2009-03-03 22:07             ` Thomas Gleixner
  2009-03-03 19:22         ` Large amount of scsi-sgpool objects Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-03 17:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 3 Mar 2009, James Bottomley wrote:

> On Tue, 2009-03-03 at 17:08 +0100, Jan Engelhardt wrote:
> > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> > >> > $ slabtop
> > >> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > >> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > >> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> > >> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> > >> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> > >> >   8927   8574  96%    0.03K     79      113       316K size-32
> > >> 
> > >> Looks like a leak, by failing to call scsi_release_buffers()
> > >> somehow. (Which was changed recently)
> > >
> > >Firstly, I have to say I don't see this in the mainline tree, so could
> > >you try that with your setup just to verify (git head at 2.6.29-rc6).
> > 
> > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> 
> OK, this is a bad patch, so just revert it.  It was posted to linux-scsi
> initially in this form before the author posted a new one with the
> missing release buffers added.  It looks like the first incarnation got
> pulled into the -rt tree for some reasons.
> 
> So the real question is why does the -rt tree even have patches not in
> the vanilla SCSI tree?  This type of cockup clearly demonstrates why
> it's a bad idea.

My bad. I was playing with that to get rid of the aic7xxx wreckage on
one of my test boxen and forgot to remove it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 16:24       ` Boaz Harrosh
@ 2009-03-03 17:59           ` Alan Stern
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Stern @ 2009-03-03 17:59 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: Jan Engelhardt, James Bottomley, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 3 Mar 2009, Boaz Harrosh wrote:

> Jan Engelhardt wrote:
> > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> >>>> $ slabtop
> >>>>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> >>>> 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> >>>> 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> >>>>  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> >>>>  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> >>>>   8927   8574  96%    0.03K     79      113       316K size-32
> >>> Looks like a leak, by failing to call scsi_release_buffers()
> >>> somehow. (Which was changed recently)
> >> Firstly, I have to say I don't see this in the mainline tree, so could
> >> you try that with your setup just to verify (git head at 2.6.29-rc6).
> > 
> > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> > 
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 940dc32..d4c6ac3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
> >  
> >  static void __scsi_release_buffers(struct scsi_cmnd *, int);
> >  
> > -/*
> > - * Function:    scsi_end_request()
> > - *
> > - * Purpose:     Post-processing of completed commands (usually invoked at end
> > - *		of upper level post-processing and scsi_io_completion).
> > - *
> > - * Arguments:   cmd	 - command that is complete.
> > - *              error    - 0 if I/O indicates success, < 0 for I/O error.
> > - *              bytes    - number of bytes of completed I/O
> > - *		requeue  - indicates whether we should requeue leftovers.
> > - *
> > - * Lock status: Assumed that lock is not held upon entry.
> > - *
> > - * Returns:     cmd if requeue required, NULL otherwise.
> > - *
> > - * Notes:       This is called for block device requests in order to
> > - *              mark some number of sectors as complete.
> > - * 
> > - *		We are guaranteeing that the request queue will be goosed
> > - *		at some point during this call.
> > - * Notes:	If cmd was requeued, upon return it will be a stale pointer.
> > - */
> > -static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
> > -					  int bytes, int requeue)
> > -{
> > -	struct request_queue *q = cmd->device->request_queue;
> > -	struct request *req = cmd->request;
> > -
> > -	/*
> > -	 * If there are blocks left over at the end, set up the command
> > -	 * to queue the remainder of them.
> > -	 */
> > -	if (blk_end_request(req, error, bytes)) {
> > -		int leftover = (req->hard_nr_sectors << 9);
> > -
> > -		if (blk_pc_request(req))
> > -			leftover = req->data_len;
> > -
> > -		/* kill remainder if no retrys */
> > -		if (error && scsi_noretry_cmd(cmd))
> > -			blk_end_request(req, error, leftover);
> > -		else {
> > -			if (requeue) {
> > -				/*
> > -				 * Bleah.  Leftovers again.  Stick the
> > -				 * leftovers in the front of the
> > -				 * queue, and goose the queue again.
> > -				 */
> > -				scsi_release_buffers(cmd);
> > -				scsi_requeue_command(q, cmd);
> > -				cmd = NULL;
> > -			}
> > -			return cmd;
> > -		}
> > -	}
> > -
> > -	/*
> > -	 * This will goose the queue request function at the end, so we don't
> > -	 * need to worry about launching another command.
> > -	 */
> > -	__scsi_release_buffers(cmd, 0);
> > -	scsi_next_command(cmd);
> > -	return NULL;
> > -}
> > -
> >  static inline unsigned int scsi_sgtable_index(unsigned short nents)
> >  {
> >  	unsigned int index;
> > @@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
> >  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> >  {
> >  	int result = cmd->result;
> > -	int this_count;
> >  	struct request_queue *q = cmd->device->request_queue;
> >  	struct request *req = cmd->request;
> >  	int error = 0;
> > @@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> >  	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
> >  				      "%d bytes done.\n",
> >  				      req->nr_sectors, good_bytes));
> > -
> > -	/* A number of bytes were successfully read.  If there
> > -	 * are leftovers and there is some kind of error
> > -	 * (result != 0), retry the rest.
> > -	 */
> > -	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
> > +	if (blk_end_request(req, error, good_bytes) == 0) {
> > +		/* This request is completely finished; start the next one */
> > +		scsi_next_command(cmd);
> >  		return;
> > -	this_count = blk_rq_bytes(req);
> > +	}

> You lost me. Why does rt needs to patch scsi_io_completion at all?
> You should remove any rt patches that modify scsi_lib.c and revert to
> vanilla 2.6.29-rc6 (scsi wise that is).
> 
> The above diff looks like something that was sent in the past to the mailing
> list, but only half of it. It was sent by Alan Stern. It might patch but
> it is not applicable any more because of changes made since. 

That's right; it is an old version of a patch which no longer applies 
to the current kernel (the __scsi_release_buffers() call was added 
after that patch was written).  An updated version of the patch has 
been submitted here:

	http://marc.info/?l=linux-scsi&m=123507641620649&w=2

Alan Stern


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
@ 2009-03-03 17:59           ` Alan Stern
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Stern @ 2009-03-03 17:59 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: Jan Engelhardt, James Bottomley, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 3 Mar 2009, Boaz Harrosh wrote:

> Jan Engelhardt wrote:
> > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> >>>> $ slabtop
> >>>>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> >>>> 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> >>>> 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> >>>>  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> >>>>  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> >>>>   8927   8574  96%    0.03K     79      113       316K size-32
> >>> Looks like a leak, by failing to call scsi_release_buffers()
> >>> somehow. (Which was changed recently)
> >> Firstly, I have to say I don't see this in the mainline tree, so could
> >> you try that with your setup just to verify (git head at 2.6.29-rc6).
> > 
> > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> > 
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 940dc32..d4c6ac3 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
> >  
> >  static void __scsi_release_buffers(struct scsi_cmnd *, int);
> >  
> > -/*
> > - * Function:    scsi_end_request()
> > - *
> > - * Purpose:     Post-processing of completed commands (usually invoked at end
> > - *		of upper level post-processing and scsi_io_completion).
> > - *
> > - * Arguments:   cmd	 - command that is complete.
> > - *              error    - 0 if I/O indicates success, < 0 for I/O error.
> > - *              bytes    - number of bytes of completed I/O
> > - *		requeue  - indicates whether we should requeue leftovers.
> > - *
> > - * Lock status: Assumed that lock is not held upon entry.
> > - *
> > - * Returns:     cmd if requeue required, NULL otherwise.
> > - *
> > - * Notes:       This is called for block device requests in order to
> > - *              mark some number of sectors as complete.
> > - * 
> > - *		We are guaranteeing that the request queue will be goosed
> > - *		at some point during this call.
> > - * Notes:	If cmd was requeued, upon return it will be a stale pointer.
> > - */
> > -static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
> > -					  int bytes, int requeue)
> > -{
> > -	struct request_queue *q = cmd->device->request_queue;
> > -	struct request *req = cmd->request;
> > -
> > -	/*
> > -	 * If there are blocks left over at the end, set up the command
> > -	 * to queue the remainder of them.
> > -	 */
> > -	if (blk_end_request(req, error, bytes)) {
> > -		int leftover = (req->hard_nr_sectors << 9);
> > -
> > -		if (blk_pc_request(req))
> > -			leftover = req->data_len;
> > -
> > -		/* kill remainder if no retrys */
> > -		if (error && scsi_noretry_cmd(cmd))
> > -			blk_end_request(req, error, leftover);
> > -		else {
> > -			if (requeue) {
> > -				/*
> > -				 * Bleah.  Leftovers again.  Stick the
> > -				 * leftovers in the front of the
> > -				 * queue, and goose the queue again.
> > -				 */
> > -				scsi_release_buffers(cmd);
> > -				scsi_requeue_command(q, cmd);
> > -				cmd = NULL;
> > -			}
> > -			return cmd;
> > -		}
> > -	}
> > -
> > -	/*
> > -	 * This will goose the queue request function at the end, so we don't
> > -	 * need to worry about launching another command.
> > -	 */
> > -	__scsi_release_buffers(cmd, 0);
> > -	scsi_next_command(cmd);
> > -	return NULL;
> > -}
> > -
> >  static inline unsigned int scsi_sgtable_index(unsigned short nents)
> >  {
> >  	unsigned int index;
> > @@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
> >  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> >  {
> >  	int result = cmd->result;
> > -	int this_count;
> >  	struct request_queue *q = cmd->device->request_queue;
> >  	struct request *req = cmd->request;
> >  	int error = 0;
> > @@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> >  	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
> >  				      "%d bytes done.\n",
> >  				      req->nr_sectors, good_bytes));
> > -
> > -	/* A number of bytes were successfully read.  If there
> > -	 * are leftovers and there is some kind of error
> > -	 * (result != 0), retry the rest.
> > -	 */
> > -	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
> > +	if (blk_end_request(req, error, good_bytes) == 0) {
> > +		/* This request is completely finished; start the next one */
> > +		scsi_next_command(cmd);
> >  		return;
> > -	this_count = blk_rq_bytes(req);
> > +	}

> You lost me. Why does rt needs to patch scsi_io_completion at all?
> You should remove any rt patches that modify scsi_lib.c and revert to
> vanilla 2.6.29-rc6 (scsi wise that is).
> 
> The above diff looks like something that was sent in the past to the mailing
> list, but only half of it. It was sent by Alan Stern. It might patch but
> it is not applicable any more because of changes made since. 

That's right; it is an old version of a patch which no longer applies 
to the current kernel (the __scsi_release_buffers() call was added 
after that patch was written).  An updated version of the patch has 
been submitted here:

	http://marc.info/?l=linux-scsi&m=123507641620649&w=2

Alan Stern


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 16:25       ` James Bottomley
  2009-03-03 17:19         ` Thomas Gleixner
@ 2009-03-03 19:22         ` Ingo Molnar
  2009-03-03 21:25           ` James Bottomley
  1 sibling, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-03 19:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2009-03-03 at 17:08 +0100, Jan Engelhardt wrote:
> > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> > >> > $ slabtop
> > >> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > >> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > >> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> > >> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> > >> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> > >> >   8927   8574  96%    0.03K     79      113       316K size-32
> > >> 
> > >> Looks like a leak, by failing to call scsi_release_buffers()
> > >> somehow. (Which was changed recently)
> > >
> > >Firstly, I have to say I don't see this in the mainline tree, so could
> > >you try that with your setup just to verify (git head at 2.6.29-rc6).
> > 
> > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> 
> OK, this is a bad patch, so just revert it.  It was posted to 
> linux-scsi initially in this form before the author posted a 
> new one with the missing release buffers added.  It looks like 
> the first incarnation got pulled into the -rt tree for some 
> reasons.

Uhm. I applied a test-patch from Alan Stern, to possibly fix an 
SCSI lockup with aic7xxx that _I_ reported to you and then to 
the scsi-list.

You were Cc:-ed to that test patch and to my bugreport as well, 
all the way. Do you claim that you dont remember it?

The saga is still documented in tip:out-of-tree (which is a 
special branch with out-of-tree hotfixes):

 7e4cbd1: fix "scsi: aic7xxx hang since v2.6.28-rc1"
 e027abc: scsi: temporarily undo scsi reverts
 813104e: Revert "[SCSI] simplify scsi_io_completion()"
 84db545: Revert "[SCSI] Fix uninitialized variable error in scsi_io_completion"
 0eb6038: Revert "[SCSI] Fix error handling for DIF/DIX"
 3cd94dd: Revert "[SCSI] scsi_lib: don't decrement busy counters when inserting commands"
 c27aed5: Revert "[SCSI] scsi_lib: fix DID_RESET status problems"

I wasnt Cc:-ed on the updated patch AFAICS, so i didnt pick it 
up.

> So the real question is why does the -rt tree even have 
> patches not in the vanilla SCSI tree?  This type of cockup 
> clearly demonstrates why it's a bad idea.

Believe me, i have better things to do than to track down your 
regressions. I applied a fix/test patch sent to me by SCSI 
folks.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 17:59           ` Alan Stern
  (?)
@ 2009-03-03 20:56           ` Ingo Molnar
  2009-03-03 21:06               ` Alan Stern
  -1 siblings, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-03 20:56 UTC (permalink / raw)
  To: Alan Stern
  Cc: Boaz Harrosh, Jan Engelhardt, James Bottomley, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users


* Alan Stern <stern@rowland.harvard.edu> wrote:

> On Tue, 3 Mar 2009, Boaz Harrosh wrote:
> 
> > Jan Engelhardt wrote:
> > > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> > >>>> $ slabtop
> > >>>>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > >>>> 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > >>>> 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> > >>>>  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> > >>>>  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> > >>>>   8927   8574  96%    0.03K     79      113       316K size-32
> > >>> Looks like a leak, by failing to call scsi_release_buffers()
> > >>> somehow. (Which was changed recently)
> > >> Firstly, I have to say I don't see this in the mainline tree, so could
> > >> you try that with your setup just to verify (git head at 2.6.29-rc6).
> > > 
> > > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> > > 
> > > 
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index 940dc32..d4c6ac3 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -703,71 +703,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
> > >  
> > >  static void __scsi_release_buffers(struct scsi_cmnd *, int);
> > >  
> > > -/*
> > > - * Function:    scsi_end_request()
> > > - *
> > > - * Purpose:     Post-processing of completed commands (usually invoked at end
> > > - *		of upper level post-processing and scsi_io_completion).
> > > - *
> > > - * Arguments:   cmd	 - command that is complete.
> > > - *              error    - 0 if I/O indicates success, < 0 for I/O error.
> > > - *              bytes    - number of bytes of completed I/O
> > > - *		requeue  - indicates whether we should requeue leftovers.
> > > - *
> > > - * Lock status: Assumed that lock is not held upon entry.
> > > - *
> > > - * Returns:     cmd if requeue required, NULL otherwise.
> > > - *
> > > - * Notes:       This is called for block device requests in order to
> > > - *              mark some number of sectors as complete.
> > > - * 
> > > - *		We are guaranteeing that the request queue will be goosed
> > > - *		at some point during this call.
> > > - * Notes:	If cmd was requeued, upon return it will be a stale pointer.
> > > - */
> > > -static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
> > > -					  int bytes, int requeue)
> > > -{
> > > -	struct request_queue *q = cmd->device->request_queue;
> > > -	struct request *req = cmd->request;
> > > -
> > > -	/*
> > > -	 * If there are blocks left over at the end, set up the command
> > > -	 * to queue the remainder of them.
> > > -	 */
> > > -	if (blk_end_request(req, error, bytes)) {
> > > -		int leftover = (req->hard_nr_sectors << 9);
> > > -
> > > -		if (blk_pc_request(req))
> > > -			leftover = req->data_len;
> > > -
> > > -		/* kill remainder if no retrys */
> > > -		if (error && scsi_noretry_cmd(cmd))
> > > -			blk_end_request(req, error, leftover);
> > > -		else {
> > > -			if (requeue) {
> > > -				/*
> > > -				 * Bleah.  Leftovers again.  Stick the
> > > -				 * leftovers in the front of the
> > > -				 * queue, and goose the queue again.
> > > -				 */
> > > -				scsi_release_buffers(cmd);
> > > -				scsi_requeue_command(q, cmd);
> > > -				cmd = NULL;
> > > -			}
> > > -			return cmd;
> > > -		}
> > > -	}
> > > -
> > > -	/*
> > > -	 * This will goose the queue request function at the end, so we don't
> > > -	 * need to worry about launching another command.
> > > -	 */
> > > -	__scsi_release_buffers(cmd, 0);
> > > -	scsi_next_command(cmd);
> > > -	return NULL;
> > > -}
> > > -
> > >  static inline unsigned int scsi_sgtable_index(unsigned short nents)
> > >  {
> > >  	unsigned int index;
> > > @@ -929,7 +864,6 @@ static void scsi_end_bidi_request(struct scsi_cmnd *cmd)
> > >  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> > >  {
> > >  	int result = cmd->result;
> > > -	int this_count;
> > >  	struct request_queue *q = cmd->device->request_queue;
> > >  	struct request *req = cmd->request;
> > >  	int error = 0;
> > > @@ -980,18 +914,30 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
> > >  	SCSI_LOG_HLCOMPLETE(1, printk("%ld sectors total, "
> > >  				      "%d bytes done.\n",
> > >  				      req->nr_sectors, good_bytes));
> > > -
> > > -	/* A number of bytes were successfully read.  If there
> > > -	 * are leftovers and there is some kind of error
> > > -	 * (result != 0), retry the rest.
> > > -	 */
> > > -	if (scsi_end_request(cmd, error, good_bytes, result == 0) == NULL)
> > > +	if (blk_end_request(req, error, good_bytes) == 0) {
> > > +		/* This request is completely finished; start the next one */
> > > +		scsi_next_command(cmd);
> > >  		return;
> > > -	this_count = blk_rq_bytes(req);
> > > +	}
> 
> > You lost me. Why does rt needs to patch scsi_io_completion at all?
> > You should remove any rt patches that modify scsi_lib.c and revert to
> > vanilla 2.6.29-rc6 (scsi wise that is).
> > 
> > The above diff looks like something that was sent in the past to the mailing
> > list, but only half of it. It was sent by Alan Stern. It might patch but
> > it is not applicable any more because of changes made since. 
> 
> That's right; it is an old version of a patch which no longer applies 
> to the current kernel (the __scsi_release_buffers() call was added 
> after that patch was written).  An updated version of the patch has 
> been submitted here:
> 
> 	http://marc.info/?l=linux-scsi&m=123507641620649&w=2

ah, i missed that patch because you used the exact same subject 
line. (The standard lkml convention is to use "v2, v3" 
postfixes, to make sure people notice that it's an update. It 
helps avoid such mistakes.)

I've picked up the v2 delta below into tip:out-of-tree - thanks 
Alan! This will also get into the next -rt release.

Both patches will go away from the tip:out-of-tree hot-fixes 
branch once the fix is upstream via the SCSI tree.

Any ETA for that? The lockup bug it fixes is very serious.

	Ingo

-------------------->
>From 6b1f22c4418f4684806b4ee499f2c623f5ed998b Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Tue, 3 Mar 2009 21:52:16 +0100
Subject: [PATCH] fix "scsi: aic7xxx hang since v2.6.28-rc1", v2

updated "SCSI: remove scsi_end_request()" patch.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/scsi/scsi_lib.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8388b4e..abd2652 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -916,12 +916,11 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 				      req->nr_sectors, good_bytes));
 	if (blk_end_request(req, error, good_bytes) == 0) {
 		/* This request is completely finished; start the next one */
+		 __scsi_release_buffers(cmd, 0);
 		scsi_next_command(cmd);
 		return;
 	}
 
-	error = -EIO;
-
 	/* The request isn't finished yet.  Figure out what to do next. */
 	if (result == 0) {
 		/* No error, so carry out the remainder of the request.
@@ -931,8 +930,9 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 		if (good_bytes > 0 || --req->retries >= 0)
 			action = ACTION_REPREP;
 		else {
-			action = ACTION_FAIL;
 			description = "Retries exhausted";
+			action = ACTION_FAIL;
+			error = -EIO;
 		}
 	} else if (error && scsi_noretry_cmd(cmd)) {
 		/* Retrys are disallowed, so kill the remainder. */
@@ -944,6 +944,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 		 */
 		action = ACTION_RETRY;
 	} else if (sense_valid && !sense_deferred) {
+		error = -EIO;
 		switch (sshdr.sense_key) {
 		case UNIT_ATTENTION:
 			if (cmd->device->removable) {
@@ -1043,7 +1044,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			if (driver_byte(result) & DRIVER_SENSE)
 				scsi_print_sense("", cmd);
 		}
-		blk_end_request(req, -EIO, blk_rq_bytes(req));
+		blk_end_request(req, error, blk_rq_bytes(req));
 		scsi_next_command(cmd);
 		break;
 	case ACTION_REPREP:

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 20:56           ` Ingo Molnar
@ 2009-03-03 21:06               ` Alan Stern
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Stern @ 2009-03-03 21:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Boaz Harrosh, Jan Engelhardt, James Bottomley, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 3 Mar 2009, Ingo Molnar wrote:

> > > The above diff looks like something that was sent in the past to the mailing
> > > list, but only half of it. It was sent by Alan Stern. It might patch but
> > > it is not applicable any more because of changes made since. 
> > 
> > That's right; it is an old version of a patch which no longer applies 
> > to the current kernel (the __scsi_release_buffers() call was added 
> > after that patch was written).  An updated version of the patch has 
> > been submitted here:
> > 
> > 	http://marc.info/?l=linux-scsi&m=123507641620649&w=2
> 
> ah, i missed that patch because you used the exact same subject 
> line. (The standard lkml convention is to use "v2, v3" 
> postfixes, to make sure people notice that it's an update. It 
> helps avoid such mistakes.)

Sorry, I just forgot to update the subject line.

> I've picked up the v2 delta below into tip:out-of-tree - thanks 
> Alan! This will also get into the next -rt release.
> 
> Both patches will go away from the tip:out-of-tree hot-fixes 
> branch once the fix is upstream via the SCSI tree.
> 
> Any ETA for that? The lockup bug it fixes is very serious.

It's entirely up to James.

Alan Stern


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
@ 2009-03-03 21:06               ` Alan Stern
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Stern @ 2009-03-03 21:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Boaz Harrosh, Jan Engelhardt, James Bottomley, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 3 Mar 2009, Ingo Molnar wrote:

> > > The above diff looks like something that was sent in the past to the mailing
> > > list, but only half of it. It was sent by Alan Stern. It might patch but
> > > it is not applicable any more because of changes made since. 
> > 
> > That's right; it is an old version of a patch which no longer applies 
> > to the current kernel (the __scsi_release_buffers() call was added 
> > after that patch was written).  An updated version of the patch has 
> > been submitted here:
> > 
> > 	http://marc.info/?l=linux-scsi&m=123507641620649&w=2
> 
> ah, i missed that patch because you used the exact same subject 
> line. (The standard lkml convention is to use "v2, v3" 
> postfixes, to make sure people notice that it's an update. It 
> helps avoid such mistakes.)

Sorry, I just forgot to update the subject line.

> I've picked up the v2 delta below into tip:out-of-tree - thanks 
> Alan! This will also get into the next -rt release.
> 
> Both patches will go away from the tip:out-of-tree hot-fixes 
> branch once the fix is upstream via the SCSI tree.
> 
> Any ETA for that? The lockup bug it fixes is very serious.

It's entirely up to James.

Alan Stern


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 19:22         ` Large amount of scsi-sgpool objects Ingo Molnar
@ 2009-03-03 21:25           ` James Bottomley
  2009-03-03 21:44             ` Ingo Molnar
  0 siblings, 1 reply; 66+ messages in thread
From: James Bottomley @ 2009-03-03 21:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 2009-03-03 at 20:22 +0100, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > On Tue, 2009-03-03 at 17:08 +0100, Jan Engelhardt wrote:
> > > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> > > >> > $ slabtop
> > > >> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > > >> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > > >> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> > > >> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> > > >> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> > > >> >   8927   8574  96%    0.03K     79      113       316K size-32
> > > >> 
> > > >> Looks like a leak, by failing to call scsi_release_buffers()
> > > >> somehow. (Which was changed recently)
> > > >
> > > >Firstly, I have to say I don't see this in the mainline tree, so could
> > > >you try that with your setup just to verify (git head at 2.6.29-rc6).
> > > 
> > > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> > 
> > OK, this is a bad patch, so just revert it.  It was posted to 
> > linux-scsi initially in this form before the author posted a 
> > new one with the missing release buffers added.  It looks like 
> > the first incarnation got pulled into the -rt tree for some 
> > reasons.
> 
> Uhm. I applied a test-patch from Alan Stern, to possibly fix an 
> SCSI lockup with aic7xxx that _I_ reported to you and then to 
> the scsi-list.
> 
> You were Cc:-ed to that test patch and to my bugreport as well, 
> all the way. Do you claim that you dont remember it?

?  If I've just quoted the patch above why would I not remember it now.
I've just said it's the cause of the problems in the -rt tree.

> The saga is still documented in tip:out-of-tree (which is a 
> special branch with out-of-tree hotfixes):
> 
>  7e4cbd1: fix "scsi: aic7xxx hang since v2.6.28-rc1"
>  e027abc: scsi: temporarily undo scsi reverts
>  813104e: Revert "[SCSI] simplify scsi_io_completion()"
>  84db545: Revert "[SCSI] Fix uninitialized variable error in scsi_io_completion"
>  0eb6038: Revert "[SCSI] Fix error handling for DIF/DIX"
>  3cd94dd: Revert "[SCSI] scsi_lib: don't decrement busy counters when inserting commands"
>  c27aed5: Revert "[SCSI] scsi_lib: fix DID_RESET status problems"
> 
> I wasnt Cc:-ed on the updated patch AFAICS, so i didnt pick it 
> up.

Actually, you were cc'd, you probably just missed it.

> > So the real question is why does the -rt tree even have 
> > patches not in the vanilla SCSI tree?  This type of cockup 
> > clearly demonstrates why it's a bad idea.
> 
> Believe me, i have better things to do than to track down your 
> regressions. I applied a fix/test patch sent to me by SCSI 
> folks.

Look, I've no problem with you collecting random patches.  I have a
problem when you start pushing random SCSI patches into other trees.  I
think this thread clearly illustrates why it's a problem.

Even if *I* sent you a patch asking if it fixes your bug, it doesn't
mean you should be deploying it on production systems ... we have
process, like code inspection and integration testing to got through
before patches appear in the SCSI trees.

For the particular patch in question (the corrected form).  I don't
actually believe it's a bug fix.  I've gone over it and best I can tell,
it does what it says it does (merely reposition the code) ... there's no
actual logic alterations at all.  So I think, unfortunately, whatever
the bug is, it's timing related (the patch does shorten some code
paths).  Now, sure, we can cover it up and pretend it's gone, but it's
bound to show up again down the road, so I'd like to find it's root
cause now rather than later.

James



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 21:25           ` James Bottomley
@ 2009-03-03 21:44             ` Ingo Molnar
  2009-03-03 22:39               ` James Bottomley
  0 siblings, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-03 21:44 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2009-03-03 at 20:22 +0100, Ingo Molnar wrote:
> > * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > 
> > > On Tue, 2009-03-03 at 17:08 +0100, Jan Engelhardt wrote:
> > > > On Tuesday 2009-03-03 16:21, James Bottomley wrote:
> > > > >> > $ slabtop
> > > > >> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
> > > > >> > 818616 818616 100%    0.16K  34109       24    136436K sgpool-8
> > > > >> > 253692 253692 100%    0.62K  42282        6    169128K sgpool-32
> > > > >> >  52017  52016  99%    2.50K  17339        3    138712K sgpool-128
> > > > >> >  26220  26219  99%    0.31K   2185       12      8740K sgpool-16
> > > > >> >   8927   8574  96%    0.03K     79      113       316K size-32
> > > > >> 
> > > > >> Looks like a leak, by failing to call scsi_release_buffers()
> > > > >> somehow. (Which was changed recently)
> > > > >
> > > > >Firstly, I have to say I don't see this in the mainline tree, so could
> > > > >you try that with your setup just to verify (git head at 2.6.29-rc6).
> > > > 
> > > > Yes, looking at the rt patch (in broken-out it's in origin.diff),
> > > > it seems a bit obvious - the scsi_release_buffers is not called anymore:
> > > 
> > > OK, this is a bad patch, so just revert it.  It was posted to 
> > > linux-scsi initially in this form before the author posted a 
> > > new one with the missing release buffers added.  It looks like 
> > > the first incarnation got pulled into the -rt tree for some 
> > > reasons.
> > 
> > Uhm. I applied a test-patch from Alan Stern, to possibly fix an 
> > SCSI lockup with aic7xxx that _I_ reported to you and then to 
> > the scsi-list.
> > 
> > You were Cc:-ed to that test patch and to my bugreport as well, 
> > all the way. Do you claim that you dont remember it?
> 
> ?  If I've just quoted the patch above why would I not remember it now.
> I've just said it's the cause of the problems in the -rt tree.
> 
> > The saga is still documented in tip:out-of-tree (which is a 
> > special branch with out-of-tree hotfixes):
> > 
> >  7e4cbd1: fix "scsi: aic7xxx hang since v2.6.28-rc1"
> >  e027abc: scsi: temporarily undo scsi reverts
> >  813104e: Revert "[SCSI] simplify scsi_io_completion()"
> >  84db545: Revert "[SCSI] Fix uninitialized variable error in scsi_io_completion"
> >  0eb6038: Revert "[SCSI] Fix error handling for DIF/DIX"
> >  3cd94dd: Revert "[SCSI] scsi_lib: don't decrement busy counters when inserting commands"
> >  c27aed5: Revert "[SCSI] scsi_lib: fix DID_RESET status problems"
> > 
> > I wasnt Cc:-ed on the updated patch AFAICS, so i didnt pick it 
> > up.
> 
> Actually, you were cc'd, you probably just missed it.

Correct, see my earlier reply to Alan Stern for the details of 
why it got missed.

> > > So the real question is why does the -rt tree even have 
> > > patches not in the vanilla SCSI tree?  This type of cockup 
> > > clearly demonstrates why it's a bad idea.
> > 
> > Believe me, i have better things to do than to track down your 
> > regressions. I applied a fix/test patch sent to me by SCSI 
> > folks.
> 
> Look, I've no problem with you collecting random patches.  I 
> have a problem when you start pushing random SCSI patches into 
> other trees. [...]

Both -tip and -rt are generic trees and there's a connection 
between them that the maintainers of both are one and the same 
set of people.

So i'm not sure on what grounds you purport to be able to 
prevent fixes from flowing from -tip into -rt and vice versa.

You have no monopoly on the propagation and testing of SCSI 
fixes. We picked up a v1 patch from the SCSI list and did not 
add nor remove anything from it.

Yes, v1 was buggy and I missed the v2 patch because it had the 
same subject line and no easily visible differentiator that it 
was a v2 patch.

In fact you should be happy that people are helping you out 
fixing your bugs. I never even thought of flaming anyone for 
having picked up a v1 scheduler fix and getting burned by it 
while a v2 existed. I find it unexplainable and irrational that 
you feel the need to do so for SCSI fixes.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-03 17:19         ` Thomas Gleixner
@ 2009-03-03 22:07             ` Thomas Gleixner
  0 siblings, 0 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-03 22:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi, Linux Kernel Mailing List

On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> My bad. I was playing with that to get rid of the aic7xxx wreckage on
> one of my test boxen and forgot to remove it.

While the one below is definitey not my fault. It's on Linus latest:

 commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66

While compiling a kernel I triggerred the BUG below. Not so nice as it
took a whole filesystem with it. fsck took more than 20 min to recover
the leftovers :(

Thanks,

	tglx


------------[ cut here ]------------
kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:03:00.0/irq
CPU 0 
Modules linked in: ext2 ipt_MASQUERADE iptable_nat nf_nat bridge stp autofs4 coretemp lm85 hwmon_vid hwmon nfs lockd nfs_acl auth_rpcgss fuse sunrpc 8139too mii ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT ip6t_ipv6header xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod raid0 kvm_intel kvm uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd ppdev i2c_i801 firewire_ohci soundcore parport_pc i82975x_edac e1000e firewire_core snd_page_alloc i2c_core iTCO_wdt sr_mod edac_core parport sg floppy crc_itu_t iTCO_vendor_support pcspkr serio_raw cdrom ata_piix sata_sil ahci libata sd_mod scsi_mod raid456 async_xor async_memcpy async_tx xor raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
Pid: 16556, comm: cc1 Not tainted 2.6.29-rc6 #155         
RIP: 0010:[<ffffffffa00ad30b>]  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
RSP: 0000:ffffffff8163cb40  EFLAGS: 00010002
RAX: 000000000000002d RBX: ffff88004b7a3590 RCX: 00000000ffffffff
RDX: 000000000000000b RSI: ffff880034b4d0a0 RDI: 0000000000002000
RBP: ffffffff8163cb50 R08: ffff880001017b50 R09: ffffffff8163cb90
R10: ffff88007d281000 R11: 0000000000000000 R12: ffff88003b323c58
R13: 00000000082b73a7 R14: ffffffff8163cc48 R15: 00000000000003f0
FS:  00007fe47f3ea6f0(0000) GS:ffffffff81645080(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe47a37f000 CR3: 00000000443a8000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cc1 (pid: 16556, threadinfo ffff88001c332000, task ffff880008398000)
Stack:
 ffff88003b323c00 ffff88007d281000 ffffffff8163cb70 ffffffffa00ad550
 ffff88004b7a3590 ffff88007d281000 ffffffff8163cba0 ffffffffa00ad6b1
 ffff88007d1f1410 ffff88007d1f0000 ffff88004b7a3590 ffff88007d281000
Call Trace:
 <IRQ> <0> [<ffffffffa00ad550>] scsi_init_io+0x1f/0xc9 [scsi_mod]
 [<ffffffffa00ad6b1>] scsi_setup_fs_cmnd+0xb7/0xbf [scsi_mod]
 [<ffffffffa00d5175>] sd_prep_fn+0x65/0x81c [sd_mod]
 [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
 [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
 [<ffffffff8114345e>] elv_next_request+0xd7/0x19e
 [<ffffffffa00acc45>] scsi_request_fn+0x90/0x4cf [scsi_mod]
 [<ffffffff81144e86>] blk_invoke_request_fn+0x75/0x14a
 [<ffffffff8114547a>] __blk_run_queue+0x25/0x29
 [<ffffffff8114549f>] blk_run_queue+0x21/0x35
 [<ffffffffa00ac4f3>] scsi_run_queue+0x2e3/0x37d [scsi_mod]
 [<ffffffffa00ad2aa>] scsi_next_command+0x36/0x46 [scsi_mod]
 [<ffffffffa00addd7>] scsi_io_completion+0x3df/0x423 [scsi_mod]
 [<ffffffffa00df3ad>] ? __ata_qc_complete+0xc0/0xc9 [libata]
 [<ffffffffa00e1011>] ? ata_qc_complete+0x1ea/0x1f3 [libata]
 [<ffffffffa00a7683>] scsi_finish_command+0xe9/0xf2 [scsi_mod]
 [<ffffffffa00adf45>] scsi_softirq_done+0x11a/0x123 [scsi_mod]
 [<ffffffff81149275>] blk_done_softirq+0x69/0x79
 [<ffffffff810412b7>] __do_softirq+0x83/0x144
 [<ffffffff8100d23c>] call_softirq+0x1c/0x28
 [<ffffffff8100e17c>] do_softirq+0x34/0x76
 [<ffffffff81040fd1>] irq_exit+0x3f/0x79
 [<ffffffff8100e43f>] do_IRQ+0x130/0x155
 [<ffffffff8100cb13>] ret_from_intr+0x0/0xa
 <EOI> <0>Code: 80 00 00 00 4c 89 e7 e8 0a 0d 0b e1 85 c0 74 45 48 c7 c2 59 d3 0a a0 be 80 00 00 00 4c 89 e7 e8 78 09 0b e1 b8 02 00 00 00 eb 25 <0f> 0b eb fe 41 89 44 24 08 83 7b 4c 02 75 08 8b 83 18 01 00 00 
RIP  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
 RSP <ffffffff8163cb40>
---[ end trace 1b0175d9b3d81293 ]---

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
@ 2009-03-03 22:07             ` Thomas Gleixner
  0 siblings, 0 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-03 22:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi, Linux Kernel Mailing List

On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> My bad. I was playing with that to get rid of the aic7xxx wreckage on
> one of my test boxen and forgot to remove it.

While the one below is definitey not my fault. It's on Linus latest:

 commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66

While compiling a kernel I triggerred the BUG below. Not so nice as it
took a whole filesystem with it. fsck took more than 20 min to recover
the leftovers :(

Thanks,

	tglx


------------[ cut here ]------------
kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:03:00.0/irq
CPU 0 
Modules linked in: ext2 ipt_MASQUERADE iptable_nat nf_nat bridge stp autofs4 coretemp lm85 hwmon_vid hwmon nfs lockd nfs_acl auth_rpcgss fuse sunrpc 8139too mii ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT ip6t_ipv6header xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod raid0 kvm_intel kvm uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd ppdev i2c_i801 firewire_ohci soundcore parport_pc i82975x_edac e1000e firewire_core snd_page_alloc i2c_core iTCO_wdt sr_mod edac_core parport sg floppy crc_itu_t iTCO_vendor_support pcspkr serio_raw c
 drom ata_piix sata_sil ahci libata sd_mod scsi_mod raid456 async_xor async_memcpy async_tx xor raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
Pid: 16556, comm: cc1 Not tainted 2.6.29-rc6 #155         
RIP: 0010:[<ffffffffa00ad30b>]  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
RSP: 0000:ffffffff8163cb40  EFLAGS: 00010002
RAX: 000000000000002d RBX: ffff88004b7a3590 RCX: 00000000ffffffff
RDX: 000000000000000b RSI: ffff880034b4d0a0 RDI: 0000000000002000
RBP: ffffffff8163cb50 R08: ffff880001017b50 R09: ffffffff8163cb90
R10: ffff88007d281000 R11: 0000000000000000 R12: ffff88003b323c58
R13: 00000000082b73a7 R14: ffffffff8163cc48 R15: 00000000000003f0
FS:  00007fe47f3ea6f0(0000) GS:ffffffff81645080(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe47a37f000 CR3: 00000000443a8000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cc1 (pid: 16556, threadinfo ffff88001c332000, task ffff880008398000)
Stack:
 ffff88003b323c00 ffff88007d281000 ffffffff8163cb70 ffffffffa00ad550
 ffff88004b7a3590 ffff88007d281000 ffffffff8163cba0 ffffffffa00ad6b1
 ffff88007d1f1410 ffff88007d1f0000 ffff88004b7a3590 ffff88007d281000
Call Trace:
 <IRQ> <0> [<ffffffffa00ad550>] scsi_init_io+0x1f/0xc9 [scsi_mod]
 [<ffffffffa00ad6b1>] scsi_setup_fs_cmnd+0xb7/0xbf [scsi_mod]
 [<ffffffffa00d5175>] sd_prep_fn+0x65/0x81c [sd_mod]
 [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
 [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
 [<ffffffff8114345e>] elv_next_request+0xd7/0x19e
 [<ffffffffa00acc45>] scsi_request_fn+0x90/0x4cf [scsi_mod]
 [<ffffffff81144e86>] blk_invoke_request_fn+0x75/0x14a
 [<ffffffff8114547a>] __blk_run_queue+0x25/0x29
 [<ffffffff8114549f>] blk_run_queue+0x21/0x35
 [<ffffffffa00ac4f3>] scsi_run_queue+0x2e3/0x37d [scsi_mod]
 [<ffffffffa00ad2aa>] scsi_next_command+0x36/0x46 [scsi_mod]
 [<ffffffffa00addd7>] scsi_io_completion+0x3df/0x423 [scsi_mod]
 [<ffffffffa00df3ad>] ? __ata_qc_complete+0xc0/0xc9 [libata]
 [<ffffffffa00e1011>] ? ata_qc_complete+0x1ea/0x1f3 [libata]
 [<ffffffffa00a7683>] scsi_finish_command+0xe9/0xf2 [scsi_mod]
 [<ffffffffa00adf45>] scsi_softirq_done+0x11a/0x123 [scsi_mod]
 [<ffffffff81149275>] blk_done_softirq+0x69/0x79
 [<ffffffff810412b7>] __do_softirq+0x83/0x144
 [<ffffffff8100d23c>] call_softirq+0x1c/0x28
 [<ffffffff8100e17c>] do_softirq+0x34/0x76
 [<ffffffff81040fd1>] irq_exit+0x3f/0x79
 [<ffffffff8100e43f>] do_IRQ+0x130/0x155
 [<ffffffff8100cb13>] ret_from_intr+0x0/0xa
 <EOI> <0>Code: 80 00 00 00 4c 89 e7 e8 0a 0d 0b e1 85 c0 74 45 48 c7 c2 59 d3 0a a0 be 80 00 00 00 4c 89 e7 e8 78 09 0b e1 b8 02 00 00 00 eb 25 <0f> 0b eb fe 41 89 44 24 08 83 7b 4c 02 75 08 8b 83 18 01 00 00 
RIP  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
 RSP <ffffffff8163cb40>
---[ end trace 1b0175d9b3d81293 ]---

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-03 22:07             ` Thomas Gleixner
  (?)
@ 2009-03-03 22:22             ` Jan Engelhardt
  2009-03-03 23:07               ` Thomas Gleixner
  -1 siblings, 1 reply; 66+ messages in thread
From: Jan Engelhardt @ 2009-03-03 22:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: James Bottomley, Boaz Harrosh, linux-scsi, Linux Kernel Mailing List


On Tuesday 2009-03-03 23:07, Thomas Gleixner wrote:

>On Tue, 3 Mar 2009, Thomas Gleixner wrote:
>> My bad. I was playing with that to get rid of the aic7xxx wreckage on
>> one of my test boxen and forgot to remove it.
>
>While the one below is definitey not my fault. It's on Linus latest:
>
> commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66

That commit seems to have nothing to do with filesystems or scsi:

commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Mar 2 16:23:33 2009 -0800

    Revert "menu: fix embedded menu snafu"

?

>While compiling a kernel I triggerred the BUG below. Not so nice as it
>took a whole filesystem with it. fsck took more than 20 min to recover
>the leftovers :(

Stop using ext3 ;-)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-03 22:07             ` Thomas Gleixner
@ 2009-03-03 22:26               ` James Bottomley
  -1 siblings, 0 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-03 22:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Tue, 2009-03-03 at 23:07 +0100, Thomas Gleixner wrote:
> On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> > My bad. I was playing with that to get rid of the aic7xxx wreckage on
> > one of my test boxen and forgot to remove it.
> 
> While the one below is definitey not my fault. It's on Linus latest:
> 
>  commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> 
> While compiling a kernel I triggerred the BUG below. Not so nice as it
> took a whole filesystem with it. fsck took more than 20 min to recover
> the leftovers :(
> 
> Thanks,
> 
> 	tglx
> 
> 
> ------------[ cut here ]------------
> kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!

This is BUG_ON(count > sdb->table.nents);

It looks like the sg list got split and grew in size ... I suspect this
might be libata related, so cc'ing the ide list.  I suspect either the
block layer initially parametrised this wrongly (tomo bug) or a sg list
got split then requeued (something in libata?).

James


> invalid opcode: 0000 [#1] SMP 
> last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:03:00.0/irq
> CPU 0 
> Modules linked in: ext2 ipt_MASQUERADE iptable_nat nf_nat bridge stp autofs4 coretemp lm85 hwmon_vid hwmon nfs lockd nfs_acl auth_rpcgss fuse sunrpc 8139too mii ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT ip6t_ipv6header xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod raid0 kvm_intel kvm uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd ppdev i2c_i801 firewire_ohci soundcore parport_pc i82975x_edac e1000e firewire_core snd_page_alloc i2c_core iTCO_wdt sr_mod edac_core parport sg floppy crc_itu_t iTCO_vendor_support pcspkr serio_raw
  cdrom ata_piix sata_sil ahci libata sd_mod scsi_mod raid456 async_xor async_memcpy async_tx xor raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
> Pid: 16556, comm: cc1 Not tainted 2.6.29-rc6 #155         
> RIP: 0010:[<ffffffffa00ad30b>]  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
> RSP: 0000:ffffffff8163cb40  EFLAGS: 00010002
> RAX: 000000000000002d RBX: ffff88004b7a3590 RCX: 00000000ffffffff
> RDX: 000000000000000b RSI: ffff880034b4d0a0 RDI: 0000000000002000
> RBP: ffffffff8163cb50 R08: ffff880001017b50 R09: ffffffff8163cb90
> R10: ffff88007d281000 R11: 0000000000000000 R12: ffff88003b323c58
> R13: 00000000082b73a7 R14: ffffffff8163cc48 R15: 00000000000003f0
> FS:  00007fe47f3ea6f0(0000) GS:ffffffff81645080(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe47a37f000 CR3: 00000000443a8000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process cc1 (pid: 16556, threadinfo ffff88001c332000, task ffff880008398000)
> Stack:
>  ffff88003b323c00 ffff88007d281000 ffffffff8163cb70 ffffffffa00ad550
>  ffff88004b7a3590 ffff88007d281000 ffffffff8163cba0 ffffffffa00ad6b1
>  ffff88007d1f1410 ffff88007d1f0000 ffff88004b7a3590 ffff88007d281000
> Call Trace:
>  <IRQ> <0> [<ffffffffa00ad550>] scsi_init_io+0x1f/0xc9 [scsi_mod]
>  [<ffffffffa00ad6b1>] scsi_setup_fs_cmnd+0xb7/0xbf [scsi_mod]
>  [<ffffffffa00d5175>] sd_prep_fn+0x65/0x81c [sd_mod]
>  [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
>  [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
>  [<ffffffff8114345e>] elv_next_request+0xd7/0x19e
>  [<ffffffffa00acc45>] scsi_request_fn+0x90/0x4cf [scsi_mod]
>  [<ffffffff81144e86>] blk_invoke_request_fn+0x75/0x14a
>  [<ffffffff8114547a>] __blk_run_queue+0x25/0x29
>  [<ffffffff8114549f>] blk_run_queue+0x21/0x35
>  [<ffffffffa00ac4f3>] scsi_run_queue+0x2e3/0x37d [scsi_mod]
>  [<ffffffffa00ad2aa>] scsi_next_command+0x36/0x46 [scsi_mod]
>  [<ffffffffa00addd7>] scsi_io_completion+0x3df/0x423 [scsi_mod]
>  [<ffffffffa00df3ad>] ? __ata_qc_complete+0xc0/0xc9 [libata]
>  [<ffffffffa00e1011>] ? ata_qc_complete+0x1ea/0x1f3 [libata]
>  [<ffffffffa00a7683>] scsi_finish_command+0xe9/0xf2 [scsi_mod]
>  [<ffffffffa00adf45>] scsi_softirq_done+0x11a/0x123 [scsi_mod]
>  [<ffffffff81149275>] blk_done_softirq+0x69/0x79
>  [<ffffffff810412b7>] __do_softirq+0x83/0x144
>  [<ffffffff8100d23c>] call_softirq+0x1c/0x28
>  [<ffffffff8100e17c>] do_softirq+0x34/0x76
>  [<ffffffff81040fd1>] irq_exit+0x3f/0x79
>  [<ffffffff8100e43f>] do_IRQ+0x130/0x155
>  [<ffffffff8100cb13>] ret_from_intr+0x0/0xa
>  <EOI> <0>Code: 80 00 00 00 4c 89 e7 e8 0a 0d 0b e1 85 c0 74 45 48 c7 c2 59 d3 0a a0 be 80 00 00 00 4c 89 e7 e8 78 09 0b e1 b8 02 00 00 00 eb 25 <0f> 0b eb fe 41 89 44 24 08 83 7b 4c 02 75 08 8b 83 18 01 00 00 
> RIP  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
>  RSP <ffffffff8163cb40>
> ---[ end trace 1b0175d9b3d81293 ]---


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
@ 2009-03-03 22:26               ` James Bottomley
  0 siblings, 0 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-03 22:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Tue, 2009-03-03 at 23:07 +0100, Thomas Gleixner wrote:
> On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> > My bad. I was playing with that to get rid of the aic7xxx wreckage on
> > one of my test boxen and forgot to remove it.
> 
> While the one below is definitey not my fault. It's on Linus latest:
> 
>  commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> 
> While compiling a kernel I triggerred the BUG below. Not so nice as it
> took a whole filesystem with it. fsck took more than 20 min to recover
> the leftovers :(
> 
> Thanks,
> 
> 	tglx
> 
> 
> ------------[ cut here ]------------
> kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!

This is BUG_ON(count > sdb->table.nents);

It looks like the sg list got split and grew in size ... I suspect this
might be libata related, so cc'ing the ide list.  I suspect either the
block layer initially parametrised this wrongly (tomo bug) or a sg list
got split then requeued (something in libata?).

James


> invalid opcode: 0000 [#1] SMP 
> last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:03:00.0/irq
> CPU 0 
> Modules linked in: ext2 ipt_MASQUERADE iptable_nat nf_nat bridge stp autofs4 coretemp lm85 hwmon_vid hwmon nfs lockd nfs_acl auth_rpcgss fuse sunrpc 8139too mii ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT ip6t_ipv6header xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod raid0 kvm_intel kvm uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd ppdev i2c_i801 firewire_ohci soundcore parport_pc i82975x_edac e1000e firewire_core snd_page_alloc i2c_core iTCO_wdt sr_mod edac_core parport sg floppy crc_itu_t iTCO_vendor_support pcspkr serio_raw cdrom ata_piix sata_sil ahci libata sd_mod scsi_mod raid456 async_xor async_memcpy async_tx xor raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table]
> Pid: 16556, comm: cc1 Not tainted 2.6.29-rc6 #155         
> RIP: 0010:[<ffffffffa00ad30b>]  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
> RSP: 0000:ffffffff8163cb40  EFLAGS: 00010002
> RAX: 000000000000002d RBX: ffff88004b7a3590 RCX: 00000000ffffffff
> RDX: 000000000000000b RSI: ffff880034b4d0a0 RDI: 0000000000002000
> RBP: ffffffff8163cb50 R08: ffff880001017b50 R09: ffffffff8163cb90
> R10: ffff88007d281000 R11: 0000000000000000 R12: ffff88003b323c58
> R13: 00000000082b73a7 R14: ffffffff8163cc48 R15: 00000000000003f0
> FS:  00007fe47f3ea6f0(0000) GS:ffffffff81645080(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe47a37f000 CR3: 00000000443a8000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process cc1 (pid: 16556, threadinfo ffff88001c332000, task ffff880008398000)
> Stack:
>  ffff88003b323c00 ffff88007d281000 ffffffff8163cb70 ffffffffa00ad550
>  ffff88004b7a3590 ffff88007d281000 ffffffff8163cba0 ffffffffa00ad6b1
>  ffff88007d1f1410 ffff88007d1f0000 ffff88004b7a3590 ffff88007d281000
> Call Trace:
>  <IRQ> <0> [<ffffffffa00ad550>] scsi_init_io+0x1f/0xc9 [scsi_mod]
>  [<ffffffffa00ad6b1>] scsi_setup_fs_cmnd+0xb7/0xbf [scsi_mod]
>  [<ffffffffa00d5175>] sd_prep_fn+0x65/0x81c [sd_mod]
>  [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
>  [<ffffffffa00a768c>] ? scsi_done+0x0/0x12 [scsi_mod]
>  [<ffffffff8114345e>] elv_next_request+0xd7/0x19e
>  [<ffffffffa00acc45>] scsi_request_fn+0x90/0x4cf [scsi_mod]
>  [<ffffffff81144e86>] blk_invoke_request_fn+0x75/0x14a
>  [<ffffffff8114547a>] __blk_run_queue+0x25/0x29
>  [<ffffffff8114549f>] blk_run_queue+0x21/0x35
>  [<ffffffffa00ac4f3>] scsi_run_queue+0x2e3/0x37d [scsi_mod]
>  [<ffffffffa00ad2aa>] scsi_next_command+0x36/0x46 [scsi_mod]
>  [<ffffffffa00addd7>] scsi_io_completion+0x3df/0x423 [scsi_mod]
>  [<ffffffffa00df3ad>] ? __ata_qc_complete+0xc0/0xc9 [libata]
>  [<ffffffffa00e1011>] ? ata_qc_complete+0x1ea/0x1f3 [libata]
>  [<ffffffffa00a7683>] scsi_finish_command+0xe9/0xf2 [scsi_mod]
>  [<ffffffffa00adf45>] scsi_softirq_done+0x11a/0x123 [scsi_mod]
>  [<ffffffff81149275>] blk_done_softirq+0x69/0x79
>  [<ffffffff810412b7>] __do_softirq+0x83/0x144
>  [<ffffffff8100d23c>] call_softirq+0x1c/0x28
>  [<ffffffff8100e17c>] do_softirq+0x34/0x76
>  [<ffffffff81040fd1>] irq_exit+0x3f/0x79
>  [<ffffffff8100e43f>] do_IRQ+0x130/0x155
>  [<ffffffff8100cb13>] ret_from_intr+0x0/0xa
>  <EOI> <0>Code: 80 00 00 00 4c 89 e7 e8 0a 0d 0b e1 85 c0 74 45 48 c7 c2 59 d3 0a a0 be 80 00 00 00 4c 89 e7 e8 78 09 0b e1 b8 02 00 00 00 eb 25 <0f> 0b eb fe 41 89 44 24 08 83 7b 4c 02 75 08 8b 83 18 01 00 00 
> RIP  [<ffffffffa00ad30b>] scsi_init_sgtable+0x51/0x9f [scsi_mod]
>  RSP <ffffffff8163cb40>
> ---[ end trace 1b0175d9b3d81293 ]---


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 21:44             ` Ingo Molnar
@ 2009-03-03 22:39               ` James Bottomley
  2009-03-03 23:03                 ` Ingo Molnar
  2009-03-04  0:30                 ` Thomas Gleixner
  0 siblings, 2 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-03 22:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

On Tue, 2009-03-03 at 22:44 +0100, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > So the real question is why does the -rt tree even have 
> > > > patches not in the vanilla SCSI tree?  This type of cockup 
> > > > clearly demonstrates why it's a bad idea.
> > > 
> > > Believe me, i have better things to do than to track down your 
> > > regressions. I applied a fix/test patch sent to me by SCSI 
> > > folks.
> > 
> > Look, I've no problem with you collecting random patches.  I 
> > have a problem when you start pushing random SCSI patches into 
> > other trees. [...]
> 
> Both -tip and -rt are generic trees and there's a connection 
> between them that the maintainers of both are one and the same 
> set of people.
> 
> So i'm not sure on what grounds you purport to be able to 
> prevent fixes from flowing from -tip into -rt and vice versa.
> 
> You have no monopoly on the propagation and testing of SCSI 
> fixes. We picked up a v1 patch from the SCSI list and did not 
> add nor remove anything from it.

OK, let me try and make the problem simpler for you:  If you pick up
random patches outside of your area and apply them without any quality
control (like code inspections, or even understanding how the patches
work) you've created a suspect and quality compromised tree.   This is
fine for your own work and others to test and report back (although the
list will start to see your bug reports as correspondingly lower quality
if you have a high proportion of self induced failures like this one).

However, if you base a feature tree off this compromised tree, now
you're causing extra work for other maintainers who see problems
reported with this tree, and have to take the time to investigate what's
going on.

Worse, supposing there is a genuine SCSI bug exposed by the -rt tree
(say something timing or interrupt related).  So I ask the reporter to
retry with the regular kernel tree and the bug goes away. Now everyone
will think "Oh, it's just because of some SCSI crap Ingo put in his
tree".  Result: the bug goes undiagnosed until it bites several people
in the field, which is an avoidable result.

The executive summary is that your "it works for me, so I'm putting it
in my tree" attitude is damaging our quality process.

James



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 22:39               ` James Bottomley
@ 2009-03-03 23:03                 ` Ingo Molnar
  2009-03-03 23:32                   ` Stefan Richter
  2009-03-04  0:01                   ` James Bottomley
  2009-03-04  0:30                 ` Thomas Gleixner
  1 sibling, 2 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-03 23:03 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2009-03-03 at 22:44 +0100, Ingo Molnar wrote:
> > * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > > So the real question is why does the -rt tree even have 
> > > > > patches not in the vanilla SCSI tree?  This type of cockup 
> > > > > clearly demonstrates why it's a bad idea.
> > > > 
> > > > Believe me, i have better things to do than to track down your 
> > > > regressions. I applied a fix/test patch sent to me by SCSI 
> > > > folks.
> > > 
> > > Look, I've no problem with you collecting random patches.  I 
> > > have a problem when you start pushing random SCSI patches into 
> > > other trees. [...]
> > 
> > Both -tip and -rt are generic trees and there's a connection 
> > between them that the maintainers of both are one and the same 
> > set of people.
> > 
> > So i'm not sure on what grounds you purport to be able to 
> > prevent fixes from flowing from -tip into -rt and vice versa.
> > 
> > You have no monopoly on the propagation and testing of SCSI 
> > fixes. We picked up a v1 patch from the SCSI list and did not 
> > add nor remove anything from it.
> 
> OK, let me try and make the problem simpler for you:  If you pick up
> random patches outside of your area [...]

Dude, lets make it clear to you: it is not "your area" that you 
own in any way and you have no monopoly on SCSI fixes. We acted 
out of necessity because the SCSI tree is taking very long to 
get fixes upstream.

I reported this lockup bug to you on _January 15th_, more than 
one and a half months ago - and it's still unfixed even today. 
Alan sent his v2 fix on Feburary 19th - two weeks ago. We are 
not asking you for much, is it really that hard to act like a 
proper maintainer and get critical fixes upstream in a timely 
manner?

Again, i repeat, i picked up a v1 patch from the SCSI list that 
i reported and which patch was sent to me. I did that in the 
hope to fix a serious lockup bug that is still unfixed in the 
upstream kernel here and today. That kind of bug can cause data 
corruption and is serious and you should have given full, high 
priority attention to it - but you didnt.

A v2 patch was sent too but i missed it because it had the exact 
same subject line and no 'v2' in its title.

I did not do this out of fun - i did it to address a serious 
regression that was unfixed upstream.

If there's a failure here it's yours: your latency and 
unreliability in SCSI bug fixing is forcing generic trees like 
-tip or -rt (or -mm) to carry SCSI fixes while it should be 
_your_ responsibility to act quickly to bugreports and get 
critical fixes upstream as soon as possible.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-03 22:22             ` Jan Engelhardt
@ 2009-03-03 23:07               ` Thomas Gleixner
  0 siblings, 0 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-03 23:07 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: James Bottomley, Boaz Harrosh, linux-scsi, Linux Kernel Mailing List

On Tue, 3 Mar 2009, Jan Engelhardt wrote:
> On Tuesday 2009-03-03 23:07, Thomas Gleixner wrote:
> 
> >On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> >> My bad. I was playing with that to get rid of the aic7xxx wreckage on
> >> one of my test boxen and forgot to remove it.
> >
> >While the one below is definitey not my fault. It's on Linus latest:
> >
> > commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> 
> That commit seems to have nothing to do with filesystems or scsi:
> 
> commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Mar 2 16:23:33 2009 -0800
> 
>     Revert "menu: fix embedded menu snafu"
> 

Errm. The commit merily indicates which point of Linus tree I'm using
and nothing else. I'm well aware that this commit has nothing to do
with scsi.
 
> >While compiling a kernel I triggerred the BUG below. Not so nice as it
> >took a whole filesystem with it. fsck took more than 20 min to recover
> >the leftovers :(
> 
> Stop using ext3 ;-)

Thanks for the real useful advise.

       tglx



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 23:03                 ` Ingo Molnar
@ 2009-03-03 23:32                   ` Stefan Richter
  2009-03-03 23:48                     ` Ingo Molnar
  2009-03-04  0:01                   ` James Bottomley
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Richter @ 2009-03-03 23:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: James Bottomley, Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton

Ingo Molnar wrote:
> i picked up a v1 patch from the SCSI list that 
> i reported and which patch was sent to me. I did that in the 
> hope to fix a serious lockup bug that is still unfixed in the
> upstream kernel

Perhaps you could avoid putting such fixes into published merge
branches.  Have it in an off-topic branch and tell people with affected
systems to locally merge such a branch themselves if it is strictly
necessary to keep their development rolling.  (And if such an off-topic
branch turns out to be unexpectedly long-lived, ping the choke point.)
Just so that there is minimal surprise about which modifications your
trees carry.
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 23:32                   ` Stefan Richter
@ 2009-03-03 23:48                     ` Ingo Molnar
  2009-03-04  6:39                       ` Stefan Richter
  0 siblings, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-03 23:48 UTC (permalink / raw)
  To: Stefan Richter
  Cc: James Bottomley, Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton


* Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Ingo Molnar wrote:
> > i picked up a v1 patch from the SCSI list that 
> > i reported and which patch was sent to me. I did that in the 
> > hope to fix a serious lockup bug that is still unfixed in the
> > upstream kernel
> 
> Perhaps you could avoid putting such fixes into published 
> merge branches.

Yeah, these commits are in none of the topic branches that are 
the git base of development, they are all already in a separate 
branch named "tip:out-of-tree". That separate branch carries 
assorted fixlets for various bugs we trigger in -tip qa. It 
never goes upstream-wards, obviously. It's a bit like the 
fixlets portion of -mm.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 23:03                 ` Ingo Molnar
  2009-03-03 23:32                   ` Stefan Richter
@ 2009-03-04  0:01                   ` James Bottomley
  2009-03-04  0:39                     ` Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: James Bottomley @ 2009-03-04  0:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton

On Wed, 2009-03-04 at 00:03 +0100, Ingo Molnar wrote:
> * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > On Tue, 2009-03-03 at 22:44 +0100, Ingo Molnar wrote:
> > > * James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > > > So the real question is why does the -rt tree even have 
> > > > > > patches not in the vanilla SCSI tree?  This type of cockup 
> > > > > > clearly demonstrates why it's a bad idea.
> > > > > 
> > > > > Believe me, i have better things to do than to track down your 
> > > > > regressions. I applied a fix/test patch sent to me by SCSI 
> > > > > folks.
> > > > 
> > > > Look, I've no problem with you collecting random patches.  I 
> > > > have a problem when you start pushing random SCSI patches into 
> > > > other trees. [...]
> > > 
> > > Both -tip and -rt are generic trees and there's a connection 
> > > between them that the maintainers of both are one and the same 
> > > set of people.
> > > 
> > > So i'm not sure on what grounds you purport to be able to 
> > > prevent fixes from flowing from -tip into -rt and vice versa.
> > > 
> > > You have no monopoly on the propagation and testing of SCSI 
> > > fixes. We picked up a v1 patch from the SCSI list and did not 
> > > add nor remove anything from it.
> > 
> > OK, let me try and make the problem simpler for you:  If you pick up
> > random patches outside of your area [...]
> 
> Dude, lets make it clear to you: it is not "your area" that you 
> own in any way and you have no monopoly on SCSI fixes.

The fact that I can spot a missing release buffers and you, apparently,
can't does lend credence to a claim of greater expertise in the SCSI
area.

>  We acted 
> out of necessity because the SCSI tree is taking very long to 
> get fixes upstream.

To get upstream, there first has to be a fix.  I've already explained
why I don't think the patch in question is a fix.

> I reported this lockup bug to you on _January 15th_, more than 
> one and a half months ago - and it's still unfixed even today. 
> Alan sent his v2 fix on Feburary 19th - two weeks ago.

OK, so why don't you humour me and try applying the diagnostic patch I
sent you on the 24th of January?  I think there's a loop in I/O
completion caused by some type of timing race, and starved list handling
is the prime candidate.

>  We are 
> not asking you for much, is it really that hard to act like a 
> proper maintainer and get critical fixes upstream in a timely 
> manner?

You're asking me to ignore root cause analysis and push patches as "bug
fixes" just because they hide the problem on your machine.

> Again, i repeat, i picked up a v1 patch from the SCSI list that 
> i reported and which patch was sent to me. I did that in the 
> hope to fix a serious lockup bug that is still unfixed in the 
> upstream kernel here and today. That kind of bug can cause data 
> corruption and is serious and you should have given full, high 
> priority attention to it - but you didnt.
> 
> A v2 patch was sent too but i missed it because it had the exact 
> same subject line and no 'v2' in its title.
> 
> I did not do this out of fun - i did it to address a serious 
> regression that was unfixed upstream.
> 
> If there's a failure here it's yours: your latency and 
> unreliability in SCSI bug fixing is forcing generic trees like 
> -tip or -rt (or -mm) to carry SCSI fixes while it should be 
> _your_ responsibility to act quickly to bugreports and get 
> critical fixes upstream as soon as possible.

Well, you not responding to requests for information for three weeks
(before deciding the it was unnecessary to provide it) may have helped
elongate the time scale ...

However, I'm not really interested in recriminations.  Since you already
know better than me what the bug is, simply explain here what the patch
actually fixes and it can go upstream with my blessing and whatever ego
boosting praise you require.

James



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 22:39               ` James Bottomley
  2009-03-03 23:03                 ` Ingo Molnar
@ 2009-03-04  0:30                 ` Thomas Gleixner
  2009-03-04  0:47                   ` Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-04  0:30 UTC (permalink / raw)
  To: James Bottomley
  Cc: Ingo Molnar, Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users

James,

On Tue, 3 Mar 2009, James Bottomley wrote:
> However, if you base a feature tree off this compromised tree, now
> you're causing extra work for other maintainers who see problems
> reported with this tree, and have to take the time to investigate what's
> going on.

Stop this nonsense. That bug stopped me and Ingo from testing code on
machines with AIC7xxx controllers for quite a while.

The patch sent by the author of the SCSI commit which caused the
breakage in the first place (b60af5b: [SCSI] simplify
scsi_io_completion()) specifically to Ingo in response to Ingo's bug
report seemed to fix it and it did not explode, so it was left in the
-rt origin.patch simply because I forgot about it and there was no fix
in the mainline kernel which would have made it obsolete.

Yes, I should have noticed the memory leak, but ...

> Worse, supposing there is a genuine SCSI bug exposed by the -rt tree
> (say something timing or interrupt related).  So I ask the reporter to
> retry with the regular kernel tree and the bug goes away. Now everyone
> will think "Oh, it's just because of some SCSI crap Ingo put in his
> tree".  Result: the bug goes undiagnosed until it bites several people
> in the field, which is an avoidable result.
> 
> The executive summary is that your "it works for me, so I'm putting it
> in my tree" attitude is damaging our quality process.

So according to your executive summary the full failure of test
machines due to a SCSI bug is not damaging anything. These machines
are^W have been part of the quality process and stopped working due to
wreckage induced by your tree in 29-rc1.

The executive summary is: stop development and testing until SCSI
comes up with a blessed fix.

You have my full understanding that quality processing defined by SCSI
has precedence over general testing.

Once I'm done with cleaning up the mess caused by the crash in linus
tree (w/o any unblessed patches) I'm going to fix the power supply of
the AIC7xxx system to make sure it is stand by when a blessed solution
pops up in the unforeseeable future.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  0:01                   ` James Bottomley
@ 2009-03-04  0:39                     ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04  0:39 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton


* James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> > Dude, lets make it clear to you: it is not "your area" that 
> > you own in any way and you have no monopoly on SCSI fixes.
> 
> The fact that I can spot a missing release buffers and you, 
> apparently, can't does lend credence to a claim of greater 
> expertise in the SCSI area.

Uhm, i dont maintain SCSI and dont intend to. I got a fix patch 
from an SCSI person for a lockup bug i reported on the SCSI 
list, caused by a commit authored by that same SCSI person, and 
i applied the fix.

The real question is, why didnt your review and testing skills 
catch the serious lockups caused by:

    commit b60af5b0adf0da24c673598c8d3fb4d4189a15ce
    Date:   Mon Nov 3 15:56:47 2008 -0500
    [SCSI] simplify scsi_io_completion()

... and why is the bug still unfixed today in the upstream 
kernel?

The thing is, we triggered the lockups quite easily in -rt and 
-tip testing - and we are not set up to catch SCSI bugs at all. 
We are set up to test and catch bugs in the areas of the kernel 
we are interested in mainly.

> >  We acted out of necessity because the SCSI tree is taking 
> > very long to get fixes upstream.
> 
> To get upstream, there first has to be a fix.  I've already 
> explained why I don't think the patch in question is a fix.

Uhm, then why didnt you revert the original commit? I clearly 
identified it to you as the root trigger of the lockup and 
confirmed that reverting it fixes the hang, and i invested many 
days of testing into that.
 
When i reported this very serious and potentially data 
corrupting bug to you on January 15th you immediately suggested 
it's a problem with the commit above.

So you were in full knowledge of the root cause of the breakage 
- but why have you not reverted it? Doing a speedy revert is a 
must-have for any bug that has data corruption potential.

What you did is akin to playing russian roulette with other 
people's data. You knew about a serious, data corruptor 
breakage, you knew which change caused it, but still you didnt 
act to revert the patch or resolve the problem.

The expected behavior from an upstream maintainer is not to 
introduce breakages and wait until the "real cause" is 
understood. The expected behavior is to either fix or revert a 
commit that has been pinpointed as the source of a serious bug. 
You didnt do that - why?

> > I reported this lockup bug to you on _January 15th_, more 
> > than one and a half months ago - and it's still unfixed even 
> > today. Alan sent his v2 fix on Feburary 19th - two weeks 
> > ago.
> 
> OK, so why don't you humour me and try applying the diagnostic 
> patch I sent you on the 24th of January? [...]

Uhm, below is the so-called "diagnostic" patch you sent to me 
and the problem with that "patch" is that it does not tell us 
anything we did not know already.

It just confirms that the function locks up - and we already 
knew that much from all the logs i sent to you. I.e. that 
"patch" is an utter waste of my time and you should know that.

I bisect commits, i test fixes, but i dont test silly 
half-hearted "diagnostic" patches which must have taken you less 
than 1 minute to write and would waste 30 minutes of my time to 
test - for no objective purpose. It has no chance to give us any 
new information. I really have better things to do with my time.

James, you ned to try harder to get bugs fixed and you need to 
treat your testers better. Come up with a real diagnostic patch 
that tells us something useful and i'll run it. Until then i'll 
just ignore your rants.

	Ingo

--------------------->
From: James Bottomley

On Wed, 2009-01-21 at 16:44 +0100, Ingo Molnar wrote:
> > One thought I had is that somehow the starved list processing in 
> > scsi_run_queue might never be terminating.  This should be easy to 
> > verify: just put a counter inside the while(!list_empty(&starved_list)) 
> > and print if it gets ridiculously large (like > 1000).
> 
> i deal with about a dozen regressions a day, so it would really help if 
> you could send me a debug patch to apply. I can then turn off the reverts 
> for that testbox and wait for another hang to occur.

OK, try this.

But could we take this to the scsi-list so that others get a chance to
give suggestions and you're not single threaded on me.

Thanks,

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 940dc32..5919dd0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -593,6 +593,7 @@ static void scsi_run_queue(struct request_queue *q)
 	struct Scsi_Host *shost = sdev->host;
 	LIST_HEAD(starved_list);
 	unsigned long flags;
+	int count = 0;
 
 	if (scsi_target(sdev)->single_lun)
 		scsi_single_lun_run(sdev);
@@ -603,6 +604,8 @@ static void scsi_run_queue(struct request_queue *q)
 	while (!list_empty(&starved_list)) {
 		int flagset;
 
+		BUG_ON(count++ > 1000);
+
 		/*
 		 * As long as shost is accepting commands and we have
 		 * starved queues, call blk_run_queue. scsi_request_fn


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  0:30                 ` Thomas Gleixner
@ 2009-03-04  0:47                   ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04  0:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: James Bottomley, Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Rafael J. Wysocki,
	Andrew Morton


* Thomas Gleixner <tglx@linutronix.de> wrote:

> James,
> 
> On Tue, 3 Mar 2009, James Bottomley wrote:
> > However, if you base a feature tree off this compromised tree, now
> > you're causing extra work for other maintainers who see problems
> > reported with this tree, and have to take the time to investigate what's
> > going on.
> 
> Stop this nonsense. That bug stopped me and Ingo from testing code on
> machines with AIC7xxx controllers for quite a while.
> 
> The patch sent by the author of the SCSI commit which caused the
> breakage in the first place (b60af5b: [SCSI] simplify
> scsi_io_completion()) specifically to Ingo in response to Ingo's bug
> report seemed to fix it and it did not explode, so it was left in the
> -rt origin.patch simply because I forgot about it and there was no fix
> in the mainline kernel which would have made it obsolete.
> 
> Yes, I should have noticed the memory leak, but ...
> 
> > Worse, supposing there is a genuine SCSI bug exposed by the -rt tree
> > (say something timing or interrupt related).  So I ask the reporter to
> > retry with the regular kernel tree and the bug goes away. Now everyone
> > will think "Oh, it's just because of some SCSI crap Ingo put in his
> > tree".  Result: the bug goes undiagnosed until it bites several people
> > in the field, which is an avoidable result.
> > 
> > The executive summary is that your "it works for me, so I'm putting it
> > in my tree" attitude is damaging our quality process.
> 
> So according to your executive summary the full failure of 
> test machines due to a SCSI bug is not damaging anything. 
> These machines are^W have been part of the quality process and 
> stopped working due to wreckage induced by your tree in 
> 29-rc1.
> 
> The executive summary is: stop development and testing until 
> SCSI comes up with a blessed fix.
> 
> You have my full understanding that quality processing defined 
> by SCSI has precedence over general testing.
> 
> Once I'm done with cleaning up the mess caused by the crash in 
> linus tree (w/o any unblessed patches) I'm going to fix the 
> power supply of the AIC7xxx system to make sure it is stand by 
> when a blessed solution pops up in the unforeseeable future.

funny ...

I'd laugh about this if the matter wasnt so serious. According 
to two testers so far aic7xxx is essentially useless in 2.6.29. 

James, what is your action plan to deal with it? A bad commit 
was identified and reverting these commits:

c27aed5: Revert "[SCSI] scsi_lib: fix DID_RESET status problems"
3cd94dd: Revert "[SCSI] scsi_lib: don't decrement busy counters when inserting comma
0eb6038: Revert "[SCSI] Fix error handling for DIF/DIX"
84db545: Revert "[SCSI] Fix uninitialized variable error in scsi_io_completion"
813104e: Revert "[SCSI] simplify scsi_io_completion()"

fixed the bug according to my testing. Something needs to happen 
and you are the SCSI maintainer.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-03 22:26               ` James Bottomley
  (?)
@ 2009-03-04  2:01               ` Thomas Gleixner
  2009-03-04 18:55                 ` James Bottomley
  2009-03-04 21:45                 ` Thomas Gleixner
  -1 siblings, 2 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-04  2:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Tue, 3 Mar 2009, James Bottomley wrote:

> On Tue, 2009-03-03 at 23:07 +0100, Thomas Gleixner wrote:
> > On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> > > My bad. I was playing with that to get rid of the aic7xxx wreckage on
> > > one of my test boxen and forgot to remove it.
> > 
> > While the one below is definitey not my fault. It's on Linus latest:
> > 
> >  commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> > 
> > While compiling a kernel I triggerred the BUG below. Not so nice as it
> > took a whole filesystem with it. fsck took more than 20 min to recover
> > the leftovers :(
> > 
> > Thanks,
> > 
> > 	tglx
> > 
> > 
> > ------------[ cut here ]------------
> > kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!
> 
> This is BUG_ON(count > sdb->table.nents);
> 
> It looks like the sg list got split and grew in size ... I suspect this
> might be libata related, so cc'ing the ide list.  I suspect either the
> block layer initially parametrised this wrongly (tomo bug) or a sg list
> got split then requeued (something in libata?).

FYI, after I've lost a full day of work including the results of four
"iozone -a -g 4G" runs I tried to reproduce the problem on that
machine - the leftovers of the filesystem are pretty useless anyway.

It took about 2hrs to trigger the bug again. Same back trace.

Anything I can do what might help to decode the problem ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-03 23:48                     ` Ingo Molnar
@ 2009-03-04  6:39                       ` Stefan Richter
  2009-03-04  7:12                         ` Mike Galbraith
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Richter @ 2009-03-04  6:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: James Bottomley, Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-rt-users, Andrew Morton

Ingo Molnar wrote:
> * Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
>> Perhaps you could avoid putting such fixes into published 
>> merge branches.
[so that consumers of the trees use them only on demand]
> 
> Yeah, these commits are in none of the topic branches that are 
> the git base of development, they are all already in a separate 
> branch named "tip:out-of-tree".

So people should remember to retry without out-of-tree before reporting
problems == remember to report against the development base.

> That separate branch carries 
> assorted fixlets for various bugs we trigger in -tip qa. It 
> never goes upstream-wards, obviously. It's a bit like the 
> fixlets portion of -mm.

Great, if an eye is kept on issues & resolutions tracking --- like in
this thread, only with less confusion. ;-)
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  6:39                       ` Stefan Richter
@ 2009-03-04  7:12                         ` Mike Galbraith
  2009-03-04  7:50                             ` Stefan Richter
  0 siblings, 1 reply; 66+ messages in thread
From: Mike Galbraith @ 2009-03-04  7:12 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Ingo Molnar, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton

On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
> Ingo Molnar wrote:
> > * Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
> >> Perhaps you could avoid putting such fixes into published 
> >> merge branches.
> [so that consumers of the trees use them only on demand]
> > 
> > Yeah, these commits are in none of the topic branches that are 
> > the git base of development, they are all already in a separate 
> > branch named "tip:out-of-tree".
> 
> So people should remember to retry without out-of-tree before reporting
> problems == remember to report against the development base.

I'll bite, how does a gut-fu white belt accomplish that?

	-Mike


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  7:12                         ` Mike Galbraith
@ 2009-03-04  7:50                             ` Stefan Richter
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Richter @ 2009-03-04  7:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Ingo Molnar, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner

Mike Galbraith wrote:
> On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
>> Ingo Molnar wrote:
>>> Yeah, these commits are in none of the topic branches that are 
>>> the git base of development, they are all already in a separate 
>>> branch named "tip:out-of-tree".
>> So people should remember to retry without out-of-tree before reporting
>> problems == remember to report against the development base.
> 
> I'll bite, how does a gut-fu white belt accomplish that?

How to do it is currently not quite obvious
  - for linux-2.6-x86.git users because the master branch contains
    out-of-tree.¹  Or maybe nobody should use the master branch,
    I don't know.
  - Ditto for linux-2.6.tip.git.²
It was also impossible for users of the -rt patchset because the faulty
patch was obviously included in a base patch in the -rt patch series.³

As I said, I recommend that people do not receive those off-topic
patches by default, only on demand.  (Because I sometimes work with -rt
users.)

¹)http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-x86.git;a=commit;h=4239438b08fe8ec149ddc530238ca131bf88d290
²)http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commit;h=4239438b08fe8ec149ddc530238ca131bf88d290
³)http://lkml.org/lkml/2009/3/3/424
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
@ 2009-03-04  7:50                             ` Stefan Richter
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Richter @ 2009-03-04  7:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Ingo Molnar, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner

Mike Galbraith wrote:
> On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
>> Ingo Molnar wrote:
>>> Yeah, these commits are in none of the topic branches that are 
>>> the git base of development, they are all already in a separate 
>>> branch named "tip:out-of-tree".
>> So people should remember to retry without out-of-tree before reporting
>> problems == remember to report against the development base.
> 
> I'll bite, how does a gut-fu white belt accomplish that?

How to do it is currently not quite obvious
  - for linux-2.6-x86.git users because the master branch contains
    out-of-tree.¹  Or maybe nobody should use the master branch,
    I don't know.
  - Ditto for linux-2.6.tip.git.²
It was also impossible for users of the -rt patchset because the faulty
patch was obviously included in a base patch in the -rt patch series.³

As I said, I recommend that people do not receive those off-topic
patches by default, only on demand.  (Because I sometimes work with -rt
users.)

¹)http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-x86.git;a=commit;h=4239438b08fe8ec149ddc530238ca131bf88d290
²)http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commit;h=4239438b08fe8ec149ddc530238ca131bf88d290
³)http://lkml.org/lkml/2009/3/3/424
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  7:50                             ` Stefan Richter
@ 2009-03-04  8:00                               ` Mike Galbraith
  -1 siblings, 0 replies; 66+ messages in thread
From: Mike Galbraith @ 2009-03-04  8:00 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Ingo Molnar, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner

On Wed, 2009-03-04 at 08:50 +0100, Stefan Richter wrote:
> Mike Galbraith wrote:
> > On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
> >> Ingo Molnar wrote:
> >>> Yeah, these commits are in none of the topic branches that are 
> >>> the git base of development, they are all already in a separate 
> >>> branch named "tip:out-of-tree".
> >> So people should remember to retry without out-of-tree before reporting
> >> problems == remember to report against the development base.
> > 
> > I'll bite, how does a gut-fu white belt accomplish that?
> 
> How to do it is currently not quite obvious
>   - for linux-2.6-x86.git users because the master branch contains
>     out-of-tree.¹  Or maybe nobody should use the master branch,
>     I don't know.

Ah, ok.  I thought there might be an un-merge incantation or whatnot.

(seems to have just been an oopsie anyway)

	-Mike


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
@ 2009-03-04  8:00                               ` Mike Galbraith
  0 siblings, 0 replies; 66+ messages in thread
From: Mike Galbraith @ 2009-03-04  8:00 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Ingo Molnar, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner

On Wed, 2009-03-04 at 08:50 +0100, Stefan Richter wrote:
> Mike Galbraith wrote:
> > On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
> >> Ingo Molnar wrote:
> >>> Yeah, these commits are in none of the topic branches that are 
> >>> the git base of development, they are all already in a separate 
> >>> branch named "tip:out-of-tree".
> >> So people should remember to retry without out-of-tree before reporting
> >> problems == remember to report against the development base.
> > 
> > I'll bite, how does a gut-fu white belt accomplish that?
> 
> How to do it is currently not quite obvious
>   - for linux-2.6-x86.git users because the master branch contains
>     out-of-tree.¹  Or maybe nobody should use the master branch,
>     I don't know.

Ah, ok.  I thought there might be an un-merge incantation or whatnot.

(seems to have just been an oopsie anyway)

	-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  7:50                             ` Stefan Richter
  (?)
  (?)
@ 2009-03-04  9:06                             ` Thomas Gleixner
  2009-03-04 11:12                               ` Ingo Molnar
  -1 siblings, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-04  9:06 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Mike Galbraith, Ingo Molnar, James Bottomley, Jan Engelhardt,
	Boaz Harrosh, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Andrew Morton

Stefan,

On Wed, 4 Mar 2009, Stefan Richter wrote:
> As I said, I recommend that people do not receive those off-topic
> patches by default, only on demand.  (Because I sometimes work with -rt
> users.)

And as I said before, I missed to remove the patch before pushing out
the -rt tree. So I really do not need repeated advise on how to do my
work.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  9:06                             ` Thomas Gleixner
@ 2009-03-04 11:12                               ` Ingo Molnar
  2009-03-04 11:28                                 ` Stefan Richter
  0 siblings, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04 11:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Stefan Richter, Mike Galbraith, James Bottomley, Jan Engelhardt,
	Boaz Harrosh, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Andrew Morton


* Thomas Gleixner <tglx@linutronix.de> wrote:

> Stefan,
> 
> On Wed, 4 Mar 2009, Stefan Richter wrote:
> > As I said, I recommend that people do not receive those off-topic
> > patches by default, only on demand.  (Because I sometimes work with -rt
> > users.)
> 
> And as I said before, I missed to remove the patch before pushing out
> the -rt tree. So I really do not need repeated advise on how to do my
> work.

yeah. I dont think Stefan ever maintained a high-volume generic 
tree before. Kibitzing from the sidelines always makes it seem 
so easy ;-)

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04  7:50                             ` Stefan Richter
@ 2009-03-04 11:24                               ` Ingo Molnar
  -1 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04 11:24 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Mike Galbraith, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner


* Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Mike Galbraith wrote:
> > On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
> >> Ingo Molnar wrote:
> >>> Yeah, these commits are in none of the topic branches that are 
> >>> the git base of development, they are all already in a separate 
> >>> branch named "tip:out-of-tree".
> >> So people should remember to retry without out-of-tree before reporting
> >> problems == remember to report against the development base.
> > 
> > I'll bite, how does a gut-fu white belt accomplish that?
> 
> How to do it is currently not quite obvious
>   - for linux-2.6-x86.git users because the master branch contains
>     out-of-tree.¹  Or maybe nobody should use the master branch,
>     I don't know.
>   - Ditto for linux-2.6.tip.git.²
> It was also impossible for users of the -rt patchset because the faulty
> patch was obviously included in a base patch in the -rt patch series.³

Dont be silly. Have you never seen a faulty v1 fix patch in -mm 
or any of the other dozens of generic trees, replaced by a 
better working v2 patch in the next release?

FYI, we still have not tracked down the SCSI bug. Latest 
tip:master is able to boot and work on the affected systems, 
while the upstream kernel does not even boot because the fix (or 
the revert, should the fix be deemed unwanted) is stuck in the 
SCSI tree.

But even with all known SCSI fixes applied there's still a crash 
in the SCSI code that Thomas can trigger - reproduced on 
mainline as well.

Instead of spreading advice in this thread about how to run a 
high-volume generic tree [which you've never done before] you 
could also have used your time to look at the crashes that were 
posted, and you could have helped us fix the bug ;-)

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
@ 2009-03-04 11:24                               ` Ingo Molnar
  0 siblings, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04 11:24 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Mike Galbraith, James Bottomley, Jan Engelhardt, Boaz Harrosh,
	linux-scsi, Linux Kernel Mailing List, linux-rt-users,
	Andrew Morton, Thomas Gleixner


* Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Mike Galbraith wrote:
> > On Wed, 2009-03-04 at 07:39 +0100, Stefan Richter wrote:
> >> Ingo Molnar wrote:
> >>> Yeah, these commits are in none of the topic branches that are 
> >>> the git base of development, they are all already in a separate 
> >>> branch named "tip:out-of-tree".
> >> So people should remember to retry without out-of-tree before reporting
> >> problems == remember to report against the development base.
> > 
> > I'll bite, how does a gut-fu white belt accomplish that?
> 
> How to do it is currently not quite obvious
>   - for linux-2.6-x86.git users because the master branch contains
>     out-of-tree.¹  Or maybe nobody should use the master branch,
>     I don't know.
>   - Ditto for linux-2.6.tip.git.²
> It was also impossible for users of the -rt patchset because the faulty
> patch was obviously included in a base patch in the -rt patch series.³

Dont be silly. Have you never seen a faulty v1 fix patch in -mm 
or any of the other dozens of generic trees, replaced by a 
better working v2 patch in the next release?

FYI, we still have not tracked down the SCSI bug. Latest 
tip:master is able to boot and work on the affected systems, 
while the upstream kernel does not even boot because the fix (or 
the revert, should the fix be deemed unwanted) is stuck in the 
SCSI tree.

But even with all known SCSI fixes applied there's still a crash 
in the SCSI code that Thomas can trigger - reproduced on 
mainline as well.

Instead of spreading advice in this thread about how to run a 
high-volume generic tree [which you've never done before] you 
could also have used your time to look at the crashes that were 
posted, and you could have helped us fix the bug ;-)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04 11:12                               ` Ingo Molnar
@ 2009-03-04 11:28                                 ` Stefan Richter
  2009-03-04 11:47                                   ` Ingo Molnar
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Richter @ 2009-03-04 11:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Mike Galbraith, James Bottomley, Jan Engelhardt,
	Boaz Harrosh, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Andrew Morton

Ingo Molnar wrote:
> * Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Wed, 4 Mar 2009, Stefan Richter wrote:
>> > As I said, I recommend that people do not receive those off-topic
>> > patches by default, only on demand.  (Because I sometimes work with -rt
>> > users.)
>> 
>> And as I said before, I missed to remove the patch before pushing out
>> the -rt tree. So I really do not need repeated advise on how to do my
>> work.
> 
> yeah. I dont think Stefan ever maintained a high-volume generic 
> tree before. Kibitzing from the sidelines always makes it seem 
> so easy ;-)

You could also take this "I recommend..." not as advice how to do your
work but rather as a /user request/ regarding the outcome of your work.

(But you know all that already of course; you also know what's best for
those who pull your trees.)
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04 11:28                                 ` Stefan Richter
@ 2009-03-04 11:47                                   ` Ingo Molnar
  2009-03-04 12:02                                     ` Stefan Richter
  0 siblings, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-04 11:47 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Thomas Gleixner, Mike Galbraith, James Bottomley, Jan Engelhardt,
	Boaz Harrosh, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Andrew Morton


* Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Ingo Molnar wrote:
> > * Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Wed, 4 Mar 2009, Stefan Richter wrote:
> >> > As I said, I recommend that people do not receive those off-topic
> >> > patches by default, only on demand.  (Because I sometimes work with -rt
> >> > users.)
> >> 
> >> And as I said before, I missed to remove the patch before pushing out
> >> the -rt tree. So I really do not need repeated advise on how to do my
> >> work.
> > 
> > yeah. I dont think Stefan ever maintained a high-volume generic 
> > tree before. Kibitzing from the sidelines always makes it seem 
> > so easy ;-)
> 
> You could also take this "I recommend..." not as advice how to 
> do your work but rather as a /user request/ regarding the 
> outcome of your work.

Oh, so you use -tip? I didnt know that ... never got any 
feedback from you i guess it's trouble-free. So you like its 
stability?

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04 11:47                                   ` Ingo Molnar
@ 2009-03-04 12:02                                     ` Stefan Richter
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Richter @ 2009-03-04 12:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Mike Galbraith, James Bottomley, Jan Engelhardt,
	Boaz Harrosh, linux-scsi, Linux Kernel Mailing List,
	linux-rt-users, Andrew Morton

Ingo Molnar wrote:
> Oh, so you use -tip?

As I wrote I work with people who use -rt.
That's the extent to which I personally "use" -tip and it's the sole
reason why I comment here.
-- 
Stefan Richter
-=====-==--= --== --=--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-04  2:01               ` Thomas Gleixner
@ 2009-03-04 18:55                 ` James Bottomley
  2009-03-04 21:45                 ` Thomas Gleixner
  1 sibling, 0 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-04 18:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide, FUJITA Tomonori

On Wed, 2009-03-04 at 03:01 +0100, Thomas Gleixner wrote: 
> On Tue, 3 Mar 2009, James Bottomley wrote:
> 
> > On Tue, 2009-03-03 at 23:07 +0100, Thomas Gleixner wrote:
> > > On Tue, 3 Mar 2009, Thomas Gleixner wrote:
> > > > My bad. I was playing with that to get rid of the aic7xxx wreckage on
> > > > one of my test boxen and forgot to remove it.
> > > 
> > > While the one below is definitey not my fault. It's on Linus latest:
> > > 
> > >  commit 2450cf51a1bdba7037e91b1bcc494b01c58aaf66
> > > 
> > > While compiling a kernel I triggerred the BUG below. Not so nice as it
> > > took a whole filesystem with it. fsck took more than 20 min to recover
> > > the leftovers :(
> > > 
> > > Thanks,
> > > 
> > > 	tglx
> > > 
> > > 
> > > ------------[ cut here ]------------
> > > kernel BUG at /home/tglx/work/kernel/git/linux-2.6/drivers/scsi/scsi_lib.c:1141!
> > 
> > This is BUG_ON(count > sdb->table.nents);
> > 
> > It looks like the sg list got split and grew in size ... I suspect this
> > might be libata related, so cc'ing the ide list.  I suspect either the
> > block layer initially parametrised this wrongly (tomo bug) or a sg list
> > got split then requeued (something in libata?).
> 
> FYI, after I've lost a full day of work including the results of four
> "iozone -a -g 4G" runs I tried to reproduce the problem on that
> machine - the leftovers of the filesystem are pretty useless anyway.
> 
> It took about 2hrs to trigger the bug again. Same back trace.
> 
> Anything I can do what might help to decode the problem ?

I discussed this with Fujita Tomonori ... we think it's probably in the
generic block merging code. 

Could you run with this debugging code added until the fault triggers so
we can get an exact view of what the layout of the request is and why
we're getting an extra segment on mapping?

Thanks,

James

P.S. I think if you take the BUG() statement out, as long as it's only
one segment over, the machine should stay up long enough for a clean
shutdown.

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 940dc32..5219153 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1139,7 +1139,33 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb,
 	 * each segment.
 	 */
 	count = blk_rq_map_sg(req->q, req, sdb->table.sgl);
-	BUG_ON(count > sdb->table.nents);
+	if (unlikely(count > sdb->table.nents)) {
+		struct bio_vec *bvec;
+		struct req_iterator iter;
+		struct scatterlist *sg;
+		int i=0;
+
+		printk(KERN_ERR "MAPPING miscount %d phys maps to %d\n",
+		       sdb->table.nents, count);
+		blk_dump_rq_flags(req, "Request Flags");
+		
+		printk("DUMPING REQUEST LIST:\n");
+		rq_for_each_segment(bvec, req, iter) {
+			printk("[%d]: phys 0x%lx len 0x%x\n", i,
+			       (unsigned long)page_to_phys(bvec->bv_page) + bvec->bv_offset,
+			       bvec->bv_len);
+			i++;
+		}
+		printk("DUMPING MAPPED LIST:\n");
+		for_each_sg(sdb->table.sgl, sg, count, i) {
+			printk("[%d]: phys 0x%lx len 0x%x\n", i,
+			       (unsigned long)page_to_phys(sg_page(sg)) + sg->offset,
+			       sg->length);
+		}
+		BUG();
+	}
+			
+
 	sdb->table.nents = count;
 	if (blk_pc_request(req))
 		sdb->length = req->data_len;



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04 11:24                               ` Ingo Molnar
  (?)
@ 2009-03-04 19:11                               ` Andrew Morton
  2009-03-04 20:09                                 ` Thomas Gleixner
  -1 siblings, 1 reply; 66+ messages in thread
From: Andrew Morton @ 2009-03-04 19:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: stefanr, efault, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-rt-users, tglx

On Wed, 4 Mar 2009 12:24:55 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> FYI, we still have not tracked down the SCSI bug. Latest 
> tip:master is able to boot and work on the affected systems, 
> while the upstream kernel does not even boot because the fix (or 
> the revert, should the fix be deemed unwanted) is stuck in the 
> SCSI tree.

<wakes up>

There are fixes in the scsi tree?

Does the scsi tree fix all the regressions which you guys are seeing?

I must say that it seems to be awfully late in the cycle to have this
amount of breakage remaining in mainline when we know exactly which
patches need to be reverted to unbreak things.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Large amount of scsi-sgpool objects
  2009-03-04 19:11                               ` Andrew Morton
@ 2009-03-04 20:09                                 ` Thomas Gleixner
  0 siblings, 0 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-04 20:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, stefanr, efault, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-rt-users

On Wed, 4 Mar 2009, Andrew Morton wrote:
> On Wed, 4 Mar 2009 12:24:55 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > FYI, we still have not tracked down the SCSI bug. Latest 
> > tip:master is able to boot and work on the affected systems, 
> > while the upstream kernel does not even boot because the fix (or 
> > the revert, should the fix be deemed unwanted) is stuck in the 
> > SCSI tree.
> 
> <wakes up>
> 
> There are fixes in the scsi tree?
> 
> Does the scsi tree fix all the regressions which you guys are seeing?
> 
> I must say that it seems to be awfully late in the cycle to have this
> amount of breakage remaining in mainline when we know exactly which
> patches need to be reverted to unbreak things.

We are looking at two breakages:

1) the hang which is observed on AIC7xxx, which is identified (Ingo
tracked it down to a bunch of commits). There is a tentative fix right
now, which needs more testing and the ack of James. James has some
objections which stalled the fix, but I did not come around to dig
into this yet as I'm busy with #2.

2) a crash caused by blk_rq_map_sg() using more sg entries than the
code which setup the request estimated. That one is dangerous, it
already trashed a complete filesystem. I'm in the process of providing
the necessary debug data, as I found a way to reproduce the problem
halfsways.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-04  2:01               ` Thomas Gleixner
  2009-03-04 18:55                 ` James Bottomley
@ 2009-03-04 21:45                 ` Thomas Gleixner
  2009-03-04 22:56                   ` James Bottomley
  1 sibling, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-04 21:45 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Wed, 4 Mar 2009, Thomas Gleixner wrote:

Instrumented the code and the result of the failing request is
below. Looks like the function which sets up the request gets
nr_phys_segments wrong by one. 

If you need further trace data feel free to ask.

Thanks,

	tglx

bvprv: bvec_to_phys(bvprv)
bvec:  bvec_to_phys(bvec)
max_segment_size is 65536

scsi_init_sgtable: ISG: nr_seg 9 nents 9

blk_rq_map_sg: MSG: nbytes 4096

blk_rq_map_sg: MSG: new segment 1: bvec 0x6a8cd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8cd000 bvec 0x6a8ce000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ce000 bvec 0x6a8cf000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8cf000 bvec 0x6a8d0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d0000 bvec 0x6a8d1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d1000 bvec 0x6a8d2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d2000 bvec 0x6a8d3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d3000 bvec 0x6a8d4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d4000 bvec 0x6a8d5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d5000 bvec 0x6a8d6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d6000 bvec 0x6a8d7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d7000 bvec 0x6a8d8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d8000 bvec 0x6a8d9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8d9000 bvec 0x6a8da000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8da000 bvec 0x6a8db000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8db000 bvec 0x6a8dc000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8dc000 bvec 0x6a8dd000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 2: bvec 0x6a8dd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8dd000 bvec 0x6a8de000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8de000 bvec 0x6a8df000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8df000 bvec 0x6a8e0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e0000 bvec 0x6a8e1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e1000 bvec 0x6a8e2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e2000 bvec 0x6a8e3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e3000 bvec 0x6a8e4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e4000 bvec 0x6a8e5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e5000 bvec 0x6a8e6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e6000 bvec 0x6a8e7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e7000 bvec 0x6a8e8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e8000 bvec 0x6a8e9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8e9000 bvec 0x6a8ea000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ea000 bvec 0x6a8eb000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8eb000 bvec 0x6a8ec000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ec000 bvec 0x6a8ed000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 3: bvec 0x6a8ed000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ed000 bvec 0x6a8ee000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ee000 bvec 0x6a8ef000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ef000 bvec 0x6a8f0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f0000 bvec 0x6a8f1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f1000 bvec 0x6a8f2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f2000 bvec 0x6a8f3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f3000 bvec 0x6a8f4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f4000 bvec 0x6a8f5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f5000 bvec 0x6a8f6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f6000 bvec 0x6a8f7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f7000 bvec 0x6a8f8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f8000 bvec 0x6a8f9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8f9000 bvec 0x6a8fa000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8fa000 bvec 0x6a8fb000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8fb000 bvec 0x6a8fc000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8fc000 bvec 0x6a8fd000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 4: bvec 0x6a8fd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8fd000 bvec 0x6a8fe000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8fe000 bvec 0x6a8ff000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6a8ff000 bvec 0x6ac80000 sg->length 12288 

blk_rq_map_sg: MSG: new segment 5: bvec 0x6ac80000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac80000 bvec 0x6ac81000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac81000 bvec 0x6ac82000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac82000 bvec 0x6ac83000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac83000 bvec 0x6ac84000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac84000 bvec 0x6ac85000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac85000 bvec 0x6ac86000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac86000 bvec 0x6ac87000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac87000 bvec 0x6ac88000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac88000 bvec 0x6ac89000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac89000 bvec 0x6ac8a000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac8a000 bvec 0x6ac8b000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac8b000 bvec 0x6ac8c000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ac8c000 bvec 0x6accd000 sg->length 53248 

blk_rq_map_sg: MSG: new segment 6: bvec 0x6accd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6accd000 bvec 0x6acce000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acce000 bvec 0x6accf000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6accf000 bvec 0x6acd0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd0000 bvec 0x6acd1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd1000 bvec 0x6acd2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd2000 bvec 0x6acd3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd3000 bvec 0x6acd4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd4000 bvec 0x6acd5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd5000 bvec 0x6acd6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd6000 bvec 0x6acd7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd7000 bvec 0x6acd8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd8000 bvec 0x6acd9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acd9000 bvec 0x6acda000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acda000 bvec 0x6acdb000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acdb000 bvec 0x6acdc000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acdc000 bvec 0x6acdd000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 7: bvec 0x6acdd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acdd000 bvec 0x6acde000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acde000 bvec 0x6acdf000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acdf000 bvec 0x6ace0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace0000 bvec 0x6ace1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace1000 bvec 0x6ace2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace2000 bvec 0x6ace3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace3000 bvec 0x6ace4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace4000 bvec 0x6ace5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace5000 bvec 0x6ace6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace6000 bvec 0x6ace7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace7000 bvec 0x6ace8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace8000 bvec 0x6ace9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6ace9000 bvec 0x6acea000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acea000 bvec 0x6aceb000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6aceb000 bvec 0x6acec000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acec000 bvec 0x6aced000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 8: bvec 0x6aced000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6aced000 bvec 0x6acee000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acee000 bvec 0x6acef000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acef000 bvec 0x6acf0000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf0000 bvec 0x6acf1000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf1000 bvec 0x6acf2000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf2000 bvec 0x6acf3000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf3000 bvec 0x6acf4000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf4000 bvec 0x6acf5000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf5000 bvec 0x6acf6000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf6000 bvec 0x6acf7000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf7000 bvec 0x6acf8000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf8000 bvec 0x6acf9000 sg->length 49152 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acf9000 bvec 0x6acfa000 sg->length 53248 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acfa000 bvec 0x6acfb000 sg->length 57344 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acfb000 bvec 0x6acfc000 sg->length 61440 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acfc000 bvec 0x6acfd000 sg->length 65536 

blk_rq_map_sg: MSG: new segment 9: bvec 0x6acfd000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acfd000 bvec 0x6acfe000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acfe000 bvec 0x6acff000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6acff000 bvec 0x6b080000 sg->length 12288 

blk_rq_map_sg: MSG: new segment 10: bvec 0x6b080000

blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b080000 bvec 0x6b081000 sg->length 4096 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b081000 bvec 0x6b082000 sg->length 8192 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b082000 bvec 0x6b083000 sg->length 12288 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b083000 bvec 0x6b084000 sg->length 16384 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b084000 bvec 0x6b085000 sg->length 20480 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b085000 bvec 0x6b086000 sg->length 24576 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b086000 bvec 0x6b087000 sg->length 28672 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b087000 bvec 0x6b088000 sg->length 32768 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b088000 bvec 0x6b089000 sg->length 36864 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b089000 bvec 0x6b08a000 sg->length 40960 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b08a000 bvec 0x6b08b000 sg->length 45056 
blk_rq_map_sg: MSG: nbytes 4096
blk_rq_map_sg: MSG: bvprv 0x6b08b000 bvec 0x6b08c000 sg->length 49152 

scsi_init_sgtable: ISG: count 10

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-04 21:45                 ` Thomas Gleixner
@ 2009-03-04 22:56                   ` James Bottomley
  2009-03-05  0:13                     ` James Bottomley
  2009-03-05  8:36                     ` FUJITA Tomonori
  0 siblings, 2 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-04 22:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> 
> Instrumented the code and the result of the failing request is
> below. Looks like the function which sets up the request gets
> nr_phys_segments wrong by one. 
> 
> If you need further trace data feel free to ask.

OK, the mapping all checks out correctly ... there must be something
wrong with the way we count before mapping.

If you're tracing everything, could you add these static prints to the
trace ... they'll trigger a lot, but capturing how they applied to the
failing request might tell us why the count is wrong.

Thanks,

James

---

diff --git a/block/blk-merge.c b/block/blk-merge.c
index a104593..a529cba 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -127,21 +127,29 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
 		return 0;
 
 	if (bio->bi_seg_back_size + nxt->bi_seg_front_size >
-	    q->max_segment_size)
+	    q->max_segment_size) {
+		printk("Refusing contig merge, over segment size\n");
 		return 0;
+	}
 
-	if (!bio_has_data(bio))
+	if (!bio_has_data(bio)) {
+		printk("Allowing contig merge, bio has no data\n");
 		return 1;
+	}
 
-	if (!BIOVEC_PHYS_MERGEABLE(__BVEC_END(bio), __BVEC_START(nxt)))
+	if (!BIOVEC_PHYS_MERGEABLE(__BVEC_END(bio), __BVEC_START(nxt))) {
+		printk("Refusing contig merge, bio not phys mergeable\n");
 		return 0;
+	}
 
 	/*
 	 * bio and nxt are contiguous in memory; check if the queue allows
 	 * these two to be merged into one
 	 */
-	if (BIO_SEG_BOUNDARY(q, bio, nxt))
+	if (BIO_SEG_BOUNDARY(q, bio, nxt)) {
+		printk("Allowing contig merge, not across segment boundary\n");
 		return 1;
+	}
 
 	return 0;
 }
@@ -325,6 +333,12 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 	if ((req->nr_sectors + next->nr_sectors) > q->max_sectors)
 		return 0;
 
+	printk("Merging end %lx (segs %d) with beginning %lx (segs %d)\n",
+	       bvec_to_phys(__BVEC_END(req->biotail)), req->nr_phys_segments,
+	       bvec_to_phys(__BVEC_START(next->bio)), next->nr_phys_segments);
+	printk("Front size is %d, back size is %d\n",
+	       next->bio->bi_seg_front_size, req->biotail->bi_seg_back_size);
+
 	total_phys_segments = req->nr_phys_segments + next->nr_phys_segments;
 	if (blk_phys_contig_segment(q, req->biotail, next->bio)) {
 		if (req->nr_phys_segments == 1)



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-04 22:56                   ` James Bottomley
@ 2009-03-05  0:13                     ` James Bottomley
  2009-03-05  8:36                     ` FUJITA Tomonori
  1 sibling, 0 replies; 66+ messages in thread
From: James Bottomley @ 2009-03-05  0:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jan Engelhardt, Boaz Harrosh, linux-scsi,
	Linux Kernel Mailing List, linux-ide

On Wed, 2009-03-04 at 22:56 +0000, James Bottomley wrote:
> On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > 
> > Instrumented the code and the result of the failing request is
> > below. Looks like the function which sets up the request gets
> > nr_phys_segments wrong by one. 
> > 
> > If you need further trace data feel free to ask.
> 
> OK, the mapping all checks out correctly ... there must be something
> wrong with the way we count before mapping.
> 
> If you're tracing everything, could you add these static prints to the
> trace ... they'll trigger a lot, but capturing how they applied to the
> failing request might tell us why the count is wrong.

Debugging this on IRC, this is the point we reached:

ftrace debugging patch:

http://tglx.de/~tglx/dbg.patch

We're tracing both blk_recalc_rq_segments() and
blk_phys_contig_segment()

The results are here:

http://tglx.de/~tglx/t.txt.bz2

Although what they show is that we're missing the point where the
counting goes wrong (blk_recalc_rq_segments only goes up to 5 max).

James



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-04 22:56                   ` James Bottomley
  2009-03-05  0:13                     ` James Bottomley
@ 2009-03-05  8:36                     ` FUJITA Tomonori
  2009-03-05  8:39                       ` FUJITA Tomonori
  1 sibling, 1 reply; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05  8:36 UTC (permalink / raw)
  To: James.Bottomley
  Cc: tglx, jengelh, bharrosh, linux-scsi, linux-kernel, linux-ide

CC'ed Jens,

On Wed, 04 Mar 2009 22:56:29 +0000
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > 
> > Instrumented the code and the result of the failing request is
> > below. Looks like the function which sets up the request gets
> > nr_phys_segments wrong by one. 
> > 
> > If you need further trace data feel free to ask.
> 
> OK, the mapping all checks out correctly ... there must be something
> wrong with the way we count before mapping.

Yeah, looks we miscalculate nr_phys_segments in the merging path.

blk_recount_segments() needs to set bi_seg_front_size and
bi_seg_back_size for ll_merge_requests_fn()?

=
diff --git a/block/blk-merge.c b/block/blk-merge.c
index a104593..efb65b6 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
 
 void blk_recount_segments(struct request_queue *q, struct bio *bio)
 {
+	unsigned int seg_size;
 	struct bio *nxt = bio->bi_next;
 
 	bio->bi_next = NULL;
-	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
+	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
 	bio->bi_next = nxt;
 	bio->bi_flags |= (1 << BIO_SEG_VALID);
+
+	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
+		bio->bi_seg_front_size = seg_size;
+	if (bio->bi_phys_segments > bio->bi_seg_back_size)
+		bio->bi_seg_back_size = seg_size;
+
 }
 EXPORT_SYMBOL(blk_recount_segments);
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05  8:36                     ` FUJITA Tomonori
@ 2009-03-05  8:39                       ` FUJITA Tomonori
  2009-03-05  9:29                         ` FUJITA Tomonori
  0 siblings, 1 reply; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05  8:39 UTC (permalink / raw)
  To: tglx
  Cc: James.Bottomley, jengelh, bharrosh, linux-scsi, linux-kernel, linux-ide

On Thu, 5 Mar 2009 17:36:13 +0900
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> CC'ed Jens,
> 
> On Wed, 04 Mar 2009 22:56:29 +0000
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > 
> > > Instrumented the code and the result of the failing request is
> > > below. Looks like the function which sets up the request gets
> > > nr_phys_segments wrong by one. 
> > > 
> > > If you need further trace data feel free to ask.
> > 
> > OK, the mapping all checks out correctly ... there must be something
> > wrong with the way we count before mapping.
> 
> Yeah, looks we miscalculate nr_phys_segments in the merging path.
> 
> blk_recount_segments() needs to set bi_seg_front_size and
> bi_seg_back_size for ll_merge_requests_fn()?
> 
> =
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index a104593..efb65b6 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
>  
>  void blk_recount_segments(struct request_queue *q, struct bio *bio)
>  {
> +	unsigned int seg_size;
>  	struct bio *nxt = bio->bi_next;
>  
>  	bio->bi_next = NULL;
> -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
>  	bio->bi_next = nxt;
>  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> +
> +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> +		bio->bi_seg_front_size = seg_size;
> +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> +		bio->bi_seg_back_size = seg_size;
> +
>  }
>  EXPORT_SYMBOL(blk_recount_segments);

Duh, here's the proper patch.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index a104593..06e0db4 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
 
 void blk_recount_segments(struct request_queue *q, struct bio *bio)
 {
+	unsigned int seg_size;
 	struct bio *nxt = bio->bi_next;
 
 	bio->bi_next = NULL;
-	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
+	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
 	bio->bi_next = nxt;
 	bio->bi_flags |= (1 << BIO_SEG_VALID);
+
+	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
+		bio->bi_seg_front_size = seg_size;
+	if (seg_size > bio->bi_seg_back_size)
+		bio->bi_seg_back_size = seg_size;
+
 }
 EXPORT_SYMBOL(blk_recount_segments);
 

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05  8:39                       ` FUJITA Tomonori
@ 2009-03-05  9:29                         ` FUJITA Tomonori
  2009-03-05 10:09                           ` Jens Axboe
  0 siblings, 1 reply; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05  9:29 UTC (permalink / raw)
  To: jens.axboe
  Cc: tglx, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-ide

Oops, somehow I forgot to CC Jens...

On Thu, 5 Mar 2009 17:39:17 +0900
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> On Thu, 5 Mar 2009 17:36:13 +0900
> FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> 
> > CC'ed Jens,
> > 
> > On Wed, 04 Mar 2009 22:56:29 +0000
> > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > 
> > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > 
> > > > Instrumented the code and the result of the failing request is
> > > > below. Looks like the function which sets up the request gets
> > > > nr_phys_segments wrong by one. 
> > > > 
> > > > If you need further trace data feel free to ask.
> > > 
> > > OK, the mapping all checks out correctly ... there must be something
> > > wrong with the way we count before mapping.
> > 
> > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > 
> > blk_recount_segments() needs to set bi_seg_front_size and
> > bi_seg_back_size for ll_merge_requests_fn()?
> > 
> > =
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index a104593..efb65b6 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> >  
> >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> >  {
> > +	unsigned int seg_size;
> >  	struct bio *nxt = bio->bi_next;
> >  
> >  	bio->bi_next = NULL;
> > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> >  	bio->bi_next = nxt;
> >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > +
> > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > +		bio->bi_seg_front_size = seg_size;
> > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > +		bio->bi_seg_back_size = seg_size;
> > +
> >  }
> >  EXPORT_SYMBOL(blk_recount_segments);
> 
> Duh, here's the proper patch.
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index a104593..06e0db4 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
>  
>  void blk_recount_segments(struct request_queue *q, struct bio *bio)
>  {
> +	unsigned int seg_size;
>  	struct bio *nxt = bio->bi_next;
>  
>  	bio->bi_next = NULL;
> -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
>  	bio->bi_next = nxt;
>  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> +
> +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> +		bio->bi_seg_front_size = seg_size;
> +	if (seg_size > bio->bi_seg_back_size)
> +		bio->bi_seg_back_size = seg_size;
> +
>  }
>  EXPORT_SYMBOL(blk_recount_segments);
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05  9:29                         ` FUJITA Tomonori
@ 2009-03-05 10:09                           ` Jens Axboe
  2009-03-05 10:14                             ` Jens Axboe
  2009-03-05 10:15                             ` Ingo Molnar
  0 siblings, 2 replies; 66+ messages in thread
From: Jens Axboe @ 2009-03-05 10:09 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: tglx, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-ide

On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> Oops, somehow I forgot to CC Jens...
> 
> On Thu, 5 Mar 2009 17:39:17 +0900
> FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> 
> > On Thu, 5 Mar 2009 17:36:13 +0900
> > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > 
> > > CC'ed Jens,
> > > 
> > > On Wed, 04 Mar 2009 22:56:29 +0000
> > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > 
> > > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > > 
> > > > > Instrumented the code and the result of the failing request is
> > > > > below. Looks like the function which sets up the request gets
> > > > > nr_phys_segments wrong by one. 
> > > > > 
> > > > > If you need further trace data feel free to ask.
> > > > 
> > > > OK, the mapping all checks out correctly ... there must be something
> > > > wrong with the way we count before mapping.
> > > 
> > > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > > 
> > > blk_recount_segments() needs to set bi_seg_front_size and
> > > bi_seg_back_size for ll_merge_requests_fn()?
> > > 
> > > =
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index a104593..efb65b6 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > >  
> > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > >  {
> > > +	unsigned int seg_size;
> > >  	struct bio *nxt = bio->bi_next;
> > >  
> > >  	bio->bi_next = NULL;
> > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > >  	bio->bi_next = nxt;
> > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > +
> > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > +		bio->bi_seg_front_size = seg_size;
> > > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > > +		bio->bi_seg_back_size = seg_size;
> > > +
> > >  }
> > >  EXPORT_SYMBOL(blk_recount_segments);
> > 
> > Duh, here's the proper patch.
> > 
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index a104593..06e0db4 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> >  
> >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> >  {
> > +	unsigned int seg_size;
> >  	struct bio *nxt = bio->bi_next;
> >  
> >  	bio->bi_next = NULL;
> > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> >  	bio->bi_next = nxt;
> >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > +
> > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > +		bio->bi_seg_front_size = seg_size;
> > +	if (seg_size > bio->bi_seg_back_size)
> > +		bio->bi_seg_back_size = seg_size;
> > +
> >  }
> >  EXPORT_SYMBOL(blk_recount_segments);

Good catch, I merged it with a slight change of layout and clearing
seg_size initially, to avoid gcc silly errors.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:09                           ` Jens Axboe
@ 2009-03-05 10:14                             ` Jens Axboe
  2009-03-05 10:27                               ` FUJITA Tomonori
  2009-03-05 10:15                             ` Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: Jens Axboe @ 2009-03-05 10:14 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: tglx, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-ide

On Thu, Mar 05 2009, Jens Axboe wrote:
> On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> > Oops, somehow I forgot to CC Jens...
> > 
> > On Thu, 5 Mar 2009 17:39:17 +0900
> > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > 
> > > On Thu, 5 Mar 2009 17:36:13 +0900
> > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > 
> > > > CC'ed Jens,
> > > > 
> > > > On Wed, 04 Mar 2009 22:56:29 +0000
> > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > 
> > > > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > > > 
> > > > > > Instrumented the code and the result of the failing request is
> > > > > > below. Looks like the function which sets up the request gets
> > > > > > nr_phys_segments wrong by one. 
> > > > > > 
> > > > > > If you need further trace data feel free to ask.
> > > > > 
> > > > > OK, the mapping all checks out correctly ... there must be something
> > > > > wrong with the way we count before mapping.
> > > > 
> > > > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > > > 
> > > > blk_recount_segments() needs to set bi_seg_front_size and
> > > > bi_seg_back_size for ll_merge_requests_fn()?
> > > > 
> > > > =
> > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > index a104593..efb65b6 100644
> > > > --- a/block/blk-merge.c
> > > > +++ b/block/blk-merge.c
> > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > >  
> > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > >  {
> > > > +	unsigned int seg_size;
> > > >  	struct bio *nxt = bio->bi_next;
> > > >  
> > > >  	bio->bi_next = NULL;
> > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > >  	bio->bi_next = nxt;
> > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > +
> > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > +		bio->bi_seg_front_size = seg_size;
> > > > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > > > +		bio->bi_seg_back_size = seg_size;
> > > > +
> > > >  }
> > > >  EXPORT_SYMBOL(blk_recount_segments);
> > > 
> > > Duh, here's the proper patch.
> > > 
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index a104593..06e0db4 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > >  
> > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > >  {
> > > +	unsigned int seg_size;
> > >  	struct bio *nxt = bio->bi_next;
> > >  
> > >  	bio->bi_next = NULL;
> > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > >  	bio->bi_next = nxt;
> > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > +
> > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > +		bio->bi_seg_front_size = seg_size;
> > > +	if (seg_size > bio->bi_seg_back_size)
> > > +		bio->bi_seg_back_size = seg_size;
> > > +
> > >  }
> > >  EXPORT_SYMBOL(blk_recount_segments);
> 
> Good catch, I merged it with a slight change of layout and clearing
> seg_size initially, to avoid gcc silly errors.

While merging that, I think we can do better than this. Essentially we
just need to have __blk_recalc_rq_segments() track the back bio as well,
then we don't have to pass in a pointer for segment sizes.

Totally untested, comments welcome...

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 6be3797..5a244f0 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -39,14 +39,13 @@ void blk_recalc_rq_sectors(struct request *rq, int nsect)
 }
 
 static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
-					     struct bio *bio,
-					     unsigned int *seg_size_ptr)
+					     struct bio *bio)
 {
 	unsigned int phys_size;
 	struct bio_vec *bv, *bvprv = NULL;
 	int cluster, i, high, highprv = 1;
 	unsigned int seg_size, nr_phys_segs;
-	struct bio *fbio;
+	struct bio *fbio, *bbio;
 
 	if (!bio)
 		return 0;
@@ -87,41 +86,28 @@ new_segment:
 			seg_size = bv->bv_len;
 			highprv = high;
 		}
+		bbio = bio;
 	}
 
-	if (seg_size_ptr)
-		*seg_size_ptr = seg_size;
+	if (nr_phys_segs == 1 && seg_size > fbio->bi_seg_front_size)
+		fbio->bi_seg_front_size = seg_size;
+	if (seg_size > bbio->bi_seg_back_size)
+		bbio->bi_seg_back_size = seg_size;
 
 	return nr_phys_segs;
 }
 
 void blk_recalc_rq_segments(struct request *rq)
 {
-	unsigned int seg_size = 0, phys_segs;
-
-	phys_segs = __blk_recalc_rq_segments(rq->q, rq->bio, &seg_size);
-
-	if (phys_segs == 1 && seg_size > rq->bio->bi_seg_front_size)
-		rq->bio->bi_seg_front_size = seg_size;
-	if (seg_size > rq->biotail->bi_seg_back_size)
-		rq->biotail->bi_seg_back_size = seg_size;
-
-	rq->nr_phys_segments = phys_segs;
+	rq->nr_phys_segments = __blk_recalc_rq_segments(rq->q, rq->bio);
 }
 
 void blk_recount_segments(struct request_queue *q, struct bio *bio)
 {
 	struct bio *nxt = bio->bi_next;
-	unsigned int seg_size = 0;
 
 	bio->bi_next = NULL;
-	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
-
-	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
-		bio->bi_seg_front_size = seg_size;
-	if (seg_size > bio->bi_seg_back_size)
-		bio->bi_seg_back_size = seg_size;
-
+	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio);
 	bio->bi_next = nxt;
 	bio->bi_flags |= (1 << BIO_SEG_VALID);
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:09                           ` Jens Axboe
  2009-03-05 10:14                             ` Jens Axboe
@ 2009-03-05 10:15                             ` Ingo Molnar
  1 sibling, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-05 10:15 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide


* Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> > Oops, somehow I forgot to CC Jens...
> > 
> > On Thu, 5 Mar 2009 17:39:17 +0900
> > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > 
> > > On Thu, 5 Mar 2009 17:36:13 +0900
> > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > 
> > > > CC'ed Jens,
> > > > 
> > > > On Wed, 04 Mar 2009 22:56:29 +0000
> > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > 
> > > > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > > > 
> > > > > > Instrumented the code and the result of the failing request is
> > > > > > below. Looks like the function which sets up the request gets
> > > > > > nr_phys_segments wrong by one. 
> > > > > > 
> > > > > > If you need further trace data feel free to ask.
> > > > > 
> > > > > OK, the mapping all checks out correctly ... there must be something
> > > > > wrong with the way we count before mapping.
> > > > 
> > > > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > > > 
> > > > blk_recount_segments() needs to set bi_seg_front_size and
> > > > bi_seg_back_size for ll_merge_requests_fn()?
> > > > 
> > > > =
> > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > index a104593..efb65b6 100644
> > > > --- a/block/blk-merge.c
> > > > +++ b/block/blk-merge.c
> > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > >  
> > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > >  {
> > > > +	unsigned int seg_size;
> > > >  	struct bio *nxt = bio->bi_next;
> > > >  
> > > >  	bio->bi_next = NULL;
> > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > >  	bio->bi_next = nxt;
> > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > +
> > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > +		bio->bi_seg_front_size = seg_size;
> > > > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > > > +		bio->bi_seg_back_size = seg_size;
> > > > +
> > > >  }
> > > >  EXPORT_SYMBOL(blk_recount_segments);
> > > 
> > > Duh, here's the proper patch.
> > > 
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index a104593..06e0db4 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > >  
> > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > >  {
> > > +	unsigned int seg_size;
> > >  	struct bio *nxt = bio->bi_next;
> > >  
> > >  	bio->bi_next = NULL;
> > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > >  	bio->bi_next = nxt;
> > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > +
> > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > +		bio->bi_seg_front_size = seg_size;
> > > +	if (seg_size > bio->bi_seg_back_size)
> > > +		bio->bi_seg_back_size = seg_size;
> > > +
> > >  }
> > >  EXPORT_SYMBOL(blk_recount_segments);
> 
> Good catch, I merged it with a slight change of layout and 
> clearing seg_size initially, to avoid gcc silly errors.

thanks Jens for the super-quick action! I have tried the v2 
patch from Fujita and it's OK in -tip testing so far, on various 
pieces of x86 hardware with different storage systems - ata, 
sata, aic7xxx, etc.

Tested-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:14                             ` Jens Axboe
@ 2009-03-05 10:27                               ` FUJITA Tomonori
  2009-03-05 10:30                                 ` Jens Axboe
  0 siblings, 1 reply; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05 10:27 UTC (permalink / raw)
  To: jens.axboe
  Cc: fujita.tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

On Thu, 5 Mar 2009 11:14:36 +0100
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Mar 05 2009, Jens Axboe wrote:
> > On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> > > Oops, somehow I forgot to CC Jens...
> > > 
> > > On Thu, 5 Mar 2009 17:39:17 +0900
> > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > 
> > > > On Thu, 5 Mar 2009 17:36:13 +0900
> > > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > > 
> > > > > CC'ed Jens,
> > > > > 
> > > > > On Wed, 04 Mar 2009 22:56:29 +0000
> > > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > > 
> > > > > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > > > > 
> > > > > > > Instrumented the code and the result of the failing request is
> > > > > > > below. Looks like the function which sets up the request gets
> > > > > > > nr_phys_segments wrong by one. 
> > > > > > > 
> > > > > > > If you need further trace data feel free to ask.
> > > > > > 
> > > > > > OK, the mapping all checks out correctly ... there must be something
> > > > > > wrong with the way we count before mapping.
> > > > > 
> > > > > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > > > > 
> > > > > blk_recount_segments() needs to set bi_seg_front_size and
> > > > > bi_seg_back_size for ll_merge_requests_fn()?
> > > > > 
> > > > > =
> > > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > > index a104593..efb65b6 100644
> > > > > --- a/block/blk-merge.c
> > > > > +++ b/block/blk-merge.c
> > > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > > >  
> > > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > > >  {
> > > > > +	unsigned int seg_size;
> > > > >  	struct bio *nxt = bio->bi_next;
> > > > >  
> > > > >  	bio->bi_next = NULL;
> > > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > > >  	bio->bi_next = nxt;
> > > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > > +
> > > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > > +		bio->bi_seg_front_size = seg_size;
> > > > > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > > > > +		bio->bi_seg_back_size = seg_size;
> > > > > +
> > > > >  }
> > > > >  EXPORT_SYMBOL(blk_recount_segments);
> > > > 
> > > > Duh, here's the proper patch.
> > > > 
> > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > index a104593..06e0db4 100644
> > > > --- a/block/blk-merge.c
> > > > +++ b/block/blk-merge.c
> > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > >  
> > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > >  {
> > > > +	unsigned int seg_size;
> > > >  	struct bio *nxt = bio->bi_next;
> > > >  
> > > >  	bio->bi_next = NULL;
> > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > >  	bio->bi_next = nxt;
> > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > +
> > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > +		bio->bi_seg_front_size = seg_size;
> > > > +	if (seg_size > bio->bi_seg_back_size)
> > > > +		bio->bi_seg_back_size = seg_size;
> > > > +
> > > >  }
> > > >  EXPORT_SYMBOL(blk_recount_segments);
> > 
> > Good catch, I merged it with a slight change of layout and clearing
> > seg_size initially, to avoid gcc silly errors.
> 
> While merging that, I think we can do better than this. Essentially we
> just need to have __blk_recalc_rq_segments() track the back bio as well,
> then we don't have to pass in a pointer for segment sizes.
> 
> Totally untested, comments welcome...

Yeah, I think that updating bi_seg_front_size and bi_seg_back_size at
one place, __blk_recalc_rq_segments, is better. I thought about the
same way. But we are already in -rc7 and this must go into mainline
now. So I chose a less-intrusive way (similar to what we have done in
the past).

As you know, the merging code is really complicated and we could
overlook stuff easily. ;) It might be better to simplify the merging
code a bit.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:27                               ` FUJITA Tomonori
@ 2009-03-05 10:30                                 ` Jens Axboe
  2009-03-05 10:41                                   ` FUJITA Tomonori
  2009-03-05 10:41                                   ` Ingo Molnar
  0 siblings, 2 replies; 66+ messages in thread
From: Jens Axboe @ 2009-03-05 10:30 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: tglx, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-ide

On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> On Thu, 5 Mar 2009 11:14:36 +0100
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Thu, Mar 05 2009, Jens Axboe wrote:
> > > On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> > > > Oops, somehow I forgot to CC Jens...
> > > > 
> > > > On Thu, 5 Mar 2009 17:39:17 +0900
> > > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > > 
> > > > > On Thu, 5 Mar 2009 17:36:13 +0900
> > > > > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> > > > > 
> > > > > > CC'ed Jens,
> > > > > > 
> > > > > > On Wed, 04 Mar 2009 22:56:29 +0000
> > > > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > > > > > 
> > > > > > > On Wed, 2009-03-04 at 22:45 +0100, Thomas Gleixner wrote:
> > > > > > > > On Wed, 4 Mar 2009, Thomas Gleixner wrote:
> > > > > > > > 
> > > > > > > > Instrumented the code and the result of the failing request is
> > > > > > > > below. Looks like the function which sets up the request gets
> > > > > > > > nr_phys_segments wrong by one. 
> > > > > > > > 
> > > > > > > > If you need further trace data feel free to ask.
> > > > > > > 
> > > > > > > OK, the mapping all checks out correctly ... there must be something
> > > > > > > wrong with the way we count before mapping.
> > > > > > 
> > > > > > Yeah, looks we miscalculate nr_phys_segments in the merging path.
> > > > > > 
> > > > > > blk_recount_segments() needs to set bi_seg_front_size and
> > > > > > bi_seg_back_size for ll_merge_requests_fn()?
> > > > > > 
> > > > > > =
> > > > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > > > index a104593..efb65b6 100644
> > > > > > --- a/block/blk-merge.c
> > > > > > +++ b/block/blk-merge.c
> > > > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > > > >  
> > > > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > > > >  {
> > > > > > +	unsigned int seg_size;
> > > > > >  	struct bio *nxt = bio->bi_next;
> > > > > >  
> > > > > >  	bio->bi_next = NULL;
> > > > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > > > >  	bio->bi_next = nxt;
> > > > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > > > +
> > > > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > > > +		bio->bi_seg_front_size = seg_size;
> > > > > > +	if (bio->bi_phys_segments > bio->bi_seg_back_size)
> > > > > > +		bio->bi_seg_back_size = seg_size;
> > > > > > +
> > > > > >  }
> > > > > >  EXPORT_SYMBOL(blk_recount_segments);
> > > > > 
> > > > > Duh, here's the proper patch.
> > > > > 
> > > > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > > > index a104593..06e0db4 100644
> > > > > --- a/block/blk-merge.c
> > > > > +++ b/block/blk-merge.c
> > > > > @@ -111,12 +111,19 @@ void blk_recalc_rq_segments(struct request *rq)
> > > > >  
> > > > >  void blk_recount_segments(struct request_queue *q, struct bio *bio)
> > > > >  {
> > > > > +	unsigned int seg_size;
> > > > >  	struct bio *nxt = bio->bi_next;
> > > > >  
> > > > >  	bio->bi_next = NULL;
> > > > > -	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, NULL);
> > > > > +	bio->bi_phys_segments = __blk_recalc_rq_segments(q, bio, &seg_size);
> > > > >  	bio->bi_next = nxt;
> > > > >  	bio->bi_flags |= (1 << BIO_SEG_VALID);
> > > > > +
> > > > > +	if (bio->bi_phys_segments == 1 && seg_size > bio->bi_seg_front_size)
> > > > > +		bio->bi_seg_front_size = seg_size;
> > > > > +	if (seg_size > bio->bi_seg_back_size)
> > > > > +		bio->bi_seg_back_size = seg_size;
> > > > > +
> > > > >  }
> > > > >  EXPORT_SYMBOL(blk_recount_segments);
> > > 
> > > Good catch, I merged it with a slight change of layout and clearing
> > > seg_size initially, to avoid gcc silly errors.
> > 
> > While merging that, I think we can do better than this. Essentially we
> > just need to have __blk_recalc_rq_segments() track the back bio as well,
> > then we don't have to pass in a pointer for segment sizes.
> > 
> > Totally untested, comments welcome...
> 
> Yeah, I think that updating bi_seg_front_size and bi_seg_back_size at
> one place, __blk_recalc_rq_segments, is better. I thought about the
> same way. But we are already in -rc7 and this must go into mainline
> now. So I chose a less-intrusive way (similar to what we have done in
> the past).
> 
> As you know, the merging code is really complicated and we could
> overlook stuff easily. ;) It might be better to simplify the merging
> code a bit.

If someone (Ingo?) is willing to test the last variant, I'd much rather
add that. It does simplify it (imho), and it kills 23 lines while only
adding 9. But a quick response would be nice, then I can ask Linus to
pull it later today.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:30                                 ` Jens Axboe
@ 2009-03-05 10:41                                   ` FUJITA Tomonori
  2009-03-05 11:10                                     ` Jens Axboe
  2009-03-05 10:41                                   ` Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05 10:41 UTC (permalink / raw)
  To: jens.axboe
  Cc: fujita.tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

On Thu, 5 Mar 2009 11:30:24 +0100
Jens Axboe <jens.axboe@oracle.com> wrote:

> > > While merging that, I think we can do better than this. Essentially we
> > > just need to have __blk_recalc_rq_segments() track the back bio as well,
> > > then we don't have to pass in a pointer for segment sizes.
> > > 
> > > Totally untested, comments welcome...
> > 
> > Yeah, I think that updating bi_seg_front_size and bi_seg_back_size at
> > one place, __blk_recalc_rq_segments, is better. I thought about the
> > same way. But we are already in -rc7 and this must go into mainline
> > now. So I chose a less-intrusive way (similar to what we have done in
> > the past).
> > 
> > As you know, the merging code is really complicated and we could
> > overlook stuff easily. ;) It might be better to simplify the merging
> > code a bit.
> 
> If someone (Ingo?) is willing to test the last variant, I'd much rather
> add that. It does simplify it (imho), and it kills 23 lines while only
> adding 9. But a quick response would be nice, then I can ask Linus to
> pull it later today.

I prefer to keep your change for 2.6.30 but if you want to push it
now, it's fine by me.

Ingo, you can quickly hit this bug without the patch?

I've not hit this bug while I've been performing intensive I/Os for
the last three hours. And I thought that Thomas took two hours to hit
this. So maybe it's too early to give 'Tested-by'. With 
max_segment_size decreased, we might hit this easier.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:30                                 ` Jens Axboe
  2009-03-05 10:41                                   ` FUJITA Tomonori
@ 2009-03-05 10:41                                   ` Ingo Molnar
  2009-03-05 11:05                                     ` Jens Axboe
  1 sibling, 1 reply; 66+ messages in thread
From: Ingo Molnar @ 2009-03-05 10:41 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide


* Jens Axboe <jens.axboe@oracle.com> wrote:

> > > Totally untested, comments welcome...
> > 
> > Yeah, I think that updating bi_seg_front_size and 
> > bi_seg_back_size at one place, __blk_recalc_rq_segments, is 
> > better. I thought about the same way. But we are already in 
> > -rc7 and this must go into mainline now. So I chose a 
> > less-intrusive way (similar to what we have done in the 
> > past).
> > 
> > As you know, the merging code is really complicated and we 
> > could overlook stuff easily. ;) It might be better to 
> > simplify the merging code a bit.
> 
> If someone (Ingo?) is willing to test the last variant, I'd 
> much rather add that. It does simplify it (imho), and it kills 
> 23 lines while only adding 9. But a quick response would be 
> nice, then I can ask Linus to pull it later today.

sure, can give it a whirl.

Note that your patch in this thread does no apply cleanly. Could 
you please send a pull request of your latest fixes that i could 
pull into tip:out-of-tree for testing purposes?

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:41                                   ` Ingo Molnar
@ 2009-03-05 11:05                                     ` Jens Axboe
  2009-03-05 11:07                                       ` Ingo Molnar
  0 siblings, 1 reply; 66+ messages in thread
From: Jens Axboe @ 2009-03-05 11:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: FUJITA Tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

On Thu, Mar 05 2009, Ingo Molnar wrote:
> 
> * Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > > > Totally untested, comments welcome...
> > > 
> > > Yeah, I think that updating bi_seg_front_size and 
> > > bi_seg_back_size at one place, __blk_recalc_rq_segments, is 
> > > better. I thought about the same way. But we are already in 
> > > -rc7 and this must go into mainline now. So I chose a 
> > > less-intrusive way (similar to what we have done in the 
> > > past).
> > > 
> > > As you know, the merging code is really complicated and we 
> > > could overlook stuff easily. ;) It might be better to 
> > > simplify the merging code a bit.
> > 
> > If someone (Ingo?) is willing to test the last variant, I'd 
> > much rather add that. It does simplify it (imho), and it kills 
> > 23 lines while only adding 9. But a quick response would be 
> > nice, then I can ask Linus to pull it later today.
> 
> sure, can give it a whirl.
> 
> Note that your patch in this thread does no apply cleanly. Could 
> you please send a pull request of your latest fixes that i could 
> pull into tip:out-of-tree for testing purposes?

Hmm that's odd, I have no changes in blk-merge.c in my tree against
Linus'. But you can pull:

  git://git.kernel.dk/linux-2.6-block.git for-linus

Jens Axboe (2):
      cciss: remove 30 second initial timeout on controller reset
      block: fix missing bio back/front segment size setting in blk_recount_segments()

Kris Shannon (1):
      Fix kernel NULL pointer dereference in xen-blkfront

Roel Kluin (1):
      loop: don't increment p->offset with (size_t) -EINVAL

 block/blk-merge.c            |   25 +++++++++----------------
 drivers/block/cciss.c        |    8 +++-----
 drivers/block/loop.c         |    3 +--
 drivers/block/xen-blkfront.c |    2 ++
 4 files changed, 15 insertions(+), 23 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 11:05                                     ` Jens Axboe
@ 2009-03-05 11:07                                       ` Ingo Molnar
  2009-03-05 12:09                                         ` Thomas Gleixner
  2009-03-05 19:32                                         ` Ingo Molnar
  0 siblings, 2 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-05 11:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide


* Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Mar 05 2009, Ingo Molnar wrote:
> > 
> > * Jens Axboe <jens.axboe@oracle.com> wrote:
> > 
> > > > > Totally untested, comments welcome...
> > > > 
> > > > Yeah, I think that updating bi_seg_front_size and 
> > > > bi_seg_back_size at one place, __blk_recalc_rq_segments, is 
> > > > better. I thought about the same way. But we are already in 
> > > > -rc7 and this must go into mainline now. So I chose a 
> > > > less-intrusive way (similar to what we have done in the 
> > > > past).
> > > > 
> > > > As you know, the merging code is really complicated and we 
> > > > could overlook stuff easily. ;) It might be better to 
> > > > simplify the merging code a bit.
> > > 
> > > If someone (Ingo?) is willing to test the last variant, I'd 
> > > much rather add that. It does simplify it (imho), and it kills 
> > > 23 lines while only adding 9. But a quick response would be 
> > > nice, then I can ask Linus to pull it later today.
> > 
> > sure, can give it a whirl.
> > 
> > Note that your patch in this thread does no apply cleanly. Could 
> > you please send a pull request of your latest fixes that i could 
> > pull into tip:out-of-tree for testing purposes?
> 
> Hmm that's odd, I have no changes in blk-merge.c in my tree 
> against Linus'. But you can pull:
> 
>   git://git.kernel.dk/linux-2.6-block.git for-linus
> 
> Jens Axboe (2):
>       cciss: remove 30 second initial timeout on controller reset
>       block: fix missing bio back/front segment size setting in blk_recount_segments()
> 
> Kris Shannon (1):
>       Fix kernel NULL pointer dereference in xen-blkfront
> 
> Roel Kluin (1):
>       loop: don't increment p->offset with (size_t) -EINVAL
> 
>  block/blk-merge.c            |   25 +++++++++----------------
>  drivers/block/cciss.c        |    8 +++-----
>  drivers/block/loop.c         |    3 +--
>  drivers/block/xen-blkfront.c |    2 ++
>  4 files changed, 15 insertions(+), 23 deletions(-)

It pulled without conflicts so we are good. Started testing it. 
Once Thomas is around i suspect he'll be able to test it too 
with his reproducer.

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 10:41                                   ` FUJITA Tomonori
@ 2009-03-05 11:10                                     ` Jens Axboe
  2009-03-05 11:40                                       ` FUJITA Tomonori
  0 siblings, 1 reply; 66+ messages in thread
From: Jens Axboe @ 2009-03-05 11:10 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: tglx, James.Bottomley, jengelh, bharrosh, linux-scsi,
	linux-kernel, linux-ide

On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> On Thu, 5 Mar 2009 11:30:24 +0100
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > > > While merging that, I think we can do better than this. Essentially we
> > > > just need to have __blk_recalc_rq_segments() track the back bio as well,
> > > > then we don't have to pass in a pointer for segment sizes.
> > > > 
> > > > Totally untested, comments welcome...
> > > 
> > > Yeah, I think that updating bi_seg_front_size and bi_seg_back_size at
> > > one place, __blk_recalc_rq_segments, is better. I thought about the
> > > same way. But we are already in -rc7 and this must go into mainline
> > > now. So I chose a less-intrusive way (similar to what we have done in
> > > the past).
> > > 
> > > As you know, the merging code is really complicated and we could
> > > overlook stuff easily. ;) It might be better to simplify the merging
> > > code a bit.
> > 
> > If someone (Ingo?) is willing to test the last variant, I'd much rather
> > add that. It does simplify it (imho), and it kills 23 lines while only
> > adding 9. But a quick response would be nice, then I can ask Linus to
> > pull it later today.
> 
> I prefer to keep your change for 2.6.30 but if you want to push it
> now, it's fine by me.

I honestly can't see much of a difference in change complexity, so I
don't see much point in putting one fix in 2.6.29 and then doing another
for 2.6.30...

> Ingo, you can quickly hit this bug without the patch?
> 
> I've not hit this bug while I've been performing intensive I/Os for
> the last three hours. And I thought that Thomas took two hours to hit
> this. So maybe it's too early to give 'Tested-by'. With 
> max_segment_size decreased, we might hit this easier.

Yep, that may help. I haven't seen this thread until I was cc'ed on it,
so I haven't even read up on the generic problem yet...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 11:10                                     ` Jens Axboe
@ 2009-03-05 11:40                                       ` FUJITA Tomonori
  0 siblings, 0 replies; 66+ messages in thread
From: FUJITA Tomonori @ 2009-03-05 11:40 UTC (permalink / raw)
  To: jens.axboe
  Cc: fujita.tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

On Thu, 5 Mar 2009 12:10:40 +0100
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Mar 05 2009, FUJITA Tomonori wrote:
> > On Thu, 5 Mar 2009 11:30:24 +0100
> > Jens Axboe <jens.axboe@oracle.com> wrote:
> > 
> > > > > While merging that, I think we can do better than this. Essentially we
> > > > > just need to have __blk_recalc_rq_segments() track the back bio as well,
> > > > > then we don't have to pass in a pointer for segment sizes.
> > > > > 
> > > > > Totally untested, comments welcome...
> > > > 
> > > > Yeah, I think that updating bi_seg_front_size and bi_seg_back_size at
> > > > one place, __blk_recalc_rq_segments, is better. I thought about the
> > > > same way. But we are already in -rc7 and this must go into mainline
> > > > now. So I chose a less-intrusive way (similar to what we have done in
> > > > the past).
> > > > 
> > > > As you know, the merging code is really complicated and we could
> > > > overlook stuff easily. ;) It might be better to simplify the merging
> > > > code a bit.
> > > 
> > > If someone (Ingo?) is willing to test the last variant, I'd much rather
> > > add that. It does simplify it (imho), and it kills 23 lines while only
> > > adding 9. But a quick response would be nice, then I can ask Linus to
> > > pull it later today.
> > 
> > I prefer to keep your change for 2.6.30 but if you want to push it
> > now, it's fine by me.
> 
> I honestly can't see much of a difference in change complexity, so I
> don't see much point in putting one fix in 2.6.29 and then doing another
> for 2.6.30...

My preference are:

1) simply reverting commit 1e42807918d17e8c93bf14fbb74be84b141334c1
(and blaming ext4 for now).

2) applying my patch, affecting only blk_recount_segments().

3) applying your patch, affecting blk_recount_segments() and
blk_recalc_rq_segments().


But as I said, the third options is fine by me. Your patch looks ok to
me.


> > Ingo, you can quickly hit this bug without the patch?
> > 
> > I've not hit this bug while I've been performing intensive I/Os for
> > the last three hours. And I thought that Thomas took two hours to hit
> > this. So maybe it's too early to give 'Tested-by'. With 
> > max_segment_size decreased, we might hit this easier.
> 
> Yep, that may help. I haven't seen this thread until I was cc'ed on it,
> so I haven't even read up on the generic problem yet...

I think that last night James and Thomas finally found that the block
layer miscalculates nr_phys_segments. And then I figured out exactly
where the bug is, I guess hopefully.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 11:07                                       ` Ingo Molnar
@ 2009-03-05 12:09                                         ` Thomas Gleixner
  2009-03-05 23:16                                           ` Thomas Gleixner
  2009-03-05 19:32                                         ` Ingo Molnar
  1 sibling, 1 reply; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-05 12:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jens Axboe, FUJITA Tomonori, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

Jens,

On Thu, 5 Mar 2009, Ingo Molnar wrote:
> * Jens Axboe <jens.axboe@oracle.com> wrote:
> > On Thu, Mar 05 2009, Ingo Molnar wrote:
> It pulled without conflicts so we are good. Started testing it. 
> Once Thomas is around i suspect he'll be able to test it too 
> with his reproducer.

I started the tests on the reproducer machine and the first result
looks good so far. I let the tests run in a loop to make sure that it
fixes the problem for real. Will report later today what the long run
outcome is.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 11:07                                       ` Ingo Molnar
  2009-03-05 12:09                                         ` Thomas Gleixner
@ 2009-03-05 19:32                                         ` Ingo Molnar
  1 sibling, 0 replies; 66+ messages in thread
From: Ingo Molnar @ 2009-03-05 19:32 UTC (permalink / raw)
  To: Jens Axboe
  Cc: FUJITA Tomonori, tglx, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide


* Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Thu, Mar 05 2009, Ingo Molnar wrote:
> > > 
> > > * Jens Axboe <jens.axboe@oracle.com> wrote:
> > > 
> > > > > > Totally untested, comments welcome...
> > > > > 
> > > > > Yeah, I think that updating bi_seg_front_size and 
> > > > > bi_seg_back_size at one place, __blk_recalc_rq_segments, is 
> > > > > better. I thought about the same way. But we are already in 
> > > > > -rc7 and this must go into mainline now. So I chose a 
> > > > > less-intrusive way (similar to what we have done in the 
> > > > > past).
> > > > > 
> > > > > As you know, the merging code is really complicated and we 
> > > > > could overlook stuff easily. ;) It might be better to 
> > > > > simplify the merging code a bit.
> > > > 
> > > > If someone (Ingo?) is willing to test the last variant, I'd 
> > > > much rather add that. It does simplify it (imho), and it kills 
> > > > 23 lines while only adding 9. But a quick response would be 
> > > > nice, then I can ask Linus to pull it later today.
> > > 
> > > sure, can give it a whirl.
> > > 
> > > Note that your patch in this thread does no apply cleanly. Could 
> > > you please send a pull request of your latest fixes that i could 
> > > pull into tip:out-of-tree for testing purposes?
> > 
> > Hmm that's odd, I have no changes in blk-merge.c in my tree 
> > against Linus'. But you can pull:
> > 
> >   git://git.kernel.dk/linux-2.6-block.git for-linus
> > 
> > Jens Axboe (2):
> >       cciss: remove 30 second initial timeout on controller reset
> >       block: fix missing bio back/front segment size setting in blk_recount_segments()
> > 
> > Kris Shannon (1):
> >       Fix kernel NULL pointer dereference in xen-blkfront
> > 
> > Roel Kluin (1):
> >       loop: don't increment p->offset with (size_t) -EINVAL
> > 
> >  block/blk-merge.c            |   25 +++++++++----------------
> >  drivers/block/cciss.c        |    8 +++-----
> >  drivers/block/loop.c         |    3 +--
> >  drivers/block/xen-blkfront.c |    2 ++
> >  4 files changed, 15 insertions(+), 23 deletions(-)
> 
> It pulled without conflicts so we are good. Started testing 
> it. Once Thomas is around i suspect he'll be able to test it 
> too with his reproducer.

FYI, here the patches didnt cause any problems.

Tested-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects
  2009-03-05 12:09                                         ` Thomas Gleixner
@ 2009-03-05 23:16                                           ` Thomas Gleixner
  0 siblings, 0 replies; 66+ messages in thread
From: Thomas Gleixner @ 2009-03-05 23:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jens Axboe, FUJITA Tomonori, James.Bottomley, jengelh, bharrosh,
	linux-scsi, linux-kernel, linux-ide

On Thu, 5 Mar 2009, Thomas Gleixner wrote:
> Jens,
> 
> On Thu, 5 Mar 2009, Ingo Molnar wrote:
> > * Jens Axboe <jens.axboe@oracle.com> wrote:
> > > On Thu, Mar 05 2009, Ingo Molnar wrote:
> > It pulled without conflicts so we are good. Started testing it. 
> > Once Thomas is around i suspect he'll be able to test it too 
> > with his reproducer.
> 
> I started the tests on the reproducer machine and the first result
> looks good so far. I let the tests run in a loop to make sure that it
> fixes the problem for real. Will report later today what the long run
> outcome is.

Still up and running.

Tested-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2009-03-05 23:16 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-03  1:28 Large amount of scsi-sgpool objects Jan Engelhardt
2009-03-03  9:31 ` Boaz Harrosh
2009-03-03 15:21   ` James Bottomley
2009-03-03 16:08     ` Jan Engelhardt
2009-03-03 16:24       ` Boaz Harrosh
2009-03-03 17:59         ` Alan Stern
2009-03-03 17:59           ` Alan Stern
2009-03-03 20:56           ` Ingo Molnar
2009-03-03 21:06             ` Alan Stern
2009-03-03 21:06               ` Alan Stern
2009-03-03 16:25       ` James Bottomley
2009-03-03 17:19         ` Thomas Gleixner
2009-03-03 22:07           ` [BUG] 2.6.29-rc6-2450cf in scsi_lib.c (was: Large amount of scsi-sgpool)objects Thomas Gleixner
2009-03-03 22:07             ` Thomas Gleixner
2009-03-03 22:22             ` Jan Engelhardt
2009-03-03 23:07               ` Thomas Gleixner
2009-03-03 22:26             ` James Bottomley
2009-03-03 22:26               ` James Bottomley
2009-03-04  2:01               ` Thomas Gleixner
2009-03-04 18:55                 ` James Bottomley
2009-03-04 21:45                 ` Thomas Gleixner
2009-03-04 22:56                   ` James Bottomley
2009-03-05  0:13                     ` James Bottomley
2009-03-05  8:36                     ` FUJITA Tomonori
2009-03-05  8:39                       ` FUJITA Tomonori
2009-03-05  9:29                         ` FUJITA Tomonori
2009-03-05 10:09                           ` Jens Axboe
2009-03-05 10:14                             ` Jens Axboe
2009-03-05 10:27                               ` FUJITA Tomonori
2009-03-05 10:30                                 ` Jens Axboe
2009-03-05 10:41                                   ` FUJITA Tomonori
2009-03-05 11:10                                     ` Jens Axboe
2009-03-05 11:40                                       ` FUJITA Tomonori
2009-03-05 10:41                                   ` Ingo Molnar
2009-03-05 11:05                                     ` Jens Axboe
2009-03-05 11:07                                       ` Ingo Molnar
2009-03-05 12:09                                         ` Thomas Gleixner
2009-03-05 23:16                                           ` Thomas Gleixner
2009-03-05 19:32                                         ` Ingo Molnar
2009-03-05 10:15                             ` Ingo Molnar
2009-03-03 19:22         ` Large amount of scsi-sgpool objects Ingo Molnar
2009-03-03 21:25           ` James Bottomley
2009-03-03 21:44             ` Ingo Molnar
2009-03-03 22:39               ` James Bottomley
2009-03-03 23:03                 ` Ingo Molnar
2009-03-03 23:32                   ` Stefan Richter
2009-03-03 23:48                     ` Ingo Molnar
2009-03-04  6:39                       ` Stefan Richter
2009-03-04  7:12                         ` Mike Galbraith
2009-03-04  7:50                           ` Stefan Richter
2009-03-04  7:50                             ` Stefan Richter
2009-03-04  8:00                             ` Mike Galbraith
2009-03-04  8:00                               ` Mike Galbraith
2009-03-04  9:06                             ` Thomas Gleixner
2009-03-04 11:12                               ` Ingo Molnar
2009-03-04 11:28                                 ` Stefan Richter
2009-03-04 11:47                                   ` Ingo Molnar
2009-03-04 12:02                                     ` Stefan Richter
2009-03-04 11:24                             ` Ingo Molnar
2009-03-04 11:24                               ` Ingo Molnar
2009-03-04 19:11                               ` Andrew Morton
2009-03-04 20:09                                 ` Thomas Gleixner
2009-03-04  0:01                   ` James Bottomley
2009-03-04  0:39                     ` Ingo Molnar
2009-03-04  0:30                 ` Thomas Gleixner
2009-03-04  0:47                   ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.