linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] um: ubd: Fix data corruption
@ 2010-09-28 21:47 Richard Weinberger
  2010-09-28 22:00 ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Weinberger @ 2010-09-28 21:47 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, jaxboe, martin.petersen, adobriyan, cdfrey,
	Richard Weinberger

Under high load the file system gets corrupted.
This patch fixes the issue.

Many thanks to Janjaap Bos <janjaap@bos.nl>!

LKML-Reference: <AANLkTi=PTp7YW_eYxtF-H2QSxgei3whWH59wU0C9oCkz () mail ! gmail ! com>
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 arch/um/drivers/ubd_kern.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..2874b83 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -748,9 +748,12 @@ static int ubd_open_dev(struct ubd *ubd_dev)
 	}
 	ubd_dev->fd = fd;
 
-	if(ubd_dev->cow.file != NULL){
-		blk_queue_max_hw_sectors(ubd_dev->queue, 8 * sizeof(long));
+	/* A setting higher than 1 sector currently (>= v2.6.31) generates
+		data loss, both for raw and cow ubd. */
+	blk_queue_max_hw_sectors(ubd_dev->queue, 1 * sizeof(long));
+	blk_queue_max_segments(ubd_dev->queue, 1 * sizeof(long));
 
+	if (ubd_dev->cow.file != NULL) {
 		err = -ENOMEM;
 		ubd_dev->cow.bitmap = vmalloc(ubd_dev->cow.bitmap_len);
 		if(ubd_dev->cow.bitmap == NULL){
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 21:47 [PATCH 1/1] um: ubd: Fix data corruption Richard Weinberger
@ 2010-09-28 22:00 ` Andrew Morton
  2010-09-28 22:13   ` Richard Weinberger
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2010-09-28 22:00 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, jaxboe, martin.petersen, adobriyan, cdfrey

On Tue, 28 Sep 2010 23:47:36 +0200
Richard Weinberger <richard@nod.at> wrote:

> Under high load the file system gets corrupted.
> This patch fixes the issue.
> 
> Many thanks to Janjaap Bos <janjaap@bos.nl>!
> 
> LKML-Reference: <AANLkTi=PTp7YW_eYxtF-H2QSxgei3whWH59wU0C9oCkz () mail ! gmail ! com>
> Signed-off-by: Richard Weinberger <richard@nod.at>
> ---
>  arch/um/drivers/ubd_kern.c |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 1bcd208..2874b83 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -748,9 +748,12 @@ static int ubd_open_dev(struct ubd *ubd_dev)
>  	}
>  	ubd_dev->fd = fd;
>  
> -	if(ubd_dev->cow.file != NULL){
> -		blk_queue_max_hw_sectors(ubd_dev->queue, 8 * sizeof(long));
> +	/* A setting higher than 1 sector currently (>= v2.6.31) generates
> +		data loss, both for raw and cow ubd. */
> +	blk_queue_max_hw_sectors(ubd_dev->queue, 1 * sizeof(long));
> +	blk_queue_max_segments(ubd_dev->queue, 1 * sizeof(long));
>  
> +	if (ubd_dev->cow.file != NULL) {
>  		err = -ENOMEM;
>  		ubd_dev->cow.bitmap = vmalloc(ubd_dev->cow.bitmap_len);
>  		if(ubd_dev->cow.bitmap == NULL){

This is a workaround, I think?  Do we know what the actual bug is?

>From the comment it appears to be a regression?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 22:00 ` Andrew Morton
@ 2010-09-28 22:13   ` Richard Weinberger
  2010-09-28 22:52     ` Chris Frey
  2010-09-29  3:30     ` Chris Frey
  0 siblings, 2 replies; 22+ messages in thread
From: Richard Weinberger @ 2010-09-28 22:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, jaxboe, martin.petersen, adobriyan, cdfrey

Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton:
> This is a workaround, I think?  Do we know what the actual bug is?
> From the comment it appears to be a regression?

Yes, it is a workaround.
For more details please have a look at this post:
http://lkml.org/lkml/2010/9/28/245

Cheers,
//richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 22:13   ` Richard Weinberger
@ 2010-09-28 22:52     ` Chris Frey
  2010-09-28 23:10       ` Jens Axboe
  2010-09-29  3:30     ` Chris Frey
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-09-28 22:52 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Andrew Morton, linux-kernel, jdike, user-mode-linux-devel,
	user-mode-linux-user, janjaap, geert, jaxboe, martin.petersen,
	adobriyan, syzop

On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote:
> Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton:
> > This is a workaround, I think?  Do we know what the actual bug is?
> > From the comment it appears to be a regression?
> 
> Yes, it is a workaround.
> For more details please have a look at this post:
> http://lkml.org/lkml/2010/9/28/245

I'm going to run some tests to see how this works on my end, but in
reading old threads, I found this response from Bram Matthys
that casts some doubt on an older version of this patch.

	http://marc.info/?l=user-mode-linux-user&m=126883932731656&w=2

I've added him to the CC.

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 22:52     ` Chris Frey
@ 2010-09-28 23:10       ` Jens Axboe
  2010-09-29  0:48         ` Janjaap Bos
  2010-09-29  1:29         ` Chris Frey
  0 siblings, 2 replies; 22+ messages in thread
From: Jens Axboe @ 2010-09-28 23:10 UTC (permalink / raw)
  To: Chris Frey
  Cc: Richard Weinberger, Andrew Morton, linux-kernel, jdike,
	user-mode-linux-devel, user-mode-linux-user, janjaap, geert,
	martin.petersen, adobriyan, syzop

On 2010-09-29 07:52, Chris Frey wrote:
> On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote:
>> Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton:
>>> This is a workaround, I think?  Do we know what the actual bug is?
>>> From the comment it appears to be a regression?
>>
>> Yes, it is a workaround.
>> For more details please have a look at this post:
>> http://lkml.org/lkml/2010/9/28/245
> 
> I'm going to run some tests to see how this works on my end, but in
> reading old threads, I found this response from Bram Matthys
> that casts some doubt on an older version of this patch.
> 
> 	http://marc.info/?l=user-mode-linux-user&m=126883932731656&w=2
> 
> I've added him to the CC.

It looks like that if we need to restart the requeue, then
we use the initial position and not the current index. Does
this help?

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..81ee063 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -162,7 +162,7 @@ struct ubd {
 	spinlock_t lock;
 	struct scatterlist sg[MAX_SG];
 	struct request *request;
-	int start_sg, end_sg;
+	int start_sg, end_sg, rq_off;
 };
 
 #define DEFAULT_COW { \
@@ -187,6 +187,7 @@ struct ubd {
 	.request =		NULL, \
 	.start_sg =		0, \
 	.end_sg =		0, \
+	.rq_off =		0, \
 }
 
 /* Protected by ubd_lock */
@@ -1241,10 +1242,11 @@ static void do_ubd_request(struct request_queue *q)
 			dev->request = req;
 			dev->start_sg = 0;
 			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
+			dev->rq_off = 0;
 		}
 
 		req = dev->request;
-		sector = blk_rq_pos(req);
+		sector = blk_rq_pos(req) + dev->rq_off;
 		while(dev->start_sg < dev->end_sg){
 			struct scatterlist *sg = &dev->sg[dev->start_sg];
 
@@ -1273,6 +1275,7 @@ static void do_ubd_request(struct request_queue *q)
 			}
 
 			dev->start_sg++;
+			dev->rq_off += sg->length >> 9;
 		}
 		dev->end_sg = 0;
 		dev->request = NULL;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 23:10       ` Jens Axboe
@ 2010-09-29  0:48         ` Janjaap Bos
  2010-09-29  1:29         ` Chris Frey
  1 sibling, 0 replies; 22+ messages in thread
From: Janjaap Bos @ 2010-09-29  0:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Chris Frey, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, geert,
	martin.petersen, adobriyan, syzop

On Wed, 2010-09-29 at 08:10 +0900, Jens Axboe wrote:

> It looks like that if we need to restart the requeue, then
> we use the initial position and not the current index. Does
> this help?
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 1bcd208..81ee063 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -162,7 +162,7 @@ struct ubd {
>  	spinlock_t lock;
>  	struct scatterlist sg[MAX_SG];
>  	struct request *request;
> -	int start_sg, end_sg;
> +	int start_sg, end_sg, rq_off;
>  };
>  
>  #define DEFAULT_COW { \
> @@ -187,6 +187,7 @@ struct ubd {
>  	.request =		NULL, \
>  	.start_sg =		0, \
>  	.end_sg =		0, \
> +	.rq_off =		0, \
>  }
>  
>  /* Protected by ubd_lock */
> @@ -1241,10 +1242,11 @@ static void do_ubd_request(struct request_queue *q)
>  			dev->request = req;
>  			dev->start_sg = 0;
>  			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
> +			dev->rq_off = 0;
>  		}
>  
>  		req = dev->request;
> -		sector = blk_rq_pos(req);
> +		sector = blk_rq_pos(req) + dev->rq_off;
>  		while(dev->start_sg < dev->end_sg){
>  			struct scatterlist *sg = &dev->sg[dev->start_sg];
>  
> @@ -1273,6 +1275,7 @@ static void do_ubd_request(struct request_queue *q)
>  			}
>  
>  			dev->start_sg++;
> +			dev->rq_off += sg->length >> 9;
>  		}
>  		dev->end_sg = 0;
>  		dev->request = NULL;
> 

Yes, this works. I quickly tested it on current git version, and it does
not require the work-around. So far no corruption seen.

Thanks!
-Janjaap



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 23:10       ` Jens Axboe
  2010-09-29  0:48         ` Janjaap Bos
@ 2010-09-29  1:29         ` Chris Frey
  2010-09-29  5:21           ` Jens Axboe
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-09-29  1:29 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Richard Weinberger, Andrew Morton, linux-kernel, jdike,
	user-mode-linux-devel, user-mode-linux-user, janjaap, geert,
	martin.petersen, adobriyan, syzop

On Wed, Sep 29, 2010 at 08:10:06AM +0900, Jens Axboe wrote:
> It looks like that if we need to restart the requeue, then
> we use the initial position and not the current index. Does
> this help?
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 1bcd208..81ee063 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -162,7 +162,7 @@ struct ubd {
>  	spinlock_t lock;
>  	struct scatterlist sg[MAX_SG];
>  	struct request *request;
> -	int start_sg, end_sg;
> +	int start_sg, end_sg, rq_off;
>  };
>  
>  #define DEFAULT_COW { \
> @@ -187,6 +187,7 @@ struct ubd {
>  	.request =		NULL, \
>  	.start_sg =		0, \
>  	.end_sg =		0, \
> +	.rq_off =		0, \
>  }
>  
>  /* Protected by ubd_lock */
> @@ -1241,10 +1242,11 @@ static void do_ubd_request(struct request_queue *q)
>  			dev->request = req;
>  			dev->start_sg = 0;
>  			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
> +			dev->rq_off = 0;
>  		}
>  
>  		req = dev->request;
> -		sector = blk_rq_pos(req);
> +		sector = blk_rq_pos(req) + dev->rq_off;
>  		while(dev->start_sg < dev->end_sg){
>  			struct scatterlist *sg = &dev->sg[dev->start_sg];
>  
> @@ -1273,6 +1275,7 @@ static void do_ubd_request(struct request_queue *q)
>  			}
>  
>  			dev->start_sg++;
> +			dev->rq_off += sg->length >> 9;
>  		}
>  		dev->end_sg = 0;
>  		dev->request = NULL;
> 
> -- 

This patch does not fix the corruption issue for me.  I applied the patch
to 2.6.35.5, and reproduced the "deleted inode referenced" errors
in both a gentoo and ubuntu guest OS.  It does take longer to reproduce
though, with this patch.

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-28 22:13   ` Richard Weinberger
  2010-09-28 22:52     ` Chris Frey
@ 2010-09-29  3:30     ` Chris Frey
  1 sibling, 0 replies; 22+ messages in thread
From: Chris Frey @ 2010-09-29  3:30 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Andrew Morton, linux-kernel, jdike, user-mode-linux-devel,
	user-mode-linux-user, janjaap, geert, jaxboe, martin.petersen,
	adobriyan

On Wed, Sep 29, 2010 at 12:13:10AM +0200, Richard Weinberger wrote:
> Am Mittwoch 29 September 2010, 00:00:00 schrieb Andrew Morton:
> > This is a workaround, I think?  Do we know what the actual bug is?
> > From the comment it appears to be a regression?
> 
> Yes, it is a workaround.
> For more details please have a look at this post:
> http://lkml.org/lkml/2010/9/28/245

I am so far unable to reproduce the corruption with this workaround patch.
So it looks like we have at least one fix.

Thank you very much to everyone who worked on this.

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-29  1:29         ` Chris Frey
@ 2010-09-29  5:21           ` Jens Axboe
  2010-09-29  6:34             ` Chris Frey
  2010-10-02 17:27             ` richard -rw- weinberger
  0 siblings, 2 replies; 22+ messages in thread
From: Jens Axboe @ 2010-09-29  5:21 UTC (permalink / raw)
  To: Chris Frey
  Cc: Richard Weinberger, Andrew Morton, linux-kernel, jdike,
	user-mode-linux-devel, user-mode-linux-user, janjaap, geert,
	martin.petersen, adobriyan, syzop

On 2010-09-29 10:29, Chris Frey wrote:
> On Wed, Sep 29, 2010 at 08:10:06AM +0900, Jens Axboe wrote:
>> It looks like that if we need to restart the requeue, then
>> we use the initial position and not the current index. Does
>> this help?
>>
>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>> index 1bcd208..81ee063 100644
>> --- a/arch/um/drivers/ubd_kern.c
>> +++ b/arch/um/drivers/ubd_kern.c
>> @@ -162,7 +162,7 @@ struct ubd {
>>  	spinlock_t lock;
>>  	struct scatterlist sg[MAX_SG];
>>  	struct request *request;
>> -	int start_sg, end_sg;
>> +	int start_sg, end_sg, rq_off;
>>  };
>>  
>>  #define DEFAULT_COW { \
>> @@ -187,6 +187,7 @@ struct ubd {
>>  	.request =		NULL, \
>>  	.start_sg =		0, \
>>  	.end_sg =		0, \
>> +	.rq_off =		0, \
>>  }
>>  
>>  /* Protected by ubd_lock */
>> @@ -1241,10 +1242,11 @@ static void do_ubd_request(struct request_queue *q)
>>  			dev->request = req;
>>  			dev->start_sg = 0;
>>  			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
>> +			dev->rq_off = 0;
>>  		}
>>  
>>  		req = dev->request;
>> -		sector = blk_rq_pos(req);
>> +		sector = blk_rq_pos(req) + dev->rq_off;
>>  		while(dev->start_sg < dev->end_sg){
>>  			struct scatterlist *sg = &dev->sg[dev->start_sg];
>>  
>> @@ -1273,6 +1275,7 @@ static void do_ubd_request(struct request_queue *q)
>>  			}
>>  
>>  			dev->start_sg++;
>> +			dev->rq_off += sg->length >> 9;
>>  		}
>>  		dev->end_sg = 0;
>>  		dev->request = NULL;
>>
>> -- 
> 
> This patch does not fix the corruption issue for me.  I applied the patch
> to 2.6.35.5, and reproduced the "deleted inode referenced" errors
> in both a gentoo and ubuntu guest OS.  It does take longer to reproduce
> though, with this patch.

This seems to imply that the original commit pin pointed is not
the only issue we have in that code atm.

I think we need to find the real fix here, just disabling merging
is not a fix (it's just a nasty work-around for the real bug).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-29  5:21           ` Jens Axboe
@ 2010-09-29  6:34             ` Chris Frey
  2010-10-04 16:37               ` Tejun Heo
  2010-10-02 17:27             ` richard -rw- weinberger
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-09-29  6:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Richard Weinberger, Andrew Morton, linux-kernel, jdike,
	user-mode-linux-devel, user-mode-linux-user, janjaap, geert,
	martin.petersen, adobriyan, syzop

On Wed, Sep 29, 2010 at 02:21:07PM +0900, Jens Axboe wrote:
> This seems to imply that the original commit pin pointed is not
> the only issue we have in that code atm.
> 
> I think we need to find the real fix here, just disabling merging
> is not a fix (it's just a nasty work-around for the real bug).

You're probably right, and I'm quite willing to help test further patches
to help get to the bottom of this.

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-29  5:21           ` Jens Axboe
  2010-09-29  6:34             ` Chris Frey
@ 2010-10-02 17:27             ` richard -rw- weinberger
  1 sibling, 0 replies; 22+ messages in thread
From: richard -rw- weinberger @ 2010-10-02 17:27 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Chris Frey, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On Wed, Sep 29, 2010 at 7:21 AM, Jens Axboe <jaxboe@fusionio.com> wrote:
> I think we need to find the real fix here, just disabling merging
> is not a fix (it's just a nasty work-around for the real bug).
>

Jens,

Do you have an idea which parts of the code are buggy?

-- 
Cheers,
//richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-09-29  6:34             ` Chris Frey
@ 2010-10-04 16:37               ` Tejun Heo
  2010-10-04 19:51                 ` Chris Frey
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2010-10-04 16:37 UTC (permalink / raw)
  To: Chris Frey
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

Hello, sorry about chiming in later.  I was off last week.

On 09/29/2010 08:34 AM, Chris Frey wrote:
> On Wed, Sep 29, 2010 at 02:21:07PM +0900, Jens Axboe wrote:
>> This seems to imply that the original commit pin pointed is not
>> the only issue we have in that code atm.
>>
>> I think we need to find the real fix here, just disabling merging
>> is not a fix (it's just a nasty work-around for the real bug).
> 
> You're probably right, and I'm quite willing to help test further patches
> to help get to the bottom of this.

I think we're on the right track.  The problem with Jens' patch was
that it didn't consider the fact that blk_end_request() now internally
updates the current position of the request, so if restart happens
after some of part of the request is complete it will end up adding
the offsets multiple times.  I think slightly modifying it to track
the current position instead of offset should do it.  Chris, can you
please try the following patch and see whether the problem goes away?

Thanks.

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..d8a5d8c 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -163,6 +163,7 @@ struct ubd {
 	struct scatterlist sg[MAX_SG];
 	struct request *request;
 	int start_sg, end_sg;
+	sector_t rq_pos;
 };

 #define DEFAULT_COW { \
@@ -187,6 +188,7 @@ struct ubd {
 	.request =		NULL, \
 	.start_sg =		0, \
 	.end_sg =		0, \
+	.rq_pos =		0, \
 }

 /* Protected by ubd_lock */
@@ -1228,7 +1230,6 @@ static void do_ubd_request(struct request_queue *q)
 {
 	struct io_thread_req *io_req;
 	struct request *req;
-	sector_t sector;
 	int n;

 	while(1){
@@ -1244,7 +1245,7 @@ static void do_ubd_request(struct request_queue *q)
 		}

 		req = dev->request;
-		sector = blk_rq_pos(req);
+		dev->rq_pos = blk_rq_pos(req);
 		while(dev->start_sg < dev->end_sg){
 			struct scatterlist *sg = &dev->sg[dev->start_sg];

@@ -1256,10 +1257,10 @@ static void do_ubd_request(struct request_queue *q)
 				return;
 			}
 			prepare_request(req, io_req,
-					(unsigned long long)sector << 9,
+					(unsigned long long)dev->rq_pos << 9,
 					sg->offset, sg->length, sg_page(sg));

-			sector += sg->length >> 9;
+			dev->rq_pos += sg->length >> 9;
 			n = os_write_file(thread_fd, &io_req,
 					  sizeof(struct io_thread_req *));
 			if(n != sizeof(struct io_thread_req *)){

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-04 16:37               ` Tejun Heo
@ 2010-10-04 19:51                 ` Chris Frey
  2010-10-05  8:23                   ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-10-04 19:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On Mon, Oct 04, 2010 at 06:37:36PM +0200, Tejun Heo wrote:
> Hello, sorry about chiming in later.  I was off last week.

No problem, I'm eager to test patches to fix this.


> I think we're on the right track.  The problem with Jens' patch was
> that it didn't consider the fact that blk_end_request() now internally
> updates the current position of the request, so if restart happens
> after some of part of the request is complete it will end up adding
> the offsets multiple times.  I think slightly modifying it to track
> the current position instead of offset should do it.  Chris, can you
> please try the following patch and see whether the problem goes away?

Unfortunately, this patch does not fix it for me.  I applied your patch
to kernel 2.6.35.5, and got the usual error:

EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 566188

Lots of them, actually.

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-04 19:51                 ` Chris Frey
@ 2010-10-05  8:23                   ` Tejun Heo
  2010-10-05 20:31                     ` Chris Frey
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2010-10-05  8:23 UTC (permalink / raw)
  To: Chris Frey
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On 10/04/2010 09:51 PM, Chris Frey wrote:
> On Mon, Oct 04, 2010 at 06:37:36PM +0200, Tejun Heo wrote:
>> Hello, sorry about chiming in later.  I was off last week.
> 
> No problem, I'm eager to test patches to fix this.
> 
>> I think we're on the right track.  The problem with Jens' patch was
>> that it didn't consider the fact that blk_end_request() now internally
>> updates the current position of the request, so if restart happens
>> after some of part of the request is complete it will end up adding
>> the offsets multiple times.  I think slightly modifying it to track
>> the current position instead of offset should do it.  Chris, can you
>> please try the following patch and see whether the problem goes away?
> 
> Unfortunately, this patch does not fix it for me.  I applied your patch
> to kernel 2.6.35.5, and got the usual error:
> 
> EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 566188
> 
> Lots of them, actually.

Hmmmm, can you please give a shot at the following one?  Thank you.

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..0435d21 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -163,6 +163,7 @@ struct ubd {
 	struct scatterlist sg[MAX_SG];
 	struct request *request;
 	int start_sg, end_sg;
+	sector_t rq_pos;
 };

 #define DEFAULT_COW { \
@@ -187,6 +188,7 @@ struct ubd {
 	.request =		NULL, \
 	.start_sg =		0, \
 	.end_sg =		0, \
+	.rq_pos =		0, \
 }

 /* Protected by ubd_lock */
@@ -1228,7 +1230,6 @@ static void do_ubd_request(struct request_queue *q)
 {
 	struct io_thread_req *io_req;
 	struct request *req;
-	sector_t sector;
 	int n;

 	while(1){
@@ -1239,12 +1240,12 @@ static void do_ubd_request(struct request_queue *q)
 				return;

 			dev->request = req;
+			dev->rq_pos = blk_rq_pos(req);
 			dev->start_sg = 0;
 			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
 		}

 		req = dev->request;
-		sector = blk_rq_pos(req);
 		while(dev->start_sg < dev->end_sg){
 			struct scatterlist *sg = &dev->sg[dev->start_sg];

@@ -1256,10 +1257,10 @@ static void do_ubd_request(struct request_queue *q)
 				return;
 			}
 			prepare_request(req, io_req,
-					(unsigned long long)sector << 9,
+					(unsigned long long)dev->rq_pos << 9,
 					sg->offset, sg->length, sg_page(sg));

-			sector += sg->length >> 9;
+			dev->rq_pos += sg->length >> 9;
 			n = os_write_file(thread_fd, &io_req,
 					  sizeof(struct io_thread_req *));
 			if(n != sizeof(struct io_thread_req *)){

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-05  8:23                   ` Tejun Heo
@ 2010-10-05 20:31                     ` Chris Frey
  2010-10-07  7:58                       ` Jens Axboe
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-10-05 20:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On Tue, Oct 05, 2010 at 10:23:19AM +0200, Tejun Heo wrote:
> Hmmmm, can you please give a shot at the following one?  Thank you.

I applied this patch on top of stock 2.6.35.5 as usual (no other patches)
and tested on my maverick image as before.  I ran a fsck.ext3 on the
filesystem image from the host before my test, just to make sure, and
there were no errors.

Unfortunately, this patch does not fix the issue either.  I get errors
in the guest like the following:

EXT3-fs error (device ubda): htree_dirblock_to_tree: bad entry in directory #1137497: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs (ubda): warning: empty_dir: bad directory (dir #1137497) - no `.' or `..'
EXT3-fs error (device ubda): htree_dirblock_to_tree: bad entry in directory #1137587: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs (ubda): warning: empty_dir: bad directory (dir #1137587) - no `.' or `..'
EXT3-fs error (device ubda): htree_dirblock_to_tree: bad entry in directory #1137619: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs (ubda): warning: empty_dir: bad directory (dir #1137619) - no `.' or `..'
EXT3-fs error (device ubda): htree_dirblock_to_tree: bad entry in directory #1137532: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs (ubda): warning: empty_dir: bad directory (dir #1137532) - no `.' or `..'
EXT3-fs (ubda): warning: ext3_rmdir: empty directory has nlink!=2 (6)
EXT3-fs (ubda): warning: ext3_rmdir: empty directory has nlink!=2 (3)


And later on:

EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 867196
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 484932
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 484932


- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-05 20:31                     ` Chris Frey
@ 2010-10-07  7:58                       ` Jens Axboe
  2010-10-07 20:23                         ` Chris Frey
  0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2010-10-07  7:58 UTC (permalink / raw)
  To: Chris Frey
  Cc: Tejun Heo, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On 2010-10-05 22:31, Chris Frey wrote:
> On Tue, Oct 05, 2010 at 10:23:19AM +0200, Tejun Heo wrote:
>> Hmmmm, can you please give a shot at the following one?  Thank you.
> 
> I applied this patch on top of stock 2.6.35.5 as usual (no other patches)
> and tested on my maverick image as before.  I ran a fsck.ext3 on the
> filesystem image from the host before my test, just to make sure, and
> there were no errors.
> 
> Unfortunately, this patch does not fix the issue either.  I get errors
> in the guest like the following:

So how about this? Note that I haven't even compiled this. The request
handling logic really should be fixed in there, it's horribly
inefficient.

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..c272cf6 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -163,6 +163,7 @@ struct ubd {
 	struct scatterlist sg[MAX_SG];
 	struct request *request;
 	int start_sg, end_sg;
+	unsigned int rq_offset;
 };
 
 #define DEFAULT_COW { \
@@ -187,6 +188,7 @@ struct ubd {
 	.request =		NULL, \
 	.start_sg =		0, \
 	.end_sg =		0, \
+	.rq_offset		0, \
 }
 
 /* Protected by ubd_lock */
@@ -466,6 +468,8 @@ static void ubd_handler(void)
 	int n;
 
 	while(1){
+		struct ubd *dev;
+
 		n = os_read_file(thread_fd, &req,
 				 sizeof(struct io_thread_req *));
 		if(n != sizeof(req)){
@@ -476,6 +480,11 @@ static void ubd_handler(void)
 			return;
 		}
 
+		/*
+		 * Clear internal offset, block core will update request state
+		 */
+		dev = req->req->q->queuedata;
+		dev->rq_offset = 0;
 		blk_end_request(req->req, 0, req->length);
 		kfree(req);
 	}
@@ -1241,10 +1250,11 @@ static void do_ubd_request(struct request_queue *q)
 			dev->request = req;
 			dev->start_sg = 0;
 			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
+			dev->rq_offset = 0;
 		}
 
 		req = dev->request;
-		sector = blk_rq_pos(req);
+		sector = blk_rq_pos(req) + dev->rq_offset;
 		while(dev->start_sg < dev->end_sg){
 			struct scatterlist *sg = &dev->sg[dev->start_sg];
 
@@ -1272,6 +1282,7 @@ static void do_ubd_request(struct request_queue *q)
 				return;
 			}
 
+			dev->rq_offset += sg->length >> 9;
 			dev->start_sg++;
 		}
 		dev->end_sg = 0;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-07  7:58                       ` Jens Axboe
@ 2010-10-07 20:23                         ` Chris Frey
  2010-10-14 13:14                           ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Frey @ 2010-10-07 20:23 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On Thu, Oct 07, 2010 at 09:58:16AM +0200, Jens Axboe wrote:
> So how about this? Note that I haven't even compiled this. The request
> handling logic really should be fixed in there, it's horribly
> inefficient.

Thanks.  I fixed the compile error with:

> +	.rq_offset		0, \

to

> +	.rq_offset =		0, \

But after testing, I still got the same errors:

EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 964816
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 964862
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 964786
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 964815
...

- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-07 20:23                         ` Chris Frey
@ 2010-10-14 13:14                           ` Tejun Heo
  2010-10-14 14:20                             ` richard -rw- weinberger
  2010-10-15  4:47                             ` Chris Frey
  0 siblings, 2 replies; 22+ messages in thread
From: Tejun Heo @ 2010-10-14 13:14 UTC (permalink / raw)
  To: Chris Frey
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

Hello,

Can you please try this one then?  It seems to work here but I can't
reproduce the original problem reliably so I'm not really sure.

Thanks.

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 1bcd208..9734994 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -163,6 +163,7 @@ struct ubd {
 	struct scatterlist sg[MAX_SG];
 	struct request *request;
 	int start_sg, end_sg;
+	sector_t rq_pos;
 };

 #define DEFAULT_COW { \
@@ -187,6 +188,7 @@ struct ubd {
 	.request =		NULL, \
 	.start_sg =		0, \
 	.end_sg =		0, \
+	.rq_pos =		0, \
 }

 /* Protected by ubd_lock */
@@ -1228,7 +1230,6 @@ static void do_ubd_request(struct request_queue *q)
 {
 	struct io_thread_req *io_req;
 	struct request *req;
-	sector_t sector;
 	int n;

 	while(1){
@@ -1239,12 +1240,12 @@ static void do_ubd_request(struct request_queue *q)
 				return;

 			dev->request = req;
+			dev->rq_pos = blk_rq_pos(req);
 			dev->start_sg = 0;
 			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
 		}

 		req = dev->request;
-		sector = blk_rq_pos(req);
 		while(dev->start_sg < dev->end_sg){
 			struct scatterlist *sg = &dev->sg[dev->start_sg];

@@ -1256,10 +1257,9 @@ static void do_ubd_request(struct request_queue *q)
 				return;
 			}
 			prepare_request(req, io_req,
-					(unsigned long long)sector << 9,
+					(unsigned long long)dev->rq_pos << 9,
 					sg->offset, sg->length, sg_page(sg));

-			sector += sg->length >> 9;
 			n = os_write_file(thread_fd, &io_req,
 					  sizeof(struct io_thread_req *));
 			if(n != sizeof(struct io_thread_req *)){
@@ -1272,6 +1272,7 @@ static void do_ubd_request(struct request_queue *q)
 				return;
 			}

+			dev->rq_pos += sg->length >> 9;
 			dev->start_sg++;
 		}
 		dev->end_sg = 0;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-14 13:14                           ` Tejun Heo
@ 2010-10-14 14:20                             ` richard -rw- weinberger
  2010-10-14 18:03                               ` Tejun Heo
  2010-10-15  4:47                             ` Chris Frey
  1 sibling, 1 reply; 22+ messages in thread
From: richard -rw- weinberger @ 2010-10-14 14:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Chris Frey, Jens Axboe, Richard Weinberger, Andrew Morton,
	linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, martin.petersen, adobriyan, syzop

On Thu, Oct 14, 2010 at 3:14 PM, Tejun Heo <htejun@gmail.com> wrote:
> Hello,
>
> Can you please try this one then?  It seems to work here but I can't
> reproduce the original problem reliably so I'm not really sure.
>
> Thanks.
>
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 1bcd208..9734994 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -163,6 +163,7 @@ struct ubd {
>        struct scatterlist sg[MAX_SG];
>        struct request *request;
>        int start_sg, end_sg;
> +       sector_t rq_pos;
>  };
>
>  #define DEFAULT_COW { \
> @@ -187,6 +188,7 @@ struct ubd {
>        .request =              NULL, \
>        .start_sg =             0, \
>        .end_sg =               0, \
> +       .rq_pos =               0, \
>  }
>
>  /* Protected by ubd_lock */
> @@ -1228,7 +1230,6 @@ static void do_ubd_request(struct request_queue *q)
>  {
>        struct io_thread_req *io_req;
>        struct request *req;
> -       sector_t sector;
>        int n;
>
>        while(1){
> @@ -1239,12 +1240,12 @@ static void do_ubd_request(struct request_queue *q)
>                                return;
>
>                        dev->request = req;
> +                       dev->rq_pos = blk_rq_pos(req);
>                        dev->start_sg = 0;
>                        dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
>                }
>
>                req = dev->request;
> -               sector = blk_rq_pos(req);
>                while(dev->start_sg < dev->end_sg){
>                        struct scatterlist *sg = &dev->sg[dev->start_sg];
>
> @@ -1256,10 +1257,9 @@ static void do_ubd_request(struct request_queue *q)
>                                return;
>                        }
>                        prepare_request(req, io_req,
> -                                       (unsigned long long)sector << 9,
> +                                       (unsigned long long)dev->rq_pos << 9,
>                                        sg->offset, sg->length, sg_page(sg));
>
> -                       sector += sg->length >> 9;
>                        n = os_write_file(thread_fd, &io_req,
>                                          sizeof(struct io_thread_req *));
>                        if(n != sizeof(struct io_thread_req *)){
> @@ -1272,6 +1272,7 @@ static void do_ubd_request(struct request_queue *q)
>                                return;
>                        }
>
> +                       dev->rq_pos += sg->length >> 9;
>                        dev->start_sg++;
>                }
>                dev->end_sg = 0;


It does not work for me.
But the error is a different one. :-)
Without your patch I've never got this kernel trace.

[   59.850000] kworker/0:1: page allocation failure. order:0, mode:0x20
[   59.850000] 0fe22d68:  [<081ca1da>] dump_stack+0x1c/0x20
[   59.850000] 0fe22d80:  [<080a4981>] __alloc_pages_nodemask+0x4a0/0x4cb
[   59.850000] 0fe22e00:  [<080bd2d7>] cache_alloc_refill+0x213/0x3fe
[   59.850000] 0fe22e58:  [<080bd5db>] kmem_cache_alloc+0x62/0x8f
[   59.850000] 0fe22e74:  [<08062a4c>] do_ubd_request+0x91/0x40c
[   59.850000] 0fe22edc:  [<08148f01>] __blk_run_queue+0x40/0x6e
[   59.850000] 0fe22ef0:  [<08151280>] cfq_kick_queue+0x24/0x39
[   59.850000] 0fe22f08:  [<08083c86>] process_one_work+0x1ea/0x2f0
[   59.850000] 0fe22f44:  [<08084202>] worker_thread+0x174/0x2e4
[   59.850000] 0fe22f74:  [<08087e5b>] kthread+0x78/0x7f
[   59.850000] 0fe22fb4:  [<08066117>] run_kernel_thread+0x37/0x3f
[   59.850000] 0fe22fe0:  [<08059bd1>] new_thread_handler+0x60/0x87
[   59.850000] 0fe22ffc:  [<00000000>] 0x0
[   59.850000]
[   59.850000] Mem-Info:
[   59.850000] Normal per-cpu:
[   59.850000] CPU    0: hi:   42, btch:   7 usd:  15
[   59.850000] active_anon:1146 inactive_anon:1194 isolated_anon:0
[   59.850000]  active_file:4314 inactive_file:19869 isolated_file:0
[   59.850000]  unevictable:0 dirty:1327 writeback:1837 unstable:0
[   59.850000]  free:131 slab_reclaimable:3618 slab_unreclaimable:1141
[   59.850000]  mapped:1105 shmem:10 pagetables:82 bounce:0
[   59.850000] Normal free:524kB min:1440kB low:1800kB high:2160kB
active_anon:4584kB inactive_anon:4776kB active_file:17256kB
inactive_file:79476kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:130048kB mlocked:0kB dirty:5308kB
writeback:7348kB mapped:4420kB shmem:40kB slab_reclaimable:14472kB
slab_unreclaimable:4564kB kernel_stack:164kB pagetables:328kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
[   59.850000] lowmem_reserve[]: 0 0
[   59.850000] Normal: 1*4kB 3*8kB 25*16kB 3*32kB 0*64kB 0*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524kB
[   59.850000] 24193 total pagecache pages
[   59.850000] 0 pages in swap cache
[   59.850000] Swap cache stats: add 0, delete 0, find 0/0
[   59.850000] Free swap  = 0kB
[   59.850000] Total swap = 0kB
[   59.850000] 32768 pages RAM
[   59.850000] 1132 pages reserved
[   59.850000] 26071 pages shared
[   59.850000] 8225 pages non-shared
[   94.930000] kworker/0:1: page allocation failure. order:0, mode:0x20
[   94.930000] 0fe22d68:  [<081ca1da>] dump_stack+0x1c/0x20
[   94.930000] 0fe22d80:  [<080a4981>] __alloc_pages_nodemask+0x4a0/0x4cb
[   94.930000] 0fe22e00:  [<080bd2d7>] cache_alloc_refill+0x213/0x3fe
[   94.930000] 0fe22e58:  [<080bd5db>] kmem_cache_alloc+0x62/0x8f
[   94.930000] 0fe22e74:  [<08062a4c>] do_ubd_request+0x91/0x40c
[   94.930000] 0fe22edc:  [<08148f01>] __blk_run_queue+0x40/0x6e
[   94.930000] 0fe22ef0:  [<08151280>] cfq_kick_queue+0x24/0x39
[   94.930000] 0fe22f08:  [<08083c86>] process_one_work+0x1ea/0x2f0
[   94.930000] 0fe22f44:  [<08084202>] worker_thread+0x174/0x2e4
[   94.930000] 0fe22f74:  [<08087e5b>] kthread+0x78/0x7f
[   94.930000] 0fe22fb4:  [<08066117>] run_kernel_thread+0x37/0x3f
[   94.930000] 0fe22fe0:  [<08059bd1>] new_thread_handler+0x60/0x87
[   94.930000] 0fe22ffc:  [<00000000>] 0x0
[   94.930000]
[   94.930000] Mem-Info:
[   94.930000] Normal per-cpu:
[   94.930000] CPU    0: hi:   42, btch:   7 usd:   7
[   94.930000] active_anon:1202 inactive_anon:1274 isolated_anon:0
[   94.930000]  active_file:6904 inactive_file:17913 isolated_file:0
[   94.930000]  unevictable:0 dirty:1341 writeback:2209 unstable:0
[   94.930000]  free:132 slab_reclaimable:2912 slab_unreclaimable:1077
[   94.930000]  mapped:1173 shmem:10 pagetables:86 bounce:0
[   94.930000] Normal free:528kB min:1440kB low:1800kB high:2160kB
active_anon:4808kB inactive_anon:5096kB active_file:27616kB
inactive_file:71652kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:130048kB mlocked:0kB dirty:5364kB
writeback:8836kB mapped:4692kB shmem:40kB slab_reclaimable:11648kB
slab_unreclaimable:4308kB kernel_stack:168kB pagetables:344kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
[   94.930000] lowmem_reserve[]: 0 0
[   94.930000] Normal: 0*4kB 0*8kB 17*16kB 8*32kB 0*64kB 0*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 528kB
[   94.930000] 24827 total pagecache pages
[   94.930000] 0 pages in swap cache
[   94.930000] Swap cache stats: add 0, delete 0, find 0/0
[   94.930000] Free swap  = 0kB
[   94.930000] Total swap = 0kB
[   94.930000] 32768 pages RAM
[   94.930000] 1132 pages reserved
[   94.930000] 26123 pages shared
[   94.930000] 8586 pages non-shared
[   94.930000] kworker/0:1: page allocation failure. order:0, mode:0x20
[   94.930000] 0fe22c60:  [<081ca1da>] dump_stack+0x1c/0x20
[   94.930000] 0fe22c78:  [<080a4981>] __alloc_pages_nodemask+0x4a0/0x4cb
[   94.930000] 0fe22cf8:  [<080bd2d7>] cache_alloc_refill+0x213/0x3fe
[   94.930000] 0fe22d50:  [<080bd5db>] kmem_cache_alloc+0x62/0x8f
[   94.930000] 0fe22d6c:  [<08062a4c>] do_ubd_request+0x91/0x40c
[   94.930000] 0fe22dd4:  [<08062e5f>] ubd_intr+0x98/0xbf
[   94.930000] 0fe22df8:  [<0809bed1>] handle_IRQ_event+0x15/0x9b
[   94.930000] 0fe22e14:  [<0809bfbd>] __do_IRQ+0x66/0xb8
[   94.930000] 0fe22e34:  [<08059136>] do_IRQ+0x1f/0x34
[   94.930000] 0fe22e44:  [<080592e6>] sigio_handler+0x46/0x5c
[   94.930000] 0fe22e5c:  [<08066e6e>] sig_handler_common+0x61/0x70
[   94.930000] 0fe22ed4:  [<08066ec3>] unblock_signals+0x46/0x57
[   94.930000] 0fe22ee0:  [<081cc3c7>] _raw_spin_unlock_irq+0x10/0x13
[   94.930000] 0fe22eec:  [<0815128b>] cfq_kick_queue+0x2f/0x39
[   94.930000] 0fe22f08:  [<08083c86>] process_one_work+0x1ea/0x2f0
[   94.930000] 0fe22f44:  [<08084202>] worker_thread+0x174/0x2e4
[   94.930000] 0fe22f74:  [<08087e5b>] kthread+0x78/0x7f
[   94.930000] 0fe22fb4:  [<08066117>] run_kernel_thread+0x37/0x3f
[   94.930000] 0fe22fe0:  [<08059bd1>] new_thread_handler+0x60/0x87
[   94.930000] 0fe22ffc:  [<00000000>] 0x0
[   94.930000]
[   94.930000] Mem-Info:
[   94.930000] Normal per-cpu:
[   94.930000] CPU    0: hi:   42, btch:   7 usd:  18
[   94.930000] active_anon:1202 inactive_anon:1274 isolated_anon:0
[   94.930000]  active_file:6904 inactive_file:17913 isolated_file:0
[   94.930000]  unevictable:0 dirty:1341 writeback:1811 unstable:0
[   94.930000]  free:132 slab_reclaimable:2912 slab_unreclaimable:1066
[   94.930000]  mapped:1173 shmem:10 pagetables:86 bounce:0
[   94.930000] Normal free:528kB min:1440kB low:1800kB high:2160kB
active_anon:4808kB inactive_anon:5096kB active_file:27616kB
inactive_file:71652kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:130048kB mlocked:0kB dirty:5364kB
writeback:7244kB mapped:4692kB shmem:40kB slab_reclaimable:11648kB
slab_unreclaimable:4264kB kernel_stack:168kB pagetables:344kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
[   94.930000] lowmem_reserve[]: 0 0
[   94.930000] Normal: 0*4kB 0*8kB 17*16kB 8*32kB 0*64kB 0*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 528kB
[   94.930000] 24827 total pagecache pages
[   94.930000] 0 pages in swap cache
[   94.930000] Swap cache stats: add 0, delete 0, find 0/0
[   94.930000] Free swap  = 0kB
[   94.930000] Total swap = 0kB
[   94.930000] 32768 pages RAM
[   94.930000] 1132 pages reserved
[   94.930000] 26123 pages shared
[   94.930000] 8575 pages non-shared

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-14 14:20                             ` richard -rw- weinberger
@ 2010-10-14 18:03                               ` Tejun Heo
  2010-10-14 21:24                                 ` Richard Weinberger
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2010-10-14 18:03 UTC (permalink / raw)
  To: richard -rw- weinberger
  Cc: Chris Frey, Jens Axboe, Richard Weinberger, Andrew Morton,
	linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, martin.petersen, adobriyan, syzop

Hello,

On 10/14/2010 04:20 PM, richard -rw- weinberger wrote:
> It does not work for me.
> But the error is a different one. :-)
> Without your patch I've never got this kernel trace.
> 
> [   59.850000] kworker/0:1: page allocation failure. order:0, mode:0x20

Hmm... you're seeing out of memory condition.  If the code screws up
filesystem access, I suppose it could make that happen too but can you
please check the configuration just in case (especially the memory
size)?  Also, if you can reliably reproduce the filesystem corruption
w/o the patch, can you please tell me how to do it?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-14 18:03                               ` Tejun Heo
@ 2010-10-14 21:24                                 ` Richard Weinberger
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Weinberger @ 2010-10-14 21:24 UTC (permalink / raw)
  To: Tejun Heo
  Cc: richard -rw- weinberger, Chris Frey, Jens Axboe, Andrew Morton,
	linux-kernel, jdike, user-mode-linux-devel, user-mode-linux-user,
	janjaap, geert, martin.petersen, adobriyan, syzop

Hi!

Am Donnerstag 14 Oktober 2010, 20:03:40 schrieb Tejun Heo:
> Hello,
> 
> On 10/14/2010 04:20 PM, richard -rw- weinberger wrote:
> > It does not work for me.
> > But the error is a different one. :-)
> > Without your patch I've never got this kernel trace.
> > 
> > [   59.850000] kworker/0:1: page allocation failure. order:0, mode:0x20
> 
> Hmm... you're seeing out of memory condition.  If the code screws up
> filesystem access, I suppose it could make that happen too but can you
> please check the configuration just in case (especially the memory
> size)?  Also, if you can reliably reproduce the filesystem corruption
> w/o the patch, can you please tell me how to do it?
> 

After rerunning my test case a few times with exactly the same configuration I 
was unable to reproduce neither a filesystem corruption nor a allocation 
failure.
Your patch seems to fix the issue.

Sorry for the false negative!

BTW: This is my test case:
tar xvf linux-2.6.36-rc7.tar.bz2 && rm linux-2.6.36-rc7.tar.bz2 && tar cvf 
linux-2.6.36-rc7.tar.bz2 linux-2.6.36-rc7/ && rm -rf linux-2.6.36-rc7/
Any big tar archive will do the job...

Thanks,
//richard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] um: ubd: Fix data corruption
  2010-10-14 13:14                           ` Tejun Heo
  2010-10-14 14:20                             ` richard -rw- weinberger
@ 2010-10-15  4:47                             ` Chris Frey
  1 sibling, 0 replies; 22+ messages in thread
From: Chris Frey @ 2010-10-15  4:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Richard Weinberger, Andrew Morton, linux-kernel,
	jdike, user-mode-linux-devel, user-mode-linux-user, janjaap,
	geert, martin.petersen, adobriyan, syzop

On Thu, Oct 14, 2010 at 03:14:28PM +0200, Tejun Heo wrote:
> Hello,
> 
> Can you please try this one then?  It seems to work here but I can't
> reproduce the original problem reliably so I'm not really sure.
> 
> Thanks.
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 1bcd208..9734994 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -163,6 +163,7 @@ struct ubd {
>  	struct scatterlist sg[MAX_SG];
>  	struct request *request;
>  	int start_sg, end_sg;
> +	sector_t rq_pos;
>  };
> 
>  #define DEFAULT_COW { \
> @@ -187,6 +188,7 @@ struct ubd {
>  	.request =		NULL, \
>  	.start_sg =		0, \
>  	.end_sg =		0, \
> +	.rq_pos =		0, \
>  }
> 
>  /* Protected by ubd_lock */
> @@ -1228,7 +1230,6 @@ static void do_ubd_request(struct request_queue *q)
>  {
>  	struct io_thread_req *io_req;
>  	struct request *req;
> -	sector_t sector;
>  	int n;
> 
>  	while(1){
> @@ -1239,12 +1240,12 @@ static void do_ubd_request(struct request_queue *q)
>  				return;
> 
>  			dev->request = req;
> +			dev->rq_pos = blk_rq_pos(req);
>  			dev->start_sg = 0;
>  			dev->end_sg = blk_rq_map_sg(q, req, dev->sg);
>  		}
> 
>  		req = dev->request;
> -		sector = blk_rq_pos(req);
>  		while(dev->start_sg < dev->end_sg){
>  			struct scatterlist *sg = &dev->sg[dev->start_sg];
> 
> @@ -1256,10 +1257,9 @@ static void do_ubd_request(struct request_queue *q)
>  				return;
>  			}
>  			prepare_request(req, io_req,
> -					(unsigned long long)sector << 9,
> +					(unsigned long long)dev->rq_pos << 9,
>  					sg->offset, sg->length, sg_page(sg));
> 
> -			sector += sg->length >> 9;
>  			n = os_write_file(thread_fd, &io_req,
>  					  sizeof(struct io_thread_req *));
>  			if(n != sizeof(struct io_thread_req *)){
> @@ -1272,6 +1272,7 @@ static void do_ubd_request(struct request_queue *q)
>  				return;
>  			}
> 
> +			dev->rq_pos += sg->length >> 9;
>  			dev->start_sg++;
>  		}
>  		dev->end_sg = 0;



I tested this patch, on 2.6.35.5, as heavily as I could today, and
was unable to reproduce the filesystem corruption.

Seems to be fixed. :-)

Thanks!
- Chris


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2010-10-15  4:49 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-28 21:47 [PATCH 1/1] um: ubd: Fix data corruption Richard Weinberger
2010-09-28 22:00 ` Andrew Morton
2010-09-28 22:13   ` Richard Weinberger
2010-09-28 22:52     ` Chris Frey
2010-09-28 23:10       ` Jens Axboe
2010-09-29  0:48         ` Janjaap Bos
2010-09-29  1:29         ` Chris Frey
2010-09-29  5:21           ` Jens Axboe
2010-09-29  6:34             ` Chris Frey
2010-10-04 16:37               ` Tejun Heo
2010-10-04 19:51                 ` Chris Frey
2010-10-05  8:23                   ` Tejun Heo
2010-10-05 20:31                     ` Chris Frey
2010-10-07  7:58                       ` Jens Axboe
2010-10-07 20:23                         ` Chris Frey
2010-10-14 13:14                           ` Tejun Heo
2010-10-14 14:20                             ` richard -rw- weinberger
2010-10-14 18:03                               ` Tejun Heo
2010-10-14 21:24                                 ` Richard Weinberger
2010-10-15  4:47                             ` Chris Frey
2010-10-02 17:27             ` richard -rw- weinberger
2010-09-29  3:30     ` Chris Frey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).