[Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
@ 2016-08-26 23:04 Ashish Samant
  2016-08-29  5:39 ` Eric Ren
  0 siblings, 1 reply; 9+ messages in thread
From: Ashish Samant @ 2016-08-26 23:04 UTC (permalink / raw)
  To: ocfs2-devel

If we punch a hole on a reflink such that following conditions are met:

1. start offset is on a cluster boundary
2. end offset is not on a cluster boundary
3. (end offset is somewhere in another extent) or
   (hole range > MAX_CONTIG_BYTES(1MB)),

we dont COW the first cluster starting at the start offset. But in this
case, we were wrongly passing this cluster to
ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
in place and zero it in the source too.

Fix this by skipping this cluster in such a scenario.

Reported-by: Saar Maoz <saar.maoz@oracle.com>
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
---
v1->v2:
-Changed the commit msg to include a better and generic description of 
 the problem, for all cluster sizes.
-Added Reported-by and Reviewed-by tags. 
  
 fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 4e7b0dc..0b055bf 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
 				       u64 start, u64 len)
 {
 	int ret = 0;
-	u64 tmpend, end = start + len;
+	u64 tmpend = 0;
+	u64 end = start + len;
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 	unsigned int csize = osb->s_clustersize;
 	handle_t *handle;
@@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
 	}
 
 	/*
-	 * We want to get the byte offset of the end of the 1st cluster.
+	 * If start is on a cluster boundary and end is somewhere in another
+	 * cluster, we have not COWed the cluster starting at start, unless
+	 * end is also within the same cluster. So, in this case, we skip this
+	 * first call to ocfs2_zero_range_for_truncate() truncate and move on
+	 * to the next one.
 	 */
-	tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
-	if (tmpend > end)
-		tmpend = end;
+	if ((start & (csize - 1)) != 0) {
+		/*
+		 * We want to get the byte offset of the end of the 1st
+		 * cluster.
+		 */
+		tmpend = (u64)osb->s_clustersize +
+			(start & ~(osb->s_clustersize - 1));
+		if (tmpend > end)
+			tmpend = end;
 
-	trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
-						 (unsigned long long)tmpend);
+		trace_ocfs2_zero_partial_clusters_range1(
+			(unsigned long long)start,
+			(unsigned long long)tmpend);
 
-	ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
-	if (ret)
-		mlog_errno(ret);
+		ret = ocfs2_zero_range_for_truncate(inode, handle, start,
+						    tmpend);
+		if (ret)
+			mlog_errno(ret);
+	}
 
 	if (tmpend < end) {
 		/*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-26 23:04 [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate() Ashish Samant
@ 2016-08-29  5:39 ` Eric Ren
  2016-08-29 19:23   ` Ashish Samant
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Ren @ 2016-08-29  5:39 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

Thanks for this fix. I'd like to reproduce this issue locally and test this patch,
could you elaborate the detailed steps of reproduction?

Thanks,
Eric

On 08/27/2016 07:04 AM, Ashish Samant wrote:
> If we punch a hole on a reflink such that following conditions are met:
>
> 1. start offset is on a cluster boundary
> 2. end offset is not on a cluster boundary
> 3. (end offset is somewhere in another extent) or
>     (hole range > MAX_CONTIG_BYTES(1MB)),
>
> we dont COW the first cluster starting at the start offset. But in this
> case, we were wrongly passing this cluster to
> ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
> in place and zero it in the source too.
>
> Fix this by skipping this cluster in such a scenario.
>
> Reported-by: Saar Maoz <saar.maoz@oracle.com>
> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
> ---
> v1->v2:
> -Changed the commit msg to include a better and generic description of
>   the problem, for all cluster sizes.
> -Added Reported-by and Reviewed-by tags.
>    
>   fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>   1 file changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index 4e7b0dc..0b055bf 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>   				       u64 start, u64 len)
>   {
>   	int ret = 0;
> -	u64 tmpend, end = start + len;
> +	u64 tmpend = 0;
> +	u64 end = start + len;
>   	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>   	unsigned int csize = osb->s_clustersize;
>   	handle_t *handle;
> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>   	}
>   
>   	/*
> -	 * We want to get the byte offset of the end of the 1st cluster.
> +	 * If start is on a cluster boundary and end is somewhere in another
> +	 * cluster, we have not COWed the cluster starting at start, unless
> +	 * end is also within the same cluster. So, in this case, we skip this
> +	 * first call to ocfs2_zero_range_for_truncate() truncate and move on
> +	 * to the next one.
>   	 */
> -	tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
> -	if (tmpend > end)
> -		tmpend = end;
> +	if ((start & (csize - 1)) != 0) {
> +		/*
> +		 * We want to get the byte offset of the end of the 1st
> +		 * cluster.
> +		 */
> +		tmpend = (u64)osb->s_clustersize +
> +			(start & ~(osb->s_clustersize - 1));
> +		if (tmpend > end)
> +			tmpend = end;
>   
> -	trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
> -						 (unsigned long long)tmpend);
> +		trace_ocfs2_zero_partial_clusters_range1(
> +			(unsigned long long)start,
> +			(unsigned long long)tmpend);
>   
> -	ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
> -	if (ret)
> -		mlog_errno(ret);
> +		ret = ocfs2_zero_range_for_truncate(inode, handle, start,
> +						    tmpend);
> +		if (ret)
> +			mlog_errno(ret);
> +	}
>   
>   	if (tmpend < end) {
>   		/*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-29  5:39 ` Eric Ren
@ 2016-08-29 19:23   ` Ashish Samant
  2016-08-30  1:09     ` Junxiao Bi
  2016-08-30  3:33     ` Eric Ren
  0 siblings, 2 replies; 9+ messages in thread
From: Ashish Samant @ 2016-08-29 19:23 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

The easiest way to reproduce this is :

1. Create a random file of say 10 MB
     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
2. Reflink  it
     reflink -f 10MBfile reflnktest
3. Punch a hole at starting at cluster boundary  with range greater that 
1MB. You can also use a range that will put the end offset in another 
extent.
     fallocate -p -o 0 -l 1048615 reflnktest
4. sync
5. Check the  first cluster in the source file. (It will be zeroed out).
    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C

Thanks,
Ashish

On 08/28/2016 10:39 PM, Eric Ren wrote:
> Hi,
>
> Thanks for this fix. I'd like to reproduce this issue locally and test 
> this patch,
> could you elaborate the detailed steps of reproduction?
>
> Thanks,
> Eric
>
> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>> If we punch a hole on a reflink such that following conditions are met:
>>
>> 1. start offset is on a cluster boundary
>> 2. end offset is not on a cluster boundary
>> 3. (end offset is somewhere in another extent) or
>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>
>> we dont COW the first cluster starting at the start offset. But in this
>> case, we were wrongly passing this cluster to
>> ocfs2_zero_range_for_truncate() to zero out. This will modify the 
>> cluster
>> in place and zero it in the source too.
>>
>> Fix this by skipping this cluster in such a scenario.
>>
>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>> ---
>> v1->v2:
>> -Changed the commit msg to include a better and generic description of
>>   the problem, for all cluster sizes.
>> -Added Reported-by and Reviewed-by tags.
>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>> index 4e7b0dc..0b055bf 100644
>> --- a/fs/ocfs2/file.c
>> +++ b/fs/ocfs2/file.c
>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct 
>> inode *inode,
>>                          u64 start, u64 len)
>>   {
>>       int ret = 0;
>> -    u64 tmpend, end = start + len;
>> +    u64 tmpend = 0;
>> +    u64 end = start + len;
>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>       unsigned int csize = osb->s_clustersize;
>>       handle_t *handle;
>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct 
>> inode *inode,
>>       }
>>         /*
>> -     * We want to get the byte offset of the end of the 1st cluster.
>> +     * If start is on a cluster boundary and end is somewhere in 
>> another
>> +     * cluster, we have not COWed the cluster starting at start, unless
>> +     * end is also within the same cluster. So, in this case, we 
>> skip this
>> +     * first call to ocfs2_zero_range_for_truncate() truncate and 
>> move on
>> +     * to the next one.
>>        */
>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize 
>> - 1));
>> -    if (tmpend > end)
>> -        tmpend = end;
>> +    if ((start & (csize - 1)) != 0) {
>> +        /*
>> +         * We want to get the byte offset of the end of the 1st
>> +         * cluster.
>> +         */
>> +        tmpend = (u64)osb->s_clustersize +
>> +            (start & ~(osb->s_clustersize - 1));
>> +        if (tmpend > end)
>> +            tmpend = end;
>>   -    trace_ocfs2_zero_partial_clusters_range1((unsigned long 
>> long)start,
>> -                         (unsigned long long)tmpend);
>> +        trace_ocfs2_zero_partial_clusters_range1(
>> +            (unsigned long long)start,
>> +            (unsigned long long)tmpend);
>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, 
>> tmpend);
>> -    if (ret)
>> -        mlog_errno(ret);
>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>> +                            tmpend);
>> +        if (ret)
>> +            mlog_errno(ret);
>> +    }
>>         if (tmpend < end) {
>>           /*
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-29 19:23   ` Ashish Samant
@ 2016-08-30  1:09     ` Junxiao Bi
  2016-08-30  3:33     ` Eric Ren
  1 sibling, 0 replies; 9+ messages in thread
From: Junxiao Bi @ 2016-08-30  1:09 UTC (permalink / raw)
  To: ocfs2-devel

On 08/30/2016 03:23 AM, Ashish Samant wrote:
> Hi Eric,
> 
> The easiest way to reproduce this is :
> 
> 1. Create a random file of say 10 MB
>      xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
> 2. Reflink  it
>      reflink -f 10MBfile reflnktest
> 3. Punch a hole at starting at cluster boundary  with range greater that 
> 1MB. You can also use a range that will put the end offset in another 
> extent.
>      fallocate -p -o 0 -l 1048615 reflnktest
> 4. sync
> 5. Check the  first cluster in the source file. (It will be zeroed out).
>     dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
> 
Cool, this reproduce step deserved to be add into patch log.

Thanks,
Junxiao.

> Thanks,
> Ashish
> 
> On 08/28/2016 10:39 PM, Eric Ren wrote:
>> Hi,
>>
>> Thanks for this fix. I'd like to reproduce this issue locally and test 
>> this patch,
>> could you elaborate the detailed steps of reproduction?
>>
>> Thanks,
>> Eric
>>
>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>> If we punch a hole on a reflink such that following conditions are met:
>>>
>>> 1. start offset is on a cluster boundary
>>> 2. end offset is not on a cluster boundary
>>> 3. (end offset is somewhere in another extent) or
>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>
>>> we dont COW the first cluster starting at the start offset. But in this
>>> case, we were wrongly passing this cluster to
>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the 
>>> cluster
>>> in place and zero it in the source too.
>>>
>>> Fix this by skipping this cluster in such a scenario.
>>>
>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>> ---
>>> v1->v2:
>>> -Changed the commit msg to include a better and generic description of
>>>   the problem, for all cluster sizes.
>>> -Added Reported-by and Reviewed-by tags.
>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>> index 4e7b0dc..0b055bf 100644
>>> --- a/fs/ocfs2/file.c
>>> +++ b/fs/ocfs2/file.c
>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct 
>>> inode *inode,
>>>                          u64 start, u64 len)
>>>   {
>>>       int ret = 0;
>>> -    u64 tmpend, end = start + len;
>>> +    u64 tmpend = 0;
>>> +    u64 end = start + len;
>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>       unsigned int csize = osb->s_clustersize;
>>>       handle_t *handle;
>>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct 
>>> inode *inode,
>>>       }
>>>         /*
>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>> +     * If start is on a cluster boundary and end is somewhere in 
>>> another
>>> +     * cluster, we have not COWed the cluster starting at start, unless
>>> +     * end is also within the same cluster. So, in this case, we 
>>> skip this
>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and 
>>> move on
>>> +     * to the next one.
>>>        */
>>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize 
>>> - 1));
>>> -    if (tmpend > end)
>>> -        tmpend = end;
>>> +    if ((start & (csize - 1)) != 0) {
>>> +        /*
>>> +         * We want to get the byte offset of the end of the 1st
>>> +         * cluster.
>>> +         */
>>> +        tmpend = (u64)osb->s_clustersize +
>>> +            (start & ~(osb->s_clustersize - 1));
>>> +        if (tmpend > end)
>>> +            tmpend = end;
>>>   -    trace_ocfs2_zero_partial_clusters_range1((unsigned long 
>>> long)start,
>>> -                         (unsigned long long)tmpend);
>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>> +            (unsigned long long)start,
>>> +            (unsigned long long)tmpend);
>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, 
>>> tmpend);
>>> -    if (ret)
>>> -        mlog_errno(ret);
>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>> +                            tmpend);
>>> +        if (ret)
>>> +            mlog_errno(ret);
>>> +    }
>>>         if (tmpend < end) {
>>>           /*
>>
>>
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-29 19:23   ` Ashish Samant
  2016-08-30  1:09     ` Junxiao Bi
@ 2016-08-30  3:33     ` Eric Ren
  2016-08-30  4:11       ` Ashish Samant
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Ren @ 2016-08-30  3:33 UTC (permalink / raw)
  To: ocfs2-devel

Hello,

On 08/30/2016 03:23 AM, Ashish Samant wrote:
> Hi Eric,
>
> The easiest way to reproduce this is :
>
> 1. Create a random file of say 10 MB
>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
> 2. Reflink  it
>     reflink -f 10MBfile reflnktest
> 3. Punch a hole at starting at cluster boundary  with range greater that 1MB. You can also 
> use a range that will put the end offset in another extent.
>     fallocate -p -o 0 -l 1048615 reflnktest
> 4. sync
> 5. Check the  first cluster in the source file. (It will be zeroed out).
>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C

Thanks! I have a try myself, but I'm not sure what is our expected output and if the test 
result meet
it:

1. After applying this patch:
ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
wrote 10485760/10485760 bytes at offset 0
10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
00100000

2. Before this patch:
....
ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
00100000

3. debugfs.ocfs2 -R stats /dev/sdb
...
Block Size Bits: 12   Cluster Size Bits: 20
...

Eric
>
> Thanks,
> Ashish
>
> On 08/28/2016 10:39 PM, Eric Ren wrote:
>> Hi,
>>
>> Thanks for this fix. I'd like to reproduce this issue locally and test this patch,
>> could you elaborate the detailed steps of reproduction?
>>
>> Thanks,
>> Eric
>>
>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>> If we punch a hole on a reflink such that following conditions are met:
>>>
>>> 1. start offset is on a cluster boundary
>>> 2. end offset is not on a cluster boundary
>>> 3. (end offset is somewhere in another extent) or
>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>
>>> we dont COW the first cluster starting at the start offset. But in this
>>> case, we were wrongly passing this cluster to
>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
>>> in place and zero it in the source too.
>>>
>>> Fix this by skipping this cluster in such a scenario.
>>>
>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>> ---
>>> v1->v2:
>>> -Changed the commit msg to include a better and generic description of
>>>   the problem, for all cluster sizes.
>>> -Added Reported-by and Reviewed-by tags.
>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>> index 4e7b0dc..0b055bf 100644
>>> --- a/fs/ocfs2/file.c
>>> +++ b/fs/ocfs2/file.c
>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>                          u64 start, u64 len)
>>>   {
>>>       int ret = 0;
>>> -    u64 tmpend, end = start + len;
>>> +    u64 tmpend = 0;
>>> +    u64 end = start + len;
>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>       unsigned int csize = osb->s_clustersize;
>>>       handle_t *handle;
>>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>       }
>>>         /*
>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>> +     * If start is on a cluster boundary and end is somewhere in another
>>> +     * cluster, we have not COWed the cluster starting at start, unless
>>> +     * end is also within the same cluster. So, in this case, we skip this
>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and move on
>>> +     * to the next one.
>>>        */
>>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
>>> -    if (tmpend > end)
>>> -        tmpend = end;
>>> +    if ((start & (csize - 1)) != 0) {
>>> +        /*
>>> +         * We want to get the byte offset of the end of the 1st
>>> +         * cluster.
>>> +         */
>>> +        tmpend = (u64)osb->s_clustersize +
>>> +            (start & ~(osb->s_clustersize - 1));
>>> +        if (tmpend > end)
>>> +            tmpend = end;
>>>   -    trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
>>> -                         (unsigned long long)tmpend);
>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>> +            (unsigned long long)start,
>>> +            (unsigned long long)tmpend);
>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
>>> -    if (ret)
>>> -        mlog_errno(ret);
>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>> +                            tmpend);
>>> +        if (ret)
>>> +            mlog_errno(ret);
>>> +    }
>>>         if (tmpend < end) {
>>>           /*
>>
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-30  3:33     ` Eric Ren
@ 2016-08-30  4:11       ` Ashish Samant
  2016-08-30  7:38         ` Eric Ren
  0 siblings, 1 reply; 9+ messages in thread
From: Ashish Samant @ 2016-08-30  4:11 UTC (permalink / raw)
  To: ocfs2-devel

Hmm, thats weird. I see this on 4.7 kernel without the patch:

# xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
wrote 10485760/10485760 bytes at offset 0
10 MiB, 2560 ops; 0.0000 sec (683.995 MiB/sec and 175102.5992 ops/sec)
# reflink -f 10MBfile reflnktest
# fallocate -p -o 0 -l 1048615 reflnktest
# dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 
|................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0321517 s, 32.6 MB/s
00100000

and with patch
----
# dd if=10MBfile iflag=direct bs=1M count=1 | hexdump -C
00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
|................|
*
1+0 records in
1+0 records out
00100000

Thanks,
Ashish


On 08/29/2016 08:33 PM, Eric Ren wrote:
> Hello,
>
> On 08/30/2016 03:23 AM, Ashish Samant wrote:
>> Hi Eric,
>>
>> The easiest way to reproduce this is :
>>
>> 1. Create a random file of say 10 MB
>>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>> 2. Reflink  it
>>     reflink -f 10MBfile reflnktest
>> 3. Punch a hole at starting at cluster boundary  with range greater 
>> that 1MB. You can also use a range that will put the end offset in 
>> another extent.
>>     fallocate -p -o 0 -l 1048615 reflnktest
>> 4. sync
>> 5. Check the  first cluster in the source file. (It will be zeroed out).
>>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
>
> Thanks! I have a try myself, but I'm not sure what is our expected 
> output and if the test result meet
> it:
>
> 1. After applying this patch:
> ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
> ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
> wrote 10485760/10485760 bytes at offset 0
> 10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
> ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
> ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 
> | hexdump -C
> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
> |................|
> *
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
> 00100000
>
> 2. Before this patch:
> ....
> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 
> | hexdump -C
> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
> |................|
> *
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
> 00100000
>
> 3. debugfs.ocfs2 -R stats /dev/sdb
> ...
> Block Size Bits: 12   Cluster Size Bits: 20
> ...
>
> Eric
>>
>> Thanks,
>> Ashish
>>
>> On 08/28/2016 10:39 PM, Eric Ren wrote:
>>> Hi,
>>>
>>> Thanks for this fix. I'd like to reproduce this issue locally and 
>>> test this patch,
>>> could you elaborate the detailed steps of reproduction?
>>>
>>> Thanks,
>>> Eric
>>>
>>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>>> If we punch a hole on a reflink such that following conditions are 
>>>> met:
>>>>
>>>> 1. start offset is on a cluster boundary
>>>> 2. end offset is not on a cluster boundary
>>>> 3. (end offset is somewhere in another extent) or
>>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>>
>>>> we dont COW the first cluster starting at the start offset. But in 
>>>> this
>>>> case, we were wrongly passing this cluster to
>>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the 
>>>> cluster
>>>> in place and zero it in the source too.
>>>>
>>>> Fix this by skipping this cluster in such a scenario.
>>>>
>>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>> ---
>>>> v1->v2:
>>>> -Changed the commit msg to include a better and generic description of
>>>>   the problem, for all cluster sizes.
>>>> -Added Reported-by and Reviewed-by tags.
>>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>>> index 4e7b0dc..0b055bf 100644
>>>> --- a/fs/ocfs2/file.c
>>>> +++ b/fs/ocfs2/file.c
>>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct 
>>>> inode *inode,
>>>>                          u64 start, u64 len)
>>>>   {
>>>>       int ret = 0;
>>>> -    u64 tmpend, end = start + len;
>>>> +    u64 tmpend = 0;
>>>> +    u64 end = start + len;
>>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>>       unsigned int csize = osb->s_clustersize;
>>>>       handle_t *handle;
>>>> @@ -1538,18 +1539,31 @@ static int 
>>>> ocfs2_zero_partial_clusters(struct inode *inode,
>>>>       }
>>>>         /*
>>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>>> +     * If start is on a cluster boundary and end is somewhere in 
>>>> another
>>>> +     * cluster, we have not COWed the cluster starting at start, 
>>>> unless
>>>> +     * end is also within the same cluster. So, in this case, we 
>>>> skip this
>>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and 
>>>> move on
>>>> +     * to the next one.
>>>>        */
>>>> -    tmpend = (u64)osb->s_clustersize + (start & 
>>>> ~(osb->s_clustersize - 1));
>>>> -    if (tmpend > end)
>>>> -        tmpend = end;
>>>> +    if ((start & (csize - 1)) != 0) {
>>>> +        /*
>>>> +         * We want to get the byte offset of the end of the 1st
>>>> +         * cluster.
>>>> +         */
>>>> +        tmpend = (u64)osb->s_clustersize +
>>>> +            (start & ~(osb->s_clustersize - 1));
>>>> +        if (tmpend > end)
>>>> +            tmpend = end;
>>>>   -    trace_ocfs2_zero_partial_clusters_range1((unsigned long 
>>>> long)start,
>>>> -                         (unsigned long long)tmpend);
>>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>>> +            (unsigned long long)start,
>>>> +            (unsigned long long)tmpend);
>>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, 
>>>> tmpend);
>>>> -    if (ret)
>>>> -        mlog_errno(ret);
>>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>>> +                            tmpend);
>>>> +        if (ret)
>>>> +            mlog_errno(ret);
>>>> +    }
>>>>         if (tmpend < end) {
>>>>           /*
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-30  4:11       ` Ashish Samant
@ 2016-08-30  7:38         ` Eric Ren
  2016-08-30 23:17           ` Ashish Samant
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Ren @ 2016-08-30  7:38 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

I'm on 4.8.0-rc3 kernel. Hope someone else can double-confirm this;-)

On 08/30/2016 12:11 PM, Ashish Samant wrote:
> Hmm, thats weird. I see this on 4.7 kernel without the patch:
>
> # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
> wrote 10485760/10485760 bytes at offset 0
> 10 MiB, 2560 ops; 0.0000 sec (683.995 MiB/sec and 175102.5992 ops/sec)
> # reflink -f 10MBfile reflnktest
> # fallocate -p -o 0 -l 1048615 reflnktest
> # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
> *
> 1+0 records in
> 1+0 records out
> 1048576 bytes (1.0 MB) copied, 0.0321517 s, 32.6 MB/s
> 00100000
>
> and with patch
> ----
> # dd if=10MBfile iflag=direct bs=1M count=1 | hexdump -C
> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|

I'm not familiar with this code.  So why is the output "cd ..."? because we didn't write 
anything
into "10MBfile". Is it a magic number when reading from a hole?

Eric

> *
> 1+0 records in
> 1+0 records out
> 00100000



>
> Thanks,
> Ashish
>
>
> On 08/29/2016 08:33 PM, Eric Ren wrote:
>> Hello,
>>
>> On 08/30/2016 03:23 AM, Ashish Samant wrote:
>>> Hi Eric,
>>>
>>> The easiest way to reproduce this is :
>>>
>>> 1. Create a random file of say 10 MB
>>>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>> 2. Reflink  it
>>>     reflink -f 10MBfile reflnktest
>>> 3. Punch a hole at starting at cluster boundary  with range greater that 1MB. You can 
>>> also use a range that will put the end offset in another extent.
>>>     fallocate -p -o 0 -l 1048615 reflnktest
>>> 4. sync
>>> 5. Check the  first cluster in the source file. (It will be zeroed out).
>>>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
>>
>> Thanks! I have a try myself, but I'm not sure what is our expected output and if the test 
>> result meet
>> it:
>>
>> 1. After applying this patch:
>> ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
>> ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>> wrote 10485760/10485760 bytes at offset 0
>> 10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
>> ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
>> ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>> *
>> 1+0 records in
>> 1+0 records out
>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
>> 00100000
>>
>> 2. Before this patch:
>> ....
>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>> *
>> 1+0 records in
>> 1+0 records out
>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
>> 00100000
>>
>> 3. debugfs.ocfs2 -R stats /dev/sdb
>> ...
>> Block Size Bits: 12   Cluster Size Bits: 20
>> ...
>>
>> Eric
>>>
>>> Thanks,
>>> Ashish
>>>
>>> On 08/28/2016 10:39 PM, Eric Ren wrote:
>>>> Hi,
>>>>
>>>> Thanks for this fix. I'd like to reproduce this issue locally and test this patch,
>>>> could you elaborate the detailed steps of reproduction?
>>>>
>>>> Thanks,
>>>> Eric
>>>>
>>>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>>>> If we punch a hole on a reflink such that following conditions are met:
>>>>>
>>>>> 1. start offset is on a cluster boundary
>>>>> 2. end offset is not on a cluster boundary
>>>>> 3. (end offset is somewhere in another extent) or
>>>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>>>
>>>>> we dont COW the first cluster starting at the start offset. But in this
>>>>> case, we were wrongly passing this cluster to
>>>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
>>>>> in place and zero it in the source too.
>>>>>
>>>>> Fix this by skipping this cluster in such a scenario.
>>>>>
>>>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>> ---
>>>>> v1->v2:
>>>>> -Changed the commit msg to include a better and generic description of
>>>>>   the problem, for all cluster sizes.
>>>>> -Added Reported-by and Reviewed-by tags.
>>>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>>>> index 4e7b0dc..0b055bf 100644
>>>>> --- a/fs/ocfs2/file.c
>>>>> +++ b/fs/ocfs2/file.c
>>>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>                          u64 start, u64 len)
>>>>>   {
>>>>>       int ret = 0;
>>>>> -    u64 tmpend, end = start + len;
>>>>> +    u64 tmpend = 0;
>>>>> +    u64 end = start + len;
>>>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>>>       unsigned int csize = osb->s_clustersize;
>>>>>       handle_t *handle;
>>>>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>       }
>>>>>         /*
>>>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>>>> +     * If start is on a cluster boundary and end is somewhere in another
>>>>> +     * cluster, we have not COWed the cluster starting at start, unless
>>>>> +     * end is also within the same cluster. So, in this case, we skip this
>>>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and move on
>>>>> +     * to the next one.
>>>>>        */
>>>>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
>>>>> -    if (tmpend > end)
>>>>> -        tmpend = end;
>>>>> +    if ((start & (csize - 1)) != 0) {
>>>>> +        /*
>>>>> +         * We want to get the byte offset of the end of the 1st
>>>>> +         * cluster.
>>>>> +         */
>>>>> +        tmpend = (u64)osb->s_clustersize +
>>>>> +            (start & ~(osb->s_clustersize - 1));
>>>>> +        if (tmpend > end)
>>>>> +            tmpend = end;
>>>>>   -    trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
>>>>> -                         (unsigned long long)tmpend);
>>>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>>>> +            (unsigned long long)start,
>>>>> +            (unsigned long long)tmpend);
>>>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
>>>>> -    if (ret)
>>>>> -        mlog_errno(ret);
>>>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>>>> +                            tmpend);
>>>>> +        if (ret)
>>>>> +            mlog_errno(ret);
>>>>> +    }
>>>>>         if (tmpend < end) {
>>>>>           /*
>>>>
>>>>
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-30  7:38         ` Eric Ren
@ 2016-08-30 23:17           ` Ashish Samant
  2016-08-31  1:29             ` Eric Ren
  0 siblings, 1 reply; 9+ messages in thread
From: Ashish Samant @ 2016-08-30 23:17 UTC (permalink / raw)
  To: ocfs2-devel

Hi Eric,

I am able to reproduce this on 4.8.0-rc3 as well. Can you try again and 
issue a sync between fallocate and dd?

On 08/30/2016 12:38 AM, Eric Ren wrote:
> Hi,
>
> I'm on 4.8.0-rc3 kernel. Hope someone else can double-confirm this;-)
>
> On 08/30/2016 12:11 PM, Ashish Samant wrote:
>> Hmm, thats weird. I see this on 4.7 kernel without the patch:
>>
>> # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>> wrote 10485760/10485760 bytes at offset 0
>> 10 MiB, 2560 ops; 0.0000 sec (683.995 MiB/sec and 175102.5992 ops/sec)
>> # reflink -f 10MBfile reflnktest
>> # fallocate -p -o 0 -l 1048615 reflnktest
>> # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 
>> |................|
>> *
>> 1+0 records in
>> 1+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0321517 s, 32.6 MB/s
>> 00100000
>>
>> and with patch
>> ----
>> # dd if=10MBfile iflag=direct bs=1M count=1 | hexdump -C
>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
>> |................|
>
> I'm not familiar with this code.  So why is the output "cd ..."? 
> because we didn't write anything
> into "10MBfile". Is it a magic number when reading from a hole?
No, "cd" is what xfs_io wrote into the file. Those are the original 
contents of the file which are overwritten by 0 in the first cluster 
because of this bug.

Thanks,
Ashish
>
> Eric
>
>> *
>> 1+0 records in
>> 1+0 records out
>> 00100000
>
>
>
>>
>> Thanks,
>> Ashish
>>
>>
>> On 08/29/2016 08:33 PM, Eric Ren wrote:
>>> Hello,
>>>
>>> On 08/30/2016 03:23 AM, Ashish Samant wrote:
>>>> Hi Eric,
>>>>
>>>> The easiest way to reproduce this is :
>>>>
>>>> 1. Create a random file of say 10 MB
>>>>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>>> 2. Reflink  it
>>>>     reflink -f 10MBfile reflnktest
>>>> 3. Punch a hole at starting at cluster boundary  with range greater 
>>>> that 1MB. You can also use a range that will put the end offset in 
>>>> another extent.
>>>>     fallocate -p -o 0 -l 1048615 reflnktest
>>>> 4. sync
>>>> 5. Check the  first cluster in the source file. (It will be zeroed 
>>>> out).
>>>>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
>>>
>>> Thanks! I have a try myself, but I'm not sure what is our expected 
>>> output and if the test result meet
>>> it:
>>>
>>> 1. After applying this patch:
>>> ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
>>> ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>> wrote 10485760/10485760 bytes at offset 0
>>> 10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
>>> ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
>>> ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 
>>> count=1 | hexdump -C
>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
>>> |................|
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
>>> 00100000
>>>
>>> 2. Before this patch:
>>> ....
>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 
>>> count=1 | hexdump -C
>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd 
>>> |................|
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
>>> 00100000
>>>
>>> 3. debugfs.ocfs2 -R stats /dev/sdb
>>> ...
>>> Block Size Bits: 12   Cluster Size Bits: 20
>>> ...
>>>
>>> Eric
>>>>
>>>> Thanks,
>>>> Ashish
>>>>
>>>> On 08/28/2016 10:39 PM, Eric Ren wrote:
>>>>> Hi,
>>>>>
>>>>> Thanks for this fix. I'd like to reproduce this issue locally and 
>>>>> test this patch,
>>>>> could you elaborate the detailed steps of reproduction?
>>>>>
>>>>> Thanks,
>>>>> Eric
>>>>>
>>>>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>>>>> If we punch a hole on a reflink such that following conditions 
>>>>>> are met:
>>>>>>
>>>>>> 1. start offset is on a cluster boundary
>>>>>> 2. end offset is not on a cluster boundary
>>>>>> 3. (end offset is somewhere in another extent) or
>>>>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>>>>
>>>>>> we dont COW the first cluster starting at the start offset. But 
>>>>>> in this
>>>>>> case, we were wrongly passing this cluster to
>>>>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the 
>>>>>> cluster
>>>>>> in place and zero it in the source too.
>>>>>>
>>>>>> Fix this by skipping this cluster in such a scenario.
>>>>>>
>>>>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>>>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>>>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>> ---
>>>>>> v1->v2:
>>>>>> -Changed the commit msg to include a better and generic 
>>>>>> description of
>>>>>>   the problem, for all cluster sizes.
>>>>>> -Added Reported-by and Reviewed-by tags.
>>>>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>>>>> index 4e7b0dc..0b055bf 100644
>>>>>> --- a/fs/ocfs2/file.c
>>>>>> +++ b/fs/ocfs2/file.c
>>>>>> @@ -1506,7 +1506,8 @@ static int 
>>>>>> ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>                          u64 start, u64 len)
>>>>>>   {
>>>>>>       int ret = 0;
>>>>>> -    u64 tmpend, end = start + len;
>>>>>> +    u64 tmpend = 0;
>>>>>> +    u64 end = start + len;
>>>>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>>>>       unsigned int csize = osb->s_clustersize;
>>>>>>       handle_t *handle;
>>>>>> @@ -1538,18 +1539,31 @@ static int 
>>>>>> ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>       }
>>>>>>         /*
>>>>>> -     * We want to get the byte offset of the end of the 1st 
>>>>>> cluster.
>>>>>> +     * If start is on a cluster boundary and end is somewhere in 
>>>>>> another
>>>>>> +     * cluster, we have not COWed the cluster starting at start, 
>>>>>> unless
>>>>>> +     * end is also within the same cluster. So, in this case, we 
>>>>>> skip this
>>>>>> +     * first call to ocfs2_zero_range_for_truncate() truncate 
>>>>>> and move on
>>>>>> +     * to the next one.
>>>>>>        */
>>>>>> -    tmpend = (u64)osb->s_clustersize + (start & 
>>>>>> ~(osb->s_clustersize - 1));
>>>>>> -    if (tmpend > end)
>>>>>> -        tmpend = end;
>>>>>> +    if ((start & (csize - 1)) != 0) {
>>>>>> +        /*
>>>>>> +         * We want to get the byte offset of the end of the 1st
>>>>>> +         * cluster.
>>>>>> +         */
>>>>>> +        tmpend = (u64)osb->s_clustersize +
>>>>>> +            (start & ~(osb->s_clustersize - 1));
>>>>>> +        if (tmpend > end)
>>>>>> +            tmpend = end;
>>>>>>   - trace_ocfs2_zero_partial_clusters_range1((unsigned long 
>>>>>> long)start,
>>>>>> -                         (unsigned long long)tmpend);
>>>>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>>>>> +            (unsigned long long)start,
>>>>>> +            (unsigned long long)tmpend);
>>>>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, 
>>>>>> tmpend);
>>>>>> -    if (ret)
>>>>>> -        mlog_errno(ret);
>>>>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>>>>> +                            tmpend);
>>>>>> +        if (ret)
>>>>>> +            mlog_errno(ret);
>>>>>> +    }
>>>>>>         if (tmpend < end) {
>>>>>>           /*
>>>>>
>>>>>
>>>>
>>>
>>
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
  2016-08-30 23:17           ` Ashish Samant
@ 2016-08-31  1:29             ` Eric Ren
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Ren @ 2016-08-31  1:29 UTC (permalink / raw)
  To: ocfs2-devel

Hi Ashish,

On 08/31/2016 07:17 AM, Ashish Samant wrote:
> Hi Eric,
>
> I am able to reproduce this on 4.8.0-rc3 as well. Can you try again and issue a sync 
> between fallocate and dd?

It works!
ocfs2dev2 is not patched:
====================
ocfs2dev2:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
ocfs2dev2:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
ocfs2dev2:/mnt/ocfs2 # sync
ocfs2dev2:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0936888 s, 11.2 MB/s
00100000

while ocfs2dev1 is patched:
======================
ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
wrote 10485760/10485760 bytes at offset 0
10 MiB, 2560 ops; 0.0000 sec (183.137 MiB/sec and 46883.0122 ops/sec)
ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
ocfs2dev1:/mnt/ocfs2 # sync
ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0933082 s, 11.2 MB/s
00100000

>
> On 08/30/2016 12:38 AM, Eric Ren wrote:
>> Hi,
>>
>> I'm on 4.8.0-rc3 kernel. Hope someone else can double-confirm this;-)
>>
>> On 08/30/2016 12:11 PM, Ashish Samant wrote:
>>> Hmm, thats weird. I see this on 4.7 kernel without the patch:
>>>
>>> # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>> wrote 10485760/10485760 bytes at offset 0
>>> 10 MiB, 2560 ops; 0.0000 sec (683.995 MiB/sec and 175102.5992 ops/sec)
>>> # reflink -f 10MBfile reflnktest
>>> # fallocate -p -o 0 -l 1048615 reflnktest
>>> # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 1048576 bytes (1.0 MB) copied, 0.0321517 s, 32.6 MB/s
>>> 00100000
>>>
>>> and with patch
>>> ----
>>> # dd if=10MBfile iflag=direct bs=1M count=1 | hexdump -C
>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>
>> I'm not familiar with this code.  So why is the output "cd ..."? because we didn't write 
>> anything
>> into "10MBfile". Is it a magic number when reading from a hole?
> No, "cd" is what xfs_io wrote into the file. Those are the original contents of the file 
> which are overwritten by 0 in the first cluster because of this bug.

Ah, gotcha, thanks!

Eric

>
> Thanks,
> Ashish
>>
>> Eric
>>
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 00100000
>>
>>
>>
>>>
>>> Thanks,
>>> Ashish
>>>
>>>
>>> On 08/29/2016 08:33 PM, Eric Ren wrote:
>>>> Hello,
>>>>
>>>> On 08/30/2016 03:23 AM, Ashish Samant wrote:
>>>>> Hi Eric,
>>>>>
>>>>> The easiest way to reproduce this is :
>>>>>
>>>>> 1. Create a random file of say 10 MB
>>>>>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>>>> 2. Reflink  it
>>>>>     reflink -f 10MBfile reflnktest
>>>>> 3. Punch a hole at starting at cluster boundary  with range greater that 1MB. You can 
>>>>> also use a range that will put the end offset in another extent.
>>>>>     fallocate -p -o 0 -l 1048615 reflnktest
>>>>> 4. sync
>>>>> 5. Check the  first cluster in the source file. (It will be zeroed out).
>>>>>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
>>>>
>>>> Thanks! I have a try myself, but I'm not sure what is our expected output and if the 
>>>> test result meet
>>>> it:
>>>>
>>>> 1. After applying this patch:
>>>> ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>>> wrote 10485760/10485760 bytes at offset 0
>>>> 10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
>>>> ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>>> *
>>>> 1+0 records in
>>>> 1+0 records out
>>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
>>>> 00100000
>>>>
>>>> 2. Before this patch:
>>>> ....
>>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>>> *
>>>> 1+0 records in
>>>> 1+0 records out
>>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
>>>> 00100000
>>>>
>>>> 3. debugfs.ocfs2 -R stats /dev/sdb
>>>> ...
>>>> Block Size Bits: 12   Cluster Size Bits: 20
>>>> ...
>>>>
>>>> Eric
>>>>>
>>>>> Thanks,
>>>>> Ashish
>>>>>
>>>>> On 08/28/2016 10:39 PM, Eric Ren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for this fix. I'd like to reproduce this issue locally and test this patch,
>>>>>> could you elaborate the detailed steps of reproduction?
>>>>>>
>>>>>> Thanks,
>>>>>> Eric
>>>>>>
>>>>>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>>>>>> If we punch a hole on a reflink such that following conditions are met:
>>>>>>>
>>>>>>> 1. start offset is on a cluster boundary
>>>>>>> 2. end offset is not on a cluster boundary
>>>>>>> 3. (end offset is somewhere in another extent) or
>>>>>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>>>>>
>>>>>>> we dont COW the first cluster starting at the start offset. But in this
>>>>>>> case, we were wrongly passing this cluster to
>>>>>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
>>>>>>> in place and zero it in the source too.
>>>>>>>
>>>>>>> Fix this by skipping this cluster in such a scenario.
>>>>>>>
>>>>>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>>>>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>>>>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>>> ---
>>>>>>> v1->v2:
>>>>>>> -Changed the commit msg to include a better and generic description of
>>>>>>>   the problem, for all cluster sizes.
>>>>>>> -Added Reported-by and Reviewed-by tags.
>>>>>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>>>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>>>>>> index 4e7b0dc..0b055bf 100644
>>>>>>> --- a/fs/ocfs2/file.c
>>>>>>> +++ b/fs/ocfs2/file.c
>>>>>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>>                          u64 start, u64 len)
>>>>>>>   {
>>>>>>>       int ret = 0;
>>>>>>> -    u64 tmpend, end = start + len;
>>>>>>> +    u64 tmpend = 0;
>>>>>>> +    u64 end = start + len;
>>>>>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>>>>>       unsigned int csize = osb->s_clustersize;
>>>>>>>       handle_t *handle;
>>>>>>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>>       }
>>>>>>>         /*
>>>>>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>>>>>> +     * If start is on a cluster boundary and end is somewhere in another
>>>>>>> +     * cluster, we have not COWed the cluster starting at start, unless
>>>>>>> +     * end is also within the same cluster. So, in this case, we skip this
>>>>>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and move on
>>>>>>> +     * to the next one.
>>>>>>>        */
>>>>>>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
>>>>>>> -    if (tmpend > end)
>>>>>>> -        tmpend = end;
>>>>>>> +    if ((start & (csize - 1)) != 0) {
>>>>>>> +        /*
>>>>>>> +         * We want to get the byte offset of the end of the 1st
>>>>>>> +         * cluster.
>>>>>>> +         */
>>>>>>> +        tmpend = (u64)osb->s_clustersize +
>>>>>>> +            (start & ~(osb->s_clustersize - 1));
>>>>>>> +        if (tmpend > end)
>>>>>>> +            tmpend = end;
>>>>>>>   - trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
>>>>>>> -                         (unsigned long long)tmpend);
>>>>>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>>>>>> +            (unsigned long long)start,
>>>>>>> +            (unsigned long long)tmpend);
>>>>>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
>>>>>>> -    if (ret)
>>>>>>> -        mlog_errno(ret);
>>>>>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>>>>>> +                            tmpend);
>>>>>>> +        if (ret)
>>>>>>> +            mlog_errno(ret);
>>>>>>> +    }
>>>>>>>         if (tmpend < end) {
>>>>>>>           /*
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-31  1:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-26 23:04 [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate() Ashish Samant
2016-08-29  5:39 ` Eric Ren
2016-08-29 19:23   ` Ashish Samant
2016-08-30  1:09     ` Junxiao Bi
2016-08-30  3:33     ` Eric Ren
2016-08-30  4:11       ` Ashish Samant
2016-08-30  7:38         ` Eric Ren
2016-08-30 23:17           ` Ashish Samant
2016-08-31  1:29             ` Eric Ren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.