All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Ren <zren@suse.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
Date: Wed, 31 Aug 2016 09:29:26 +0800	[thread overview]
Message-ID: <021c9340-4bbd-1f9c-6abc-d4bc783ab042@suse.com> (raw)
In-Reply-To: <57C61407.6060509@oracle.com>

Hi Ashish,

On 08/31/2016 07:17 AM, Ashish Samant wrote:
> Hi Eric,
>
> I am able to reproduce this on 4.8.0-rc3 as well. Can you try again and issue a sync 
> between fallocate and dd?

It works!
ocfs2dev2 is not patched:
====================
ocfs2dev2:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
ocfs2dev2:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
ocfs2dev2:/mnt/ocfs2 # sync
ocfs2dev2:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0936888 s, 11.2 MB/s
00100000

while ocfs2dev1 is patched:
======================
ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
wrote 10485760/10485760 bytes at offset 0
10 MiB, 2560 ops; 0.0000 sec (183.137 MiB/sec and 46883.0122 ops/sec)
ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
ocfs2dev1:/mnt/ocfs2 # sync
ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
*
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0933082 s, 11.2 MB/s
00100000

>
> On 08/30/2016 12:38 AM, Eric Ren wrote:
>> Hi,
>>
>> I'm on 4.8.0-rc3 kernel. Hope someone else can double-confirm this;-)
>>
>> On 08/30/2016 12:11 PM, Ashish Samant wrote:
>>> Hmm, thats weird. I see this on 4.7 kernel without the patch:
>>>
>>> # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>> wrote 10485760/10485760 bytes at offset 0
>>> 10 MiB, 2560 ops; 0.0000 sec (683.995 MiB/sec and 175102.5992 ops/sec)
>>> # reflink -f 10MBfile reflnktest
>>> # fallocate -p -o 0 -l 1048615 reflnktest
>>> # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 1048576 bytes (1.0 MB) copied, 0.0321517 s, 32.6 MB/s
>>> 00100000
>>>
>>> and with patch
>>> ----
>>> # dd if=10MBfile iflag=direct bs=1M count=1 | hexdump -C
>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>
>> I'm not familiar with this code.  So why is the output "cd ..."? because we didn't write 
>> anything
>> into "10MBfile". Is it a magic number when reading from a hole?
> No, "cd" is what xfs_io wrote into the file. Those are the original contents of the file 
> which are overwritten by 0 in the first cluster because of this bug.

Ah, gotcha, thanks!

Eric

>
> Thanks,
> Ashish
>>
>> Eric
>>
>>> *
>>> 1+0 records in
>>> 1+0 records out
>>> 00100000
>>
>>
>>
>>>
>>> Thanks,
>>> Ashish
>>>
>>>
>>> On 08/29/2016 08:33 PM, Eric Ren wrote:
>>>> Hello,
>>>>
>>>> On 08/30/2016 03:23 AM, Ashish Samant wrote:
>>>>> Hi Eric,
>>>>>
>>>>> The easiest way to reproduce this is :
>>>>>
>>>>> 1. Create a random file of say 10 MB
>>>>>     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>>>> 2. Reflink  it
>>>>>     reflink -f 10MBfile reflnktest
>>>>> 3. Punch a hole at starting at cluster boundary  with range greater that 1MB. You can 
>>>>> also use a range that will put the end offset in another extent.
>>>>>     fallocate -p -o 0 -l 1048615 reflnktest
>>>>> 4. sync
>>>>> 5. Check the  first cluster in the source file. (It will be zeroed out).
>>>>>    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C
>>>>
>>>> Thanks! I have a try myself, but I'm not sure what is our expected output and if the 
>>>> test result meet
>>>> it:
>>>>
>>>> 1. After applying this patch:
>>>> ocfs2dev1:/mnt/ocfs2 # rm 10MBfile reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
>>>> wrote 10485760/10485760 bytes at offset 0
>>>> 10 MiB, 2560 ops; 0.0000 sec (1.089 GiB/sec and 285427.5839 ops/sec)
>>>> ocfs2dev1:/mnt/ocfs2 # reflink -f 10MBfile reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # fallocate -p -o 0 -l 1048615 reflnktest
>>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>>> *
>>>> 1+0 records in
>>>> 1+0 records out
>>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0952464 s, 11.0 MB/s
>>>> 00100000
>>>>
>>>> 2. Before this patch:
>>>> ....
>>>> ocfs2dev1:/mnt/ocfs2 # dd if=10MBfile iflag=direct bs=1048576 count=1 | hexdump -C
>>>> 00000000  cd cd cd cd cd cd cd cd  cd cd cd cd cd cd cd cd |................|
>>>> *
>>>> 1+0 records in
>>>> 1+0 records out
>>>> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0954648 s, 11.0 MB/s
>>>> 00100000
>>>>
>>>> 3. debugfs.ocfs2 -R stats /dev/sdb
>>>> ...
>>>> Block Size Bits: 12   Cluster Size Bits: 20
>>>> ...
>>>>
>>>> Eric
>>>>>
>>>>> Thanks,
>>>>> Ashish
>>>>>
>>>>> On 08/28/2016 10:39 PM, Eric Ren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for this fix. I'd like to reproduce this issue locally and test this patch,
>>>>>> could you elaborate the detailed steps of reproduction?
>>>>>>
>>>>>> Thanks,
>>>>>> Eric
>>>>>>
>>>>>> On 08/27/2016 07:04 AM, Ashish Samant wrote:
>>>>>>> If we punch a hole on a reflink such that following conditions are met:
>>>>>>>
>>>>>>> 1. start offset is on a cluster boundary
>>>>>>> 2. end offset is not on a cluster boundary
>>>>>>> 3. (end offset is somewhere in another extent) or
>>>>>>>     (hole range > MAX_CONTIG_BYTES(1MB)),
>>>>>>>
>>>>>>> we dont COW the first cluster starting at the start offset. But in this
>>>>>>> case, we were wrongly passing this cluster to
>>>>>>> ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
>>>>>>> in place and zero it in the source too.
>>>>>>>
>>>>>>> Fix this by skipping this cluster in such a scenario.
>>>>>>>
>>>>>>> Reported-by: Saar Maoz <saar.maoz@oracle.com>
>>>>>>> Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
>>>>>>> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>>>>> ---
>>>>>>> v1->v2:
>>>>>>> -Changed the commit msg to include a better and generic description of
>>>>>>>   the problem, for all cluster sizes.
>>>>>>> -Added Reported-by and Reviewed-by tags.
>>>>>>>      fs/ocfs2/file.c | 34 ++++++++++++++++++++++++----------
>>>>>>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>>>>>> index 4e7b0dc..0b055bf 100644
>>>>>>> --- a/fs/ocfs2/file.c
>>>>>>> +++ b/fs/ocfs2/file.c
>>>>>>> @@ -1506,7 +1506,8 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>>                          u64 start, u64 len)
>>>>>>>   {
>>>>>>>       int ret = 0;
>>>>>>> -    u64 tmpend, end = start + len;
>>>>>>> +    u64 tmpend = 0;
>>>>>>> +    u64 end = start + len;
>>>>>>>       struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>>>>>       unsigned int csize = osb->s_clustersize;
>>>>>>>       handle_t *handle;
>>>>>>> @@ -1538,18 +1539,31 @@ static int ocfs2_zero_partial_clusters(struct inode *inode,
>>>>>>>       }
>>>>>>>         /*
>>>>>>> -     * We want to get the byte offset of the end of the 1st cluster.
>>>>>>> +     * If start is on a cluster boundary and end is somewhere in another
>>>>>>> +     * cluster, we have not COWed the cluster starting at start, unless
>>>>>>> +     * end is also within the same cluster. So, in this case, we skip this
>>>>>>> +     * first call to ocfs2_zero_range_for_truncate() truncate and move on
>>>>>>> +     * to the next one.
>>>>>>>        */
>>>>>>> -    tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
>>>>>>> -    if (tmpend > end)
>>>>>>> -        tmpend = end;
>>>>>>> +    if ((start & (csize - 1)) != 0) {
>>>>>>> +        /*
>>>>>>> +         * We want to get the byte offset of the end of the 1st
>>>>>>> +         * cluster.
>>>>>>> +         */
>>>>>>> +        tmpend = (u64)osb->s_clustersize +
>>>>>>> +            (start & ~(osb->s_clustersize - 1));
>>>>>>> +        if (tmpend > end)
>>>>>>> +            tmpend = end;
>>>>>>>   - trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
>>>>>>> -                         (unsigned long long)tmpend);
>>>>>>> +        trace_ocfs2_zero_partial_clusters_range1(
>>>>>>> +            (unsigned long long)start,
>>>>>>> +            (unsigned long long)tmpend);
>>>>>>>   -    ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
>>>>>>> -    if (ret)
>>>>>>> -        mlog_errno(ret);
>>>>>>> +        ret = ocfs2_zero_range_for_truncate(inode, handle, start,
>>>>>>> +                            tmpend);
>>>>>>> +        if (ret)
>>>>>>> +            mlog_errno(ret);
>>>>>>> +    }
>>>>>>>         if (tmpend < end) {
>>>>>>>           /*
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>

      reply	other threads:[~2016-08-31  1:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-26 23:04 [Ocfs2-devel] [PATCH v2] ocfs2: Fix start offset to ocfs2_zero_range_for_truncate() Ashish Samant
2016-08-29  5:39 ` Eric Ren
2016-08-29 19:23   ` Ashish Samant
2016-08-30  1:09     ` Junxiao Bi
2016-08-30  3:33     ` Eric Ren
2016-08-30  4:11       ` Ashish Samant
2016-08-30  7:38         ` Eric Ren
2016-08-30 23:17           ` Ashish Samant
2016-08-31  1:29             ` Eric Ren [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=021c9340-4bbd-1f9c-6abc-d4bc783ab042@suse.com \
    --to=zren@suse.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.