All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <lhenriques@suse.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: Dave Chinner <david@fromorbit.com>,
	fstests@vger.kernel.org, "Yan, Zheng" <zyan@redhat.com>,
	ceph-devel@vger.kernel.org
Subject: Re: [RFC PATCH 2/2] ceph: test basic ceph.quota.max_bytes quota
Date: Wed, 03 Apr 2019 14:19:11 +0100	[thread overview]
Message-ID: <874l7fdy5s.fsf@suse.com> (raw)
In-Reply-To: <d38a4d84-8df2-984e-cf1c-045d85644796@suse.com> (Nikolay Borisov's message of "Wed, 3 Apr 2019 15:17:23 +0300")

Nikolay Borisov <nborisov@suse.com> writes:

> On 3.04.19 г. 12:45 ч., Luis Henriques wrote:
>> Dave Chinner <david@fromorbit.com> writes:
>> 
>>> On Tue, Apr 02, 2019 at 11:34:28AM +0100, Luis Henriques wrote:
>>>> Simple set of checks for CephFS max_bytes directory quota implementation.
>>>>
>>>> Signed-off-by: Luis Henriques <lhenriques@suse.com>
>>>> ---
>>>>  tests/ceph/002     | 147 +++++++++++++++++++++++++++++++++++++++++++++
>>>>  tests/ceph/002.out |   1 +
>>>>  tests/ceph/group   |   1 +
>>>>  3 files changed, 149 insertions(+)
>>>>  create mode 100755 tests/ceph/002
>>>>  create mode 100644 tests/ceph/002.out
>>>>
>>>> diff --git a/tests/ceph/002 b/tests/ceph/002
>>>> new file mode 100755
>>>> index 000000000000..313865dc639e
>>>> --- /dev/null
>>>> +++ b/tests/ceph/002
>>>> @@ -0,0 +1,147 @@
>>>> +#! /bin/bash
>>>> +# SPDX-License-Identifier: GPL-2.0
>>>> +# Copyright (c) 2019 SUSE LLC. All Rights Reserved.
>>>> +#
>>>> +# FS QA Test No. 002
>>>> +#
>>>> +# This tests basic ceph.quota.max_bytes quota features.
>>>> +#
>>>> +
>>>> +seq=`basename $0`
>>>> +seqres=$RESULT_DIR/$seq
>>>> +echo "QA output created by $seq"
>>>> +
>>>> +testdir=$TEST_DIR/quota-test
>>>
>>> Try not to name local variables the same as when known global
>>> variables. When we talk about "test dir", we mean the mount point
>>> for the test device, not the local, tests specific work directory.
>>>  i.e. this is a "work dir", not a "test dir".
>>>
>>> And, often, we just name them after the test that is running,
>>> so we can identify what test left them behind. i.e.
>>>
>>> workdir=$TEST_DIR/$seq
>>>
>>>> +
>>>> +tmp=/tmp/$$
>>>> +status=1    # failure is the default!
>>>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>>>> +
>>>> +_cleanup()
>>>> +{
>>>> +	cd /
>>>> +	rm -rf $tmp.*
>>>> +	rm -rf $testdir
>>>
>>> Leave it behind for post-mortem analysis if necessary, remove it
>>> before starting this test execution....
>>>
>>>> +}
>>>> +
>>>> +# get standard environment, filters and checks
>>>> +. ./common/rc
>>>> +. ./common/filter
>>>> +. ./common/attr
>>>> +
>>>> +# real QA test starts here
>>>> +_supported_os Linux
>>>> +_supported_fs ceph
>>>> +
>>>> +_require_attrs
>>>> +
>>>> +set_quota()
>>>> +{
>>>> +	val=$1
>>>> +	dir=$2
>>>> +	$SETFATTR_PROG -n ceph.quota.max_bytes -v $val $dir >/dev/null 2>&1
>>>> +}
>>>> +
>>>> +get_quota()
>>>> +{
>>>> +	dir=$1
>>>> +	$GETFATTR_PROG --only-values -n ceph.quota.max_bytes $dir 2> /dev/null
>>>> +}
>>>> +
>>>> +# function to write a file.  We use a loop because quotas in CephFS is a
>>>> +# "best-effort" implementation, i.e. a write may actually be allowed even if the
>>>> +# quota is being exceeded.  Using a loop reduces the chances of this to happen.
>>>> +#
>>>> +# NOTE: 'size' parameter is in M
>>>
>>> xfs_io accepts "1m" as one megabyte.
>>>
>>>> +write_file()
>>>> +{
>>>> +	file=$1
>>>> +	size=$2 # size in M
>>>> +	for (( i = 1; i < $size; i++ )); do
>>>> +		$XFS_IO_PROG -f -c "pwrite -W $((i * 1048576)) 1048576" \
>>>> +			     $file >/dev/null 2>&1
>>>> +	done
>>>> +}
>>>
>>> Makes no sense to me. xfs_io does a write() loop internally with
>>> this pwrite command of 4kB writes - the default buffer size. If you
>>> want xfs_io to loop doing 1MB sized pwrite() calls, then all you
>>> need is this:
>>>
>>> 	$XFS_IO_PROG -f -c "pwrite -w -B 1m 0 ${size}m" $file | _filter_xfs_io
>>>
>> 
>> Thank you for your review, Dave.  I'll make sure the next revision of
>> these tests will include all your comments implemented... except for
>> this one.
>> 
>> The reason I'm using a loop for writing a file is due to the nature of
>> the (very!) loose definition of quotas in CephFS.  Basically, clients
>> will likely write some amount of data over the configured limit because
>> the servers they are communicating with to write the data (the OSDs)
>> have no idea about the concept of quotas (or files even); the filesystem
>> view in the cluster is managed at a different level, with the help of
>> the MDS and the client itself.
>> 
>> So, the loop in this function is simply to allow the metadata associated
>> with the file to be updated while we're writing the file.  If I use a
>
> But the metadata will be modified while writing the file even with a
> single invocation of xfs_io.

No, that's not true.  It would be too expensive to keep the metadata
server updated while writing to a file.  So, making sure there's
actually an open/close to the file (plus the fsync in pwrite) helps
making sure the metadata is flushed into the MDS.

(And yes I _did_ tried to use xfs_io with the 1m loop and the test fails
because it simply writes all the data and gets no EDQUOT error.)

Cheers,
-- 
Luis

> It's just that you are moving the loop inside xfs_io rather than
> having to invoke xfs_io a lot of time. Also, just because you are
> using a single -c "pwrite" command doesn't mean this will translate to
> a single call to pwrite. As Dave mentioned, the default block size is
> 4k meaning :
>
> "pwrite -w -B 1m 0 ${size}m"
>
> will result in 'size / 1m' writes of size 1m, each being a distinct call
> to pwrite.
>
>> single pwrite, the whole file will be written before we get a -EDQUOT.
>> 
>> If an admin wants to really enforce some hard quotas in the filesystem,
>> there are other means to do that, but not at the filesystem level.
>> There are some more details on the quota implementation in Ceph here:
>> 
>>   http://docs.ceph.com/docs/master/cephfs/quota/
>> 
>> I hope this makes sense and helps understanding why I need a loop to be
>> used in this test.
>> 
>> Cheers
>> 

  reply	other threads:[~2019-04-03 13:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 10:34 [RFC PATCH 0/2] Initial CephFS tests Luis Henriques
2019-04-02 10:34 ` [RFC PATCH 1/2] ceph: test basic ceph.quota.max_files quota Luis Henriques
2019-04-02 10:34 ` [RFC PATCH 2/2] ceph: test basic ceph.quota.max_bytes quota Luis Henriques
2019-04-02 21:09   ` Dave Chinner
2019-04-03  9:45     ` Luis Henriques
2019-04-03 12:17       ` Nikolay Borisov
2019-04-03 13:19         ` Luis Henriques [this message]
2019-04-03 21:47           ` Dave Chinner
2019-04-04 10:18             ` Luis Henriques
2019-04-12  1:15               ` Dave Chinner
2019-04-12  3:37                 ` Yan, Zheng
2019-04-12 11:04                   ` Luis Henriques
2019-04-14 22:15                   ` Dave Chinner
2019-04-15  2:16                     ` Yan, Zheng
2019-04-16  8:13                       ` Dave Chinner
2019-04-16 10:48                         ` Luis Henriques
2019-04-16 18:38                           ` Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874l7fdy5s.fsf@suse.com \
    --to=lhenriques@suse.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=david@fromorbit.com \
    --cc=fstests@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.