* [PATCH v3] ceph: do not update snapshot context when there is no new snapshot
@ 2022-02-19 6:28 xiubli
2022-02-19 12:55 ` Jeff Layton
2022-02-21 16:43 ` Luís Henriques
0 siblings, 2 replies; 4+ messages in thread
From: xiubli @ 2022-02-19 6:28 UTC (permalink / raw)
To: jlayton; +Cc: idryomov, vshankar, ceph-devel, Xiubo Li
From: Xiubo Li <xiubli@redhat.com>
We will only track the uppest parent snapshot realm from which we
need to rebuild the snapshot contexts _downward_ in hierarchy. For
all the others having no new snapshot we will do nothing.
This fix will avoid calling ceph_queue_cap_snap() on some inodes
inappropriately. For example, with the code in mainline, suppose there
are 2 directory hierarchies (with 6 directories total), like this:
/dir_X1/dir_X2/dir_X3/
/dir_Y1/dir_Y2/dir_Y3/
Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
root snapshot under /.snap/root_snap. Every time we make snapshots under
/dir_Y1/..., the kclient will always try to rebuild the snap context for
snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
and dir_Y3, which makes no sense.
That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
creating a new snapshot under /dir_Y1/... the new seq will be 4, and
the mds will send the kclient a snapshot backtrace in _downward_
order: seqs 4, 3.
When ceph_update_snap_trace() is called, it will always rebuild the from
the last realm, that's the root_snap. So later when rebuilding the snap
context, the current logic will always cause it to rebuild the snap_X2
realm and then try to queue cap snaps for all the inodes related in that
realm, even though it's not necessary.
This is accompanied by a lot of these sorts of dout messages:
"ceph: queue_cap_snap 00000000a42b796b nothing dirty|writing"
Fix the logic to avoid this situation.
Also, the 'invalidate' word is not precise here. In actuality, it will
cause a rebuild of the existing snapshot contexts or just build
non-existant ones. Rename it to 'rebuild_snapcs'.
URL: https://tracker.ceph.com/issues/44100
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
V3:
- Fixed the crash issue reproduced by Luís.
fs/ceph/snap.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index 32e246138793..25a29304b74d 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -736,7 +736,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
__le64 *prior_parent_snaps; /* encoded */
struct ceph_snap_realm *realm = NULL;
struct ceph_snap_realm *first_realm = NULL;
- int invalidate = 0;
+ struct ceph_snap_realm *realm_to_rebuild = NULL;
+ int rebuild_snapcs;
int err = -ENOMEM;
LIST_HEAD(dirty_realms);
@@ -744,6 +745,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
dout("update_snap_trace deletion=%d\n", deletion);
more:
+ rebuild_snapcs = 0;
ceph_decode_need(&p, e, sizeof(*ri), bad);
ri = p;
p += sizeof(*ri);
@@ -767,7 +769,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
err = adjust_snap_realm_parent(mdsc, realm, le64_to_cpu(ri->parent));
if (err < 0)
goto fail;
- invalidate += err;
+ rebuild_snapcs += err;
if (le64_to_cpu(ri->seq) > realm->seq) {
dout("update_snap_trace updating %llx %p %lld -> %lld\n",
@@ -792,22 +794,30 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
if (realm->seq > mdsc->last_snap_seq)
mdsc->last_snap_seq = realm->seq;
- invalidate = 1;
+ rebuild_snapcs = 1;
} else if (!realm->cached_context) {
dout("update_snap_trace %llx %p seq %lld new\n",
realm->ino, realm, realm->seq);
- invalidate = 1;
+ rebuild_snapcs = 1;
} else {
dout("update_snap_trace %llx %p seq %lld unchanged\n",
realm->ino, realm, realm->seq);
}
- dout("done with %llx %p, invalidated=%d, %p %p\n", realm->ino,
- realm, invalidate, p, e);
+ dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino,
+ realm, rebuild_snapcs, p, e);
- /* invalidate when we reach the _end_ (root) of the trace */
- if (invalidate && p >= e)
- rebuild_snap_realms(realm, &dirty_realms);
+ /*
+ * this will always track the uppest parent realm from which
+ * we need to rebuild the snapshot contexts _downward_ in
+ * hierarchy.
+ */
+ if (rebuild_snapcs)
+ realm_to_rebuild = realm;
+
+ /* rebuild_snapcs when we reach the _end_ (root) of the trace */
+ if (realm_to_rebuild && p >= e)
+ rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
if (!first_realm)
first_realm = realm;
--
2.27.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3] ceph: do not update snapshot context when there is no new snapshot
2022-02-19 6:28 [PATCH v3] ceph: do not update snapshot context when there is no new snapshot xiubli
@ 2022-02-19 12:55 ` Jeff Layton
2022-02-21 16:43 ` Luís Henriques
1 sibling, 0 replies; 4+ messages in thread
From: Jeff Layton @ 2022-02-19 12:55 UTC (permalink / raw)
To: xiubli; +Cc: idryomov, vshankar, ceph-devel
On Sat, 2022-02-19 at 14:28 +0800, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> We will only track the uppest parent snapshot realm from which we
> need to rebuild the snapshot contexts _downward_ in hierarchy. For
> all the others having no new snapshot we will do nothing.
>
> This fix will avoid calling ceph_queue_cap_snap() on some inodes
> inappropriately. For example, with the code in mainline, suppose there
> are 2 directory hierarchies (with 6 directories total), like this:
>
> /dir_X1/dir_X2/dir_X3/
> /dir_Y1/dir_Y2/dir_Y3/
>
> Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
> root snapshot under /.snap/root_snap. Every time we make snapshots under
> /dir_Y1/..., the kclient will always try to rebuild the snap context for
> snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
> and dir_Y3, which makes no sense.
>
> That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
> creating a new snapshot under /dir_Y1/... the new seq will be 4, and
> the mds will send the kclient a snapshot backtrace in _downward_
> order: seqs 4, 3.
>
> When ceph_update_snap_trace() is called, it will always rebuild the from
> the last realm, that's the root_snap. So later when rebuilding the snap
> context, the current logic will always cause it to rebuild the snap_X2
> realm and then try to queue cap snaps for all the inodes related in that
> realm, even though it's not necessary.
>
> This is accompanied by a lot of these sorts of dout messages:
>
> "ceph: queue_cap_snap 00000000a42b796b nothing dirty|writing"
>
> Fix the logic to avoid this situation.
>
> Also, the 'invalidate' word is not precise here. In actuality, it will
> cause a rebuild of the existing snapshot contexts or just build
> non-existant ones. Rename it to 'rebuild_snapcs'.
>
> URL: https://tracker.ceph.com/issues/44100
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>
>
>
> V3:
> - Fixed the crash issue reproduced by Luís.
>
>
>
>
> fs/ceph/snap.c | 28 +++++++++++++++++++---------
> 1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> index 32e246138793..25a29304b74d 100644
> --- a/fs/ceph/snap.c
> +++ b/fs/ceph/snap.c
> @@ -736,7 +736,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> __le64 *prior_parent_snaps; /* encoded */
> struct ceph_snap_realm *realm = NULL;
> struct ceph_snap_realm *first_realm = NULL;
> - int invalidate = 0;
> + struct ceph_snap_realm *realm_to_rebuild = NULL;
> + int rebuild_snapcs;
> int err = -ENOMEM;
> LIST_HEAD(dirty_realms);
>
> @@ -744,6 +745,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
>
> dout("update_snap_trace deletion=%d\n", deletion);
> more:
> + rebuild_snapcs = 0;
> ceph_decode_need(&p, e, sizeof(*ri), bad);
> ri = p;
> p += sizeof(*ri);
> @@ -767,7 +769,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> err = adjust_snap_realm_parent(mdsc, realm, le64_to_cpu(ri->parent));
> if (err < 0)
> goto fail;
> - invalidate += err;
> + rebuild_snapcs += err;
>
> if (le64_to_cpu(ri->seq) > realm->seq) {
> dout("update_snap_trace updating %llx %p %lld -> %lld\n",
> @@ -792,22 +794,30 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> if (realm->seq > mdsc->last_snap_seq)
> mdsc->last_snap_seq = realm->seq;
>
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else if (!realm->cached_context) {
> dout("update_snap_trace %llx %p seq %lld new\n",
> realm->ino, realm, realm->seq);
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else {
> dout("update_snap_trace %llx %p seq %lld unchanged\n",
> realm->ino, realm, realm->seq);
> }
>
> - dout("done with %llx %p, invalidated=%d, %p %p\n", realm->ino,
> - realm, invalidate, p, e);
> + dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino,
> + realm, rebuild_snapcs, p, e);
>
> - /* invalidate when we reach the _end_ (root) of the trace */
> - if (invalidate && p >= e)
> - rebuild_snap_realms(realm, &dirty_realms);
> + /*
> + * this will always track the uppest parent realm from which
> + * we need to rebuild the snapshot contexts _downward_ in
> + * hierarchy.
> + */
> + if (rebuild_snapcs)
> + realm_to_rebuild = realm;
> +
> + /* rebuild_snapcs when we reach the _end_ (root) of the trace */
> + if (realm_to_rebuild && p >= e)
> + rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
>
> if (!first_realm)
> first_realm = realm;
Looks good, Xiubo. Dropped v2 and merged this one.
Thanks,
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] ceph: do not update snapshot context when there is no new snapshot
2022-02-19 6:28 [PATCH v3] ceph: do not update snapshot context when there is no new snapshot xiubli
2022-02-19 12:55 ` Jeff Layton
@ 2022-02-21 16:43 ` Luís Henriques
2022-02-22 0:23 ` Xiubo Li
1 sibling, 1 reply; 4+ messages in thread
From: Luís Henriques @ 2022-02-21 16:43 UTC (permalink / raw)
To: xiubli; +Cc: jlayton, idryomov, vshankar, ceph-devel
xiubli@redhat.com writes:
> From: Xiubo Li <xiubli@redhat.com>
>
> We will only track the uppest parent snapshot realm from which we
> need to rebuild the snapshot contexts _downward_ in hierarchy. For
> all the others having no new snapshot we will do nothing.
>
> This fix will avoid calling ceph_queue_cap_snap() on some inodes
> inappropriately. For example, with the code in mainline, suppose there
> are 2 directory hierarchies (with 6 directories total), like this:
>
> /dir_X1/dir_X2/dir_X3/
> /dir_Y1/dir_Y2/dir_Y3/
>
> Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
> root snapshot under /.snap/root_snap. Every time we make snapshots under
> /dir_Y1/..., the kclient will always try to rebuild the snap context for
> snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
> and dir_Y3, which makes no sense.
>
> That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
> creating a new snapshot under /dir_Y1/... the new seq will be 4, and
> the mds will send the kclient a snapshot backtrace in _downward_
> order: seqs 4, 3.
>
> When ceph_update_snap_trace() is called, it will always rebuild the from
> the last realm, that's the root_snap. So later when rebuilding the snap
> context, the current logic will always cause it to rebuild the snap_X2
> realm and then try to queue cap snaps for all the inodes related in that
> realm, even though it's not necessary.
>
> This is accompanied by a lot of these sorts of dout messages:
>
> "ceph: queue_cap_snap 00000000a42b796b nothing dirty|writing"
>
> Fix the logic to avoid this situation.
>
> Also, the 'invalidate' word is not precise here. In actuality, it will
> cause a rebuild of the existing snapshot contexts or just build
> non-existant ones. Rename it to 'rebuild_snapcs'.
>
> URL: https://tracker.ceph.com/issues/44100
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>
>
>
> V3:
> - Fixed the crash issue reproduced by Luís.
Thanks, I can confirm I'm no longer seeing this issue.
Cheers,
--
Luís
>
>
>
>
> fs/ceph/snap.c | 28 +++++++++++++++++++---------
> 1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> index 32e246138793..25a29304b74d 100644
> --- a/fs/ceph/snap.c
> +++ b/fs/ceph/snap.c
> @@ -736,7 +736,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> __le64 *prior_parent_snaps; /* encoded */
> struct ceph_snap_realm *realm = NULL;
> struct ceph_snap_realm *first_realm = NULL;
> - int invalidate = 0;
> + struct ceph_snap_realm *realm_to_rebuild = NULL;
> + int rebuild_snapcs;
> int err = -ENOMEM;
> LIST_HEAD(dirty_realms);
>
> @@ -744,6 +745,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
>
> dout("update_snap_trace deletion=%d\n", deletion);
> more:
> + rebuild_snapcs = 0;
> ceph_decode_need(&p, e, sizeof(*ri), bad);
> ri = p;
> p += sizeof(*ri);
> @@ -767,7 +769,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> err = adjust_snap_realm_parent(mdsc, realm, le64_to_cpu(ri->parent));
> if (err < 0)
> goto fail;
> - invalidate += err;
> + rebuild_snapcs += err;
>
> if (le64_to_cpu(ri->seq) > realm->seq) {
> dout("update_snap_trace updating %llx %p %lld -> %lld\n",
> @@ -792,22 +794,30 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> if (realm->seq > mdsc->last_snap_seq)
> mdsc->last_snap_seq = realm->seq;
>
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else if (!realm->cached_context) {
> dout("update_snap_trace %llx %p seq %lld new\n",
> realm->ino, realm, realm->seq);
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else {
> dout("update_snap_trace %llx %p seq %lld unchanged\n",
> realm->ino, realm, realm->seq);
> }
>
> - dout("done with %llx %p, invalidated=%d, %p %p\n", realm->ino,
> - realm, invalidate, p, e);
> + dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino,
> + realm, rebuild_snapcs, p, e);
>
> - /* invalidate when we reach the _end_ (root) of the trace */
> - if (invalidate && p >= e)
> - rebuild_snap_realms(realm, &dirty_realms);
> + /*
> + * this will always track the uppest parent realm from which
> + * we need to rebuild the snapshot contexts _downward_ in
> + * hierarchy.
> + */
> + if (rebuild_snapcs)
> + realm_to_rebuild = realm;
> +
> + /* rebuild_snapcs when we reach the _end_ (root) of the trace */
> + if (realm_to_rebuild && p >= e)
> + rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
>
> if (!first_realm)
> first_realm = realm;
> --
>
> 2.27.0
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] ceph: do not update snapshot context when there is no new snapshot
2022-02-21 16:43 ` Luís Henriques
@ 2022-02-22 0:23 ` Xiubo Li
0 siblings, 0 replies; 4+ messages in thread
From: Xiubo Li @ 2022-02-22 0:23 UTC (permalink / raw)
To: Luís Henriques; +Cc: jlayton, idryomov, vshankar, ceph-devel
On 2/22/22 12:43 AM, Luís Henriques wrote:
> xiubli@redhat.com writes:
>
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> We will only track the uppest parent snapshot realm from which we
>> need to rebuild the snapshot contexts _downward_ in hierarchy. For
>> all the others having no new snapshot we will do nothing.
>>
>> This fix will avoid calling ceph_queue_cap_snap() on some inodes
>> inappropriately. For example, with the code in mainline, suppose there
>> are 2 directory hierarchies (with 6 directories total), like this:
>>
>> /dir_X1/dir_X2/dir_X3/
>> /dir_Y1/dir_Y2/dir_Y3/
>>
>> Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
>> root snapshot under /.snap/root_snap. Every time we make snapshots under
>> /dir_Y1/..., the kclient will always try to rebuild the snap context for
>> snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
>> and dir_Y3, which makes no sense.
>>
>> That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
>> creating a new snapshot under /dir_Y1/... the new seq will be 4, and
>> the mds will send the kclient a snapshot backtrace in _downward_
>> order: seqs 4, 3.
>>
>> When ceph_update_snap_trace() is called, it will always rebuild the from
>> the last realm, that's the root_snap. So later when rebuilding the snap
>> context, the current logic will always cause it to rebuild the snap_X2
>> realm and then try to queue cap snaps for all the inodes related in that
>> realm, even though it's not necessary.
>>
>> This is accompanied by a lot of these sorts of dout messages:
>>
>> "ceph: queue_cap_snap 00000000a42b796b nothing dirty|writing"
>>
>> Fix the logic to avoid this situation.
>>
>> Also, the 'invalidate' word is not precise here. In actuality, it will
>> cause a rebuild of the existing snapshot contexts or just build
>> non-existant ones. Rename it to 'rebuild_snapcs'.
>>
>> URL: https://tracker.ceph.com/issues/44100
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> ---
>>
>>
>>
>> V3:
>> - Fixed the crash issue reproduced by Luís.
> Thanks, I can confirm I'm no longer seeing this issue.
Cool, thanks Luis.
- Xiubo
>
> Cheers,
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-02-22 0:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-19 6:28 [PATCH v3] ceph: do not update snapshot context when there is no new snapshot xiubli
2022-02-19 12:55 ` Jeff Layton
2022-02-21 16:43 ` Luís Henriques
2022-02-22 0:23 ` Xiubo Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.