All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zdenek Kabelac <zkabelac@redhat.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] Possible bug in expanding thinpool: lvextend doens't expand the top-level dm-linear device
Date: Mon, 4 Jan 2016 14:27:35 +0100	[thread overview]
Message-ID: <568A7347.4070208@redhat.com> (raw)
In-Reply-To: <CAAYit8RFg1hTUvQqNxv6Uip7ZgA2Dg5KtO1RjpxqpqwGO=io0A@mail.gmail.com>

Dne 4.1.2016 v 06:08 M.H. Tsai napsal(a):
> 2016-01-03 7:05 GMT+08:00 Zdenek Kabelac <zkabelac@redhat.com>:
>> Dne 1.1.2016 v 19:10 M.H. Tsai napsal(a):
>>> 2016-01-01 5:25 GMT+08:00 Zdenek Kabelac <zkabelac@redhat.com>:
>>>> There is even sequencing problem with creating snapshot in kernel target
>>>> which needs to be probably fixed first.
>>>> (the rule here should be - to never create/allocate something when
>>>> there is suspended device
>
> Excuse me, does the statement
> 'to never create/allocate something when there is suspended device'
> describes the case that the thin-pool is full, and the volume is
> 'suspend with no flush' ? Because there's no free blocks for
> allocation.

The reason for this is -  you could suspend a device with i.e. swap/root
so now - if during any kernel allocation kernel would need a memory
chunk and would require some 'swap/root' space on suspended disk, kernel
would block endlessly.

So table reload (with updated dm table line) should always happen before
suspend (aka PRELOAD phase in lvm2 code).

Following device resume should be just switching tables without any
memory allocations - those should have been all resolved in load phase -
where you have always 2 slots - active & inactive.

(And yes - there are some (known) problems with this rule in current lvm2 and 
some dm targets...)

> Otherwise, it would be strange if we cannot do these operations when
> the pool is not full.

Extension of device is 'special' - in fact we could enable  'suspend WITHOUT 
flush' for any 'lvextend' operation - but that needs full re-validation of all 
targets - so for now it's only enabled for thin-pool lvextend.

As 'suspend with flush' is typically needed when you change device type in 
some way - however with pure lvextend case (onlt new space is added, no 
existing device space changes) there may not be any BIO in-flight routed into 
'new extended' space - thus flush is not needed. (unsure if this explanation 
does make sense)

>
>>>> and this rule is broken with current thin
>>>> snapshot creation, so thin snap create message should go in front
>>>> to ensure there is a space in thin-pool ahead of origin suspend  - will
>>>> be addressed in some future version....)
>>>>
>>>> However when taking snapshot - only origin thin LV is now suspended and
>>>> should not influence rest of thin volumes (except for thin-pool commit
>>>> points)
>>>
>>> Does that mean in future version of dm-thin, the command sequence of
>>> snapshot creation will be:
>>>
>>> dmsetup message /dev/mapper/pool 0 "create_snap 1 0"
>>> dmsetup suspend /dev/mapper/thin
>>> dmsetup resume /dev/mapper/thin
>>>
>> Possibly different message - since everything must remain
>> fully backward compatible (i.e. create_snap_on_suspend,
>> or maybe some other mechanism will be there).
>> But yes something in this direction...
>
> I'm not well understood. Is the new message designed for the case that
> thin-pool is nearly full?
> Because the pool's free data blocks might not sufficient for 'suspend
> with flush' (i.e., 'suspend with flush' might failed if the pool is
> nearly full), so we should move the create_snap message before
> suspending. However, the created snapshots are inconsistent.
> If the pool is full, then there's no difference between taking
> snapshots before or after 'suspend without flush'.
> Is that right?

As said - the solution is nontrivial - and needs enhancements
on suspend API - when you suspend 'thinLV origin' you need
to use suspend with flush - however ATM such suspend may 'block'
whole lvm2 - while lvm2 keeps VG lock.

As a prevention - lvm2 user can configure threshold for autoresize (e.g. 70%)
and when pool is above the threshold user is not allowed to create any new 
thinLV. This normally works quite ok - but it's obviously not a 'bullet-proof' 
solution here (as you could construct a case, where time-of-check
and time-of-use may cause out-of-space pool).

So far the rule is simple - at all cost - do not run thin-pool when it's full, 
overfilled pool is NOT comparable to a 'single' write error.
When admin is solving overfilled pool - something went wrong earlier
(admin failed to extend his VG)....

Thin-pool is about 'promising' a space user can deliver 'later', not about
hitting overfull corner case as 'regular' use-case where user can expect some 
well handled error behavior (but yes we try to make a better user experience here)

Regards

Zdenek

  reply	other threads:[~2016-01-04 13:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-23  9:50 [linux-lvm] Possible bug in expanding thinpool: lvextend doens't expand the top-level dm-linear device M.H. Tsai
2015-12-24  9:04 ` Zdenek Kabelac
2015-12-25  2:27   ` M.H. Tsai
2015-12-25 18:37     ` Zdenek Kabelac
2015-12-27  9:19       ` M.H. Tsai
2015-12-27 13:09       ` M.H. Tsai
2015-12-29 21:06         ` Zdenek Kabelac
2015-12-31  9:06           ` M.H. Tsai
2015-12-31 21:25             ` Zdenek Kabelac
2016-01-01 18:10               ` M.H. Tsai
2016-01-02 23:05                 ` Zdenek Kabelac
2016-01-04  5:08                   ` M.H. Tsai
2016-01-04 13:27                     ` Zdenek Kabelac [this message]
2016-02-12 12:40         ` Zdenek Kabelac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568A7347.4070208@redhat.com \
    --to=zkabelac@redhat.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.