All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [lustre-devel] Lustre log question(s)
       [not found] <CABnqofzHf0GdzqiXebSzbkDHHoGwxoAyCnja7_=YqWGVOnn7jA@mail.gmail.com>
@ 2021-01-29  9:22 ` Degremont, Aurelien via lustre-devel
  2021-01-30 17:07   ` Sudheendra Sampath
  0 siblings, 1 reply; 4+ messages in thread
From: Degremont, Aurelien via lustre-devel @ 2021-01-29  9:22 UTC (permalink / raw)
  To: Sudheendra Sampath, lustre-devel


[-- Attachment #1.1: Type: text/plain, Size: 2271 bytes --]

Hi,

This is not totally correct.

First, LLOG is the underlying technology used to store and handle Lustre Changelogs. But LLOG is used for other Lustre mechanisms, like lustre configuration.
Second, Changelog is similar to an audit feature. Changelog only logs different filesystem change, mostly metadata change, but definitely not the file content change. They don't play a role at all in transaction or failure recovery. This is only an admin feature.

At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.

Changelog is not needed in your case.

Aurélien

De : lustre-devel <lustre-devel-bounces@lists.lustre.org> au nom de Sudheendra Sampath <sudheendra.sampath@gmail.com>
Date : jeudi 28 janvier 2021 à 21:43
À : "lustre-devel@lists.lustre.org" <lustre-devel@lists.lustre.org>
Objet : [EXTERNAL] [lustre-devel] Lustre log question(s)


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi,

I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node setup.

I have the following questions about Lustre log:
1.       Is changelog and llog both the same, in the sense are they synonymous with each other?
2.       I understand that ZIL is currently not supported in Lustre version 2.12.2.  My question is :
1.       My understanding is that transactions (in general) need some logging mechanism for it to work in 'all or none' scenarios.  Please correct me if my understanding is incorrect.   I understand that changelog has to be enabled so that filesystem changes are recorded to be replayed after a crash.  How does Lustre transactions work if there is no intent log/changelog ?
2.       Does it mean that if changelog is NOT enabled and there is a crash, we risk losing all changes/updates to the filesystem ?
Appreciate your timely response and Thank you for your help.

--
Regards

Sudheendra Sampath

[-- Attachment #1.2: Type: text/html, Size: 7993 bytes --]

[-- Attachment #2: Type: text/plain, Size: 165 bytes --]

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] Lustre log question(s)
  2021-01-29  9:22 ` [lustre-devel] Lustre log question(s) Degremont, Aurelien via lustre-devel
@ 2021-01-30 17:07   ` Sudheendra Sampath
  2021-02-01 18:18     ` Spitz, Cory James
  0 siblings, 1 reply; 4+ messages in thread
From: Sudheendra Sampath @ 2021-01-30 17:07 UTC (permalink / raw)
  To: Degremont, Aurelien; +Cc: lustre-devel


[-- Attachment #1.1: Type: text/plain, Size: 4031 bytes --]

Thank you for the explanation on LLOG and changelog.  With respect to the
following statement :

*>> Lustre has its own mechanisms to guarantee transaction are committed to
disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients
before the data is actually on disk. In case of server crash, the Lustre
client will replay all non-acknowledge I/Os to ensure none of them are
lost.*

For example:

Let us say that I have 4 clients (cli1, cli2, cli3 and cli4) and all are
writing and reading data.  I have 1 host with 4 disks (2 OSTs, 1 MDT, 1
MGT).

   1. cli1 issues a directory remove (rm -rf /mnt/lustre/dir1)
   2. cli1 loses connection with Lustre targets.
   3. cli2 wants to now create a file under /mnt/lustre/dir1/file100 and
   write some data to file100

All of these are happening in parallel.

   - Does cli2 get an error that /mnt/lustre/dir1 has been removed and it
   has to first issue additional I/O to create /mnt/lustre/dir1 before
   reissuing the I/O to write file100 ?
   - If a transaction from cli2 happens before cli1, then this would lead
   to data lost situation for cli2, if cli2 tries to read/write data from/to
   file100 after sometime.
   - What is the role of last_rcvd file in this entire picture ?

I am trying to get a 30,000 ft overview of how lustre replay/recovery works.

Thanks again and appreciate your timely response.

On Fri, Jan 29, 2021 at 1:22 AM Degremont, Aurelien <degremoa@amazon.com>
wrote:

> Hi,
>
>
>
> This is not totally correct.
>
>
>
> First, LLOG is the underlying technology used to store and handle Lustre
> Changelogs. But LLOG is used for other Lustre mechanisms, like lustre
> configuration.
>
> Second, Changelog is similar to an audit feature. Changelog only logs
> different filesystem change, mostly metadata change, but definitely not the
> file content change. They don't play a role at all in transaction or
> failure recovery. This is only an admin feature.
>
>
>
> At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to
> guarantee transaction are committed to disk and handle crash. Basicly, I/O
> are not acknowledge to Lustre clients before the data is actually on disk.
> In case of server crash, the Lustre client will replay all non-acknowledge
> I/Os to ensure none of them are lost.
>
>
>
> Changelog is not needed in your case.
>
>
>
> Aurélien
>
>
>
> *De : *lustre-devel <lustre-devel-bounces@lists.lustre.org> au nom de
> Sudheendra Sampath <sudheendra.sampath@gmail.com>
> *Date : *jeudi 28 janvier 2021 à 21:43
> *À : *"lustre-devel@lists.lustre.org" <lustre-devel@lists.lustre.org>
> *Objet : *[EXTERNAL] [lustre-devel] Lustre log question(s)
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi,
>
>
>
> I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node
> setup.
>
>
>
> I have the following questions about Lustre log:
>
> 1.       *Is changelog and llog both the same, in the sense are they
> synonymous with each other?*
>
> 2.       I understand that ZIL is currently not supported in Lustre
> version 2.12.2.  My question is :
>
> 1.       My understanding is that transactions (in general) need some
> logging mechanism for it to work in 'all or none' scenarios.  Please
> correct me if my understanding is incorrect.   I understand that changelog
> has to be enabled so that filesystem changes are recorded to be replayed
> after a crash.  *How does Lustre transactions work if there is no intent
> log/changelog ?*
>
> 2.       Does it mean that if changelog is NOT enabled and there is a
> crash, we risk losing all changes/updates to the filesystem ?
>
> Appreciate your timely response and Thank you for your help.
>
>
>
> --
>
> Regards
>
> Sudheendra Sampath
>


-- 
Regards

Sudheendra Sampath

[-- Attachment #1.2: Type: text/html, Size: 8097 bytes --]

[-- Attachment #2: Type: text/plain, Size: 165 bytes --]

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] Lustre log question(s)
  2021-01-30 17:07   ` Sudheendra Sampath
@ 2021-02-01 18:18     ` Spitz, Cory James
  2021-02-02  4:14       ` Andreas Dilger
  0 siblings, 1 reply; 4+ messages in thread
From: Spitz, Cory James @ 2021-02-01 18:18 UTC (permalink / raw)
  To: Sudheendra Sampath, Degremont, Aurelien; +Cc: lustre-devel


[-- Attachment #1.1: Type: text/plain, Size: 4335 bytes --]

> I am trying to get a 30,000 ft overview of how lustre replay/recovery works

This old slide deck might be useful to you:
https://wiki.lustre.org/images/0/00/A_Deep_Dive_into_Lustre_Recovery_Mechanisms.pdf

Granted, it may not be 100% correct any longer.

-Cory

On 1/30/21, 11:08 AM, "lustre-devel" <lustre-devel-bounces@lists.lustre.org> wrote:

Thank you for the explanation on LLOG and changelog.  With respect to the following statement :

>> Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.

For example:

Let us say that I have 4 clients (cli1, cli2, cli3 and cli4) and all are writing and reading data.  I have 1 host with 4 disks (2 OSTs, 1 MDT, 1 MGT).
1.       cli1 issues a directory remove (rm -rf /mnt/lustre/dir1)
2.       cli1 loses connection with Lustre targets.
3.       cli2 wants to now create a file under /mnt/lustre/dir1/file100 and write some data to file100
All of these are happening in parallel.
·         Does cli2 get an error that /mnt/lustre/dir1 has been removed and it has to first issue additional I/O to create /mnt/lustre/dir1 before reissuing the I/O to write file100 ?
·         If a transaction from cli2 happens before cli1, then this would lead to data lost situation for cli2, if cli2 tries to read/write data from/to file100 after sometime.
·         What is the role of last_rcvd file in this entire picture ?
I am trying to get a 30,000 ft overview of how lustre replay/recovery works.

Thanks again and appreciate your timely response.

On Fri, Jan 29, 2021 at 1:22 AM Degremont, Aurelien <degremoa@amazon.com<mailto:degremoa@amazon.com>> wrote:
Hi,

This is not totally correct.

First, LLOG is the underlying technology used to store and handle Lustre Changelogs. But LLOG is used for other Lustre mechanisms, like lustre configuration.
Second, Changelog is similar to an audit feature. Changelog only logs different filesystem change, mostly metadata change, but definitely not the file content change. They don't play a role at all in transaction or failure recovery. This is only an admin feature.

At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.

Changelog is not needed in your case.

Aurélien

De : lustre-devel <lustre-devel-bounces@lists.lustre.org<mailto:lustre-devel-bounces@lists.lustre.org>> au nom de Sudheendra Sampath <sudheendra.sampath@gmail.com<mailto:sudheendra.sampath@gmail.com>>
Date : jeudi 28 janvier 2021 à 21:43
À : "lustre-devel@lists.lustre.org<mailto:lustre-devel@lists.lustre.org>" <lustre-devel@lists.lustre.org<mailto:lustre-devel@lists.lustre.org>>
Objet : [EXTERNAL] [lustre-devel] Lustre log question(s)


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

Hi,

I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node setup.

I have the following questions about Lustre log:
1.       Is changelog and llog both the same, in the sense are they synonymous with each other?
2.       I understand that ZIL is currently not supported in Lustre version 2.12.2.  My question is :
1.       My understanding is that transactions (in general) need some logging mechanism for it to work in 'all or none' scenarios.  Please correct me if my understanding is incorrect.   I understand that changelog has to be enabled so that filesystem changes are recorded to be replayed after a crash.  How does Lustre transactions work if there is no intent log/changelog ?
2.       Does it mean that if changelog is NOT enabled and there is a crash, we risk losing all changes/updates to the filesystem ?
Appreciate your timely response and Thank you for your help.

--
Regards

Sudheendra Sampath


--
Regards

Sudheendra Sampath

[-- Attachment #1.2: Type: text/html, Size: 18928 bytes --]

[-- Attachment #2: Type: text/plain, Size: 165 bytes --]

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lustre-devel] Lustre log question(s)
  2021-02-01 18:18     ` Spitz, Cory James
@ 2021-02-02  4:14       ` Andreas Dilger
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2021-02-02  4:14 UTC (permalink / raw)
  To: Spitz, Cory James; +Cc: Sudheendra Sampath, lustre-devel


[-- Attachment #1.1: Type: text/plain, Size: 5631 bytes --]

It is worthwhile to note that the proposed scenario is racy even for local filesystems, regardless of whether recovery is involved or not.

If (1) and (3) are happening on two clients at the same time:
- if (1) happens first, then (3) will fail with ENOENT ("no such file or directory") because "dir1" is gone.
- if (3) happens first, then (1) may delete "file100", but that is not the filesystem's fault, the user on cli1 asked for everything in "dir1" to be deleted.
- in some cases, "file100" may be created after (1) has passed that part of the directory traversal (it depends on how large the tree under "dir1" is), and then (3) will fail the final rmdir("dir1") with EBUSY ("directory is not empty") because "file100" still exists there.  This is the classic "TOCTOU" race ("Time of Creation/Time of Use").

Cheers, Andreas

On Feb 1, 2021, at 11:18, Spitz, Cory James <cory.spitz@hpe.com<mailto:cory.spitz@hpe.com>> wrote:

> I am trying to get a 30,000 ft overview of how lustre replay/recovery works

This old slide deck might be useful to you:
https://wiki.lustre.org/images/0/00/A_Deep_Dive_into_Lustre_Recovery_Mechanisms.pdf

Granted, it may not be 100% correct any longer.

-Cory

On 1/30/21, 11:08 AM, "lustre-devel" <lustre-devel-bounces@lists.lustre.org<mailto:lustre-devel-bounces@lists.lustre.org>> wrote:

Thank you for the explanation on LLOG and changelog.  With respect to the following statement :

>> Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.

For example:

Let us say that I have 4 clients (cli1, cli2, cli3 and cli4) and all are writing and reading data.  I have 1 host with 4 disks (2 OSTs, 1 MDT, 1 MGT).
1.       cli1 issues a directory remove (rm -rf /mnt/lustre/dir1)
2.       cli1 loses connection with Lustre targets.
3.       cli2 wants to now create a file under /mnt/lustre/dir1/file100 and write some data to file100
All of these are happening in parallel.
•         Does cli2 get an error that /mnt/lustre/dir1 has been removed and it has to first issue additional I/O to create /mnt/lustre/dir1 before reissuing the I/O to write file100 ?
•         If a transaction from cli2 happens before cli1, then this would lead to data lost situation for cli2, if cli2 tries to read/write data from/to file100 after sometime.
•         What is the role of last_rcvd file in this entire picture ?
I am trying to get a 30,000 ft overview of how lustre replay/recovery works.

Thanks again and appreciate your timely response.

On Fri, Jan 29, 2021 at 1:22 AM Degremont, Aurelien <degremoa@amazon.com<mailto:degremoa@amazon.com>> wrote:
Hi,

This is not totally correct.

First, LLOG is the underlying technology used to store and handle Lustre Changelogs. But LLOG is used for other Lustre mechanisms, like lustre configuration.
Second, Changelog is similar to an audit feature. Changelog only logs different filesystem change, mostly metadata change, but definitely not the file content change. They don't play a role at all in transaction or failure recovery. This is only an admin feature.

At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.

Changelog is not needed in your case.

Aurélien

De : lustre-devel <lustre-devel-bounces@lists.lustre.org<mailto:lustre-devel-bounces@lists.lustre.org>> au nom de Sudheendra Sampath <sudheendra.sampath@gmail.com<mailto:sudheendra.sampath@gmail.com>>
Date : jeudi 28 janvier 2021 à 21:43
À : "lustre-devel@lists.lustre.org<mailto:lustre-devel@lists.lustre.org>" <lustre-devel@lists.lustre.org<mailto:lustre-devel@lists.lustre.org>>
Objet : [EXTERNAL] [lustre-devel] Lustre log question(s)


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi,

I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node setup.

I have the following questions about Lustre log:
1.       Is changelog and llog both the same, in the sense are they synonymous with each other?
2.       I understand that ZIL is currently not supported in Lustre version 2.12.2.  My question is :
1.       My understanding is that transactions (in general) need some logging mechanism for it to work in 'all or none' scenarios.  Please correct me if my understanding is incorrect.   I understand that changelog has to be enabled so that filesystem changes are recorded to be replayed after a crash.  How does Lustre transactions work if there is no intent log/changelog ?
2.       Does it mean that if changelog is NOT enabled and there is a crash, we risk losing all changes/updates to the filesystem ?
Appreciate your timely response and Thank you for your help.

--
Regards

Sudheendra Sampath


--
Regards

Sudheendra Sampath
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org<mailto:lustre-devel@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud







[-- Attachment #1.2: Type: text/html, Size: 26816 bytes --]

[-- Attachment #2: Type: text/plain, Size: 165 bytes --]

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-02  4:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CABnqofzHf0GdzqiXebSzbkDHHoGwxoAyCnja7_=YqWGVOnn7jA@mail.gmail.com>
2021-01-29  9:22 ` [lustre-devel] Lustre log question(s) Degremont, Aurelien via lustre-devel
2021-01-30 17:07   ` Sudheendra Sampath
2021-02-01 18:18     ` Spitz, Cory James
2021-02-02  4:14       ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.