Thank you for the explanation on LLOG and changelog.  With respect to the
following statement :

*>> Lustre has its own mechanisms to guarantee transaction are committed to
disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients
before the data is actually on disk. In case of server crash, the Lustre
client will replay all non-acknowledge I/Os to ensure none of them are
lost.*

For example:

Let us say that I have 4 clients (cli1, cli2, cli3 and cli4) and all are
writing and reading data.  I have 1 host with 4 disks (2 OSTs, 1 MDT, 1
MGT).

   1. cli1 issues a directory remove (rm -rf /mnt/lustre/dir1)
   2. cli1 loses connection with Lustre targets.
   3. cli2 wants to now create a file under /mnt/lustre/dir1/file100 and
   write some data to file100

All of these are happening in parallel.

   - Does cli2 get an error that /mnt/lustre/dir1 has been removed and it
   has to first issue additional I/O to create /mnt/lustre/dir1 before
   reissuing the I/O to write file100 ?
   - If a transaction from cli2 happens before cli1, then this would lead
   to data lost situation for cli2, if cli2 tries to read/write data from/to
   file100 after sometime.
   - What is the role of last_rcvd file in this entire picture ?

I am trying to get a 30,000 ft overview of how lustre replay/recovery works.

Thanks again and appreciate your timely response.

On Fri, Jan 29, 2021 at 1:22 AM Degremont, Aurelien <degremoa@amazon.com>
wrote:

> Hi,
>
>
>
> This is not totally correct.
>
>
>
> First, LLOG is the underlying technology used to store and handle Lustre
> Changelogs. But LLOG is used for other Lustre mechanisms, like lustre
> configuration.
>
> Second, Changelog is similar to an audit feature. Changelog only logs
> different filesystem change, mostly metadata change, but definitely not the
> file content change. They don't play a role at all in transaction or
> failure recovery. This is only an admin feature.
>
>
>
> At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to
> guarantee transaction are committed to disk and handle crash. Basicly, I/O
> are not acknowledge to Lustre clients before the data is actually on disk.
> In case of server crash, the Lustre client will replay all non-acknowledge
> I/Os to ensure none of them are lost.
>
>
>
> Changelog is not needed in your case.
>
>
>
> Aurélien
>
>
>
> *De : *lustre-devel <lustre-devel-bounces@lists.lustre.org> au nom de
> Sudheendra Sampath <sudheendra.sampath@gmail.com>
> *Date : *jeudi 28 janvier 2021 à 21:43
> *À : *"lustre-devel@lists.lustre.org" <lustre-devel@lists.lustre.org>
> *Objet : *[EXTERNAL] [lustre-devel] Lustre log question(s)
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi,
>
>
>
> I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node
> setup.
>
>
>
> I have the following questions about Lustre log:
>
> 1.       *Is changelog and llog both the same, in the sense are they
> synonymous with each other?*
>
> 2.       I understand that ZIL is currently not supported in Lustre
> version 2.12.2.  My question is :
>
> 1.       My understanding is that transactions (in general) need some
> logging mechanism for it to work in 'all or none' scenarios.  Please
> correct me if my understanding is incorrect.   I understand that changelog
> has to be enabled so that filesystem changes are recorded to be replayed
> after a crash.  *How does Lustre transactions work if there is no intent
> log/changelog ?*
>
> 2.       Does it mean that if changelog is NOT enabled and there is a
> crash, we risk losing all changes/updates to the filesystem ?
>
> Appreciate your timely response and Thank you for your help.
>
>
>
> --
>
> Regards
>
> Sudheendra Sampath
>


-- 
Regards

Sudheendra Sampath