Thank you for the explanation on LLOG and changelog. With respect to the following statement : *>> Lustre has its own mechanisms to guarantee transaction are committed to disk and handle crash. Basicly, I/O are not acknowledge to Lustre clients before the data is actually on disk. In case of server crash, the Lustre client will replay all non-acknowledge I/Os to ensure none of them are lost.* For example: Let us say that I have 4 clients (cli1, cli2, cli3 and cli4) and all are writing and reading data. I have 1 host with 4 disks (2 OSTs, 1 MDT, 1 MGT). 1. cli1 issues a directory remove (rm -rf /mnt/lustre/dir1) 2. cli1 loses connection with Lustre targets. 3. cli2 wants to now create a file under /mnt/lustre/dir1/file100 and write some data to file100 All of these are happening in parallel. - Does cli2 get an error that /mnt/lustre/dir1 has been removed and it has to first issue additional I/O to create /mnt/lustre/dir1 before reissuing the I/O to write file100 ? - If a transaction from cli2 happens before cli1, then this would lead to data lost situation for cli2, if cli2 tries to read/write data from/to file100 after sometime. - What is the role of last_rcvd file in this entire picture ? I am trying to get a 30,000 ft overview of how lustre replay/recovery works. Thanks again and appreciate your timely response. On Fri, Jan 29, 2021 at 1:22 AM Degremont, Aurelien wrote: > Hi, > > > > This is not totally correct. > > > > First, LLOG is the underlying technology used to store and handle Lustre > Changelogs. But LLOG is used for other Lustre mechanisms, like lustre > configuration. > > Second, Changelog is similar to an audit feature. Changelog only logs > different filesystem change, mostly metadata change, but definitely not the > file content change. They don't play a role at all in transaction or > failure recovery. This is only an admin feature. > > > > At the end, indeed ZIL cannot be used and Lustre has its own mechanisms to > guarantee transaction are committed to disk and handle crash. Basicly, I/O > are not acknowledge to Lustre clients before the data is actually on disk. > In case of server crash, the Lustre client will replay all non-acknowledge > I/Os to ensure none of them are lost. > > > > Changelog is not needed in your case. > > > > Aurélien > > > > *De : *lustre-devel au nom de > Sudheendra Sampath > *Date : *jeudi 28 janvier 2021 à 21:43 > *À : *"lustre-devel@lists.lustre.org" > *Objet : *[EXTERNAL] [lustre-devel] Lustre log question(s) > > > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > Hi, > > > > I am trying to evaluate osd-zfs based MDS and OST deployment on a 2 node > setup. > > > > I have the following questions about Lustre log: > > 1. *Is changelog and llog both the same, in the sense are they > synonymous with each other?* > > 2. I understand that ZIL is currently not supported in Lustre > version 2.12.2. My question is : > > 1. My understanding is that transactions (in general) need some > logging mechanism for it to work in 'all or none' scenarios. Please > correct me if my understanding is incorrect. I understand that changelog > has to be enabled so that filesystem changes are recorded to be replayed > after a crash. *How does Lustre transactions work if there is no intent > log/changelog ?* > > 2. Does it mean that if changelog is NOT enabled and there is a > crash, we risk losing all changes/updates to the filesystem ? > > Appreciate your timely response and Thank you for your help. > > > > -- > > Regards > > Sudheendra Sampath > -- Regards Sudheendra Sampath