All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
@ 2018-04-03  8:28 Erdrich, Frank
  2018-04-03 12:04 ` Richard Weinberger
  0 siblings, 1 reply; 6+ messages in thread
From: Erdrich, Frank @ 2018-04-03  8:28 UTC (permalink / raw)
  To: linux-mtd

Hello,

we are encountering an error on UBIFS that prevents mounting of a partition. Maybe one of you can tell me the directly the reason for that or can help me hunting this error down.

Some data of the system:

- Linux Kernel version 4.9 (unmodified)
- SAMA5D36
- Flash: Micron 29F8G0ABABA (PEB size of 512k)
- 256 MB of RAM

The error seems to occur after some power-cycles and is therefore hard to reproduce. We are seeing it also on only some of the systems.
The error looks like the following:

UBIFS (ubi2:0): background thread "ubifs_bgt2_0" started, PID 867 UBIFS (ubi2:0): recovery needed UBIFS error (ubi2:0 pid 865): ubifs_get_pnode.part.4: error -22 reading pnode at 6:188460 (pid 865) dumping pnode:
	address cd9bb7c0 parent cd9bb700 cnext 0
	flags 0 iip 0 level 0 num 0
	0: free 0 dirty 501576 flags 1 lnum 0
	1: free 0 dirty 254864 flags 1 lnum 0
	2: free 0 dirty 515672 flags 1 lnum 0
	3: free 507904 dirty 524128 flags 34 lnum 0
CPU: 0 PID: 865 Comm: mount Not tainted 4.9.51 #124 Hardware name: Atmel SAMA5 [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24) [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28) [<c0390c7c>] (dump_stack) from [<c0319e54>] (ubifs_get_pnode.part.4+0x250/0x2b8)
[<c0319e54>] (ubifs_get_pnode.part.4) from [<c031bf08>] (ubifs_lpt_lookup_dirty+0x2f8/0x340)
[<c031bf08>] (ubifs_lpt_lookup_dirty) from [<c031f910>] (ubifs_update_one_lp+0x44/0x150) [<c031f910>] (ubifs_update_one_lp) from [<c0309540>] (ubifs_tnc_add+0x170/0x1ec) [<c0309540>] (ubifs_tnc_add) from [<c030d638>] (ubifs_replay_journal+0xfcc/0x1670)
[<c030d638>] (ubifs_replay_journal) from [<c0300380>] (ubifs_mount+0x1194/0x1e8c) [<c0300380>] (ubifs_mount) from [<c0209ea0>] (mount_fs+0x54/0x170) [<c0209ea0>] (mount_fs) from [<c0228dec>] (vfs_kern_mount+0x58/0x124) [<c0228dec>] (vfs_kern_mount) from [<c022c6e0>] (do_mount+0x1bc/0xcc0) [<c022c6e0>] (do_mount) from [<c022d548>] (SyS_mount+0x84/0xac) [<c022d548>] (SyS_mount) from [<c0109380>] (ret_fast_syscall+0x0/0x3c) UBIFS error (ubi2:0 pid 865): ubifs_get_pnode.part.4: calc num: 4 UBIFS error (ubi2:0 pid 865): ubifs_update_one_lp: cannot update properties of LEB 29, error -22 UBIFS (ubi2:0): background thread "ubifs_bgt2_0" stops

Output of ubinfo:

Volumes count:                           1
Logical eraseblock size:                 516096 bytes, 504.0 KiB
Total amount of logical eraseblocks:     140 (72253440 bytes, 68.9 MiB)
Amount of available logical eraseblocks: 0 (0 bytes)
Maximum count of volumes                 128
Count of bad physical eraseblocks:       0
Count of reserved physical eraseblocks:  40
Current maximum erase counter value:     11
Minimum input/output unit size:          4096 bytes
Character device major/minor:            247:0
Present volumes:                         0


I'm not deep enough in the ubifs system to completely understand what is happening here but for me it seems that there are more dirty data to write than the size of the LEB.
-> 3: free 507904 dirty 524128 flags 34 lnum 0
I've seen that the values are checked in validate_pnode() where the -EINVAL (-22) comes from. The question is, how can dirty data bigger that the LEB size?

Does someone have a suggestion what is happening here? Or do you need more information to rate that error?


Best Regards,
Frank Erdrich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
  2018-04-03  8:28 UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode Erdrich, Frank
@ 2018-04-03 12:04 ` Richard Weinberger
       [not found]   ` <s1l9q5oc97q87j6d75pkp49b.1522781343251@email.android.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Weinberger @ 2018-04-03 12:04 UTC (permalink / raw)
  To: Erdrich, Frank; +Cc: linux-mtd

Frank,

On Tue, Apr 3, 2018 at 10:28 AM, Erdrich, Frank
<Frank.Erdrich@emtrion.de> wrote:
> Hello,
>
> we are encountering an error on UBIFS that prevents mounting of a partition. Maybe one of you can tell me the directly the reason for that or can help me hunting this error down.

UBIFS got confused wrt. free space accounting.
Can you tell me more on your setup? Do you use Fastmap? fscrypt? xattr?

> I'm not deep enough in the ubifs system to completely understand what is happening here but for me it seems that there are more dirty data to write than the size of the LEB.
> -> 3: free 507904 dirty 524128 flags 34 lnum 0
> I've seen that the values are checked in validate_pnode() where the -EINVAL (-22) comes from. The question is, how can dirty data bigger that the LEB size?

Yes, this can happen. That's why UBIFS does in some cases a fixup of
the used/free numbers.
These numbers are not always updated immediately and when a power-cut
happens their state might be inconsistent.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
       [not found]   ` <s1l9q5oc97q87j6d75pkp49b.1522781343251@email.android.com>
@ 2018-04-03 20:26     ` Frank Erdrich
  2018-04-03 21:28       ` Richard Weinberger
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Erdrich @ 2018-04-03 20:26 UTC (permalink / raw)
  To: richard.weinberger; +Cc: linux-mtd

Hello Richard,

thanks for your quick response. And sorry for writing from another 
Mail-account, I hope that will not break the thread. Our setup at work 
is a little bit weird and I need an admin to send messages to a mailing 
list (disclaimer removal and stuff like that).

> Frank,
> 
> On Tue, Apr 3, 2018 at 10:28 AM, Erdrich, Frank
> <Frank.Erdrich@emtrion.de> wrote:
>  > Hello,
>  >
>  > we are encountering an error on UBIFS that prevents mounting of a 
> partition. Maybe one of you can tell me the directly the reason for that 
> or can help me hunting this error down.
> 
> UBIFS got confused wrt. free space accounting.
> Can you tell me more on your setup? Do you use Fastmap? fscrypt? xattr?

We use only extended attributes. fastmap and fscrypt are not used. That 
are the kernel options we have set:

CONFIG_MTD_UBI=y
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_LIMIT=20
# CONFIG_MTD_UBI_FASTMAP is not set
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_BLOCK is not set
CONFIG_UBIFS_FS=y
CONFIG_UBIFS_FS_ADVANCED_COMPR=y
CONFIG_UBIFS_FS_LZO=y
CONFIG_UBIFS_FS_ZLIB=y
# CONFIG_UBIFS_ATIME_SUPPORT is not set

The Flash itself is divided into three mtd partitions to get some sort 
of separation of critical data against other data. The partition that 
shows that error is a logging partition which is, of course, written on 
a regular base. We are not using the sync mount option at the moment. 
Writeback timers (for dirty data) are on default value.

>  > I'm not deep enough in the ubifs system to completely understand what 
> is happening here but for me it seems that there are more dirty data to 
> write than the size of the LEB.
>  > -> 3: free 507904 dirty 524128 flags 34 lnum 0
>  > I've seen that the values are checked in validate_pnode() where the 
> -EINVAL (-22) comes from. The question is, how can dirty data bigger 
> that the LEB size?
> 
> Yes, this can happen. That's why UBIFS does in some cases a fixup of
> the used/free numbers.
> These numbers are not always updated immediately and when a power-cut
> happens their state might be inconsistent.

I'm totally fine with that behaviour as a power-cut can happen anytime 
in an embedded system but I would assume that the ubifs-recovery would 
bring back the system to a operable state. Or is there an error in my 
reasoning?

Please let me know if you need more information or if I should do 
special testing.


Best regards,
Frank Erdrich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
  2018-04-03 20:26     ` Frank Erdrich
@ 2018-04-03 21:28       ` Richard Weinberger
  2018-04-04  4:58         ` Frank Erdrich
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Weinberger @ 2018-04-03 21:28 UTC (permalink / raw)
  To: Frank Erdrich, linux-mtd

[-- Attachment #1: Type: text/plain, Size: 3456 bytes --]

Frank,

Am Dienstag, 3. April 2018, 22:26:32 CEST schrieb Frank Erdrich:
> Hello Richard,
> 
> thanks for your quick response. And sorry for writing from another 
> Mail-account, I hope that will not break the thread. Our setup at work 
> is a little bit weird and I need an admin to send messages to a mailing 
> list (disclaimer removal and stuff like that).

:-)
 
> > Frank,
> > 
> > On Tue, Apr 3, 2018 at 10:28 AM, Erdrich, Frank
> > <Frank.Erdrich@emtrion.de> wrote:
> >  > Hello,
> >  >
> >  > we are encountering an error on UBIFS that prevents mounting of a 
> > partition. Maybe one of you can tell me the directly the reason for that 
> > or can help me hunting this error down.
> > 
> > UBIFS got confused wrt. free space accounting.
> > Can you tell me more on your setup? Do you use Fastmap? fscrypt? xattr?
> 
> We use only extended attributes. fastmap and fscrypt are not used. That 
> are the kernel options we have set:
> 
> CONFIG_MTD_UBI=y
> CONFIG_MTD_UBI_WL_THRESHOLD=4096
> CONFIG_MTD_UBI_BEB_LIMIT=20
> # CONFIG_MTD_UBI_FASTMAP is not set
> # CONFIG_MTD_UBI_GLUEBI is not set
> # CONFIG_MTD_UBI_BLOCK is not set
> CONFIG_UBIFS_FS=y
> CONFIG_UBIFS_FS_ADVANCED_COMPR=y
> CONFIG_UBIFS_FS_LZO=y
> CONFIG_UBIFS_FS_ZLIB=y
> # CONFIG_UBIFS_ATIME_SUPPORT is not set

Good.

> The Flash itself is divided into three mtd partitions to get some sort 
> of separation of critical data against other data. The partition that 
> shows that error is a logging partition which is, of course, written on 
> a regular base. We are not using the sync mount option at the moment. 
> Writeback timers (for dirty data) are on default value.

Side note: Having multiple UBI instances on the same MTD is not a good idea.
Usually you want the wear leveling domain as large as possible.

> >  > I'm not deep enough in the ubifs system to completely understand what 
> > is happening here but for me it seems that there are more dirty data to 
> > write than the size of the LEB.
> >  > -> 3: free 507904 dirty 524128 flags 34 lnum 0
> >  > I've seen that the values are checked in validate_pnode() where the 
> > -EINVAL (-22) comes from. The question is, how can dirty data bigger 
> > that the LEB size?
> > 
> > Yes, this can happen. That's why UBIFS does in some cases a fixup of
> > the used/free numbers.
> > These numbers are not always updated immediately and when a power-cut
> > happens their state might be inconsistent.
> 
> I'm totally fine with that behaviour as a power-cut can happen anytime 
> in an embedded system but I would assume that the ubifs-recovery would 
> bring back the system to a operable state. Or is there an error in my 
> reasoning?

Of course UBIFS should be able to recover.
I suspect a very subtle but in UBIFS' xattr code.
Other bug reports point in that direction too.

Sadly so far I was not able to pinpoint the exact issue.
Therefore I started to review the code. So far I've found some odds but
I'm not 100% sure whether these cause the trouble you see.

> Please let me know if you need more information or if I should do 
> special testing.

Can you reproduce the issue?
If so, please retry with the attached patches.
They will not fix your already broken UBIFS but they (hopefully) will
make sure that the accounting problem will not happen again.

Can you also share me the broken UBIFS image?

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

[-- Attachment #2: 0002-ubifs-Orphan-xattr-inodes-too.patch --]
[-- Type: text/x-patch, Size: 8220 bytes --]

>From 5667651fcd44f12079f7b6bd6fbc52b1c898f8a1 Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard@nod.at>
Date: Thu, 15 Feb 2018 20:18:46 +0100
Subject: [PATCH 2/2] ubifs: Orphan xattr inodes too.

WIP WIP WIP

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 fs/ubifs/orphan.c | 208 ++++++++++++++++++++++++++++++++++++------------------
 fs/ubifs/ubifs.h  |   3 +
 2 files changed, 144 insertions(+), 67 deletions(-)

diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c
index caf2d123e9ee..769e5cec04b3 100644
--- a/fs/ubifs/orphan.c
+++ b/fs/ubifs/orphan.c
@@ -54,30 +54,24 @@
 
 static int dbg_check_orphans(struct ubifs_info *c);
 
-/**
- * ubifs_add_orphan - add an orphan.
- * @c: UBIFS file-system description object
- * @inum: orphan inode number
- *
- * Add an orphan. This function is called when an inodes link count drops to
- * zero.
- */
-int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
+static struct ubifs_orphan *orphan_add(struct ubifs_info *c, ino_t inum,
+				       struct ubifs_orphan *parent_orphan)
 {
 	struct ubifs_orphan *orphan, *o;
 	struct rb_node **p, *parent = NULL;
 
 	orphan = kzalloc(sizeof(struct ubifs_orphan), GFP_NOFS);
 	if (!orphan)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	orphan->inum = inum;
 	orphan->new = 1;
+	INIT_LIST_HEAD(&orphan->child_list);
 
 	spin_lock(&c->orphan_lock);
 	if (c->tot_orphans >= c->max_orphans) {
 		spin_unlock(&c->orphan_lock);
 		kfree(orphan);
-		return -ENFILE;
+		return ERR_PTR(-ENFILE);
 	}
 	p = &c->orph_tree.rb_node;
 	while (*p) {
@@ -91,7 +85,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
 			ubifs_err(c, "orphaned twice");
 			spin_unlock(&c->orphan_lock);
 			kfree(orphan);
-			return 0;
+			return ERR_PTR(-EINVAL);
 		}
 	}
 	c->tot_orphans += 1;
@@ -100,24 +94,22 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
 	rb_insert_color(&orphan->rb, &c->orph_tree);
 	list_add_tail(&orphan->list, &c->orph_list);
 	list_add_tail(&orphan->new_list, &c->orph_new);
+
+	if (parent_orphan) {
+		list_add_tail(&orphan->child_list,
+			      &parent_orphan->child_list);
+	}
+
 	spin_unlock(&c->orphan_lock);
 	dbg_gen("ino %lu", (unsigned long)inum);
-	return 0;
+	return orphan;
 }
 
-/**
- * ubifs_delete_orphan - delete an orphan.
- * @c: UBIFS file-system description object
- * @inum: orphan inode number
- *
- * Delete an orphan. This function is called when an inode is deleted.
- */
-void ubifs_delete_orphan(struct ubifs_info *c, ino_t inum)
+static struct ubifs_orphan *lookup_orphan(struct ubifs_info *c, ino_t inum)
 {
 	struct ubifs_orphan *o;
 	struct rb_node *p;
 
-	spin_lock(&c->orphan_lock);
 	p = c->orph_tree.rb_node;
 	while (p) {
 		o = rb_entry(p, struct ubifs_orphan, rb);
@@ -126,37 +118,124 @@ void ubifs_delete_orphan(struct ubifs_info *c, ino_t inum)
 		else if (inum > o->inum)
 			p = p->rb_right;
 		else {
-			if (o->del) {
-				spin_unlock(&c->orphan_lock);
-				dbg_gen("deleted twice ino %lu",
-					(unsigned long)inum);
-				return;
-			}
-			if (o->cmt) {
-				o->del = 1;
-				o->dnext = c->orph_dnext;
-				c->orph_dnext = o;
-				spin_unlock(&c->orphan_lock);
-				dbg_gen("delete later ino %lu",
-					(unsigned long)inum);
-				return;
-			}
-			rb_erase(p, &c->orph_tree);
-			list_del(&o->list);
-			c->tot_orphans -= 1;
-			if (o->new) {
-				list_del(&o->new_list);
-				c->new_orphans -= 1;
-			}
-			spin_unlock(&c->orphan_lock);
-			kfree(o);
-			dbg_gen("inum %lu", (unsigned long)inum);
-			return;
+			return o;
 		}
 	}
+	return NULL;
+}
+
+static void __orphan_drop(struct ubifs_info *c, struct ubifs_orphan *o)
+{
+	rb_erase(&o->rb, &c->orph_tree);
+	list_del(&o->list);
+	c->tot_orphans -= 1;
+
+	if (o->new) {
+		list_del(&o->new_list);
+		c->new_orphans -= 1;
+	}
+
+	kfree(o);
+}
+
+static void orphan_delete(struct ubifs_info *c, ino_t inum)
+{
+	struct ubifs_orphan *orph, *child_orph, *tmp_o;
+
+	spin_lock(&c->orphan_lock);
+
+	orph = lookup_orphan(c, inum);
+	if (!orph) {
+		spin_unlock(&c->orphan_lock);
+		ubifs_err(c, "missing orphan ino %lu", (unsigned long)inum);
+		dump_stack();
+
+		return;
+	}
+
+	if (orph->del) {
+		spin_unlock(&c->orphan_lock);
+		dbg_gen("deleted twice ino %lu",
+			(unsigned long)inum);
+		return;
+	}
+
+	if (orph->cmt) {
+		orph->del = 1;
+		orph->dnext = c->orph_dnext;
+		c->orph_dnext = orph;
+		spin_unlock(&c->orphan_lock);
+		dbg_gen("delete later ino %lu",
+			(unsigned long)inum);
+		return;
+	}
+
+	list_for_each_entry_safe(child_orph, tmp_o, &orph->child_list, child_list) {
+		list_del(&child_orph->child_list);
+		__orphan_drop(c, child_orph);
+	}
+
+	__orphan_drop(c, orph);
+
 	spin_unlock(&c->orphan_lock);
-	ubifs_err(c, "missing orphan ino %lu", (unsigned long)inum);
-	dump_stack();
+}
+
+/**
+ * ubifs_add_orphan - add an orphan.
+ * @c: UBIFS file-system description object
+ * @inum: orphan inode number
+ *
+ * Add an orphan. This function is called when an inodes link count drops to
+ * zero.
+ */
+int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
+{
+	int err = 0;
+	ino_t xattr_inum;
+	union ubifs_key key;
+	struct ubifs_dent_node *xent;
+	struct fscrypt_name nm = {0};
+	struct ubifs_orphan *xattr_orphan;
+	struct ubifs_orphan *orphan;
+
+	orphan = orphan_add(c, inum, NULL);
+	if (IS_ERR(orphan))
+		return PTR_ERR(orphan);
+
+	lowest_xent_key(c, &key, inum);
+	while (1) {
+		xent = ubifs_tnc_next_ent(c, &key, &nm);
+		if (IS_ERR(xent)) {
+			err = PTR_ERR(xent);
+			if (err == -ENOENT)
+				break;
+			return err;
+		}
+
+		fname_name(&nm) = xent->name;
+		fname_len(&nm) = le16_to_cpu(xent->nlen);
+		xattr_inum = le64_to_cpu(xent->inum);
+
+		xattr_orphan = orphan_add(c, xattr_inum, orphan);
+		if (IS_ERR(xattr_orphan))
+			return PTR_ERR(xattr_orphan);
+
+		key_read(c, &xent->key, &key);
+	}
+
+	return 0;
+}
+
+/**
+ * ubifs_delete_orphan - delete an orphan.
+ * @c: UBIFS file-system description object
+ * @inum: orphan inode number
+ *
+ * Delete an orphan. This function is called when an inode is deleted.
+ */
+void ubifs_delete_orphan(struct ubifs_info *c, ino_t inum)
+{
+	orphan_delete(c, inum);
 }
 
 /**
@@ -611,10 +690,16 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
 
 		n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
 		for (i = 0; i < n; i++) {
+			union ubifs_key key1, key2;
+
 			inum = le64_to_cpu(orph->inos[i]);
 			dbg_rcvry("deleting orphaned inode %lu",
 				  (unsigned long)inum);
-			err = ubifs_tnc_remove_ino(c, inum);
+
+			lowest_ino_key(c, &key1, inum);
+			highest_ino_key(c, &key2, inum);
+
+			err = ubifs_tnc_remove_range(c, &key1, &key2);
 			if (err)
 				return err;
 			err = insert_dead_orphan(c, inum);
@@ -744,26 +829,15 @@ struct check_info {
 	struct rb_root root;
 };
 
-static int dbg_find_orphan(struct ubifs_info *c, ino_t inum)
+static bool dbg_find_orphan(struct ubifs_info *c, ino_t inum)
 {
-	struct ubifs_orphan *o;
-	struct rb_node *p;
+	bool found = false;
 
 	spin_lock(&c->orphan_lock);
-	p = c->orph_tree.rb_node;
-	while (p) {
-		o = rb_entry(p, struct ubifs_orphan, rb);
-		if (inum < o->inum)
-			p = p->rb_left;
-		else if (inum > o->inum)
-			p = p->rb_right;
-		else {
-			spin_unlock(&c->orphan_lock);
-			return 1;
-		}
-	}
+	found = !!lookup_orphan(c, inum);
 	spin_unlock(&c->orphan_lock);
-	return 0;
+
+	return found;
 }
 
 static int dbg_ins_check_orphan(struct rb_root *root, ino_t inum)
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 5ee7af879cc4..eb34bc0046c4 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -892,6 +892,8 @@ struct ubifs_budget_req {
  * @rb: rb-tree node of rb-tree of orphans sorted by inode number
  * @list: list head of list of orphans in order added
  * @new_list: list head of list of orphans added since the last commit
+ * @child_list: list of xattr childs if this orphan hosts xattrs, list head
+ * if this orphan is a xattr, not used otherwise.
  * @cnext: next orphan to commit
  * @dnext: next orphan to delete
  * @inum: inode number
@@ -903,6 +905,7 @@ struct ubifs_orphan {
 	struct rb_node rb;
 	struct list_head list;
 	struct list_head new_list;
+	struct list_head child_list;
 	struct ubifs_orphan *cnext;
 	struct ubifs_orphan *dnext;
 	ino_t inum;
-- 
2.13.6


[-- Attachment #3: 0001-ubifs-Fix-synced_i_size-calculation-for-xattr-inodes.patch --]
[-- Type: text/x-patch, Size: 1442 bytes --]

>From acb4805ca8816b83873861090f97fae0bfeb313b Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard@nod.at>
Date: Wed, 14 Feb 2018 11:53:22 +0100
Subject: [PATCH 1/2] ubifs: Fix synced_i_size calculation for xattr inodes

In ubifs_jnl_update() we sync parent and child inodes to the flash,
in case of xattrs, the parent inode (AKA host inode) has a non-zero
data_len. Therefore we need to adjust synced_i_size too.

This issue was reported by ubifs self tests unter a xattr related work
load.
UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: ui_size is 4, synced_i_size is 0, but inode is clean
UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: i_ino 65, i_mode 0x81a4, i_size 4

Cc: <stable@vger.kernel.org>
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
---
 fs/ubifs/journal.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index 04c4ec6483e5..1fb123279bb5 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -665,6 +665,11 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
 	spin_lock(&ui->ui_lock);
 	ui->synced_i_size = ui->ui_size;
 	spin_unlock(&ui->ui_lock);
+	if (xent) {
+		spin_lock(&host_ui->ui_lock);
+		host_ui->synced_i_size = host_ui->ui_size;
+		spin_unlock(&host_ui->ui_lock);
+	}
 	mark_inode_clean(c, ui);
 	mark_inode_clean(c, host_ui);
 	return 0;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
  2018-04-03 21:28       ` Richard Weinberger
@ 2018-04-04  4:58         ` Frank Erdrich
  2018-04-04 22:36           ` Richard Weinberger
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Erdrich @ 2018-04-04  4:58 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

Hello Richard,

thanks again for the quick response.

Am 03.04.2018 um 23:28 schrieb Richard Weinberger:
>>> On Tue, Apr 3, 2018 at 10:28 AM, Erdrich, Frank
>>> <Frank.Erdrich@emtrion.de> wrote:
>>>   > Hello,
>>>   >
>>>   > we are encountering an error on UBIFS that prevents mounting of a
>>> partition. Maybe one of you can tell me the directly the reason for that
>>> or can help me hunting this error down.
>>>
>>> UBIFS got confused wrt. free space accounting.
>>> Can you tell me more on your setup? Do you use Fastmap? fscrypt? xattr?
>>
>> We use only extended attributes. fastmap and fscrypt are not used. That
>> are the kernel options we have set:
>>
>> CONFIG_MTD_UBI=y
>> CONFIG_MTD_UBI_WL_THRESHOLD=4096
>> CONFIG_MTD_UBI_BEB_LIMIT=20
>> # CONFIG_MTD_UBI_FASTMAP is not set
>> # CONFIG_MTD_UBI_GLUEBI is not set
>> # CONFIG_MTD_UBI_BLOCK is not set
>> CONFIG_UBIFS_FS=y
>> CONFIG_UBIFS_FS_ADVANCED_COMPR=y
>> CONFIG_UBIFS_FS_LZO=y
>> CONFIG_UBIFS_FS_ZLIB=y
>> # CONFIG_UBIFS_ATIME_SUPPORT is not set
> 
> Good.
> 
>> The Flash itself is divided into three mtd partitions to get some sort
>> of separation of critical data against other data. The partition that
>> shows that error is a logging partition which is, of course, written on
>> a regular base. We are not using the sync mount option at the moment.
>> Writeback timers (for dirty data) are on default value.
> 
> Side note: Having multiple UBI instances on the same MTD is not a good idea.
> Usually you want the wear leveling domain as large as possible.

Yeah, I know. But we had some more issues with the flash and the 
filesystem and that was the safest way to have a running system even if 
the log partition fails. That gives us the ability to do updates on a 
corrupted system an bring it back to a good state.

The easiest thing would be to disable the logging. I would do that 
because logging is totally unneeded on that device, but the customer 
gets what he wants...

>>>   > I'm not deep enough in the ubifs system to completely understand what
>>> is happening here but for me it seems that there are more dirty data to
>>> write than the size of the LEB.
>>>   > -> 3: free 507904 dirty 524128 flags 34 lnum 0
>>>   > I've seen that the values are checked in validate_pnode() where the
>>> -EINVAL (-22) comes from. The question is, how can dirty data bigger
>>> that the LEB size?
>>>
>>> Yes, this can happen. That's why UBIFS does in some cases a fixup of
>>> the used/free numbers.
>>> These numbers are not always updated immediately and when a power-cut
>>> happens their state might be inconsistent.
>>
>> I'm totally fine with that behaviour as a power-cut can happen anytime
>> in an embedded system but I would assume that the ubifs-recovery would
>> bring back the system to a operable state. Or is there an error in my
>> reasoning?
> 
> Of course UBIFS should be able to recover.
> I suspect a very subtle but in UBIFS' xattr code.
> Other bug reports point in that direction too.
> 
> Sadly so far I was not able to pinpoint the exact issue.
> Therefore I started to review the code. So far I've found some odds but
> I'm not 100% sure whether these cause the trouble you see.
> 
>> Please let me know if you need more information or if I should do
>> special testing.
> 
> Can you reproduce the issue?
> If so, please retry with the attached patches.
> They will not fix your already broken UBIFS but they (hopefully) will
> make sure that the accounting problem will not happen again.

That's the biggest problem. There is no way to reproduce it on a proper 
way. The only thing I can do at the moment is to write to the partition 
and do random power-cuts.

I will try the patches you have provided an will check if the error 
comes up with them or not. Unluckily I'm out of office for today, so I 
can start testing tomorrow.

> Can you also share me the broken UBIFS image?

I have to talk to our customer if that is ok for him, but I think that 
will not be a big problem. Can you give me a command line to extract 
that image from the flash in a way, that it is directly usable by you 
without do conversion stuff?.


Thanks,
Frank

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode
  2018-04-04  4:58         ` Frank Erdrich
@ 2018-04-04 22:36           ` Richard Weinberger
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Weinberger @ 2018-04-04 22:36 UTC (permalink / raw)
  To: Frank Erdrich; +Cc: linux-mtd

Frank,

Am Mittwoch, 4. April 2018, 06:58:40 CEST schrieb Frank Erdrich:
> > Can you reproduce the issue?
> > If so, please retry with the attached patches.
> > They will not fix your already broken UBIFS but they (hopefully) will
> > make sure that the accounting problem will not happen again.
> 
> That's the biggest problem. There is no way to reproduce it on a proper 
> way. The only thing I can do at the moment is to write to the partition 
> and do random power-cuts.
> 
> I will try the patches you have provided an will check if the error 
> comes up with them or not. Unluckily I'm out of office for today, so I 
> can start testing tomorrow.
> 
> > Can you also share me the broken UBIFS image?
> 
> I have to talk to our customer if that is ok for him, but I think that 
> will not be a big problem. Can you give me a command line to extract 
> that image from the flash in a way, that it is directly usable by you 
> without do conversion stuff?.

In this case I'm just interested in UBIFS itself.
So you can directly dd/cat from /dev/ubiX_Y.

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-04-04 22:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-03  8:28 UBIFS - ubifs_get_pnode.part.4: error -22 reading pnode Erdrich, Frank
2018-04-03 12:04 ` Richard Weinberger
     [not found]   ` <s1l9q5oc97q87j6d75pkp49b.1522781343251@email.android.com>
2018-04-03 20:26     ` Frank Erdrich
2018-04-03 21:28       ` Richard Weinberger
2018-04-04  4:58         ` Frank Erdrich
2018-04-04 22:36           ` Richard Weinberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.