All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
@ 2022-11-18  9:47 Alexey Asemov (Alex/AT) via Ocfs2-devel
  2022-11-21  1:31 ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 7+ messages in thread
From: Alexey Asemov (Alex/AT) via Ocfs2-devel @ 2022-11-18  9:47 UTC (permalink / raw)
  To: ocfs2-devel

Hello,

While diagnosing unlink hang issue, I have found an inconsistency in 
dinode hardlink count handling.

If we take a look at ocfs2.h, sections below the text, we see 
ocfs2_set_links_count() always stores high portion of link count into 
dinode, but ocfs2_read_links_count() retrieves and adds high portion 
*only* when inode has OCFS2_INDEXED_DIR_FL flag set.

The problem is, ocfs2_read_links_count() is used throughout all the code 
and not directories only. For files, OCFS2_INDEXED_DIR_FL flag is never 
present, so when number of hardlinks for file spills over 65535, it will 
be written as 65536 correctly to dinode, but then will always be read as 
0 to inode due to the check. This causes wrong link count of 0 in stat() 
and total hang on attempt to unlink the 'parent' inode directory entry 
somehow (this is the issue I hit but I have not diagnosed how it happens 
because I stumbled on wrong link count immediately and dug that direction).

While not sure about the internals, I suggest removing

- if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))

check for the time being inside ocfs2_read_links_count() so it always 
reads the correct link count from dinode, be it file or directory.

I do not see dinode link count parts used outside ocfs2.h API, so it 
should be semantically correct thing to do, but please someone 
acquainted with the logic confirm.

KR,
Alex

---

static inline unsigned int ocfs2_read_links_count(struct ocfs2_dinode *di)
{
         u32 nlink = le16_to_cpu(di->i_links_count);
         u32 hi = le16_to_cpu(di->i_links_count_hi);

         if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))
                 nlink |= (hi << OCFS2_LINKS_HI_SHIFT);

         return nlink;
}

static inline void ocfs2_set_links_count(struct ocfs2_dinode *di, u32 nlink)
{
         u16 lo, hi;

         lo = nlink;
         hi = nlink >> OCFS2_LINKS_HI_SHIFT;

         di->i_links_count = cpu_to_le16(lo);
         di->i_links_count_hi = cpu_to_le16(hi);
}


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-18  9:47 [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic Alexey Asemov (Alex/AT) via Ocfs2-devel
@ 2022-11-21  1:31 ` Joseph Qi via Ocfs2-devel
  2022-11-21  7:27   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  2022-11-21 16:15   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  0 siblings, 2 replies; 7+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-11-21  1:31 UTC (permalink / raw)
  To: Alexey Asemov (Alex/AT), ocfs2-devel

Hi,

ocfs2_mknod()/ocfs2_link()/ocfs2_rename() will always check
ocfs2_link_max(), so it seems that it won't overflow.
I wonder if you've encounter another bug that leads to dinode link count
leaking.

Thanks,
Joseph

On 11/18/22 5:47 PM, Alexey Asemov (Alex/AT) via Ocfs2-devel wrote:
> Hello,
> 
> While diagnosing unlink hang issue, I have found an inconsistency in dinode hardlink count handling.
> 
> If we take a look at ocfs2.h, sections below the text, we see ocfs2_set_links_count() always stores high portion of link count into dinode, but ocfs2_read_links_count() retrieves and adds high portion *only* when inode has OCFS2_INDEXED_DIR_FL flag set.
> 
> The problem is, ocfs2_read_links_count() is used throughout all the code and not directories only. For files, OCFS2_INDEXED_DIR_FL flag is never present, so when number of hardlinks for file spills over 65535, it will be written as 65536 correctly to dinode, but then will always be read as 0 to inode due to the check. This causes wrong link count of 0 in stat() and total hang on attempt to unlink the 'parent' inode directory entry somehow (this is the issue I hit but I have not diagnosed how it happens because I stumbled on wrong link count immediately and dug that direction).
> 
> While not sure about the internals, I suggest removing
> 
> - if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))
> 
> check for the time being inside ocfs2_read_links_count() so it always reads the correct link count from dinode, be it file or directory.
> 
> I do not see dinode link count parts used outside ocfs2.h API, so it should be semantically correct thing to do, but please someone acquainted with the logic confirm.
> 
> KR,
> Alex
> 
> ---
> 
> static inline unsigned int ocfs2_read_links_count(struct ocfs2_dinode *di)
> {
>         u32 nlink = le16_to_cpu(di->i_links_count);
>         u32 hi = le16_to_cpu(di->i_links_count_hi);
> 
>         if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))
>                 nlink |= (hi << OCFS2_LINKS_HI_SHIFT);
> 
>         return nlink;
> }
> 
> static inline void ocfs2_set_links_count(struct ocfs2_dinode *di, u32 nlink)
> {
>         u16 lo, hi;
> 
>         lo = nlink;
>         hi = nlink >> OCFS2_LINKS_HI_SHIFT;
> 
>         di->i_links_count = cpu_to_le16(lo);
>         di->i_links_count_hi = cpu_to_le16(hi);
> }
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-21  1:31 ` Joseph Qi via Ocfs2-devel
@ 2022-11-21  7:27   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  2022-11-21 16:15   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  1 sibling, 0 replies; 7+ messages in thread
From: Alexey Asemov (Alex/AT) via Ocfs2-devel @ 2022-11-21  7:27 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel


[-- Attachment #1.1: Type: text/plain, Size: 3306 bytes --]

Hello,

ocfs2_link_max() relies on filesystem wide indexed directories flag, 
while ocfs2_read_links_count() relies on per-dinode flag, and so this is 
the exact inconsistency in question.

I am not versed in code enough to be sure about exactly why this limit 
exists, but it seems supporting OCFS2_DX_LINK_MAX is no problem for the 
code.

It's just optionally not adding high part of link count just seems weird 
to me, the field is there/zeroed even with no indexed dirs, so always 
using it feels logical.

---

|

if  (ocfs2_supports_indexed_dirs(osb))
		return  OCFS2_DX_LINK_MAX;
	return  OCFS2_LINK_MAX;|

On 21.11.2022 4:31, Joseph Qi wrote:
> Hi,
>
> ocfs2_mknod()/ocfs2_link()/ocfs2_rename() will always check
> ocfs2_link_max(), so it seems that it won't overflow.
> I wonder if you've encounter another bug that leads to dinode link count
> leaking.
>
> Thanks,
> Joseph
>
> On 11/18/22 5:47 PM, Alexey Asemov (Alex/AT) via Ocfs2-devel wrote:
>> Hello,
>>
>> While diagnosing unlink hang issue, I have found an inconsistency in dinode hardlink count handling.
>>
>> If we take a look at ocfs2.h, sections below the text, we see ocfs2_set_links_count() always stores high portion of link count into dinode, but ocfs2_read_links_count() retrieves and adds high portion *only* when inode has OCFS2_INDEXED_DIR_FL flag set.
>>
>> The problem is, ocfs2_read_links_count() is used throughout all the code and not directories only. For files, OCFS2_INDEXED_DIR_FL flag is never present, so when number of hardlinks for file spills over 65535, it will be written as 65536 correctly to dinode, but then will always be read as 0 to inode due to the check. This causes wrong link count of 0 in stat() and total hang on attempt to unlink the 'parent' inode directory entry somehow (this is the issue I hit but I have not diagnosed how it happens because I stumbled on wrong link count immediately and dug that direction).
>>
>> While not sure about the internals, I suggest removing
>>
>> - if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))
>>
>> check for the time being inside ocfs2_read_links_count() so it always reads the correct link count from dinode, be it file or directory.
>>
>> I do not see dinode link count parts used outside ocfs2.h API, so it should be semantically correct thing to do, but please someone acquainted with the logic confirm.
>>
>> KR,
>> Alex
>>
>> ---
>>
>> static inline unsigned int ocfs2_read_links_count(struct ocfs2_dinode *di)
>> {
>>          u32 nlink = le16_to_cpu(di->i_links_count);
>>          u32 hi = le16_to_cpu(di->i_links_count_hi);
>>
>>          if (di->i_dyn_features & cpu_to_le16(OCFS2_INDEXED_DIR_FL))
>>                  nlink |= (hi << OCFS2_LINKS_HI_SHIFT);
>>
>>          return nlink;
>> }
>>
>> static inline void ocfs2_set_links_count(struct ocfs2_dinode *di, u32 nlink)
>> {
>>          u16 lo, hi;
>>
>>          lo = nlink;
>>          hi = nlink >> OCFS2_LINKS_HI_SHIFT;
>>
>>          di->i_links_count = cpu_to_le16(lo);
>>          di->i_links_count_hi = cpu_to_le16(hi);
>> }
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

[-- Attachment #1.2: Type: text/html, Size: 4203 bytes --]

[-- Attachment #2: Type: text/plain, Size: 151 bytes --]

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-21  1:31 ` Joseph Qi via Ocfs2-devel
  2022-11-21  7:27   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
@ 2022-11-21 16:15   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  2022-11-22  1:52     ` Joseph Qi via Ocfs2-devel
  1 sibling, 1 reply; 7+ messages in thread
From: Alexey Asemov (Alex/AT) via Ocfs2-devel @ 2022-11-21 16:15 UTC (permalink / raw)
  To: ocfs2-devel

The modification I mentioned surely fixes the thing (well, obviously).

If to reproduce, trivial as well:

- mkfs with indexed-dirs option (otherwise low hardlink limit will apply)
- Create a file
- Create 65535 hardlinks to it, stat will now show 65536 hardlinks (that 
comes from local inode cache)
- Clear inode cache (for me cat 3 > /proc/sys/vm/drop_caches is enough) 
or reboot
- stat again and you will see hardlink count = 0 that is the result of 
actual inconsistency



_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-21 16:15   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
@ 2022-11-22  1:52     ` Joseph Qi via Ocfs2-devel
  2022-11-25 11:25       ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 7+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-11-22  1:52 UTC (permalink / raw)
  To: Alexey Asemov (Alex/AT), ocfs2-devel



On 11/22/22 12:15 AM, Alexey Asemov (Alex/AT) via Ocfs2-devel wrote:
> The modification I mentioned surely fixes the thing (well, obviously).
> 
> If to reproduce, trivial as well:
> 
> - mkfs with indexed-dirs option (otherwise low hardlink limit will apply)
> - Create a file
> - Create 65535 hardlinks to it, stat will now show 65536 hardlinks (that comes from local inode cache)
> - Clear inode cache (for me cat 3 > /proc/sys/vm/drop_caches is enough) or reboot
> - stat again and you will see hardlink count = 0 that is the result of actual inconsistency
> 

Will take a deep look into it.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-22  1:52     ` Joseph Qi via Ocfs2-devel
@ 2022-11-25 11:25       ` Joseph Qi via Ocfs2-devel
  2022-11-27  8:00         ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  0 siblings, 1 reply; 7+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-11-25 11:25 UTC (permalink / raw)
  To: Alexey Asemov (Alex/AT), ocfs2-devel

Hi,

Checked the code again, since feature indexed dirs is now default on, so
the ocfs2_link_max() check will return OCFS2_DX_LINK_MAX, which can use
high 16 bits of links count.

Could you please send a formal patch?

Thanks,
Joseph

On 11/22/22 9:52 AM, Joseph Qi via Ocfs2-devel wrote:
> 
> 
> On 11/22/22 12:15 AM, Alexey Asemov (Alex/AT) via Ocfs2-devel wrote:
>> The modification I mentioned surely fixes the thing (well, obviously).
>>
>> If to reproduce, trivial as well:
>>
>> - mkfs with indexed-dirs option (otherwise low hardlink limit will apply)
>> - Create a file
>> - Create 65535 hardlinks to it, stat will now show 65536 hardlinks (that comes from local inode cache)
>> - Clear inode cache (for me cat 3 > /proc/sys/vm/drop_caches is enough) or reboot
>> - stat again and you will see hardlink count = 0 that is the result of actual inconsistency
>>
> 
> Will take a deep look into it.
> 
> Thanks,
> Joseph
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic
  2022-11-25 11:25       ` Joseph Qi via Ocfs2-devel
@ 2022-11-27  8:00         ` Alexey Asemov (Alex/AT) via Ocfs2-devel
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Asemov (Alex/AT) via Ocfs2-devel @ 2022-11-27  8:00 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

I've sent a formal patch, it is awaiting moderation according to the reply.

For how it happens: yes, ocfs2_link_max() returns OCFS2_DX_LINK_MAX 
indeed. And so large number of links is allowed.
But ocfs2_read_links_count() checks for dynamic di->i_dyn_features flag 
OCFS2_INDEXED_DIR_FL which is only valid and set for directories, not files.
So when reading file dinode, it drops the high part, and that is 
obviously wrong. Anyways, dropping high part seems terribly wrong to me.

KR,
Alex

On 25.11.2022 14:25, Joseph Qi wrote:
> Hi,
>
> Checked the code again, since feature indexed dirs is now default on, so
> the ocfs2_link_max() check will return OCFS2_DX_LINK_MAX, which can use
> high 16 bits of links count.
>
> Could you please send a formal patch?
>
> Thanks,
> Joseph
>

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-28 14:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-18  9:47 [Ocfs2-devel] dinode link count inconsistency in ocfs2_read_links_count() logic Alexey Asemov (Alex/AT) via Ocfs2-devel
2022-11-21  1:31 ` Joseph Qi via Ocfs2-devel
2022-11-21  7:27   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
2022-11-21 16:15   ` Alexey Asemov (Alex/AT) via Ocfs2-devel
2022-11-22  1:52     ` Joseph Qi via Ocfs2-devel
2022-11-25 11:25       ` Joseph Qi via Ocfs2-devel
2022-11-27  8:00         ` Alexey Asemov (Alex/AT) via Ocfs2-devel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.