All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs: fix some new memory allocation failures
@ 2013-09-02 10:52 Dave Chinner
  2013-09-02 10:52 ` [PATCH 1/2] xfs: fix memory allocation failures with ACLs Dave Chinner
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Dave Chinner @ 2013-09-02 10:52 UTC (permalink / raw)
  To: xfs

Hi folks,

These failures are a result of order-4 allocations being done on v5
filesystems to support the large ACL count xattrs. The first patch
puts out usual falbback to vmalloc workaround in place. The second
patch factors all the places we now have this fallback-to-vmalloc
and makes it transparent to the callers.

Cheers,

Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] xfs: fix memory allocation failures with ACLs
  2013-09-02 10:52 [PATCH 0/2] xfs: fix some new memory allocation failures Dave Chinner
@ 2013-09-02 10:52 ` Dave Chinner
  2013-09-06 19:59   ` Mark Tinguely
  2013-09-02 10:53 ` [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations Dave Chinner
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2013-09-02 10:52 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Ever since increasing the number of supported ACLs from 25 to as
many as can fit in an xattr, there have been reports of order 4
memory allocations failing in the ACL code. Fix it in the same way
we've fixed all the xattr read/write code that has the same problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_acl.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 6951896..4ea73cc 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -152,9 +152,12 @@ xfs_get_acl(struct inode *inode, int type)
 	 * go out to the disk.
 	 */
 	len = XFS_ACL_MAX_SIZE(ip->i_mount);
-	xfs_acl = kzalloc(len, GFP_KERNEL);
-	if (!xfs_acl)
-		return ERR_PTR(-ENOMEM);
+	xfs_acl = kmem_zalloc(len, KM_SLEEP | KM_MAYFAIL);
+	if (!xfs_acl) {
+		xfs_acl = kmem_zalloc_large(len);
+		if (!xfs_acl)
+			return ERR_PTR(-ENOMEM);
+	}
 
 	error = -xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
 							&len, ATTR_ROOT);
@@ -175,10 +178,13 @@ xfs_get_acl(struct inode *inode, int type)
 	if (IS_ERR(acl))
 		goto out;
 
- out_update_cache:
+out_update_cache:
 	set_cached_acl(inode, type, acl);
- out:
-	kfree(xfs_acl);
+out:
+	if (is_vmalloc_addr(xfs_acl))
+		kmem_free_large(xfs_acl);
+	else
+		kfree(xfs_acl);
 	return acl;
 }
 
@@ -209,9 +215,12 @@ xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
 		struct xfs_acl *xfs_acl;
 		int len = XFS_ACL_MAX_SIZE(ip->i_mount);
 
-		xfs_acl = kzalloc(len, GFP_KERNEL);
-		if (!xfs_acl)
-			return -ENOMEM;
+		xfs_acl = kmem_zalloc(len, KM_SLEEP | KM_MAYFAIL);
+		if (!xfs_acl) {
+			xfs_acl = kmem_zalloc_large(len);
+			if (!xfs_acl)
+				return -ENOMEM;
+		}
 
 		xfs_acl_to_disk(xfs_acl, acl);
 
@@ -222,7 +231,10 @@ xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
 		error = -xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
 				len, ATTR_ROOT);
 
-		kfree(xfs_acl);
+		if (is_vmalloc_addr(xfs_acl))
+			kmem_free_large(xfs_acl);
+		else
+			kfree(xfs_acl);
 	} else {
 		/*
 		 * A NULL ACL argument means we want to remove the ACL.
-- 
1.8.3.2

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations
  2013-09-02 10:52 [PATCH 0/2] xfs: fix some new memory allocation failures Dave Chinner
  2013-09-02 10:52 ` [PATCH 1/2] xfs: fix memory allocation failures with ACLs Dave Chinner
@ 2013-09-02 10:53 ` Dave Chinner
  2013-09-06 20:37   ` Mark Tinguely
  2013-09-02 17:03 ` [PATCH 0/2] xfs: fix some new memory allocation failures Mark Tinguely
  2013-09-10 22:40 ` Ben Myers
  3 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2013-09-02 10:53 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

We have quite a few places now where we do:

	x = kmem_zalloc(large size)
	if (!x)
		x = kmem_zalloc_large(large size)

and do a similar dance when freeing the memory. kmem_free() already
does the correct freeing dance, and kmem_zalloc_large() is only ever
called in these constructs, so just factor it all into
kmem_zalloc_large() and kmem_free().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/kmem.c          | 15 ++++++++++++---
 fs/xfs/kmem.h          |  9 +--------
 fs/xfs/xfs_acl.c       | 28 ++++++++--------------------
 fs/xfs/xfs_bmap_util.c | 15 ++++-----------
 fs/xfs/xfs_ioctl.c     | 34 +++++++++++-----------------------
 fs/xfs/xfs_ioctl32.c   | 18 ++++++------------
 fs/xfs/xfs_itable.c    |  2 +-
 7 files changed, 43 insertions(+), 78 deletions(-)

diff --git a/fs/xfs/kmem.c b/fs/xfs/kmem.c
index 4a7286c..a02cfb9 100644
--- a/fs/xfs/kmem.c
+++ b/fs/xfs/kmem.c
@@ -27,8 +27,6 @@
 
 /*
  * Greedy allocation.  May fail and may return vmalloced memory.
- *
- * Must be freed using kmem_free_large.
  */
 void *
 kmem_zalloc_greedy(size_t *size, size_t minsize, size_t maxsize)
@@ -36,7 +34,7 @@ kmem_zalloc_greedy(size_t *size, size_t minsize, size_t maxsize)
 	void		*ptr;
 	size_t		kmsize = maxsize;
 
-	while (!(ptr = kmem_zalloc_large(kmsize))) {
+	while (!(ptr = vzalloc(kmsize))) {
 		if ((kmsize >>= 1) <= minsize)
 			kmsize = minsize;
 	}
@@ -75,6 +73,17 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
 	return ptr;
 }
 
+void *
+kmem_zalloc_large(size_t size, xfs_km_flags_t flags)
+{
+	void	*ptr;
+
+	ptr = kmem_zalloc(size, flags | KM_MAYFAIL);
+	if (ptr)
+		return ptr;
+	return vzalloc(size);
+}
+
 void
 kmem_free(const void *ptr)
 {
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index b2f2620..3a7371c 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -57,17 +57,10 @@ kmem_flags_convert(xfs_km_flags_t flags)
 
 extern void *kmem_alloc(size_t, xfs_km_flags_t);
 extern void *kmem_zalloc(size_t, xfs_km_flags_t);
+extern void *kmem_zalloc_large(size_t size, xfs_km_flags_t);
 extern void *kmem_realloc(const void *, size_t, size_t, xfs_km_flags_t);
 extern void  kmem_free(const void *);
 
-static inline void *kmem_zalloc_large(size_t size)
-{
-	return vzalloc(size);
-}
-static inline void kmem_free_large(void *ptr)
-{
-	vfree(ptr);
-}
 
 extern void *kmem_zalloc_greedy(size_t *, size_t, size_t);
 
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 4ea73cc..0e2f37e 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -152,12 +152,9 @@ xfs_get_acl(struct inode *inode, int type)
 	 * go out to the disk.
 	 */
 	len = XFS_ACL_MAX_SIZE(ip->i_mount);
-	xfs_acl = kmem_zalloc(len, KM_SLEEP | KM_MAYFAIL);
-	if (!xfs_acl) {
-		xfs_acl = kmem_zalloc_large(len);
-		if (!xfs_acl)
-			return ERR_PTR(-ENOMEM);
-	}
+	xfs_acl = kmem_zalloc_large(len, KM_SLEEP);
+	if (!xfs_acl)
+		return ERR_PTR(-ENOMEM);
 
 	error = -xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
 							&len, ATTR_ROOT);
@@ -181,10 +178,7 @@ xfs_get_acl(struct inode *inode, int type)
 out_update_cache:
 	set_cached_acl(inode, type, acl);
 out:
-	if (is_vmalloc_addr(xfs_acl))
-		kmem_free_large(xfs_acl);
-	else
-		kfree(xfs_acl);
+	kmem_free(xfs_acl);
 	return acl;
 }
 
@@ -215,12 +209,9 @@ xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
 		struct xfs_acl *xfs_acl;
 		int len = XFS_ACL_MAX_SIZE(ip->i_mount);
 
-		xfs_acl = kmem_zalloc(len, KM_SLEEP | KM_MAYFAIL);
-		if (!xfs_acl) {
-			xfs_acl = kmem_zalloc_large(len);
-			if (!xfs_acl)
-				return -ENOMEM;
-		}
+		xfs_acl = kmem_zalloc_large(len, KM_SLEEP);
+		if (!xfs_acl)
+			return -ENOMEM;
 
 		xfs_acl_to_disk(xfs_acl, acl);
 
@@ -231,10 +222,7 @@ xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
 		error = -xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
 				len, ATTR_ROOT);
 
-		if (is_vmalloc_addr(xfs_acl))
-			kmem_free_large(xfs_acl);
-		else
-			kfree(xfs_acl);
+		kmem_free(xfs_acl);
 	} else {
 		/*
 		 * A NULL ACL argument means we want to remove the ACL.
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index c6dc551..97f952c 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -612,13 +612,9 @@ xfs_getbmap(
 
 	if (bmv->bmv_count > ULONG_MAX / sizeof(struct getbmapx))
 		return XFS_ERROR(ENOMEM);
-	out = kmem_zalloc(bmv->bmv_count * sizeof(struct getbmapx), KM_MAYFAIL);
-	if (!out) {
-		out = kmem_zalloc_large(bmv->bmv_count *
-					sizeof(struct getbmapx));
-		if (!out)
-			return XFS_ERROR(ENOMEM);
-	}
+	out = kmem_zalloc_large(bmv->bmv_count * sizeof(struct getbmapx), 0);
+	if (!out)
+		return XFS_ERROR(ENOMEM);
 
 	xfs_ilock(ip, XFS_IOLOCK_SHARED);
 	if (whichfork == XFS_DATA_FORK && !(iflags & BMV_IF_DELALLOC)) {
@@ -754,10 +750,7 @@ xfs_getbmap(
 			break;
 	}
 
-	if (is_vmalloc_addr(out))
-		kmem_free_large(out);
-	else
-		kmem_free(out);
+	kmem_free(out);
 	return error;
 }
 
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 21d9c9df..668e8f4 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -456,12 +456,9 @@ xfs_attrlist_by_handle(
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	kbuf = kmem_zalloc(al_hreq.buflen, KM_SLEEP | KM_MAYFAIL);
-	if (!kbuf) {
-		kbuf = kmem_zalloc_large(al_hreq.buflen);
-		if (!kbuf)
-			goto out_dput;
-	}
+	kbuf = kmem_zalloc_large(al_hreq.buflen, KM_SLEEP);
+	if (!kbuf)
+		goto out_dput;
 
 	cursor = (attrlist_cursor_kern_t *)&al_hreq.pos;
 	error = -xfs_attr_list(XFS_I(dentry->d_inode), kbuf, al_hreq.buflen,
@@ -472,12 +469,9 @@ xfs_attrlist_by_handle(
 	if (copy_to_user(al_hreq.buffer, kbuf, al_hreq.buflen))
 		error = -EFAULT;
 
- out_kfree:
-	if (is_vmalloc_addr(kbuf))
-		kmem_free_large(kbuf);
-	else
-		kmem_free(kbuf);
- out_dput:
+out_kfree:
+	kmem_free(kbuf);
+out_dput:
 	dput(dentry);
 	return error;
 }
@@ -495,12 +489,9 @@ xfs_attrmulti_attr_get(
 
 	if (*len > XATTR_SIZE_MAX)
 		return EINVAL;
-	kbuf = kmem_zalloc(*len, KM_SLEEP | KM_MAYFAIL);
-	if (!kbuf) {
-		kbuf = kmem_zalloc_large(*len);
-		if (!kbuf)
-			return ENOMEM;
-	}
+	kbuf = kmem_zalloc_large(*len, KM_SLEEP);
+	if (!kbuf)
+		return ENOMEM;
 
 	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
 	if (error)
@@ -509,11 +500,8 @@ xfs_attrmulti_attr_get(
 	if (copy_to_user(ubuf, kbuf, *len))
 		error = EFAULT;
 
- out_kfree:
-	if (is_vmalloc_addr(kbuf))
-		kmem_free_large(kbuf);
-	else
-		kmem_free(kbuf);
+out_kfree:
+	kmem_free(kbuf);
 	return error;
 }
 
diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
index d3ab953..f671f7e 100644
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -371,12 +371,9 @@ xfs_compat_attrlist_by_handle(
 		return PTR_ERR(dentry);
 
 	error = -ENOMEM;
-	kbuf = kmem_zalloc(al_hreq.buflen, KM_SLEEP | KM_MAYFAIL);
-	if (!kbuf) {
-		kbuf = kmem_zalloc_large(al_hreq.buflen);
-		if (!kbuf)
-			goto out_dput;
-	}
+	kbuf = kmem_zalloc_large(al_hreq.buflen, KM_SLEEP);
+	if (!kbuf)
+		goto out_dput;
 
 	cursor = (attrlist_cursor_kern_t *)&al_hreq.pos;
 	error = -xfs_attr_list(XFS_I(dentry->d_inode), kbuf, al_hreq.buflen,
@@ -387,12 +384,9 @@ xfs_compat_attrlist_by_handle(
 	if (copy_to_user(compat_ptr(al_hreq.buffer), kbuf, al_hreq.buflen))
 		error = -EFAULT;
 
- out_kfree:
-	if (is_vmalloc_addr(kbuf))
-		kmem_free_large(kbuf);
-	else
-		kmem_free(kbuf);
- out_dput:
+out_kfree:
+	kmem_free(kbuf);
+out_dput:
 	dput(dentry);
 	return error;
 }
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 8a67d53..084b3e1 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -495,7 +495,7 @@ xfs_bulkstat(
 	/*
 	 * Done, we're either out of filesystem or space to put the data.
 	 */
-	kmem_free_large(irbuf);
+	kmem_free(irbuf);
 	*ubcountp = ubelem;
 	/*
 	 * Found some inodes, return them now and return the error next time.
-- 
1.8.3.2

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-02 10:52 [PATCH 0/2] xfs: fix some new memory allocation failures Dave Chinner
  2013-09-02 10:52 ` [PATCH 1/2] xfs: fix memory allocation failures with ACLs Dave Chinner
  2013-09-02 10:53 ` [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations Dave Chinner
@ 2013-09-02 17:03 ` Mark Tinguely
  2013-09-02 22:20   ` Dave Chinner
  2013-09-10 22:40 ` Ben Myers
  3 siblings, 1 reply; 13+ messages in thread
From: Mark Tinguely @ 2013-09-02 17:03 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/02/13 05:52, Dave Chinner wrote:
> Hi folks,
>
> These failures are a result of order-4 allocations being done on v5
> filesystems to support the large ACL count xattrs. The first patch
> puts out usual falbback to vmalloc workaround in place. The second
> patch factors all the places we now have this fallback-to-vmalloc
> and makes it transparent to the callers.
>
> Cheers,
>
> Dave.

Thanks for clean up. Broken record time: Do we really need order 
allocation in the filesystem? Esp in xfs_ioctl.c.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-02 17:03 ` [PATCH 0/2] xfs: fix some new memory allocation failures Mark Tinguely
@ 2013-09-02 22:20   ` Dave Chinner
  2013-09-03 13:07     ` Mark Tinguely
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2013-09-02 22:20 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On Mon, Sep 02, 2013 at 12:03:37PM -0500, Mark Tinguely wrote:
> On 09/02/13 05:52, Dave Chinner wrote:
> >Hi folks,
> >
> >These failures are a result of order-4 allocations being done on v5
> >filesystems to support the large ACL count xattrs. The first patch
> >puts out usual falbback to vmalloc workaround in place. The second
> >patch factors all the places we now have this fallback-to-vmalloc
> >and makes it transparent to the callers.
> >
> >Cheers,
> >
> >Dave.
> 
> Thanks for clean up. Broken record time: Do we really need order
> allocation in the filesystem? Esp in xfs_ioctl.c.

I don't understand your question. Are you asking why we need high
order allocation?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-02 22:20   ` Dave Chinner
@ 2013-09-03 13:07     ` Mark Tinguely
  2013-09-03 20:04       ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Tinguely @ 2013-09-03 13:07 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/02/13 17:20, Dave Chinner wrote:
> On Mon, Sep 02, 2013 at 12:03:37PM -0500, Mark Tinguely wrote:
>> On 09/02/13 05:52, Dave Chinner wrote:
>>> Hi folks,
>>>
>>> These failures are a result of order-4 allocations being done on v5
>>> filesystems to support the large ACL count xattrs. The first patch
>>> puts out usual falbback to vmalloc workaround in place. The second
>>> patch factors all the places we now have this fallback-to-vmalloc
>>> and makes it transparent to the callers.
>>>
>>> Cheers,
>>>
>>> Dave.
>>
>> Thanks for clean up. Broken record time: Do we really need order
>> allocation in the filesystem? Esp in xfs_ioctl.c.
>
> I don't understand your question. Are you asking why we need high
> order allocation?
>
> Cheers,
>
> Dave.

In patch 2, why not drop the physically contiguous allocation attempt 
and just do the virtually contiguous allocation?

Things that now call kmem_zalloc_large() do not need a physically 
contiguous memory, it will simplify the allocation, it will leave the 
physically contiguous pieces for other Linux code that really need it.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-03 13:07     ` Mark Tinguely
@ 2013-09-03 20:04       ` Dave Chinner
  2013-09-03 20:46         ` Mark Tinguely
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2013-09-03 20:04 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On Tue, Sep 03, 2013 at 08:07:19AM -0500, Mark Tinguely wrote:
> On 09/02/13 17:20, Dave Chinner wrote:
> >On Mon, Sep 02, 2013 at 12:03:37PM -0500, Mark Tinguely wrote:
> >>On 09/02/13 05:52, Dave Chinner wrote:
> >>>Hi folks,
> >>>
> >>>These failures are a result of order-4 allocations being done on v5
> >>>filesystems to support the large ACL count xattrs. The first patch
> >>>puts out usual falbback to vmalloc workaround in place. The second
> >>>patch factors all the places we now have this fallback-to-vmalloc
> >>>and makes it transparent to the callers.
> >>>
> >>>Cheers,
> >>>
> >>>Dave.
> >>
> >>Thanks for clean up. Broken record time: Do we really need order
> >>allocation in the filesystem? Esp in xfs_ioctl.c.
> >
> >I don't understand your question. Are you asking why we need high
> >order allocation?
> >
> >Cheers,
> >
> >Dave.
> 
> In patch 2, why not drop the physically contiguous allocation
> attempt and just do the virtually contiguous allocation?

Because:

	a) virtual memory space is extremely limited on some
	platforms - we regularly get people reporting that they've
	exhausted vmalloc space on 32 bit systems.
	b) when there is free contiguous memory, allocating that
	contiguous memory is much faster than allocating
	virtual memory.
	c) virtual memory access is slower than physical memory
	access and it puts pressure on the page tables.

IOWs, we want to avoid allocating virtual memory if at all possible.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-03 20:04       ` Dave Chinner
@ 2013-09-03 20:46         ` Mark Tinguely
  2013-09-03 21:16           ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Tinguely @ 2013-09-03 20:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/03/13 15:04, Dave Chinner wrote:
> On Tue, Sep 03, 2013 at 08:07:19AM -0500, Mark Tinguely wrote:
>> On 09/02/13 17:20, Dave Chinner wrote:
>>> On Mon, Sep 02, 2013 at 12:03:37PM -0500, Mark Tinguely wrote:
>>>> On 09/02/13 05:52, Dave Chinner wrote:
>>>>> Hi folks,
>>>>>
>>>>> These failures are a result of order-4 allocations being done on v5
>>>>> filesystems to support the large ACL count xattrs. The first patch
>>>>> puts out usual falbback to vmalloc workaround in place. The second
>>>>> patch factors all the places we now have this fallback-to-vmalloc
>>>>> and makes it transparent to the callers.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Dave.
>>>>
>>>> Thanks for clean up. Broken record time: Do we really need order
>>>> allocation in the filesystem? Esp in xfs_ioctl.c.
>>>
>>> I don't understand your question. Are you asking why we need high
>>> order allocation?
>>>
>>> Cheers,
>>>
>>> Dave.
>>
>> In patch 2, why not drop the physically contiguous allocation
>> attempt and just do the virtually contiguous allocation?
>
> Because:
>
> 	a) virtual memory space is extremely limited on some
> 	platforms - we regularly get people reporting that they've
> 	exhausted vmalloc space on 32 bit systems.
> 	b) when there is free contiguous memory, allocating that
> 	contiguous memory is much faster than allocating
> 	virtual memory.
> 	c) virtual memory access is slower than physical memory
> 	access and it puts pressure on the page tables.
>
> IOWs, we want to avoid allocating virtual memory if at all possible.
>
> Cheers,
>
> Dave.

Ummm, It is all virtual memory it all runs through page tables. The MMU 
works on virtual addresses.

It appears Linux has a special range of kernel virtual memory for the 
physical contiguous allocations and range for sparse memory allocation.

XFS does not need the physical space that backs the kernel virtual 
address to be contiguous - other parts of the kernel do. Why put 
pressure on the drivers that need order allocations when we do not need it?

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-03 20:46         ` Mark Tinguely
@ 2013-09-03 21:16           ` Dave Chinner
  2013-09-03 22:38             ` Mark Tinguely
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2013-09-03 21:16 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On Tue, Sep 03, 2013 at 03:46:24PM -0500, Mark Tinguely wrote:
> On 09/03/13 15:04, Dave Chinner wrote:
> >On Tue, Sep 03, 2013 at 08:07:19AM -0500, Mark Tinguely wrote:
> >>On 09/02/13 17:20, Dave Chinner wrote:
> >>>On Mon, Sep 02, 2013 at 12:03:37PM -0500, Mark Tinguely wrote:
> >>>>On 09/02/13 05:52, Dave Chinner wrote:
> >>>>>Hi folks,
> >>>>>
> >>>>>These failures are a result of order-4 allocations being done on v5
> >>>>>filesystems to support the large ACL count xattrs. The first patch
> >>>>>puts out usual falbback to vmalloc workaround in place. The second
> >>>>>patch factors all the places we now have this fallback-to-vmalloc
> >>>>>and makes it transparent to the callers.
> >>>>>
> >>>>>Cheers,
> >>>>>
> >>>>>Dave.
> >>>>
> >>>>Thanks for clean up. Broken record time: Do we really need order
> >>>>allocation in the filesystem? Esp in xfs_ioctl.c.
> >>>
> >>>I don't understand your question. Are you asking why we need high
> >>>order allocation?
> >>>
> >>>Cheers,
> >>>
> >>>Dave.
> >>
> >>In patch 2, why not drop the physically contiguous allocation
> >>attempt and just do the virtually contiguous allocation?
> >
> >Because:
> >
> >	a) virtual memory space is extremely limited on some
> >	platforms - we regularly get people reporting that they've
> >	exhausted vmalloc space on 32 bit systems.
> >	b) when there is free contiguous memory, allocating that
> >	contiguous memory is much faster than allocating
> >	virtual memory.
> >	c) virtual memory access is slower than physical memory
> >	access and it puts pressure on the page tables.
> >
> >IOWs, we want to avoid allocating virtual memory if at all possible.
> >
> >Cheers,
> >
> >Dave.
> 
> Ummm, It is all virtual memory it all runs through page tables. The
> MMU works on virtual addresses.

Sure, but there's a massive difference between kmalloc and
vmalloc()'d memory...

> It appears Linux has a special range of kernel virtual memory for
> the physical contiguous allocations and range for sparse memory
> allocation.

... kernel memory is directly mapped as there is a 1:!  relationship
between the virtual address of kernel memory and the physical
location of the memory. vmalloc memory has an arbitrary virtual to
physical mapping that has to be looked in a separate structure on
every page fault, same as for any userspace page fault. vmalloc
space can become fragmented, be exhausted, etc, just like kmalloc
memory can be. Indeed, there are situations where vmalloc can fail
yet kmalloc will succeed...

> XFS does not need the physical space that backs the kernel virtual
> address to be contiguous - other parts of the kernel do. Why put
> pressure on the drivers that need order allocations when we do not
> need it?

Let's just quote Linus from 2003, shall we:

http://yarchive.net/comp/linux/vmalloc.html

| > I think it'd make more sense to only use vmalloc when it's explicitly
| > too big for kmalloc - or simply switch on num_online_cpus > 100 or
| > whatever a sensible cutoff is (ie nobody but you would ever see this ;-))
| 
| No, please please please don't do these things.
| 
| vmalloc() is NOT SOMETHING YOU SHOULD EVER USE! It's only valid for when
| you _need_ a big array, and you don't have any choice. It's slow, and it's
| a very restricted resource: it's a global resource that is literally
| restricted to a few tens of megabytes. It should be _very_ carefully used.
| 
| There are basically no valid new uses of it. There's a few valid legacy
| users (I think the file descriptor array), and there are some drivers that
| use it (which is crap, but drivers are drivers), and it's _really_ valid
| only for modules. Nothing else.
| 
| Basically: if you think you need more memory than a kmalloc() can give,
| you need to re-organize your data structures. To either not need a big
| area, or to be able to allocate it in chunks.
| 
| 		Linus 

Linus will say exactly the same thing today....

And that doesn't take into account that vmalloc() or vm_map_ram()
inside an XFS transaction context can deadlock as page table entries
use GFP_KERNEL allocation, and this can happen in a GFP_NOFS context
in XFS...

Seriously, there are excellent reasons for vmalloc being considered
a bad thing and hence it's use is actively discouraged across
the entire kernel space. We only use it where absolutely necessary
to function correctly.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-03 21:16           ` Dave Chinner
@ 2013-09-03 22:38             ` Mark Tinguely
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Tinguely @ 2013-09-03 22:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/03/13 16:16, Dave Chinner wrote:
> Let's just quote Linus from 2003, shall we:
>
> http://yarchive.net/comp/linux/vmalloc.html
>
> |>  I think it'd make more sense to only use vmalloc when it's explicitly
> |>  too big for kmalloc - or simply switch on num_online_cpus>  100 or
> |>  whatever a sensible cutoff is (ie nobody but you would ever see this ;-))
> |
> | No, please please please don't do these things.
> |
> | vmalloc() is NOT SOMETHING YOU SHOULD EVER USE! It's only valid for when
> | you_need_  a big array, and you don't have any choice. It's slow, and it's
> | a very restricted resource: it's a global resource that is literally
> | restricted to a few tens of megabytes. It should be_very_  carefully used.
> |
> | There are basically no valid new uses of it. There's a few valid legacy
> | users (I think the file descriptor array), and there are some drivers that
> | use it (which is crap, but drivers are drivers), and it's_really_  valid
> | only for modules. Nothing else.
> |
> | Basically: if you think you need more memory than a kmalloc() can give,
> | you need to re-organize your data structures. To either not need a big
> | area, or to be able to allocate it in chunks.
> |
> | 		Linus
>
> Linus will say exactly the same thing today....

And that is very very sad that is it funny.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] xfs: fix memory allocation failures with ACLs
  2013-09-02 10:52 ` [PATCH 1/2] xfs: fix memory allocation failures with ACLs Dave Chinner
@ 2013-09-06 19:59   ` Mark Tinguely
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Tinguely @ 2013-09-06 19:59 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/02/13 05:52, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> Ever since increasing the number of supported ACLs from 25 to as
> many as can fit in an xattr, there have been reports of order 4
> memory allocations failing in the ACL code. Fix it in the same way
> we've fixed all the xattr read/write code that has the same problem.
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
> ---

Looks good.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations
  2013-09-02 10:53 ` [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations Dave Chinner
@ 2013-09-06 20:37   ` Mark Tinguely
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Tinguely @ 2013-09-06 20:37 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 09/02/13 05:53, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> We have quite a few places now where we do:
>
> 	x = kmem_zalloc(large size)
> 	if (!x)
> 		x = kmem_zalloc_large(large size)
>
> and do a similar dance when freeing the memory. kmem_free() already
> does the correct freeing dance, and kmem_zalloc_large() is only ever
> called in these constructs, so just factor it all into
> kmem_zalloc_large() and kmem_free().
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
> ---

Good start on the allocation changes.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] xfs: fix some new memory allocation failures
  2013-09-02 10:52 [PATCH 0/2] xfs: fix some new memory allocation failures Dave Chinner
                   ` (2 preceding siblings ...)
  2013-09-02 17:03 ` [PATCH 0/2] xfs: fix some new memory allocation failures Mark Tinguely
@ 2013-09-10 22:40 ` Ben Myers
  3 siblings, 0 replies; 13+ messages in thread
From: Ben Myers @ 2013-09-10 22:40 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Mon, Sep 02, 2013 at 08:52:58PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> These failures are a result of order-4 allocations being done on v5
> filesystems to support the large ACL count xattrs. The first patch
> puts out usual falbback to vmalloc workaround in place. The second
> patch factors all the places we now have this fallback-to-vmalloc
> and makes it transparent to the callers.

Applied these 2.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-09-10 22:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-02 10:52 [PATCH 0/2] xfs: fix some new memory allocation failures Dave Chinner
2013-09-02 10:52 ` [PATCH 1/2] xfs: fix memory allocation failures with ACLs Dave Chinner
2013-09-06 19:59   ` Mark Tinguely
2013-09-02 10:53 ` [PATCH 2/2] xfs: factor all the kmalloc-or-vmalloc fallback allocations Dave Chinner
2013-09-06 20:37   ` Mark Tinguely
2013-09-02 17:03 ` [PATCH 0/2] xfs: fix some new memory allocation failures Mark Tinguely
2013-09-02 22:20   ` Dave Chinner
2013-09-03 13:07     ` Mark Tinguely
2013-09-03 20:04       ` Dave Chinner
2013-09-03 20:46         ` Mark Tinguely
2013-09-03 21:16           ` Dave Chinner
2013-09-03 22:38             ` Mark Tinguely
2013-09-10 22:40 ` Ben Myers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.