All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] linux servers as a storage server - what's missing?
@ 2011-12-21 15:59 Ric Wheeler
  2011-12-22  8:14   ` Shyam_Iyer
                   ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Ric Wheeler @ 2011-12-21 15:59 UTC (permalink / raw)
  To: linux-fsdevel, linux-scsi


One common thing that I see a lot of these days is an increasing number of 
platforms that are built on our stack as storage servers. Ranging from the 
common linux based storage/NAS devices up to various distributed systems.  
Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.

At last year's SNIA developers conference, it was clear that Microsoft is 
putting a lot of effort into enhancing windows 8 server as a storage server with 
both support for a pNFS server and of course SMB. I think that linux (+samba) is 
ahead of the windows based storage appliances today, but they are putting 
together a very aggressive list of features.

I think that it would be useful and interesting to take a slot at this year's 
LSF to see how we are doing in this space. How large do we need to scale for an 
appliance?  What kind of work is needed (support for the copy offload system 
call? better support for out of band notifications like those used in "thinly 
provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).

The goal would be to see what technical gaps we have that need more active 
development in, not just a wish list :)

Ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler
@ 2011-12-22  8:14   ` Shyam_Iyer
  2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton
  2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier
  2 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-22  8:14 UTC (permalink / raw)
  To: rwheeler, linux-fsdevel, linux-scsi



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Ric Wheeler
> Sent: Wednesday, December 21, 2011 11:00 AM
> To: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org
> Subject: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> 
> One common thing that I see a lot of these days is an increasing number
> of
> platforms that are built on our stack as storage servers. Ranging from
> the
> common linux based storage/NAS devices up to various distributed
> systems.
> Almost all of them use our common stack - software RAID, LVM, XFS/ext4
> and samba.
> 
> At last year's SNIA developers conference, it was clear that Microsoft
> is
> putting a lot of effort into enhancing windows 8 server as a storage
> server with
> both support for a pNFS server and of course SMB. I think that linux
> (+samba) is
> ahead of the windows based storage appliances today, but they are
> putting
> together a very aggressive list of features.
> 
> I think that it would be useful and interesting to take a slot at this
> year's
> LSF to see how we are doing in this space. How large do we need to
> scale for an
> appliance?  What kind of work is needed (support for the copy offload
> system
> call? better support for out of band notifications like those used in
> "thinly
> provisioned" SCSI devices? management API's? Ease of use CLI work?
> SMB2.2 support?).
> 
> The goal would be to see what technical gaps we have that need more
> active
> development in, not just a wish list :)
> 
> Ric

Working for a company that works with different OS vendors I get involved in such discussions on what linux offers and what it doesn't and where the gaps are both at the code level and the customer usage patterns..

A few things that stand out..

- Management models.. Performance models.

I tend to think that we(linux folks) get into performance paradigm more frequently in the kernel and leave the management paradigms to the big vendors to play around leaving a lot of inconsistency in storage management by sysadmins.

I think the analogy could be equated to a traffic scenario with rules vs a traffic scenario without rules.
The traffic scenario without rules generally leaves a skilled expert driver navigating the traffic swiftly and reaching the destination much faster than the others but at the same time leaving the non-driving passenger with a bad feeling in the stomach.
The customer is the analogy of the non-driving passenger in the case of linux.

For eg: If someone had to write a decent use case that lets you use a clustered framework with nfs/pnfs with iSCSI storage backend supporting Copy offload while managing backup all you would end up is having a set of management windows in setting up this whole framework unless you are a vendor willing to take some extra brownie points from the customer in  writing this whole thing up and packaging it into a framework. And if there are features not implemented in a particular filesystem/kernel subsystem like the copy offload it just needs a lot of synchronization which means the feature generally takes a long time to evolve.

The kernel feature is usually implemented with the performance in mind but the management of the feature is usually left to the user.

In this case a vendor includes OS distributions and stake holder storage companies..

If I flip this over to what other OSes offer..

1) A consistent clustered filesystem that supports performance oriented features like copy offload and optimization features like thin provisioning
2) A management api for things like thin provisioning with well documented hooks to write a vendor specific plugin
3) GUI/CLI support
4) Backup management/API with hooks for vendor plugins

Usually all of this is within a common framework or single management window... providing a consistent view.

Simple asks -
1) Provide a consistent storage and fs management library that discourages folks to write their own usespace storage library. Include things like fs formatting(fs profiles), transport configuration(eg: iscsiadm as a library), thin provisioning watermarks, cluster management, apis for cgroups etc. The library should provide a clean set of rules/interfaces to build management apps for.
Think Android market place providing a well defined framework for app writers. Let the distributions/Storage companies write their own cool apps with this framework..

2) View implementations like copy offload, thin provisioning, snapshots, watermarks in the kernel in conjunction with this storage library. So a usecase has to be discussed to be included in this library before working in the kernel

3) And this may sound controversial but inspite of being a long time linux fan, user and observer I would say provide hooks for folks to write clean pluggins that lets them protect their proprietary work by allowing them to bundle binary blobs. 
Usually folks want to keep proprietary plugins in this area because -
    a) No other storage vendor provides an open source pluggin. So if you are a storage vendor listening this might be your cue to start the avalanche
    b) They are into IP protection agreement with another OS vendor
    c) A startup protecting its IP
The benefits of open sourcing are usually realized when maintaining code.. :-) not when pitching it against simpler management frameworks offered by other OS vendors who are able to offer the feature as vendors mutually want to keep it proprietary.
(The last one being my personal opinion and not as the employee of an increasingly storage company)

/me fully expects brickbats but then as they say from where I come from - A fool can always try his luck a few times and get wise in the process.. :-)



^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
@ 2011-12-22  8:14   ` Shyam_Iyer
  0 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-22  8:14 UTC (permalink / raw)
  To: rwheeler, linux-fsdevel, linux-scsi



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Ric Wheeler
> Sent: Wednesday, December 21, 2011 11:00 AM
> To: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org
> Subject: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> 
> One common thing that I see a lot of these days is an increasing number
> of
> platforms that are built on our stack as storage servers. Ranging from
> the
> common linux based storage/NAS devices up to various distributed
> systems.
> Almost all of them use our common stack - software RAID, LVM, XFS/ext4
> and samba.
> 
> At last year's SNIA developers conference, it was clear that Microsoft
> is
> putting a lot of effort into enhancing windows 8 server as a storage
> server with
> both support for a pNFS server and of course SMB. I think that linux
> (+samba) is
> ahead of the windows based storage appliances today, but they are
> putting
> together a very aggressive list of features.
> 
> I think that it would be useful and interesting to take a slot at this
> year's
> LSF to see how we are doing in this space. How large do we need to
> scale for an
> appliance?  What kind of work is needed (support for the copy offload
> system
> call? better support for out of band notifications like those used in
> "thinly
> provisioned" SCSI devices? management API's? Ease of use CLI work?
> SMB2.2 support?).
> 
> The goal would be to see what technical gaps we have that need more
> active
> development in, not just a wish list :)
> 
> Ric

Working for a company that works with different OS vendors I get involved in such discussions on what linux offers and what it doesn't and where the gaps are both at the code level and the customer usage patterns..

A few things that stand out..

- Management models.. Performance models.

I tend to think that we(linux folks) get into performance paradigm more frequently in the kernel and leave the management paradigms to the big vendors to play around leaving a lot of inconsistency in storage management by sysadmins.

I think the analogy could be equated to a traffic scenario with rules vs a traffic scenario without rules.
The traffic scenario without rules generally leaves a skilled expert driver navigating the traffic swiftly and reaching the destination much faster than the others but at the same time leaving the non-driving passenger with a bad feeling in the stomach.
The customer is the analogy of the non-driving passenger in the case of linux.

For eg: If someone had to write a decent use case that lets you use a clustered framework with nfs/pnfs with iSCSI storage backend supporting Copy offload while managing backup all you would end up is having a set of management windows in setting up this whole framework unless you are a vendor willing to take some extra brownie points from the customer in  writing this whole thing up and packaging it into a framework. And if there are features not implemented in a particular filesystem/kernel subsystem like the copy offload it just needs a lot of synchronization which means the feature generally takes a long time to evolve.

The kernel feature is usually implemented with the performance in mind but the management of the feature is usually left to the user.

In this case a vendor includes OS distributions and stake holder storage companies..

If I flip this over to what other OSes offer..

1) A consistent clustered filesystem that supports performance oriented features like copy offload and optimization features like thin provisioning
2) A management api for things like thin provisioning with well documented hooks to write a vendor specific plugin
3) GUI/CLI support
4) Backup management/API with hooks for vendor plugins

Usually all of this is within a common framework or single management window... providing a consistent view.

Simple asks -
1) Provide a consistent storage and fs management library that discourages folks to write their own usespace storage library. Include things like fs formatting(fs profiles), transport configuration(eg: iscsiadm as a library), thin provisioning watermarks, cluster management, apis for cgroups etc. The library should provide a clean set of rules/interfaces to build management apps for.
Think Android market place providing a well defined framework for app writers. Let the distributions/Storage companies write their own cool apps with this framework..

2) View implementations like copy offload, thin provisioning, snapshots, watermarks in the kernel in conjunction with this storage library. So a usecase has to be discussed to be included in this library before working in the kernel

3) And this may sound controversial but inspite of being a long time linux fan, user and observer I would say provide hooks for folks to write clean pluggins that lets them protect their proprietary work by allowing them to bundle binary blobs. 
Usually folks want to keep proprietary plugins in this area because -
    a) No other storage vendor provides an open source pluggin. So if you are a storage vendor listening this might be your cue to start the avalanche
    b) They are into IP protection agreement with another OS vendor
    c) A startup protecting its IP
The benefits of open sourcing are usually realized when maintaining code.. :-) not when pitching it against simpler management frameworks offered by other OS vendors who are able to offer the feature as vendors mutually want to keep it proprietary.
(The last one being my personal opinion and not as the employee of an increasingly storage company)

/me fully expects brickbats but then as they say from where I come from - A fool can always try his luck a few times and get wise in the process.. :-)



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-22  8:14   ` Shyam_Iyer
  (?)
@ 2011-12-22 15:58   ` Vivek Goyal
  2011-12-22 20:54       ` Shyam_Iyer
  -1 siblings, 1 reply; 50+ messages in thread
From: Vivek Goyal @ 2011-12-22 15:58 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: rwheeler, linux-fsdevel, linux-scsi

On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:

[..]

> Simple asks -
> 1) Provide a consistent storage and fs management library that discourages folks to write their own usespace storage library. Include things like fs formatting(fs profiles), transport configuration(eg: iscsiadm as a library), thin provisioning watermarks, cluster management, apis for cgroups etc.
                                      ^^^^^^^^^^^^^^^^
For cgroups, we have libcgroup library. Not many people like to use it 
though as cgroup is exported as a filesystem and they prefer to use normal
libc api to traverse and configure cgroups (Instead of going through
another library). Some examples include libvrit, systemd.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-22 15:58   ` Vivek Goyal
@ 2011-12-22 20:54       ` Shyam_Iyer
  0 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-22 20:54 UTC (permalink / raw)
  To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Vivek Goyal
> Sent: Thursday, December 22, 2011 10:59 AM
> To: Iyer, Shyam
> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
> 
> [..]
> 
> > Simple asks -
> > 1) Provide a consistent storage and fs management library that
> discourages folks to write their own usespace storage library. Include
> things like fs formatting(fs profiles), transport configuration(eg:
> iscsiadm as a library), thin provisioning watermarks, cluster
> management, apis for cgroups etc.
>                                       ^^^^^^^^^^^^^^^^
> For cgroups, we have libcgroup library. Not many people like to use it
> though as cgroup is exported as a filesystem and they prefer to use
> normal
> libc api to traverse and configure cgroups (Instead of going through
> another library). Some examples include libvrit, systemd.
> 
> Thanks
> Vivek

Well honestly I think that is a libvirt/systemd issue and libvirt also invokes things like iscsiadm, dcb etc as a binary :-/

Some one could always use qemu command lines to invoke KVM/XEN but libvirt has saved me one too many days in doing a quick operation without wondering about a qemu commandline.
 
I am also asking for ideas on how to avoid this fragmentation because just like libvirt others are also encouraged to do their own libc thing in the absence of a common storage management framework..

Does the standard interface for linux end at the user/kernel boundary or the user/libc boundary? If so I feel we would continue to lag behind other OSes in features because of the model. 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
@ 2011-12-22 20:54       ` Shyam_Iyer
  0 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-22 20:54 UTC (permalink / raw)
  To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi



> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Vivek Goyal
> Sent: Thursday, December 22, 2011 10:59 AM
> To: Iyer, Shyam
> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
> 
> [..]
> 
> > Simple asks -
> > 1) Provide a consistent storage and fs management library that
> discourages folks to write their own usespace storage library. Include
> things like fs formatting(fs profiles), transport configuration(eg:
> iscsiadm as a library), thin provisioning watermarks, cluster
> management, apis for cgroups etc.
>                                       ^^^^^^^^^^^^^^^^
> For cgroups, we have libcgroup library. Not many people like to use it
> though as cgroup is exported as a filesystem and they prefer to use
> normal
> libc api to traverse and configure cgroups (Instead of going through
> another library). Some examples include libvrit, systemd.
> 
> Thanks
> Vivek

Well honestly I think that is a libvirt/systemd issue and libvirt also invokes things like iscsiadm, dcb etc as a binary :-/

Some one could always use qemu command lines to invoke KVM/XEN but libvirt has saved me one too many days in doing a quick operation without wondering about a qemu commandline.
 
I am also asking for ideas on how to avoid this fragmentation because just like libvirt others are also encouraged to do their own libc thing in the absence of a common storage management framework..

Does the standard interface for linux end at the user/kernel boundary or the user/libc boundary? If so I feel we would continue to lag behind other OSes in features because of the model. 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-22 20:54       ` Shyam_Iyer
  (?)
@ 2011-12-23  3:06       ` Vivek Goyal
  2011-12-23  4:35           ` Shyam_Iyer
  -1 siblings, 1 reply; 50+ messages in thread
From: Vivek Goyal @ 2011-12-23  3:06 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: rwheeler, linux-fsdevel, linux-scsi

On Fri, Dec 23, 2011 at 02:24:42AM +0530, Shyam_Iyer@Dell.com wrote:
> 
> 
> > -----Original Message-----
> > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> > owner@vger.kernel.org] On Behalf Of Vivek Goyal
> > Sent: Thursday, December 22, 2011 10:59 AM
> > To: Iyer, Shyam
> > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> > scsi@vger.kernel.org
> > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> > missing?
> > 
> > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
> > 
> > [..]
> > 
> > > Simple asks -
> > > 1) Provide a consistent storage and fs management library that
> > discourages folks to write their own usespace storage library. Include
> > things like fs formatting(fs profiles), transport configuration(eg:
> > iscsiadm as a library), thin provisioning watermarks, cluster
> > management, apis for cgroups etc.
> >                                       ^^^^^^^^^^^^^^^^
> > For cgroups, we have libcgroup library. Not many people like to use it
> > though as cgroup is exported as a filesystem and they prefer to use
> > normal
> > libc api to traverse and configure cgroups (Instead of going through
> > another library). Some examples include libvrit, systemd.
> > 
> > Thanks
> > Vivek
> 
> Well honestly I think that is a libvirt/systemd issue and libvirt also invokes things like iscsiadm, dcb etc as a binary :-/
> 
> Some one could always use qemu command lines to invoke KVM/XEN but libvirt has saved me one too many days in doing a quick operation without wondering about a qemu commandline.
>  
> I am also asking for ideas on how to avoid this fragmentation because just like libvirt others are also encouraged to do their own libc thing in the absence of a common storage management framework..
> 
> Does the standard interface for linux end at the user/kernel boundary or the user/libc boundary? If so I feel we would continue to lag behind other OSes in features because of the model. 

This is true only for IO cgroup management. There is not much to be done. For
basic management, an applicatoin can just write 500 lines of code and be
done with it.

libcgroup does offer bunch of commnad lines operations too.

Do you have something in mind, what applications expect out of a IO cgroup
library and what other OSes are supporting. Don't extend this libc thing
to iscsi, and other storage management requirements.

Thanks
Vivek 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-23  3:06       ` Vivek Goyal
@ 2011-12-23  4:35           ` Shyam_Iyer
  0 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-23  4:35 UTC (permalink / raw)
  To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi

> -----Original Message-----
> From: Vivek Goyal [mailto:vgoyal@redhat.com]
> Sent: Thursday, December 22, 2011 10:07 PM
> To: Iyer, Shyam
> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> On Fri, Dec 23, 2011 at 02:24:42AM +0530, Shyam_Iyer@Dell.com wrote:
> >
> >
> > > -----Original Message-----
> > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> > > owner@vger.kernel.org] On Behalf Of Vivek Goyal
> > > Sent: Thursday, December 22, 2011 10:59 AM
> > > To: Iyer, Shyam
> > > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> > > scsi@vger.kernel.org
> > > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what's
> > > missing?
> > >
> > > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com
> wrote:
> > >
> > > [..]
> > >
> > > > Simple asks -
> > > > 1) Provide a consistent storage and fs management library that
> > > discourages folks to write their own usespace storage library.
> Include
> > > things like fs formatting(fs profiles), transport configuration(eg:
> > > iscsiadm as a library), thin provisioning watermarks, cluster
> > > management, apis for cgroups etc.
> > >                                       ^^^^^^^^^^^^^^^^
> > > For cgroups, we have libcgroup library. Not many people like to use
> it
> > > though as cgroup is exported as a filesystem and they prefer to use
> > > normal
> > > libc api to traverse and configure cgroups (Instead of going
> through
> > > another library). Some examples include libvrit, systemd.
> > >
> > > Thanks
> > > Vivek
> >
> > Well honestly I think that is a libvirt/systemd issue and libvirt
> also invokes things like iscsiadm, dcb etc as a binary :-/
> >
> > Some one could always use qemu command lines to invoke KVM/XEN but
> libvirt has saved me one too many days in doing a quick operation
> without wondering about a qemu commandline.
> >
> > I am also asking for ideas on how to avoid this fragmentation because
> just like libvirt others are also encouraged to do their own libc thing
> in the absence of a common storage management framework..
> >
> > Does the standard interface for linux end at the user/kernel boundary
> or the user/libc boundary? If so I feel we would continue to lag behind
> other OSes in features because of the model.
> 
> This is true only for IO cgroup management. There is not much to be
> done. For
> basic management, an applicatoin can just write 500 lines of code and
> be
> done with it.
> 
> libcgroup does offer bunch of commnad lines operations too.
> 
> Do you have something in mind, what applications expect out of a IO
> cgroup
> library and what other OSes are supporting. Don't extend this libc
> thing
> to iscsi, and other storage management requirements.

Sorry Vivek but that is just one of the points in my original post..

I think I am providing points on how to improve linux as a storage server and therefore I don't want to restrict the discussion to io cgroup alone. 

The problem is a lack of framework that  looks somewhat like this.. (I hope the formatting is preserved )



	Fs management	IO cgroup management	Monitoring Apis	      HBA Management			Thin provisioning
	Fs_create()		bw				log_dump			HBA Apis					Watermark
	Snap_create						scsi_log				\					\
								fs_log				 hba_create				 High
	...							io_log				\					\
	...												 iscsi_session_create		 Alarms/Notifications
	|			|				|					\
	|			|				|					 fc_login

------------------------------------------------------------Storage API-----------------------------------------------------------




Double clicking on FS Management
VFS
\
 Fs_clone
\
 Copy Offload
-------------------------			
|		|		|
|		|		|
Ext 4		btrfs		nfs


For eg: If ext4 did not support discard it would return an error with one of the fs apis.

Unless you have a unified interface there won't be uniformity of features or coordination to make linux a complete storage server with apps that do things at the higher layer. 

Today if xfs supports copy offload it would be an intuitive guess on the part of the app running on top of it.
If you want to create thin provisioning water marks possibly  snapshot management is a different tool.


I think the reason libvirt has so many features built in quickly is because of the modular architecture and clear interface it provides for creating and managing virtual machines.

Today if I had to write a virtualization management tool with libvirt as the back end for VM management it is really a question of whether there is an API support or not..

So.. in that sense linux has matured as a virtualization server. I don't think I can do that for storage..

Now while we are talking about storage servers it is important to note if we are talking about servers with local storage or attached remote storage and so I added transport management apis to the framework along with the hba management apis..

Which is why I think storage management could get a better interface than just libc/ ioctl calls/sysfs operations


On your question on what applications would like out of io cgroups..

I think that is for a different discussion.. ;-) But nevertheless here is something I was thinking..

Apis around the following.. there can be more..
-io timeouts, driver/controller, scsi, blk, fs etc
-io scheduler tunings, io_delay, read_ahead,
-bw profile
-IOPs stats
-io resource monitoring - Eg:  spindle movement per application run. This will help in data placement.

For a storage server folks do a lot of tuning to qualify it. Having apis to manage these and auto calibrate them with application performance load would be awesome..



^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
@ 2011-12-23  4:35           ` Shyam_Iyer
  0 siblings, 0 replies; 50+ messages in thread
From: Shyam_Iyer @ 2011-12-23  4:35 UTC (permalink / raw)
  To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi

> -----Original Message-----
> From: Vivek Goyal [mailto:vgoyal@redhat.com]
> Sent: Thursday, December 22, 2011 10:07 PM
> To: Iyer, Shyam
> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> On Fri, Dec 23, 2011 at 02:24:42AM +0530, Shyam_Iyer@Dell.com wrote:
> >
> >
> > > -----Original Message-----
> > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> > > owner@vger.kernel.org] On Behalf Of Vivek Goyal
> > > Sent: Thursday, December 22, 2011 10:59 AM
> > > To: Iyer, Shyam
> > > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> > > scsi@vger.kernel.org
> > > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what's
> > > missing?
> > >
> > > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com
> wrote:
> > >
> > > [..]
> > >
> > > > Simple asks -
> > > > 1) Provide a consistent storage and fs management library that
> > > discourages folks to write their own usespace storage library.
> Include
> > > things like fs formatting(fs profiles), transport configuration(eg:
> > > iscsiadm as a library), thin provisioning watermarks, cluster
> > > management, apis for cgroups etc.
> > >                                       ^^^^^^^^^^^^^^^^
> > > For cgroups, we have libcgroup library. Not many people like to use
> it
> > > though as cgroup is exported as a filesystem and they prefer to use
> > > normal
> > > libc api to traverse and configure cgroups (Instead of going
> through
> > > another library). Some examples include libvrit, systemd.
> > >
> > > Thanks
> > > Vivek
> >
> > Well honestly I think that is a libvirt/systemd issue and libvirt
> also invokes things like iscsiadm, dcb etc as a binary :-/
> >
> > Some one could always use qemu command lines to invoke KVM/XEN but
> libvirt has saved me one too many days in doing a quick operation
> without wondering about a qemu commandline.
> >
> > I am also asking for ideas on how to avoid this fragmentation because
> just like libvirt others are also encouraged to do their own libc thing
> in the absence of a common storage management framework..
> >
> > Does the standard interface for linux end at the user/kernel boundary
> or the user/libc boundary? If so I feel we would continue to lag behind
> other OSes in features because of the model.
> 
> This is true only for IO cgroup management. There is not much to be
> done. For
> basic management, an applicatoin can just write 500 lines of code and
> be
> done with it.
> 
> libcgroup does offer bunch of commnad lines operations too.
> 
> Do you have something in mind, what applications expect out of a IO
> cgroup
> library and what other OSes are supporting. Don't extend this libc
> thing
> to iscsi, and other storage management requirements.

Sorry Vivek but that is just one of the points in my original post..

I think I am providing points on how to improve linux as a storage server and therefore I don't want to restrict the discussion to io cgroup alone. 

The problem is a lack of framework that  looks somewhat like this.. (I hope the formatting is preserved )



	Fs management	IO cgroup management	Monitoring Apis	      HBA Management			Thin provisioning
	Fs_create()		bw				log_dump			HBA Apis					Watermark
	Snap_create						scsi_log				\					\
								fs_log				 hba_create				 High
	...							io_log				\					\
	...												 iscsi_session_create		 Alarms/Notifications
	|			|				|					\
	|			|				|					 fc_login

------------------------------------------------------------Storage API-----------------------------------------------------------




Double clicking on FS Management
VFS
\
 Fs_clone
\
 Copy Offload
-------------------------			
|		|		|
|		|		|
Ext 4		btrfs		nfs


For eg: If ext4 did not support discard it would return an error with one of the fs apis.

Unless you have a unified interface there won't be uniformity of features or coordination to make linux a complete storage server with apps that do things at the higher layer. 

Today if xfs supports copy offload it would be an intuitive guess on the part of the app running on top of it.
If you want to create thin provisioning water marks possibly  snapshot management is a different tool.


I think the reason libvirt has so many features built in quickly is because of the modular architecture and clear interface it provides for creating and managing virtual machines.

Today if I had to write a virtualization management tool with libvirt as the back end for VM management it is really a question of whether there is an API support or not..

So.. in that sense linux has matured as a virtualization server. I don't think I can do that for storage..

Now while we are talking about storage servers it is important to note if we are talking about servers with local storage or attached remote storage and so I added transport management apis to the framework along with the hba management apis..

Which is why I think storage management could get a better interface than just libc/ ioctl calls/sysfs operations


On your question on what applications would like out of io cgroups..

I think that is for a different discussion.. ;-) But nevertheless here is something I was thinking..

Apis around the following.. there can be more..
-io timeouts, driver/controller, scsi, blk, fs etc
-io scheduler tunings, io_delay, read_ahead,
-bw profile
-IOPs stats
-io resource monitoring - Eg:  spindle movement per application run. This will help in data placement.

For a storage server folks do a lot of tuning to qualify it. Having apis to manage these and auto calibrate them with application performance load would be awesome..



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler
  2011-12-22  8:14   ` Shyam_Iyer
@ 2012-01-03 19:26 ` Jeff Layton
  2012-01-03 19:32   ` Chuck Lever
  2012-01-24 21:36   ` J. Bruce Fields
  2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier
  2 siblings, 2 replies; 50+ messages in thread
From: Jeff Layton @ 2012-01-03 19:26 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi

On Wed, 21 Dec 2011 10:59:43 -0500
Ric Wheeler <rwheeler@redhat.com> wrote:

> 
> One common thing that I see a lot of these days is an increasing number of 
> platforms that are built on our stack as storage servers. Ranging from the 
> common linux based storage/NAS devices up to various distributed systems.  
> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
> 
> At last year's SNIA developers conference, it was clear that Microsoft is 
> putting a lot of effort into enhancing windows 8 server as a storage server with 
> both support for a pNFS server and of course SMB. I think that linux (+samba) is 
> ahead of the windows based storage appliances today, but they are putting 
> together a very aggressive list of features.
> 
> I think that it would be useful and interesting to take a slot at this year's 
> LSF to see how we are doing in this space. How large do we need to scale for an 
> appliance?  What kind of work is needed (support for the copy offload system 
> call? better support for out of band notifications like those used in "thinly 
> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
> 
> The goal would be to see what technical gaps we have that need more active 
> development in, not just a wish list :)
> 
> Ric

Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
more active development ;).

While HCH will probably disagree, being able to support more
NFSv4/Windows API features at the VFS layer would make it a lot easier
to do a more unified serving appliance. Right now, both knfsd and samba
track too much info internally, and that makes it very difficult to
serve the same data via multiple protocols.

Off the top of my head, my "wishlist" for better NFSv4 serving would be:

- RichACLs
- Share/Deny mode support on open
- mandatory locking that doesn't rely on weirdo file modes

It's always going to be hard for us to compete with dedicated
appliances. Where Linux can shine though is in allowing for more
innovative combinations.

Being able to do active/active NFS serving from clustered filesystems,
for instance is something that we can eventually attain but that would
be harder to do in an appliance. This sort of discussion might also
dovetail with Benny's proposal about pNFS serving.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton
@ 2012-01-03 19:32   ` Chuck Lever
  2012-01-17 21:16     ` J. Bruce Fields
  2012-01-24 21:36   ` J. Bruce Fields
  1 sibling, 1 reply; 50+ messages in thread
From: Chuck Lever @ 2012-01-03 19:32 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi


On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote:

> On Wed, 21 Dec 2011 10:59:43 -0500
> Ric Wheeler <rwheeler@redhat.com> wrote:
> 
>> 
>> One common thing that I see a lot of these days is an increasing number of 
>> platforms that are built on our stack as storage servers. Ranging from the 
>> common linux based storage/NAS devices up to various distributed systems.  
>> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
>> 
>> At last year's SNIA developers conference, it was clear that Microsoft is 
>> putting a lot of effort into enhancing windows 8 server as a storage server with 
>> both support for a pNFS server and of course SMB. I think that linux (+samba) is 
>> ahead of the windows based storage appliances today, but they are putting 
>> together a very aggressive list of features.
>> 
>> I think that it would be useful and interesting to take a slot at this year's 
>> LSF to see how we are doing in this space. How large do we need to scale for an 
>> appliance?  What kind of work is needed (support for the copy offload system 
>> call? better support for out of band notifications like those used in "thinly 
>> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
>> 
>> The goal would be to see what technical gaps we have that need more active 
>> development in, not just a wish list :)
>> 
>> Ric
> 
> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
> more active development ;).
> 
> While HCH will probably disagree, being able to support more
> NFSv4/Windows API features at the VFS layer would make it a lot easier
> to do a more unified serving appliance. Right now, both knfsd and samba
> track too much info internally, and that makes it very difficult to
> serve the same data via multiple protocols.
> 
> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
> 
> - RichACLs
> - Share/Deny mode support on open
> - mandatory locking that doesn't rely on weirdo file modes

To add a few more NFSv4 related items:

 - Simplified ID mapping and security configuration
 - Support for NFSv4 migration and replication
 - Better server observability (for operational and performance debugging in the field)
 - FedFS and NFS basic junctions (already under way)

> It's always going to be hard for us to compete with dedicated
> appliances. Where Linux can shine though is in allowing for more
> innovative combinations.
> 
> Being able to do active/active NFS serving from clustered filesystems,
> for instance is something that we can eventually attain but that would
> be harder to do in an appliance. This sort of discussion might also
> dovetail with Benny's proposal about pNFS serving.
> 
> -- 
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-22 20:54       ` Shyam_Iyer
  (?)
  (?)
@ 2012-01-09 12:18       ` Hannes Reinecke
  2012-01-09 12:59         ` Tom Coughlan
  -1 siblings, 1 reply; 50+ messages in thread
From: Hannes Reinecke @ 2012-01-09 12:18 UTC (permalink / raw)
  To: Shyam_Iyer; +Cc: vgoyal, rwheeler, linux-fsdevel, linux-scsi

On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote:
> 
> 
>> -----Original Message-----
>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>> owner@vger.kernel.org] On Behalf Of Vivek Goyal
>> Sent: Thursday, December 22, 2011 10:59 AM
>> To: Iyer, Shyam
>> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
>> scsi@vger.kernel.org
>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
>> missing?
>>
>> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
>>
>> [..]
>>
>>> Simple asks -
>>> 1) Provide a consistent storage and fs management library that
>> discourages folks to write their own usespace storage library. Include
>> things like fs formatting(fs profiles), transport configuration(eg:
>> iscsiadm as a library), thin provisioning watermarks, cluster
>> management, apis for cgroups etc.
>>                                       ^^^^^^^^^^^^^^^^
>> For cgroups, we have libcgroup library. Not many people like to use it
>> though as cgroup is exported as a filesystem and they prefer to use
>> normal
>> libc api to traverse and configure cgroups (Instead of going through
>> another library). Some examples include libvrit, systemd.
>>
>> Thanks
>> Vivek
> 
> Well honestly I think that is a libvirt/systemd issue and libvirt also
> invokes things like iscsiadm, dcb etc as a binary :-/
> 
> Some one could always use qemu command lines to invoke KVM/XEN but
> libvirt has saved me one too many days in doing a quick operation
> without wondering about a qemu commandline.
>  
> I am also asking for ideas on how to avoid this fragmentation because
> just like libvirt others are also encouraged to do their own libc
thing
> in the absence of a common storage management framework..
> 
> Does the standard interface for linux end at the user/kernel boundary
> or the user/libc boundary? If so I feel we would continue to lag
behind
> other OSes in features because of the model.
> 
StorageAPI _again_.

I was under the impression RH had someone working on it.
(Actually I was trying to give it a go, but then got buried under
customer escalations).

So yes, we know there is a shortcoming.
And yes, we should improve things.

But I feel another discussion about this will only give us more
insight, but not moving things forward.

What about having a separate session at the storage summit (or even
at the collab summit) to hammer out the requirements here?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-09 12:18       ` Hannes Reinecke
@ 2012-01-09 12:59         ` Tom Coughlan
  2012-01-10  6:53           ` Ric Wheeler
  2012-01-19 16:17             ` Loke, Chetan
  0 siblings, 2 replies; 50+ messages in thread
From: Tom Coughlan @ 2012-01-09 12:59 UTC (permalink / raw)
  To: Hannes Reinecke, tasleson
  Cc: Shyam_Iyer, vgoyal, rwheeler, linux-fsdevel, linux-scsi

On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote:
> On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote:
> > 
> > 
> >> -----Original Message-----
> >> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> >> owner@vger.kernel.org] On Behalf Of Vivek Goyal
> >> Sent: Thursday, December 22, 2011 10:59 AM
> >> To: Iyer, Shyam
> >> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
> >> scsi@vger.kernel.org
> >> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> >> missing?
> >>
> >> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
> >>
> >> [..]
> >>
> >>> Simple asks -
> >>> 1) Provide a consistent storage and fs management library that
> >> discourages folks to write their own usespace storage library. Include
> >> things like fs formatting(fs profiles), transport configuration(eg:
> >> iscsiadm as a library), thin provisioning watermarks, cluster
> >> management, apis for cgroups etc.
> >>                                       ^^^^^^^^^^^^^^^^
> >> For cgroups, we have libcgroup library. Not many people like to use it
> >> though as cgroup is exported as a filesystem and they prefer to use
> >> normal
> >> libc api to traverse and configure cgroups (Instead of going through
> >> another library). Some examples include libvrit, systemd.
> >>
> >> Thanks
> >> Vivek
> > 
> > Well honestly I think that is a libvirt/systemd issue and libvirt also
> > invokes things like iscsiadm, dcb etc as a binary :-/
> > 
> > Some one could always use qemu command lines to invoke KVM/XEN but
> > libvirt has saved me one too many days in doing a quick operation
> > without wondering about a qemu commandline.
> >  
> > I am also asking for ideas on how to avoid this fragmentation because
> > just like libvirt others are also encouraged to do their own libc
> thing
> > in the absence of a common storage management framework..
> > 
> > Does the standard interface for linux end at the user/kernel boundary
> > or the user/libc boundary? If so I feel we would continue to lag
> behind
> > other OSes in features because of the model.
> > 
> StorageAPI _again_.
> 
> I was under the impression RH had someone working on it.

Yes, Red Hat does. Tony Asleson. libStorageMgmt:

http://sourceforge.net/apps/trac/libstoragemgmt

The current focus is on managing external storage (SMI-S, etc.). This
focus can be expanded over time. Contributions welcome.  

> (Actually I was trying to give it a go, but then got buried under
> customer escalations).
> 
> So yes, we know there is a shortcoming.
> And yes, we should improve things.
> 
> But I feel another discussion about this will only give us more
> insight, but not moving things forward.
> 
> What about having a separate session at the storage summit (or even
> at the collab summit) to hammer out the requirements here?

That would be fine, although as you say, we need more than talk.

Tom  



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-09 12:59         ` Tom Coughlan
@ 2012-01-10  6:53           ` Ric Wheeler
  2012-01-20  8:55             ` Hannes Reinecke
  2012-01-19 16:17             ` Loke, Chetan
  1 sibling, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-10  6:53 UTC (permalink / raw)
  To: Tom Coughlan
  Cc: Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi

On 01/09/2012 02:59 PM, Tom Coughlan wrote:
> On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote:
>> On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote:
>>>
>>>> -----Original Message-----
>>>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>>>> owner@vger.kernel.org] On Behalf Of Vivek Goyal
>>>> Sent: Thursday, December 22, 2011 10:59 AM
>>>> To: Iyer, Shyam
>>>> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
>>>> scsi@vger.kernel.org
>>>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
>>>> missing?
>>>>
>>>> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote:
>>>>
>>>> [..]
>>>>
>>>>> Simple asks -
>>>>> 1) Provide a consistent storage and fs management library that
>>>> discourages folks to write their own usespace storage library. Include
>>>> things like fs formatting(fs profiles), transport configuration(eg:
>>>> iscsiadm as a library), thin provisioning watermarks, cluster
>>>> management, apis for cgroups etc.
>>>>                                        ^^^^^^^^^^^^^^^^
>>>> For cgroups, we have libcgroup library. Not many people like to use it
>>>> though as cgroup is exported as a filesystem and they prefer to use
>>>> normal
>>>> libc api to traverse and configure cgroups (Instead of going through
>>>> another library). Some examples include libvrit, systemd.
>>>>
>>>> Thanks
>>>> Vivek
>>> Well honestly I think that is a libvirt/systemd issue and libvirt also
>>> invokes things like iscsiadm, dcb etc as a binary :-/
>>>
>>> Some one could always use qemu command lines to invoke KVM/XEN but
>>> libvirt has saved me one too many days in doing a quick operation
>>> without wondering about a qemu commandline.
>>>
>>> I am also asking for ideas on how to avoid this fragmentation because
>>> just like libvirt others are also encouraged to do their own libc
>> thing
>>> in the absence of a common storage management framework..
>>>
>>> Does the standard interface for linux end at the user/kernel boundary
>>> or the user/libc boundary? If so I feel we would continue to lag
>> behind
>>> other OSes in features because of the model.
>>>
>> StorageAPI _again_.
>>
>> I was under the impression RH had someone working on it.
> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
>
> http://sourceforge.net/apps/trac/libstoragemgmt
>
> The current focus is on managing external storage (SMI-S, etc.). This
> focus can be expanded over time. Contributions welcome.
>
>> (Actually I was trying to give it a go, but then got buried under
>> customer escalations).
>>
>> So yes, we know there is a shortcoming.
>> And yes, we should improve things.
>>
>> But I feel another discussion about this will only give us more
>> insight, but not moving things forward.
>>
>> What about having a separate session at the storage summit (or even
>> at the collab summit) to hammer out the requirements here?
> That would be fine, although as you say, we need more than talk.
>
> Tom
>

Having a special session would be really a good idea - given the size of the 
discussion, we might want to do both a talk and a breakout at collab summit...

ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-03 19:32   ` Chuck Lever
@ 2012-01-17 21:16     ` J. Bruce Fields
  2012-01-17 21:25       ` Chuck Lever
  0 siblings, 1 reply; 50+ messages in thread
From: J. Bruce Fields @ 2012-01-17 21:16 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Ric Wheeler, linux-fsdevel, linux-scsi

On Tue, Jan 03, 2012 at 02:32:40PM -0500, Chuck Lever wrote:
> 
> On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote:
> 
> > On Wed, 21 Dec 2011 10:59:43 -0500
> > Ric Wheeler <rwheeler@redhat.com> wrote:
> > 
> >> 
> >> One common thing that I see a lot of these days is an increasing number of 
> >> platforms that are built on our stack as storage servers. Ranging from the 
> >> common linux based storage/NAS devices up to various distributed systems.  
> >> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
> >> 
> >> At last year's SNIA developers conference, it was clear that Microsoft is 
> >> putting a lot of effort into enhancing windows 8 server as a storage server with 
> >> both support for a pNFS server and of course SMB. I think that linux (+samba) is 
> >> ahead of the windows based storage appliances today, but they are putting 
> >> together a very aggressive list of features.
> >> 
> >> I think that it would be useful and interesting to take a slot at this year's 
> >> LSF to see how we are doing in this space. How large do we need to scale for an 
> >> appliance?  What kind of work is needed (support for the copy offload system 
> >> call? better support for out of band notifications like those used in "thinly 
> >> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
> >> 
> >> The goal would be to see what technical gaps we have that need more active 
> >> development in, not just a wish list :)
> >> 
> >> Ric
> > 
> > Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
> > more active development ;).
> > 
> > While HCH will probably disagree, being able to support more
> > NFSv4/Windows API features at the VFS layer would make it a lot easier
> > to do a more unified serving appliance. Right now, both knfsd and samba
> > track too much info internally, and that makes it very difficult to
> > serve the same data via multiple protocols.
> > 
> > Off the top of my head, my "wishlist" for better NFSv4 serving would be:
> > 
> > - RichACLs
> > - Share/Deny mode support on open
> > - mandatory locking that doesn't rely on weirdo file modes
> 
> To add a few more NFSv4 related items:
> 
>  - Simplified ID mapping

What are you thinking of here?

--b.

>    and security configuration
>  - Support for NFSv4 migration and replication
>  - Better server observability (for operational and performance debugging in the field)
>  - FedFS and NFS basic junctions (already under way)
> 
> > It's always going to be hard for us to compete with dedicated
> > appliances. Where Linux can shine though is in allowing for more
> > innovative combinations.
> > 
> > Being able to do active/active NFS serving from clustered filesystems,
> > for instance is something that we can eventually attain but that would
> > be harder to do in an appliance. This sort of discussion might also
> > dovetail with Benny's proposal about pNFS serving.
> > 
> > -- 
> > Jeff Layton <jlayton@redhat.com>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-17 21:16     ` J. Bruce Fields
@ 2012-01-17 21:25       ` Chuck Lever
  0 siblings, 0 replies; 50+ messages in thread
From: Chuck Lever @ 2012-01-17 21:25 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jeff Layton, Ric Wheeler, linux-fsdevel, linux-scsi


On Jan 17, 2012, at 4:16 PM, J. Bruce Fields wrote:

> On Tue, Jan 03, 2012 at 02:32:40PM -0500, Chuck Lever wrote:
>> 
>> On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote:
>> 
>>> On Wed, 21 Dec 2011 10:59:43 -0500
>>> Ric Wheeler <rwheeler@redhat.com> wrote:
>>> 
>>>> 
>>>> One common thing that I see a lot of these days is an increasing number of 
>>>> platforms that are built on our stack as storage servers. Ranging from the 
>>>> common linux based storage/NAS devices up to various distributed systems.  
>>>> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
>>>> 
>>>> At last year's SNIA developers conference, it was clear that Microsoft is 
>>>> putting a lot of effort into enhancing windows 8 server as a storage server with 
>>>> both support for a pNFS server and of course SMB. I think that linux (+samba) is 
>>>> ahead of the windows based storage appliances today, but they are putting 
>>>> together a very aggressive list of features.
>>>> 
>>>> I think that it would be useful and interesting to take a slot at this year's 
>>>> LSF to see how we are doing in this space. How large do we need to scale for an 
>>>> appliance?  What kind of work is needed (support for the copy offload system 
>>>> call? better support for out of band notifications like those used in "thinly 
>>>> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
>>>> 
>>>> The goal would be to see what technical gaps we have that need more active 
>>>> development in, not just a wish list :)
>>>> 
>>>> Ric
>>> 
>>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
>>> more active development ;).
>>> 
>>> While HCH will probably disagree, being able to support more
>>> NFSv4/Windows API features at the VFS layer would make it a lot easier
>>> to do a more unified serving appliance. Right now, both knfsd and samba
>>> track too much info internally, and that makes it very difficult to
>>> serve the same data via multiple protocols.
>>> 
>>> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
>>> 
>>> - RichACLs
>>> - Share/Deny mode support on open
>>> - mandatory locking that doesn't rely on weirdo file modes
>> 
>> To add a few more NFSv4 related items:
>> 
>> - Simplified ID mapping
> 
> What are you thinking of here?
> 
> --b.
> 
>>   and security configuration

Trond has already made things easier for NFSv3 to NFSv4 transition by having the client send numeric UIDs and GIDs in idmap strings when servers can deal with that.

It would be even better if we had some kind of GUI like the "Users and Groups" tool that could combine the configuration of ID mapping and security configuration, and maybe provide some nice preset configurations (all local IDs, Kerberos only, LDAP, and so on).

This also needs to integrate well with network services like FreeIPA.  And it would probably need to work on both both NFS clients and servers, but what if we had some way of automatically configuring clients, on first contact with a server or realm, with a Kerberos keytab and the correct ID mapping and security set up?

>> - Support for NFSv4 migration and replication
>> - Better server observability (for operational and performance debugging in the field)
>> - FedFS and NFS basic junctions (already under way)
>> 
>>> It's always going to be hard for us to compete with dedicated
>>> appliances. Where Linux can shine though is in allowing for more
>>> innovative combinations.
>>> 
>>> Being able to do active/active NFS serving from clustered filesystems,
>>> for instance is something that we can eventually attain but that would
>>> be harder to do in an appliance. This sort of discussion might also
>>> dovetail with Benny's proposal about pNFS serving.
>>> 
>>> -- 
>>> Jeff Layton <jlayton@redhat.com>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> -- 
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>> 
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler
  2011-12-22  8:14   ` Shyam_Iyer
  2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton
@ 2012-01-18 17:00 ` Roland Dreier
  2012-01-18 17:51   ` Ric Wheeler
  2 siblings, 1 reply; 50+ messages in thread
From: Roland Dreier @ 2012-01-18 17:00 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi

On Wed, Dec 21, 2011 at 7:59 AM, Ric Wheeler <rwheeler@redhat.com> wrote:
> One common thing that I see a lot of these days is an increasing number of
> platforms that are built on our stack as storage servers. Ranging from the
> common linux based storage/NAS devices up to various distributed systems.
>  Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and
> samba.
>
> At last year's SNIA developers conference, it was clear that Microsoft is
> putting a lot of effort into enhancing windows 8 server as a storage server
> with both support for a pNFS server and of course SMB. I think that linux
> (+samba) is ahead of the windows based storage appliances today, but they
> are putting together a very aggressive list of features.
>
> I think that it would be useful and interesting to take a slot at this
> year's LSF to see how we are doing in this space. How large do we need to
> scale for an appliance?  What kind of work is needed (support for the copy
> offload system call? better support for out of band notifications like those
> used in "thinly provisioned" SCSI devices? management API's? Ease of use CLI
> work? SMB2.2 support?).
>
> The goal would be to see what technical gaps we have that need more active
> development in, not just a wish list :)

I see a technical gap in the robustness of our basic SCSI/block stack.  In a
pretty standard low to midrange setup, ie standard server with a couple of SAS
HBAs connected to an external SAS JBOD, it's quite easy to run into problems
like oopses or other issues that kill the whole system, even from faults that
should affect only part of the system.  For example losing one path to the JBOD,
or losing one drive, or having a SCSI reservation conflict can lead to the whole
system crashing.

Which is not good for an HA storage server built on redundant hardware.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier
@ 2012-01-18 17:51   ` Ric Wheeler
  2012-01-18 18:46     ` Roland Dreier
  0 siblings, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-18 17:51 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-fsdevel, linux-scsi

On 01/18/2012 12:00 PM, Roland Dreier wrote:
> On Wed, Dec 21, 2011 at 7:59 AM, Ric Wheeler<rwheeler@redhat.com>  wrote:
>> One common thing that I see a lot of these days is an increasing number of
>> platforms that are built on our stack as storage servers. Ranging from the
>> common linux based storage/NAS devices up to various distributed systems.
>>   Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and
>> samba.
>>
>> At last year's SNIA developers conference, it was clear that Microsoft is
>> putting a lot of effort into enhancing windows 8 server as a storage server
>> with both support for a pNFS server and of course SMB. I think that linux
>> (+samba) is ahead of the windows based storage appliances today, but they
>> are putting together a very aggressive list of features.
>>
>> I think that it would be useful and interesting to take a slot at this
>> year's LSF to see how we are doing in this space. How large do we need to
>> scale for an appliance?  What kind of work is needed (support for the copy
>> offload system call? better support for out of band notifications like those
>> used in "thinly provisioned" SCSI devices? management API's? Ease of use CLI
>> work? SMB2.2 support?).
>>
>> The goal would be to see what technical gaps we have that need more active
>> development in, not just a wish list :)
> I see a technical gap in the robustness of our basic SCSI/block stack.  In a
> pretty standard low to midrange setup, ie standard server with a couple of SAS
> HBAs connected to an external SAS JBOD, it's quite easy to run into problems
> like oopses or other issues that kill the whole system, even from faults that
> should affect only part of the system.  For example losing one path to the JBOD,
> or losing one drive, or having a SCSI reservation conflict can lead to the whole
> system crashing.
>
> Which is not good for an HA storage server built on redundant hardware.
>
>   - R.

Why would you crash is you have device mapper multipath configured to handle 
path fail over? We have tons of enterprise customers that use that...

On the broader topic of error handling and so on, I do agree that is always an 
area of concern (how many times to retry, how long time outs need to be, when to 
panic/reboot or propagate up an error code)

ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 17:51   ` Ric Wheeler
@ 2012-01-18 18:46     ` Roland Dreier
  2012-01-18 18:51       ` Bart Van Assche
  2012-01-19 17:50         ` Loke, Chetan
  0 siblings, 2 replies; 50+ messages in thread
From: Roland Dreier @ 2012-01-18 18:46 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi

> Why would you crash is you have device mapper multipath configured to handle
> path fail over? We have tons of enterprise customers that use that...

cf http://www.spinics.net/lists/linux-scsi/msg56254.html

Basically hot unplug of an sdX can oops on any recent kernel, no
matter what dm stuff you have on top.

> On the broader topic of error handling and so on, I do agree that is always
> an area of concern (how many times to retry, how long time outs need to be,
> when to panic/reboot or propagate up an error code)

Yes, especially the scsi eh stuff escalating to a host reset when
a single drive has gone bad -- even if the HBA is happily doing IO
to other drives, we'll kill access to the whole SAS fabric.

- R.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 18:46     ` Roland Dreier
@ 2012-01-18 18:51       ` Bart Van Assche
  2012-01-18 19:00         ` Roland Dreier
  2012-01-19  8:16         ` Rolf Eike Beer
  2012-01-19 17:50         ` Loke, Chetan
  1 sibling, 2 replies; 50+ messages in thread
From: Bart Van Assche @ 2012-01-18 18:51 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi

On Wed, Jan 18, 2012 at 6:46 PM, Roland Dreier <roland@purestorage.com> wrote:
> > Why would you crash is you have device mapper multipath configured to handle
> > path fail over? We have tons of enterprise customers that use that...
>
> cf http://www.spinics.net/lists/linux-scsi/msg56254.html
>
> Basically hot unplug of an sdX can oops on any recent kernel, no
> matter what dm stuff you have on top.
>
> > On the broader topic of error handling and so on, I do agree that is always
> > an area of concern (how many times to retry, how long time outs need to be,
> > when to panic/reboot or propagate up an error code)
>
> Yes, especially the scsi eh stuff escalating to a host reset when
> a single drive has gone bad -- even if the HBA is happily doing IO
> to other drives, we'll kill access to the whole SAS fabric.

With which SCSI low-level diver does that occur and how does the call
stack look like ? I haven't encountered any such issues while testing
the srp-ha patch set. However, I have to admit that the issues
mentioned in the description of commit 3308511 were discovered while
testing the srp-ha patch set.

Bart.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 18:51       ` Bart Van Assche
@ 2012-01-18 19:00         ` Roland Dreier
  2012-01-19  8:16         ` Rolf Eike Beer
  1 sibling, 0 replies; 50+ messages in thread
From: Roland Dreier @ 2012-01-18 19:00 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi

On Wed, Jan 18, 2012 at 10:51 AM, Bart Van Assche <bvanassche@acm.org> wrote:
>> cf http://www.spinics.net/lists/linux-scsi/msg56254.html

> With which SCSI low-level diver does that occur and how does the call
> stack look like ? I haven't encountered any such issues while testing
> the srp-ha patch set. However, I have to admit that the issues
> mentioned in the description of commit 3308511 were discovered while
> testing the srp-ha patch set.

I assume you're asking about the crash on hot unplug...

Anyway, I've seen it with mp2tsas, looking like

 [<ffffffff8130078b>] scsi_setup_blk_pc_cmnd+0x2b/0x170
 [<ffffffff8130091d>] scsi_prep_fn+0x4d/0x60
 [<ffffffff81227794>] blk_peek_request+0xb4/0x240
 [<ffffffff813010cd>] scsi_request_fn+0x47d/0x4b0
 [<ffffffff81223bff>] __blk_run_queue+0x1f/0x80
 [<ffffffff8122b808>] blk_execute_rq_nowait+0x68/0xc0

in my case I think there is some SG_IO stuff in flight
for SCSI reservations; the cdrom case I linked to also
has scsi_setup_blk_pc_cmnd in the trace and mentions
an ioctl for medium presence.  So maybe regular block
IO is OK now but not non-IO commands.

 - R.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 18:51       ` Bart Van Assche
  2012-01-18 19:00         ` Roland Dreier
@ 2012-01-19  8:16         ` Rolf Eike Beer
  1 sibling, 0 replies; 50+ messages in thread
From: Rolf Eike Beer @ 2012-01-19  8:16 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Roland Dreier, Ric Wheeler, linux-fsdevel, linux-scsi

> On Wed, Jan 18, 2012 at 6:46 PM, Roland Dreier <roland@purestorage.com>
> wrote:
>> > Why would you crash is you have device mapper multipath configured to
>> handle
>> > path fail over? We have tons of enterprise customers that use that...
>>
>> cf http://www.spinics.net/lists/linux-scsi/msg56254.html
>>
>> Basically hot unplug of an sdX can oops on any recent kernel, no
>> matter what dm stuff you have on top.
>>
>> > On the broader topic of error handling and so on, I do agree that is
>> always
>> > an area of concern (how many times to retry, how long time outs need
>> to be,
>> > when to panic/reboot or propagate up an error code)
>>
>> Yes, especially the scsi eh stuff escalating to a host reset when
>> a single drive has gone bad -- even if the HBA is happily doing IO
>> to other drives, we'll kill access to the whole SAS fabric.
>
> With which SCSI low-level diver does that occur and how does the call
> stack look like ? I haven't encountered any such issues while testing
> the srp-ha patch set. However, I have to admit that the issues
> mentioned in the description of commit 3308511 were discovered while
> testing the srp-ha patch set.

Likely unrelated to the stuff above, but this has happened for me. I was
changing the USB devices while sending the machine to s2disk and this was
what it came up with on resume:

[91794.875373] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[91794.875385] IP: [<ffffffff813c46c1>] sd_revalidate_disk+0x31/0x320
[91794.875396] PGD 3fe33f067 PUD 3fff84067 PMD 0
[91794.875403] Oops: 0000 [#1] PREEMPT SMP
[91794.875410] CPU 7
[91794.875412] Modules linked in: autofs4 fuse ip6t_LOG xt_tcpudp
xt_pkttype ipt_LOG xt_limit af_packet edd ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 ip6table_raw xt_NO
TRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4
nf_defrag_ipv4 ip_tables xt_conntrack nf_connt
rack ip6table_filter ip6_tables x_tables snd_pcm_oss snd_mixer_oss snd_seq
snd_seq_device cpufreq_conservative cpufreq_userspace cpufreq_powersave
acpi_cpufreq mperf snd_h
da_codec_hdmi snd_hda_codec_realtek pl2303 usbserial kvm_intel kvm
snd_hda_intel e1000e snd_hda_codec iTCO_wdt shpchp mei(C) xhci_hcd
i2c_i801 pci_hotplug iTCO_vendor_supp
ort snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc sr_mod cdrom
sg serio_raw pcspkr linear raid456 async_raid6_recov async_pq raid6_pq
async_xor xor async_memcpy
 async_tx raid10 raid1 raid0 i915 drm_kms_helper drm i2c_algo_bit button
video dm_snapshot dm_mod fan processor thermal thermal_sys pata_amd
ata_generic sata_nv [last unlo
aded: preloadtrace]
[91794.875522]
[91794.875525] Pid: 5242, comm: udisks-daemon Tainted: G         C 
3.1.0-46-desktop #1                  /DH67CL
[91794.875534] RIP: 0010:[<ffffffff813c46c1>]  [<ffffffff813c46c1>]
sd_revalidate_disk+0x31/0x320
[91794.875543] RSP: 0018:ffff88040399dbb8  EFLAGS: 00010293
[91794.875547] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000001
[91794.875552] RDX: ffff8803fa9ba740 RSI: ffff8803fa9ba760 RDI:
ffff8800d3975c00
[91794.875557] RBP: ffff8800d3975c00 R08: ffff88040399db84 R09:
ffff8803fb546400
[91794.875561] R10: 0000000000000001 R11: 0000000000000001 R12:
00000000ffffff85
[91794.875565] R13: ffff88041efcb818 R14: ffff8800d3975c00 R15:
ffff88040399dc08
[91794.875718] FS:  00007fb7921067a0(0000) GS:ffff88041fbc0000(0000)
knlGS:0000000000000000
[91794.875863] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[91794.876016] CR2: 0000000000000008 CR3: 00000003fe33e000 CR4:
00000000000406e0
[91794.876172] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[91794.876321] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[91794.876473] Process udisks-daemon (pid: 5242, threadinfo
ffff88040399c000, task ffff8804035fa500)
[91794.876596] done.
[91794.876772] Stack:
[91794.876774]  ffff88040399dc08 ffff88041efcb800 0000000000000000
00000000ffffff85
[91794.876777]  ffff88041efcb818 ffffffff811c7a98 ffff88041efcb800
000000001efcb800
[91794.876779]  ffff8800d3975c78 ffff8800d3975c0c ffff8800d3975c00
0000000000000000
[91794.876782] Call Trace:
[91794.876791]  [<ffffffff811c7a98>] rescan_partitions+0xa8/0x320
[91794.876797]  [<ffffffff811928ee>] __blkdev_get+0x2be/0x420
[91794.876802]  [<ffffffff81192ab2>] blkdev_get+0x62/0x2d0
[91794.876807]  [<ffffffff81159ffa>] __dentry_open+0x23a/0x3f0
[91794.876812]  [<ffffffff8116b668>] do_last+0x3f8/0x7b0
[91794.876816]  [<ffffffff8116bb4b>] path_openat+0xdb/0x400
[91794.876819]  [<ffffffff8116bedd>] do_filp_open+0x4d/0xc0
[91794.876823]  [<ffffffff8115b511>] do_sys_open+0x101/0x1e0
[91794.876827]  [<ffffffff815ae692>] system_call_fastpath+0x16/0x1b
[91794.876840]  [<00007fb79189fb20>] 0x7fb79189fb1f
[91794.876841] Code: 86 b0 9e 00 48 89 6c 24 10 48 89 5c 24 08 48 89 fd 4c
89 64 24 18 4c 89 6c 24 20 c1 e8 15 48 8b 9f 28 03 00 00 83 e0 07 83 f8 03
<4c> 8b 63 08 0f 87 8e 02 00 00 41 8b 84 24 50 06 00 00 31 d2 83
[91794.876857] RIP  [<ffffffff813c46c1>] sd_revalidate_disk+0x31/0x320
[91794.876860]  RSP <ffff88040399dbb8>
[91794.876861] CR2: 0000000000000008

Kernel is from openSuSE 12.1:

Linux devpool02 3.1.0-46-desktop #1 SMP PREEMPT Mon Oct 24 20:49:37 UTC
2011 (1cba112) x86_64 x86_64 x86_64 GNU/Linux

Greetings,

Eike

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-09 12:59         ` Tom Coughlan
@ 2012-01-19 16:17             ` Loke, Chetan
  2012-01-19 16:17             ` Loke, Chetan
  1 sibling, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 16:17 UTC (permalink / raw)
  To: Tom Coughlan, Hannes Reinecke, tasleson
  Cc: Shyam_Iyer, vgoyal, rwheeler, linux-fsdevel, linux-scsi

> From: linux-scsi-owner@vger.kernel.org
[mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
> 


> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
> 
> http://sourceforge.net/apps/trac/libstoragemgmt
> 
> The current focus is on managing external storage (SMI-S, etc.). This
> focus can be expanded over time. Contributions welcome.
> 

Device management(scalability/feature mgmt) - 

1) scalability:
I may be wrong but storage boxes like symmetrix(and others) support
large number of LUNs.
So device management 'scalability' will make linux shine. By scalability
I mean efficiently managing let's say 32K(?) LUNs.

2) feature management:
As Shyam mentioned in earlier emails - provide an ecosystem similar to
smart-phones. Let me provide an example(and then
you guys can correct me by saying this already exists or bits and pieces
exist):

a) As a kernel developer - export all the features(to the upper layers)
we can by querying the target.

b) As an app developer - GetLUNFeatures(), could return - {Thin
Provision, RAID-level, ... ).

c) As a sys-admin - If I right click(for a volume/LUN) on my management
GUI, I should be able to tell if my volume supports thin-prov, backed by
DRBD[proxy?] etc.

So once we provide b) from above, tens(if not hundreds) of apps will be
developed in a short period of time.
I feel libstoragemgmt is an excellent place to get all of this
organized.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
@ 2012-01-19 16:17             ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 16:17 UTC (permalink / raw)
  To: Tom Coughlan, Hannes Reinecke, tasleson
  Cc: Shyam_Iyer, vgoyal, rwheeler, linux-fsdevel, linux-scsi

> From: linux-scsi-owner@vger.kernel.org
[mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
> 


> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
> 
> http://sourceforge.net/apps/trac/libstoragemgmt
> 
> The current focus is on managing external storage (SMI-S, etc.). This
> focus can be expanded over time. Contributions welcome.
> 

Device management(scalability/feature mgmt) - 

1) scalability:
I may be wrong but storage boxes like symmetrix(and others) support
large number of LUNs.
So device management 'scalability' will make linux shine. By scalability
I mean efficiently managing let's say 32K(?) LUNs.

2) feature management:
As Shyam mentioned in earlier emails - provide an ecosystem similar to
smart-phones. Let me provide an example(and then
you guys can correct me by saying this already exists or bits and pieces
exist):

a) As a kernel developer - export all the features(to the upper layers)
we can by querying the target.

b) As an app developer - GetLUNFeatures(), could return - {Thin
Provision, RAID-level, ... ).

c) As a sys-admin - If I right click(for a volume/LUN) on my management
GUI, I should be able to tell if my volume supports thin-prov, backed by
DRBD[proxy?] etc.

So once we provide b) from above, tens(if not hundreds) of apps will be
developed in a short period of time.
I feel libstoragemgmt is an excellent place to get all of this
organized.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 16:17             ` Loke, Chetan
  (?)
@ 2012-01-19 16:19             ` Ric Wheeler
  2012-01-19 16:26                 ` Loke, Chetan
  -1 siblings, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-19 16:19 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

On 01/19/2012 11:17 AM, Loke, Chetan wrote:
>> From: linux-scsi-owner@vger.kernel.org
> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
>
>> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
>>
>> http://sourceforge.net/apps/trac/libstoragemgmt
>>
>> The current focus is on managing external storage (SMI-S, etc.). This
>> focus can be expanded over time. Contributions welcome.
>>
> Device management(scalability/feature mgmt) -
>
> 1) scalability:
> I may be wrong but storage boxes like symmetrix(and others) support
> large number of LUNs.
> So device management 'scalability' will make linux shine. By scalability
> I mean efficiently managing let's say 32K(?) LUNs.

You can actually have (even more) luns than that for each array in a SAN. From a 
host, you will probably not see all of the luns all of the time, but you 
certainly need to have good support for a large lun address space that will be 
sparsely populated.


>
> 2) feature management:
> As Shyam mentioned in earlier emails - provide an ecosystem similar to
> smart-phones. Let me provide an example(and then
> you guys can correct me by saying this already exists or bits and pieces
> exist):
>
> a) As a kernel developer - export all the features(to the upper layers)
> we can by querying the target.
>
> b) As an app developer - GetLUNFeatures(), could return - {Thin
> Provision, RAID-level, ... ).
>
> c) As a sys-admin - If I right click(for a volume/LUN) on my management
> GUI, I should be able to tell if my volume supports thin-prov, backed by
> DRBD[proxy?] etc.
>
> So once we provide b) from above, tens(if not hundreds) of apps will be
> developed in a short period of time.
> I feel libstoragemgmt is an excellent place to get all of this
> organized.
>
>
> Chetan Loke


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 16:19             ` Ric Wheeler
@ 2012-01-19 16:26                 ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 16:26 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> -----Original Message-----
> From: Ric Wheeler [mailto:rwheeler@redhat.com]
> Sent: January 19, 2012 11:20 AM
> To: Loke, Chetan
> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
> linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what'smissing?
> 
> On 01/19/2012 11:17 AM, Loke, Chetan wrote:
> >> From: linux-scsi-owner@vger.kernel.org
> > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
> >
> >> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
> >>
> >> http://sourceforge.net/apps/trac/libstoragemgmt
> >>
> >> The current focus is on managing external storage (SMI-S, etc.).
> This
> >> focus can be expanded over time. Contributions welcome.
> >>
> > Device management(scalability/feature mgmt) -
> >
> > 1) scalability:
> > I may be wrong but storage boxes like symmetrix(and others) support
> > large number of LUNs.
> > So device management 'scalability' will make linux shine. By
> scalability
> > I mean efficiently managing let's say 32K(?) LUNs.
> 
> You can actually have (even more) luns than that for each array in a
> SAN. From a host, you will probably not see all of the luns all of the
time, but
> you certainly need to have good support for a large lun address space
that
> will be sparsely populated.
> 
> 

True, a single front-end won't see all of those LUNs/devices. So not a
big concern
about the front-end hosts.

I am thinking of a use-case where folks can use a linux-box to manage
their different storage arrays.
So this linux box with 'libstoragemgmt + app' needs to
manage(scan/create/delete/so on) all those LUNs.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
@ 2012-01-19 16:26                 ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 16:26 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> -----Original Message-----
> From: Ric Wheeler [mailto:rwheeler@redhat.com]
> Sent: January 19, 2012 11:20 AM
> To: Loke, Chetan
> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
> linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what'smissing?
> 
> On 01/19/2012 11:17 AM, Loke, Chetan wrote:
> >> From: linux-scsi-owner@vger.kernel.org
> > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
> >
> >> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
> >>
> >> http://sourceforge.net/apps/trac/libstoragemgmt
> >>
> >> The current focus is on managing external storage (SMI-S, etc.).
> This
> >> focus can be expanded over time. Contributions welcome.
> >>
> > Device management(scalability/feature mgmt) -
> >
> > 1) scalability:
> > I may be wrong but storage boxes like symmetrix(and others) support
> > large number of LUNs.
> > So device management 'scalability' will make linux shine. By
> scalability
> > I mean efficiently managing let's say 32K(?) LUNs.
> 
> You can actually have (even more) luns than that for each array in a
> SAN. From a host, you will probably not see all of the luns all of the
time, but
> you certainly need to have good support for a large lun address space
that
> will be sparsely populated.
> 
> 

True, a single front-end won't see all of those LUNs/devices. So not a
big concern
about the front-end hosts.

I am thinking of a use-case where folks can use a linux-box to manage
their different storage arrays.
So this linux box with 'libstoragemgmt + app' needs to
manage(scan/create/delete/so on) all those LUNs.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 16:26                 ` Loke, Chetan
  (?)
@ 2012-01-19 16:29                 ` Ric Wheeler
  2012-01-19 17:32                     ` Loke, Chetan
  -1 siblings, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-19 16:29 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

On 01/19/2012 11:26 AM, Loke, Chetan wrote:
>> -----Original Message-----
>> From: Ric Wheeler [mailto:rwheeler@redhat.com]
>> Sent: January 19, 2012 11:20 AM
>> To: Loke, Chetan
>> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
>> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
>> linux-scsi@vger.kernel.org
>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
>> what'smissing?
>>
>> On 01/19/2012 11:17 AM, Loke, Chetan wrote:
>>>> From: linux-scsi-owner@vger.kernel.org
>>> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan
>>>
>>>> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
>>>>
>>>> http://sourceforge.net/apps/trac/libstoragemgmt
>>>>
>>>> The current focus is on managing external storage (SMI-S, etc.).
>> This
>>>> focus can be expanded over time. Contributions welcome.
>>>>
>>> Device management(scalability/feature mgmt) -
>>>
>>> 1) scalability:
>>> I may be wrong but storage boxes like symmetrix(and others) support
>>> large number of LUNs.
>>> So device management 'scalability' will make linux shine. By
>> scalability
>>> I mean efficiently managing let's say 32K(?) LUNs.
>> You can actually have (even more) luns than that for each array in a
>> SAN. From a host, you will probably not see all of the luns all of the
> time, but
>> you certainly need to have good support for a large lun address space
> that
>> will be sparsely populated.
>>
>>
> True, a single front-end won't see all of those LUNs/devices. So not a
> big concern
> about the front-end hosts.
>
> I am thinking of a use-case where folks can use a linux-box to manage
> their different storage arrays.
> So this linux box with 'libstoragemgmt + app' needs to
> manage(scan/create/delete/so on) all those LUNs.
>
>
> Chetan Loke

People do have boxes with thousands of luns though & file systems in active use. 
Both for SAN and NAS volumes.

One of the challenges is what to do when just one LUN (or NFS server) crashes 
and burns. You simply cannot "reboot" the server to clean up after one bad mount 
when you have thousands of other happy users runs on thousands/hundreds of other 
mount points :)

Ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 16:29                 ` Ric Wheeler
@ 2012-01-19 17:32                     ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 17:32 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> > True, a single front-end won't see all of those LUNs/devices. So not
> a
> > big concern
> > about the front-end hosts.
> >
> > I am thinking of a use-case where folks can use a linux-box to
manage
> > their different storage arrays.
> > So this linux box with 'libstoragemgmt + app' needs to
> > manage(scan/create/delete/so on) all those LUNs.
> >
> 
> People do have boxes with thousands of luns though & file systems in
> active use.
> Both for SAN and NAS volumes.
> 
> One of the challenges is what to do when just one LUN (or NFS server)
> crashes
> and burns. 

The FS needs to go read-only(plain & simple) because you don't know
what's going on.
You can't risk writing data anymore. Let the apps fail. You can make it
happen even today.
It's a simple exercise.

Like others, I have seen/debugged enough weirdness when it comes to
resets/aborts(FYI - 200+ hosts in a cluster).
Because of NDA reasons I can't disclose a whole lot but folks have
fixed/enhanced 
scsi stack to make resets/aborts fully robust. And you need folks who
can debug 
'apps/FS/block/initiator/wire-protocol/target-side' in one shot. Simple.
So when you say 'crash & burn' then either or 'all' of the above(minus
the protocol handling) might need fixing.

> You simply cannot "reboot" the server to clean up after one
> bad mount when you have thousands of other happy users runs on
thousands/hundreds
> of other mount points :)

Again, can't front-end can go read only and limit the outage w/o
disturbing thousands of users?


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
@ 2012-01-19 17:32                     ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 17:32 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> > True, a single front-end won't see all of those LUNs/devices. So not
> a
> > big concern
> > about the front-end hosts.
> >
> > I am thinking of a use-case where folks can use a linux-box to
manage
> > their different storage arrays.
> > So this linux box with 'libstoragemgmt + app' needs to
> > manage(scan/create/delete/so on) all those LUNs.
> >
> 
> People do have boxes with thousands of luns though & file systems in
> active use.
> Both for SAN and NAS volumes.
> 
> One of the challenges is what to do when just one LUN (or NFS server)
> crashes
> and burns. 

The FS needs to go read-only(plain & simple) because you don't know
what's going on.
You can't risk writing data anymore. Let the apps fail. You can make it
happen even today.
It's a simple exercise.

Like others, I have seen/debugged enough weirdness when it comes to
resets/aborts(FYI - 200+ hosts in a cluster).
Because of NDA reasons I can't disclose a whole lot but folks have
fixed/enhanced 
scsi stack to make resets/aborts fully robust. And you need folks who
can debug 
'apps/FS/block/initiator/wire-protocol/target-side' in one shot. Simple.
So when you say 'crash & burn' then either or 'all' of the above(minus
the protocol handling) might need fixing.

> You simply cannot "reboot" the server to clean up after one
> bad mount when you have thousands of other happy users runs on
thousands/hundreds
> of other mount points :)

Again, can't front-end can go read only and limit the outage w/o
disturbing thousands of users?


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 17:32                     ` Loke, Chetan
  (?)
@ 2012-01-19 17:44                     ` Ric Wheeler
  2012-01-19 21:30                         ` Loke, Chetan
  -1 siblings, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-19 17:44 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

On 01/19/2012 12:32 PM, Loke, Chetan wrote:
>>> True, a single front-end won't see all of those LUNs/devices. So not
>> a
>>> big concern
>>> about the front-end hosts.
>>>
>>> I am thinking of a use-case where folks can use a linux-box to
> manage
>>> their different storage arrays.
>>> So this linux box with 'libstoragemgmt + app' needs to
>>> manage(scan/create/delete/so on) all those LUNs.
>>>
>> People do have boxes with thousands of luns though&  file systems in
>> active use.
>> Both for SAN and NAS volumes.
>>
>> One of the challenges is what to do when just one LUN (or NFS server)
>> crashes
>> and burns.
> The FS needs to go read-only(plain&  simple) because you don't know
> what's going on.
> You can't risk writing data anymore. Let the apps fail. You can make it
> happen even today.
> It's a simple exercise.

Nope - it needs to be torn down and we need to be able to cleanly unmount it.

Letting an application see and read-only file system when the disk is gone or 
server down is not very useful since you won't get any non-cached data back.

Also, if you have an ability to migrate that mount (same mount point) to another 
server or clone LUN, you want to unmount the source so you can remount the data 
under that same mount point/namespace....

ric

>
> Like others, I have seen/debugged enough weirdness when it comes to
> resets/aborts(FYI - 200+ hosts in a cluster).
> Because of NDA reasons I can't disclose a whole lot but folks have
> fixed/enhanced
> scsi stack to make resets/aborts fully robust. And you need folks who
> can debug
> 'apps/FS/block/initiator/wire-protocol/target-side' in one shot. Simple.
> So when you say 'crash&  burn' then either or 'all' of the above(minus
> the protocol handling) might need fixing.
>
>> You simply cannot "reboot" the server to clean up after one
>> bad mount when you have thousands of other happy users runs on
> thousands/hundreds
>> of other mount points :)
> Again, can't front-end can go read only and limit the outage w/o
> disturbing thousands of users?
>
>
> Chetan Loke


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-18 18:46     ` Roland Dreier
@ 2012-01-19 17:50         ` Loke, Chetan
  2012-01-19 17:50         ` Loke, Chetan
  1 sibling, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 17:50 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-fsdevel, linux-scsi

> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Roland Dreier
> Sent: January 18, 2012 1:46 PM
> To: Ric Wheeler
> Cc: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> > Why would you crash is you have device mapper multipath configured
to
> handle
> > path fail over? We have tons of enterprise customers that use
that...
> 
> cf http://www.spinics.net/lists/linux-scsi/msg56254.html
> 
> Basically hot unplug of an sdX can oops on any recent kernel, no
> matter what dm stuff you have on top.


Depends on the use-case. In the particular (SAN)use-case that I'm
talking about, 
you shouldn't unplug live sdX which is still mapped on the target(array)
side.
I've attempted to fix it somewhere and it works. If you need more info
about eh and initiator fixes, email me offline.

When would this case arise? :
a) you unknowingly unmap *hot* LUNs from target(array) while managing
the array.

End result - reset storm on front-end. You just need a smarter way to
isolate/localize this
front-end storm.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
@ 2012-01-19 17:50         ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 17:50 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-fsdevel, linux-scsi

> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Roland Dreier
> Sent: January 18, 2012 1:46 PM
> To: Ric Wheeler
> Cc: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's
> missing?
> 
> > Why would you crash is you have device mapper multipath configured
to
> handle
> > path fail over? We have tons of enterprise customers that use
that...
> 
> cf http://www.spinics.net/lists/linux-scsi/msg56254.html
> 
> Basically hot unplug of an sdX can oops on any recent kernel, no
> matter what dm stuff you have on top.


Depends on the use-case. In the particular (SAN)use-case that I'm
talking about, 
you shouldn't unplug live sdX which is still mapped on the target(array)
side.
I've attempted to fix it somewhere and it works. If you need more info
about eh and initiator fixes, email me offline.

When would this case arise? :
a) you unknowingly unmap *hot* LUNs from target(array) while managing
the array.

End result - reset storm on front-end. You just need a smarter way to
isolate/localize this
front-end storm.


Chetan Loke

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 17:44                     ` Ric Wheeler
@ 2012-01-19 21:30                         ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 21:30 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> From: Ric Wheeler [mailto:rwheeler@redhat.com]
> Sent: January 19, 2012 12:44 PM
> To: Loke, Chetan
> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
> linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what'smissing?
> 
> On 01/19/2012 12:32 PM, Loke, Chetan wrote:
> >>> True, a single front-end won't see all of those LUNs/devices. So
> not
> >> a
> >>> big concern
> >>> about the front-end hosts.
> >>>
> >>> I am thinking of a use-case where folks can use a linux-box to
> > manage
> >>> their different storage arrays.
> >>> So this linux box with 'libstoragemgmt + app' needs to
> >>> manage(scan/create/delete/so on) all those LUNs.
> >>>
> >> People do have boxes with thousands of luns though&  file systems
in
> >> active use.
> >> Both for SAN and NAS volumes.
> >>
> >> One of the challenges is what to do when just one LUN (or NFS
> server)
> >> crashes
> >> and burns.
> > The FS needs to go read-only(plain&  simple) because you don't know
> > what's going on.
> > You can't risk writing data anymore. Let the apps fail. You can make
> it
> > happen even today.
> > It's a simple exercise.
> 
> Nope - it needs to be torn down and we need to be able to cleanly
> unmount it.
> 
> Letting an application see and read-only file system when the disk is
> gone or
> server down is not very useful since you won't get any non-cached data
> back.
> 

Sure, it's just a partial snapshot(aka cached-data) of the file-system.

But writes that have to fetch the non-cached data, will unnecessarily
issue I/O to the fabric. These orphaned I/O's cause more pain in the
cleanup.
And if caching is enabled on the front-side then it's all the more
painful.

We can go one extra step and make FS fail read I/O for non-cached data
too
to avoid more orphan IOs.

Tearing down will happen sometime later. But don't you agree that
something needs
to happen before that? And that something is, read-only, which will
eventually 
propagate to the users(example when you are copying a new file).
Users will then report it to their IT/admins. 
This approach of serving the snap-shot(cached) file-system could serve 
some users for what it's worth. It's better than surprise-removal and
issuing
needless IOs(read - eh race conditions).

> Also, if you have an ability to migrate that mount (same mount point)
> to another
> server or clone LUN, you want to unmount the source so you can remount
> the data
> under that same mount point/namespace....
> 

Won't this be protocol specific.


> ric

Chetan Loke


^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
@ 2012-01-19 21:30                         ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-19 21:30 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> From: Ric Wheeler [mailto:rwheeler@redhat.com]
> Sent: January 19, 2012 12:44 PM
> To: Loke, Chetan
> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
> linux-scsi@vger.kernel.org
> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
> what'smissing?
> 
> On 01/19/2012 12:32 PM, Loke, Chetan wrote:
> >>> True, a single front-end won't see all of those LUNs/devices. So
> not
> >> a
> >>> big concern
> >>> about the front-end hosts.
> >>>
> >>> I am thinking of a use-case where folks can use a linux-box to
> > manage
> >>> their different storage arrays.
> >>> So this linux box with 'libstoragemgmt + app' needs to
> >>> manage(scan/create/delete/so on) all those LUNs.
> >>>
> >> People do have boxes with thousands of luns though&  file systems
in
> >> active use.
> >> Both for SAN and NAS volumes.
> >>
> >> One of the challenges is what to do when just one LUN (or NFS
> server)
> >> crashes
> >> and burns.
> > The FS needs to go read-only(plain&  simple) because you don't know
> > what's going on.
> > You can't risk writing data anymore. Let the apps fail. You can make
> it
> > happen even today.
> > It's a simple exercise.
> 
> Nope - it needs to be torn down and we need to be able to cleanly
> unmount it.
> 
> Letting an application see and read-only file system when the disk is
> gone or
> server down is not very useful since you won't get any non-cached data
> back.
> 

Sure, it's just a partial snapshot(aka cached-data) of the file-system.

But writes that have to fetch the non-cached data, will unnecessarily
issue I/O to the fabric. These orphaned I/O's cause more pain in the
cleanup.
And if caching is enabled on the front-side then it's all the more
painful.

We can go one extra step and make FS fail read I/O for non-cached data
too
to avoid more orphan IOs.

Tearing down will happen sometime later. But don't you agree that
something needs
to happen before that? And that something is, read-only, which will
eventually 
propagate to the users(example when you are copying a new file).
Users will then report it to their IT/admins. 
This approach of serving the snap-shot(cached) file-system could serve 
some users for what it's worth. It's better than surprise-removal and
issuing
needless IOs(read - eh race conditions).

> Also, if you have an ability to migrate that mount (same mount point)
> to another
> server or clone LUN, you want to unmount the source so you can remount
> the data
> under that same mount point/namespace....
> 

Won't this be protocol specific.


> ric

Chetan Loke


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 21:30                         ` Loke, Chetan
  (?)
@ 2012-01-19 21:39                         ` Ric Wheeler
  2012-01-24 17:05                             ` Loke, Chetan
  -1 siblings, 1 reply; 50+ messages in thread
From: Ric Wheeler @ 2012-01-19 21:39 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

On 01/19/2012 04:30 PM, Loke, Chetan wrote:
>> From: Ric Wheeler [mailto:rwheeler@redhat.com]
>> Sent: January 19, 2012 12:44 PM
>> To: Loke, Chetan
>> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com;
>> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org;
>> linux-scsi@vger.kernel.org
>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
>> what'smissing?
>>
>> On 01/19/2012 12:32 PM, Loke, Chetan wrote:
>>>>> True, a single front-end won't see all of those LUNs/devices. So
>> not
>>>> a
>>>>> big concern
>>>>> about the front-end hosts.
>>>>>
>>>>> I am thinking of a use-case where folks can use a linux-box to
>>> manage
>>>>> their different storage arrays.
>>>>> So this linux box with 'libstoragemgmt + app' needs to
>>>>> manage(scan/create/delete/so on) all those LUNs.
>>>>>
>>>> People do have boxes with thousands of luns though&   file systems
> in
>>>> active use.
>>>> Both for SAN and NAS volumes.
>>>>
>>>> One of the challenges is what to do when just one LUN (or NFS
>> server)
>>>> crashes
>>>> and burns.
>>> The FS needs to go read-only(plain&   simple) because you don't know
>>> what's going on.
>>> You can't risk writing data anymore. Let the apps fail. You can make
>> it
>>> happen even today.
>>> It's a simple exercise.
>> Nope - it needs to be torn down and we need to be able to cleanly
>> unmount it.
>>
>> Letting an application see and read-only file system when the disk is
>> gone or
>> server down is not very useful since you won't get any non-cached data
>> back.
>>
> Sure, it's just a partial snapshot(aka cached-data) of the file-system.
>
> But writes that have to fetch the non-cached data, will unnecessarily
> issue I/O to the fabric. These orphaned I/O's cause more pain in the
> cleanup.
> And if caching is enabled on the front-side then it's all the more
> painful.
>
> We can go one extra step and make FS fail read I/O for non-cached data
> too
> to avoid more orphan IOs.

I don't really see this as a useful state. Read-only without a real backing file 
system or LUN is hit or miss, that file system should go offline :)

>
> Tearing down will happen sometime later. But don't you agree that
> something needs
> to happen before that? And that something is, read-only, which will
> eventually
> propagate to the users(example when you are copying a new file).
> Users will then report it to their IT/admins.
> This approach of serving the snap-shot(cached) file-system could serve
> some users for what it's worth. It's better than surprise-removal and
> issuing
> needless IOs(read - eh race conditions).
>
>> Also, if you have an ability to migrate that mount (same mount point)
>> to another
>> server or clone LUN, you want to unmount the source so you can remount
>> the data
>> under that same mount point/namespace....
>>
> Won't this be protocol specific.
>

Not really protocol specific. We need to be able to do a forced unmount and then 
do fail over (that varies depending on many things like your HA frame work and 
certainly the type of thing you are attempting to fail over)

Ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
  2012-01-10  6:53           ` Ric Wheeler
@ 2012-01-20  8:55             ` Hannes Reinecke
  0 siblings, 0 replies; 50+ messages in thread
From: Hannes Reinecke @ 2012-01-20  8:55 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi

On 01/10/2012 07:53 AM, Ric Wheeler wrote:
> On 01/09/2012 02:59 PM, Tom Coughlan wrote:
>> On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote:
>>> On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
>>>>> owner@vger.kernel.org] On Behalf Of Vivek Goyal
>>>>> Sent: Thursday, December 22, 2011 10:59 AM
>>>>> To: Iyer, Shyam
>>>>> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux-
>>>>> scsi@vger.kernel.org
>>>>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server -
>>>>> what's
>>>>> missing?
>>>>>
>>>>> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com
>>>>> wrote:
>>>>>
>>>>> [..]
>>>>>
>>>>>> Simple asks -
>>>>>> 1) Provide a consistent storage and fs management library that
>>>>> discourages folks to write their own usespace storage library.
>>>>> Include
>>>>> things like fs formatting(fs profiles), transport
>>>>> configuration(eg:
>>>>> iscsiadm as a library), thin provisioning watermarks, cluster
>>>>> management, apis for cgroups etc.
>>>>>                                        ^^^^^^^^^^^^^^^^
>>>>> For cgroups, we have libcgroup library. Not many people like to
>>>>> use it
>>>>> though as cgroup is exported as a filesystem and they prefer to
>>>>> use
>>>>> normal
>>>>> libc api to traverse and configure cgroups (Instead of going
>>>>> through
>>>>> another library). Some examples include libvrit, systemd.
>>>>>
>>>>> Thanks
>>>>> Vivek
>>>> Well honestly I think that is a libvirt/systemd issue and
>>>> libvirt also
>>>> invokes things like iscsiadm, dcb etc as a binary :-/
>>>>
>>>> Some one could always use qemu command lines to invoke KVM/XEN but
>>>> libvirt has saved me one too many days in doing a quick operation
>>>> without wondering about a qemu commandline.
>>>>
>>>> I am also asking for ideas on how to avoid this fragmentation
>>>> because
>>>> just like libvirt others are also encouraged to do their own libc
>>> thing
>>>> in the absence of a common storage management framework..
>>>>
>>>> Does the standard interface for linux end at the user/kernel
>>>> boundary
>>>> or the user/libc boundary? If so I feel we would continue to lag
>>> behind
>>>> other OSes in features because of the model.
>>>>
>>> StorageAPI _again_.
>>>
>>> I was under the impression RH had someone working on it.
>> Yes, Red Hat does. Tony Asleson. libStorageMgmt:
>>
>> http://sourceforge.net/apps/trac/libstoragemgmt
>>
>> The current focus is on managing external storage (SMI-S, etc.). This
>> focus can be expanded over time. Contributions welcome.
>>
>>> (Actually I was trying to give it a go, but then got buried under
>>> customer escalations).
>>>
>>> So yes, we know there is a shortcoming.
>>> And yes, we should improve things.
>>>
>>> But I feel another discussion about this will only give us more
>>> insight, but not moving things forward.
>>>
>>> What about having a separate session at the storage summit (or even
>>> at the collab summit) to hammer out the requirements here?
>> That would be fine, although as you say, we need more than talk.
>>
>> Tom
>>
> 
> Having a special session would be really a good idea - given the
> size of the discussion, we might want to do both a talk and a
> breakout at collab summit...
> 
To advance this further, I've submitted a BoF proposal 'Storage
Management on Linux' for the Collab Summit.

I would be available for a joint talk as an introduction if someone
would be interested ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-19 21:39                         ` Ric Wheeler
@ 2012-01-24 17:05                             ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-24 17:05 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> > But writes that have to fetch the non-cached data, will
unnecessarily
> > issue I/O to the fabric. These orphaned I/O's cause more pain in the
> > cleanup.
> > And if caching is enabled on the front-side then it's all the more
> > painful.
> >
> > We can go one extra step and make FS fail read I/O for non-cached
data
> > too to avoid more orphan IOs.
> 
> I don't really see this as a useful state. Read-only without a real
> backing file
> system or LUN is hit or miss, that file system should go offline :)
> 

Last year when I checked, I forget but I think xfs(or ext4) was going
read-only.
If this is still the case then you are basically asking FS's to modify
that behavior.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
@ 2012-01-24 17:05                             ` Loke, Chetan
  0 siblings, 0 replies; 50+ messages in thread
From: Loke, Chetan @ 2012-01-24 17:05 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

> > But writes that have to fetch the non-cached data, will
unnecessarily
> > issue I/O to the fabric. These orphaned I/O's cause more pain in the
> > cleanup.
> > And if caching is enabled on the front-side then it's all the more
> > painful.
> >
> > We can go one extra step and make FS fail read I/O for non-cached
data
> > too to avoid more orphan IOs.
> 
> I don't really see this as a useful state. Read-only without a real
> backing file
> system or LUN is hit or miss, that file system should go offline :)
> 

Last year when I checked, I forget but I think xfs(or ext4) was going
read-only.
If this is still the case then you are basically asking FS's to modify
that behavior.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-24 17:05                             ` Loke, Chetan
  (?)
@ 2012-01-24 18:13                             ` Ric Wheeler
  -1 siblings, 0 replies; 50+ messages in thread
From: Ric Wheeler @ 2012-01-24 18:13 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal,
	linux-fsdevel, linux-scsi

On 01/24/2012 12:05 PM, Loke, Chetan wrote:
>>> But writes that have to fetch the non-cached data, will
> unnecessarily
>>> issue I/O to the fabric. These orphaned I/O's cause more pain in the
>>> cleanup.
>>> And if caching is enabled on the front-side then it's all the more
>>> painful.
>>>
>>> We can go one extra step and make FS fail read I/O for non-cached
> data
>>> too to avoid more orphan IOs.
>> I don't really see this as a useful state. Read-only without a real
>> backing file
>> system or LUN is hit or miss, that file system should go offline :)
>>
> Last year when I checked, I forget but I think xfs(or ext4) was going
> read-only.
> If this is still the case then you are basically asking FS's to modify
> that behavior.

Not really.

I think that at that state, we need to be able to cleanly umount those file 
systems and clean up the system without requiring a reboot.

Read-only state is not a usable condition for a file system without a backing 
device, just an side effect of our error handling...

ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton
  2012-01-03 19:32   ` Chuck Lever
@ 2012-01-24 21:36   ` J. Bruce Fields
  2012-01-24 23:13     ` Ric Wheeler
  1 sibling, 1 reply; 50+ messages in thread
From: J. Bruce Fields @ 2012-01-24 21:36 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi

On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote:
> On Wed, 21 Dec 2011 10:59:43 -0500
> Ric Wheeler <rwheeler@redhat.com> wrote:
> 
> > 
> > One common thing that I see a lot of these days is an increasing number of 
> > platforms that are built on our stack as storage servers. Ranging from the 
> > common linux based storage/NAS devices up to various distributed systems.  
> > Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
> > 
> > At last year's SNIA developers conference, it was clear that Microsoft is 
> > putting a lot of effort into enhancing windows 8 server as a storage server with 
> > both support for a pNFS server and of course SMB. I think that linux (+samba) is 
> > ahead of the windows based storage appliances today, but they are putting 
> > together a very aggressive list of features.
> > 
> > I think that it would be useful and interesting to take a slot at this year's 
> > LSF to see how we are doing in this space. How large do we need to scale for an 
> > appliance?  What kind of work is needed (support for the copy offload system 
> > call? better support for out of band notifications like those used in "thinly 
> > provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
> > 
> > The goal would be to see what technical gaps we have that need more active 
> > development in, not just a wish list :)
> > 
> > Ric
> 
> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
> more active development ;).
> 
> While HCH will probably disagree, being able to support more
> NFSv4/Windows API features at the VFS layer would make it a lot easier
> to do a more unified serving appliance. Right now, both knfsd and samba
> track too much info internally, and that makes it very difficult to
> serve the same data via multiple protocols.

By the way, we could really use a
Windows/Samba expert if we're going to discuss that.

I don't think their list(s) got the announcement?

--b.

> 
> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
> 
> - RichACLs
> - Share/Deny mode support on open
> - mandatory locking that doesn't rely on weirdo file modes
> 
> It's always going to be hard for us to compete with dedicated
> appliances. Where Linux can shine though is in allowing for more
> innovative combinations.
> 
> Being able to do active/active NFS serving from clustered filesystems,
> for instance is something that we can eventually attain but that would
> be harder to do in an appliance. This sort of discussion might also
> dovetail with Benny's proposal about pNFS serving.
> 
> -- 
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-24 21:36   ` J. Bruce Fields
@ 2012-01-24 23:13     ` Ric Wheeler
  2012-01-25 19:05       ` Christopher R. Hertel
  2012-01-25 20:25       ` Christopher R. Hertel
  0 siblings, 2 replies; 50+ messages in thread
From: Ric Wheeler @ 2012-01-24 23:13 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison,
	Simo Sorce, Christopher R. Hertel

On 01/24/2012 04:36 PM, J. Bruce Fields wrote:
> On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote:
>> On Wed, 21 Dec 2011 10:59:43 -0500
>> Ric Wheeler<rwheeler@redhat.com>  wrote:
>>
>>> One common thing that I see a lot of these days is an increasing number of
>>> platforms that are built on our stack as storage servers. Ranging from the
>>> common linux based storage/NAS devices up to various distributed systems.
>>> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba.
>>>
>>> At last year's SNIA developers conference, it was clear that Microsoft is
>>> putting a lot of effort into enhancing windows 8 server as a storage server with
>>> both support for a pNFS server and of course SMB. I think that linux (+samba) is
>>> ahead of the windows based storage appliances today, but they are putting
>>> together a very aggressive list of features.
>>>
>>> I think that it would be useful and interesting to take a slot at this year's
>>> LSF to see how we are doing in this space. How large do we need to scale for an
>>> appliance?  What kind of work is needed (support for the copy offload system
>>> call? better support for out of band notifications like those used in "thinly
>>> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?).
>>>
>>> The goal would be to see what technical gaps we have that need more active
>>> development in, not just a wish list :)
>>>
>>> Ric
>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
>> more active development ;).
>>
>> While HCH will probably disagree, being able to support more
>> NFSv4/Windows API features at the VFS layer would make it a lot easier
>> to do a more unified serving appliance. Right now, both knfsd and samba
>> track too much info internally, and that makes it very difficult to
>> serve the same data via multiple protocols.
> By the way, we could really use a
> Windows/Samba expert if we're going to discuss that.
>
> I don't think their list(s) got the announcement?
>
> --b.

Adding in three windows/samba people that I know of :)

Ric

>> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
>>
>> - RichACLs
>> - Share/Deny mode support on open
>> - mandatory locking that doesn't rely on weirdo file modes
>>
>> It's always going to be hard for us to compete with dedicated
>> appliances. Where Linux can shine though is in allowing for more
>> innovative combinations.
>>
>> Being able to do active/active NFS serving from clustered filesystems,
>> for instance is something that we can eventually attain but that would
>> be harder to do in an appliance. This sort of discussion might also
>> dovetail with Benny's proposal about pNFS serving.
>>
>> -- 
>> Jeff Layton<jlayton@redhat.com>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-24 23:13     ` Ric Wheeler
@ 2012-01-25 19:05       ` Christopher R. Hertel
  2012-01-25 20:25       ` Christopher R. Hertel
  1 sibling, 0 replies; 50+ messages in thread
From: Christopher R. Hertel @ 2012-01-25 19:05 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi,
	Jeremy Allison, Simo Sorce

Ric, et. al.,

I want to think of these questions from the perspective of the storage
systems vendor who is (or may be interested in) using the Linux storage
stack in a product.
  * How do we make things easy for those vendors (without compromising
    the licensing--theirs and ours)?
  * How do we ensure that they have all of the features that they need,
    and can add their own software "skin" to present to the end-user?

We have talked about some of these issues already, and I know that there is
work being done, but the key will be to make that work consistent so that we
are not reinventing wheels but also not forcing customers to use this or
that specific piece or part.

I think we want to look at our storage stack from the bottom up, and
consider what can be done within the kernel and/or in user-space to bring
the pieces together.

Okay... so all of that is wildly vague and magic-waving-of-hands-ish.  The
questions raised in the thread below require a bit of thought to answer.

Chris -)-----

Ric Wheeler wrote:
> On 01/24/2012 04:36 PM, J. Bruce Fields wrote:
>> On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote:
>>> On Wed, 21 Dec 2011 10:59:43 -0500
>>> Ric Wheeler<rwheeler@redhat.com>  wrote:
>>>
>>>> One common thing that I see a lot of these days is an increasing
>>>> number of
>>>> platforms that are built on our stack as storage servers. Ranging
>>>> from the
>>>> common linux based storage/NAS devices up to various distributed
>>>> systems.
>>>> Almost all of them use our common stack - software RAID, LVM,
>>>> XFS/ext4 and samba.
>>>>
>>>> At last year's SNIA developers conference, it was clear that
>>>> Microsoft is
>>>> putting a lot of effort into enhancing windows 8 server as a storage
>>>> server with
>>>> both support for a pNFS server and of course SMB. I think that linux
>>>> (+samba) is
>>>> ahead of the windows based storage appliances today, but they are
>>>> putting
>>>> together a very aggressive list of features.
>>>>
>>>> I think that it would be useful and interesting to take a slot at
>>>> this year's
>>>> LSF to see how we are doing in this space. How large do we need to
>>>> scale for an
>>>> appliance?  What kind of work is needed (support for the copy
>>>> offload system
>>>> call? better support for out of band notifications like those used
>>>> in "thinly
>>>> provisioned" SCSI devices? management API's? Ease of use CLI work?
>>>> SMB2.2 support?).
>>>>
>>>> The goal would be to see what technical gaps we have that need more
>>>> active
>>>> development in, not just a wish list :)
>>>>
>>>> Ric
>>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
>>> more active development ;).
>>>
>>> While HCH will probably disagree, being able to support more
>>> NFSv4/Windows API features at the VFS layer would make it a lot easier
>>> to do a more unified serving appliance. Right now, both knfsd and samba
>>> track too much info internally, and that makes it very difficult to
>>> serve the same data via multiple protocols.
>> By the way, we could really use a
>> Windows/Samba expert if we're going to discuss that.
>>
>> I don't think their list(s) got the announcement?
>>
>> --b.
> 
> Adding in three windows/samba people that I know of :)
> 
> Ric
> 
>>> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
>>>
>>> - RichACLs
>>> - Share/Deny mode support on open
>>> - mandatory locking that doesn't rely on weirdo file modes
>>>
>>> It's always going to be hard for us to compete with dedicated
>>> appliances. Where Linux can shine though is in allowing for more
>>> innovative combinations.
>>>
>>> Being able to do active/active NFS serving from clustered filesystems,
>>> for instance is something that we can eventually attain but that would
>>> be harder to do in an appliance. This sort of discussion might also
>>> dovetail with Benny's proposal about pNFS serving.
>>>
>>> -- 
>>> Jeff Layton<jlayton@redhat.com>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-fsdevel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh@ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh@ubiqx.org

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-24 23:13     ` Ric Wheeler
  2012-01-25 19:05       ` Christopher R. Hertel
@ 2012-01-25 20:25       ` Christopher R. Hertel
  2012-01-25 21:56         ` Roland Dreier
  2012-01-26 11:15         ` Bart Van Assche
  1 sibling, 2 replies; 50+ messages in thread
From: Christopher R. Hertel @ 2012-01-25 20:25 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi,
	Jeremy Allison, Simo Sorce

I have actually been mulling this over a good deal in my mind, but I keep
approaching it from different perspectives.

Here are some random questions that pop up in my mind...

Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and
export LUNs via FibreChannel?
  - Do we support multipath in this configuration?
  - Zoning?  Muli-access LUNs for Clustering?

Where are we on FCoE support?

Where to we stand on ATAoE?
  - Coraid, the creaters of ATAoE, were recently listed 3rd on a
    "Most Promising Storage Start-Ups in 2012" list[1].
  - ATAoE support has been standard in Linux for years.
  - The target software is very low-level.  It needs an overhaul
    and a powerful configuration API.

What about other block-level protocols?  My familiarity with these varies.

Chris -)-----


[1]http://www.storagenewsletter.com/news/startups/most-promising-storage-start-ups


Ric Wheeler wrote:
> On 01/24/2012 04:36 PM, J. Bruce Fields wrote:
>> On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote:
>>> On Wed, 21 Dec 2011 10:59:43 -0500
>>> Ric Wheeler<rwheeler@redhat.com>  wrote:
>>>
>>>> One common thing that I see a lot of these days is an increasing
>>>> number of
>>>> platforms that are built on our stack as storage servers. Ranging
>>>> from the
>>>> common linux based storage/NAS devices up to various distributed
>>>> systems.
>>>> Almost all of them use our common stack - software RAID, LVM,
>>>> XFS/ext4 and samba.
>>>>
>>>> At last year's SNIA developers conference, it was clear that
>>>> Microsoft is
>>>> putting a lot of effort into enhancing windows 8 server as a storage
>>>> server with
>>>> both support for a pNFS server and of course SMB. I think that linux
>>>> (+samba) is
>>>> ahead of the windows based storage appliances today, but they are
>>>> putting
>>>> together a very aggressive list of features.
>>>>
>>>> I think that it would be useful and interesting to take a slot at
>>>> this year's
>>>> LSF to see how we are doing in this space. How large do we need to
>>>> scale for an
>>>> appliance?  What kind of work is needed (support for the copy
>>>> offload system
>>>> call? better support for out of band notifications like those used
>>>> in "thinly
>>>> provisioned" SCSI devices? management API's? Ease of use CLI work?
>>>> SMB2.2 support?).
>>>>
>>>> The goal would be to see what technical gaps we have that need more
>>>> active
>>>> development in, not just a wish list :)
>>>>
>>>> Ric
>>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs
>>> more active development ;).
>>>
>>> While HCH will probably disagree, being able to support more
>>> NFSv4/Windows API features at the VFS layer would make it a lot easier
>>> to do a more unified serving appliance. Right now, both knfsd and samba
>>> track too much info internally, and that makes it very difficult to
>>> serve the same data via multiple protocols.
>> By the way, we could really use a
>> Windows/Samba expert if we're going to discuss that.
>>
>> I don't think their list(s) got the announcement?
>>
>> --b.
> 
> Adding in three windows/samba people that I know of :)
> 
> Ric
> 
>>> Off the top of my head, my "wishlist" for better NFSv4 serving would be:
>>>
>>> - RichACLs
>>> - Share/Deny mode support on open
>>> - mandatory locking that doesn't rely on weirdo file modes
>>>
>>> It's always going to be hard for us to compete with dedicated
>>> appliances. Where Linux can shine though is in allowing for more
>>> innovative combinations.
>>>
>>> Being able to do active/active NFS serving from clustered filesystems,
>>> for instance is something that we can eventually attain but that would
>>> be harder to do in an appliance. This sort of discussion might also
>>> dovetail with Benny's proposal about pNFS serving.
>>>
>>> -- 
>>> Jeff Layton<jlayton@redhat.com>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-fsdevel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh@ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh@ubiqx.org

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-25 20:25       ` Christopher R. Hertel
@ 2012-01-25 21:56         ` Roland Dreier
  2012-01-25 22:09           ` Christopher R. Hertel
  2012-01-26 11:15         ` Bart Van Assche
  1 sibling, 1 reply; 50+ messages in thread
From: Roland Dreier @ 2012-01-25 21:56 UTC (permalink / raw)
  To: Christopher R. Hertel
  Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel,
	linux-scsi, Jeremy Allison, Simo Sorce

On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote:
> Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and
> export LUNs via FibreChannel?
>  - Do we support multipath in this configuration?
>  - Zoning?  Muli-access LUNs for Clustering?

Yes to everything, although the low-level FC driver patches are not
quite upstream yet.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-25 21:56         ` Roland Dreier
@ 2012-01-25 22:09           ` Christopher R. Hertel
  2012-01-26 21:52             ` Andy Grover
  0 siblings, 1 reply; 50+ messages in thread
From: Christopher R. Hertel @ 2012-01-25 22:09 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel,
	linux-scsi, Jeremy Allison, Simo Sorce

Roland Dreier wrote:
> On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote:
>> Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and
>> export LUNs via FibreChannel?
>>  - Do we support multipath in this configuration?
>>  - Zoning?  Muli-access LUNs for Clustering?
> 
> Yes to everything, although the low-level FC driver patches are not
> quite upstream yet.

Cool!
Thanks for the update.

I assume that the management API for this is similar to or the same as the
FCoE API?  That would make it easier to produce a consistent look-and-feel
in whatever end-user management tools a third-party my create.

Chris -)-----

-- 
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh@ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh@ubiqx.org

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-25 20:25       ` Christopher R. Hertel
  2012-01-25 21:56         ` Roland Dreier
@ 2012-01-26 11:15         ` Bart Van Assche
  1 sibling, 0 replies; 50+ messages in thread
From: Bart Van Assche @ 2012-01-26 11:15 UTC (permalink / raw)
  To: Christopher R. Hertel
  Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel,
	linux-scsi, Jeremy Allison, Simo Sorce

On Wed, Jan 25, 2012 at 9:25 PM, Christopher R. Hertel <crh@samba.org> wrote:
> Where are we on FCoE support?

Initiator and target mode should work, except that VN2VN mode is
broken. See also
http://lists.open-fcoe.org/pipermail/devel/2011-December/011892.html.

Bart.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing?
  2012-01-25 22:09           ` Christopher R. Hertel
@ 2012-01-26 21:52             ` Andy Grover
  0 siblings, 0 replies; 50+ messages in thread
From: Andy Grover @ 2012-01-26 21:52 UTC (permalink / raw)
  To: Christopher R. Hertel
  Cc: Roland Dreier, Ric Wheeler, J. Bruce Fields, Jeff Layton,
	linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce

On 01/25/2012 02:09 PM, Christopher R. Hertel wrote:
> Roland Dreier wrote:
>> On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote:
>>> Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and
>>> export LUNs via FibreChannel?
>>>  - Do we support multipath in this configuration?
>>>  - Zoning?  Muli-access LUNs for Clustering?
>>
>> Yes to everything, although the low-level FC driver patches are not
>> quite upstream yet.
> 
> Cool!
> Thanks for the update.
> 
> I assume that the management API for this is similar to or the same as the
> FCoE API?  That would make it easier to produce a consistent look-and-feel
> in whatever end-user management tools a third-party my create.

If you're talking about the kernel target stuff, you need to do the
FCoE-specific setup using the same fcoe tools, and then configure the
kernel target via configfs, so that's probably going to be a little
different.

Regards -- Andy

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-24 17:05                             ` Loke, Chetan
  (?)
  (?)
@ 2012-01-26 22:24                             ` Dave Chinner
  2012-01-26 22:29                               ` Ric Wheeler
  -1 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2012-01-26 22:24 UTC (permalink / raw)
  To: Loke, Chetan
  Cc: Ric Wheeler, Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer,
	vgoyal, linux-fsdevel, linux-scsi

On Tue, Jan 24, 2012 at 12:05:23PM -0500, Loke, Chetan wrote:
> > > But writes that have to fetch the non-cached data, will
> unnecessarily
> > > issue I/O to the fabric. These orphaned I/O's cause more pain in the
> > > cleanup.
> > > And if caching is enabled on the front-side then it's all the more
> > > painful.
> > >
> > > We can go one extra step and make FS fail read I/O for non-cached
> data
> > > too to avoid more orphan IOs.
> > 
> > I don't really see this as a useful state. Read-only without a real
> > backing file
> > system or LUN is hit or miss, that file system should go offline :)
> > 
> 
> Last year when I checked, I forget but I think xfs(or ext4) was going
> read-only.
> If this is still the case then you are basically asking FS's to modify
> that behavior.

ext4 goes read only on erro by default. I think that behaviour can
be changed by a mount option.

XFS shuts the filesystem down (takes it offline) preventing all
dirty data and metadata from being issued to disk, aborts any
journal IO that might be going on, etc. It the returns fatal errors
(EIO or EUCLEAN "structure needs cleaning" errors) to all callers
that attempt to write or modify the filesystem, and EIO to all
attempts to read from it.

IOWs, the XFS filesystem is *gone* once it shuts down and requires
administrator intervention to get it back to a working state....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing?
  2012-01-26 22:24                             ` Dave Chinner
@ 2012-01-26 22:29                               ` Ric Wheeler
  0 siblings, 0 replies; 50+ messages in thread
From: Ric Wheeler @ 2012-01-26 22:29 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Loke, Chetan, Tom Coughlan, Hannes Reinecke, tasleson,
	Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi

On 01/26/2012 05:24 PM, Dave Chinner wrote:
> On Tue, Jan 24, 2012 at 12:05:23PM -0500, Loke, Chetan wrote:
>>>> But writes that have to fetch the non-cached data, will
>> unnecessarily
>>>> issue I/O to the fabric. These orphaned I/O's cause more pain in the
>>>> cleanup.
>>>> And if caching is enabled on the front-side then it's all the more
>>>> painful.
>>>>
>>>> We can go one extra step and make FS fail read I/O for non-cached
>> data
>>>> too to avoid more orphan IOs.
>>> I don't really see this as a useful state. Read-only without a real
>>> backing file
>>> system or LUN is hit or miss, that file system should go offline :)
>>>
>> Last year when I checked, I forget but I think xfs(or ext4) was going
>> read-only.
>> If this is still the case then you are basically asking FS's to modify
>> that behavior.
> ext4 goes read only on erro by default. I think that behaviour can
> be changed by a mount option.
>
> XFS shuts the filesystem down (takes it offline) preventing all
> dirty data and metadata from being issued to disk, aborts any
> journal IO that might be going on, etc. It the returns fatal errors
> (EIO or EUCLEAN "structure needs cleaning" errors) to all callers
> that attempt to write or modify the filesystem, and EIO to all
> attempts to read from it.
>
> IOWs, the XFS filesystem is *gone* once it shuts down and requires
> administrator intervention to get it back to a working state....
>
> Cheers,
>
> Dave.

If it isn't obvious, I like the XFS approach :)

Ric


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2012-01-26 22:29 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler
2011-12-22  8:14 ` Shyam_Iyer
2011-12-22  8:14   ` Shyam_Iyer
2011-12-22 15:58   ` Vivek Goyal
2011-12-22 20:54     ` Shyam_Iyer
2011-12-22 20:54       ` Shyam_Iyer
2011-12-23  3:06       ` Vivek Goyal
2011-12-23  4:35         ` Shyam_Iyer
2011-12-23  4:35           ` Shyam_Iyer
2012-01-09 12:18       ` Hannes Reinecke
2012-01-09 12:59         ` Tom Coughlan
2012-01-10  6:53           ` Ric Wheeler
2012-01-20  8:55             ` Hannes Reinecke
2012-01-19 16:17           ` [LSF/MM TOPIC] linux servers as a storage server - what'smissing? Loke, Chetan
2012-01-19 16:17             ` Loke, Chetan
2012-01-19 16:19             ` Ric Wheeler
2012-01-19 16:26               ` Loke, Chetan
2012-01-19 16:26                 ` Loke, Chetan
2012-01-19 16:29                 ` Ric Wheeler
2012-01-19 17:32                   ` Loke, Chetan
2012-01-19 17:32                     ` Loke, Chetan
2012-01-19 17:44                     ` Ric Wheeler
2012-01-19 21:30                       ` Loke, Chetan
2012-01-19 21:30                         ` Loke, Chetan
2012-01-19 21:39                         ` Ric Wheeler
2012-01-24 17:05                           ` Loke, Chetan
2012-01-24 17:05                             ` Loke, Chetan
2012-01-24 18:13                             ` Ric Wheeler
2012-01-26 22:24                             ` Dave Chinner
2012-01-26 22:29                               ` Ric Wheeler
2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton
2012-01-03 19:32   ` Chuck Lever
2012-01-17 21:16     ` J. Bruce Fields
2012-01-17 21:25       ` Chuck Lever
2012-01-24 21:36   ` J. Bruce Fields
2012-01-24 23:13     ` Ric Wheeler
2012-01-25 19:05       ` Christopher R. Hertel
2012-01-25 20:25       ` Christopher R. Hertel
2012-01-25 21:56         ` Roland Dreier
2012-01-25 22:09           ` Christopher R. Hertel
2012-01-26 21:52             ` Andy Grover
2012-01-26 11:15         ` Bart Van Assche
2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier
2012-01-18 17:51   ` Ric Wheeler
2012-01-18 18:46     ` Roland Dreier
2012-01-18 18:51       ` Bart Van Assche
2012-01-18 19:00         ` Roland Dreier
2012-01-19  8:16         ` Rolf Eike Beer
2012-01-19 17:50       ` Loke, Chetan
2012-01-19 17:50         ` Loke, Chetan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.