* [LSF/MM TOPIC] linux servers as a storage server - what's missing? @ 2011-12-21 15:59 Ric Wheeler 2011-12-22 8:14 ` Shyam_Iyer ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Ric Wheeler @ 2011-12-21 15:59 UTC (permalink / raw) To: linux-fsdevel, linux-scsi One common thing that I see a lot of these days is an increasing number of platforms that are built on our stack as storage servers. Ranging from the common linux based storage/NAS devices up to various distributed systems. Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. At last year's SNIA developers conference, it was clear that Microsoft is putting a lot of effort into enhancing windows 8 server as a storage server with both support for a pNFS server and of course SMB. I think that linux (+samba) is ahead of the windows based storage appliances today, but they are putting together a very aggressive list of features. I think that it would be useful and interesting to take a slot at this year's LSF to see how we are doing in this space. How large do we need to scale for an appliance? What kind of work is needed (support for the copy offload system call? better support for out of band notifications like those used in "thinly provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). The goal would be to see what technical gaps we have that need more active development in, not just a wish list :) Ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler @ 2011-12-22 8:14 ` Shyam_Iyer 2011-12-22 15:58 ` Vivek Goyal 2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton 2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier 2 siblings, 1 reply; 41+ messages in thread From: Shyam_Iyer @ 2011-12-22 8:14 UTC (permalink / raw) To: rwheeler, linux-fsdevel, linux-scsi > -----Original Message----- > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > owner@vger.kernel.org] On Behalf Of Ric Wheeler > Sent: Wednesday, December 21, 2011 11:00 AM > To: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org > Subject: [LSF/MM TOPIC] linux servers as a storage server - what's > missing? > > > One common thing that I see a lot of these days is an increasing number > of > platforms that are built on our stack as storage servers. Ranging from > the > common linux based storage/NAS devices up to various distributed > systems. > Almost all of them use our common stack - software RAID, LVM, XFS/ext4 > and samba. > > At last year's SNIA developers conference, it was clear that Microsoft > is > putting a lot of effort into enhancing windows 8 server as a storage > server with > both support for a pNFS server and of course SMB. I think that linux > (+samba) is > ahead of the windows based storage appliances today, but they are > putting > together a very aggressive list of features. > > I think that it would be useful and interesting to take a slot at this > year's > LSF to see how we are doing in this space. How large do we need to > scale for an > appliance? What kind of work is needed (support for the copy offload > system > call? better support for out of band notifications like those used in > "thinly > provisioned" SCSI devices? management API's? Ease of use CLI work? > SMB2.2 support?). > > The goal would be to see what technical gaps we have that need more > active > development in, not just a wish list :) > > Ric Working for a company that works with different OS vendors I get involved in such discussions on what linux offers and what it doesn't and where the gaps are both at the code level and the customer usage patterns.. A few things that stand out.. - Management models.. Performance models. I tend to think that we(linux folks) get into performance paradigm more frequently in the kernel and leave the management paradigms to the big vendors to play around leaving a lot of inconsistency in storage management by sysadmins. I think the analogy could be equated to a traffic scenario with rules vs a traffic scenario without rules. The traffic scenario without rules generally leaves a skilled expert driver navigating the traffic swiftly and reaching the destination much faster than the others but at the same time leaving the non-driving passenger with a bad feeling in the stomach. The customer is the analogy of the non-driving passenger in the case of linux. For eg: If someone had to write a decent use case that lets you use a clustered framework with nfs/pnfs with iSCSI storage backend supporting Copy offload while managing backup all you would end up is having a set of management windows in setting up this whole framework unless you are a vendor willing to take some extra brownie points from the customer in writing this whole thing up and packaging it into a framework. And if there are features not implemented in a particular filesystem/kernel subsystem like the copy offload it just needs a lot of synchronization which means the feature generally takes a long time to evolve. The kernel feature is usually implemented with the performance in mind but the management of the feature is usually left to the user. In this case a vendor includes OS distributions and stake holder storage companies.. If I flip this over to what other OSes offer.. 1) A consistent clustered filesystem that supports performance oriented features like copy offload and optimization features like thin provisioning 2) A management api for things like thin provisioning with well documented hooks to write a vendor specific plugin 3) GUI/CLI support 4) Backup management/API with hooks for vendor plugins Usually all of this is within a common framework or single management window... providing a consistent view. Simple asks - 1) Provide a consistent storage and fs management library that discourages folks to write their own usespace storage library. Include things like fs formatting(fs profiles), transport configuration(eg: iscsiadm as a library), thin provisioning watermarks, cluster management, apis for cgroups etc. The library should provide a clean set of rules/interfaces to build management apps for. Think Android market place providing a well defined framework for app writers. Let the distributions/Storage companies write their own cool apps with this framework.. 2) View implementations like copy offload, thin provisioning, snapshots, watermarks in the kernel in conjunction with this storage library. So a usecase has to be discussed to be included in this library before working in the kernel 3) And this may sound controversial but inspite of being a long time linux fan, user and observer I would say provide hooks for folks to write clean pluggins that lets them protect their proprietary work by allowing them to bundle binary blobs. Usually folks want to keep proprietary plugins in this area because - a) No other storage vendor provides an open source pluggin. So if you are a storage vendor listening this might be your cue to start the avalanche b) They are into IP protection agreement with another OS vendor c) A startup protecting its IP The benefits of open sourcing are usually realized when maintaining code.. :-) not when pitching it against simpler management frameworks offered by other OS vendors who are able to offer the feature as vendors mutually want to keep it proprietary. (The last one being my personal opinion and not as the employee of an increasingly storage company) /me fully expects brickbats but then as they say from where I come from - A fool can always try his luck a few times and get wise in the process.. :-) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-22 8:14 ` Shyam_Iyer @ 2011-12-22 15:58 ` Vivek Goyal 2011-12-22 20:54 ` Shyam_Iyer 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2011-12-22 15:58 UTC (permalink / raw) To: Shyam_Iyer; +Cc: rwheeler, linux-fsdevel, linux-scsi On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: [..] > Simple asks - > 1) Provide a consistent storage and fs management library that discourages folks to write their own usespace storage library. Include things like fs formatting(fs profiles), transport configuration(eg: iscsiadm as a library), thin provisioning watermarks, cluster management, apis for cgroups etc. ^^^^^^^^^^^^^^^^ For cgroups, we have libcgroup library. Not many people like to use it though as cgroup is exported as a filesystem and they prefer to use normal libc api to traverse and configure cgroups (Instead of going through another library). Some examples include libvrit, systemd. Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-22 15:58 ` Vivek Goyal @ 2011-12-22 20:54 ` Shyam_Iyer 2011-12-23 3:06 ` Vivek Goyal 2012-01-09 12:18 ` Hannes Reinecke 0 siblings, 2 replies; 41+ messages in thread From: Shyam_Iyer @ 2011-12-22 20:54 UTC (permalink / raw) To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi > -----Original Message----- > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > owner@vger.kernel.org] On Behalf Of Vivek Goyal > Sent: Thursday, December 22, 2011 10:59 AM > To: Iyer, Shyam > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- > scsi@vger.kernel.org > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's > missing? > > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: > > [..] > > > Simple asks - > > 1) Provide a consistent storage and fs management library that > discourages folks to write their own usespace storage library. Include > things like fs formatting(fs profiles), transport configuration(eg: > iscsiadm as a library), thin provisioning watermarks, cluster > management, apis for cgroups etc. > ^^^^^^^^^^^^^^^^ > For cgroups, we have libcgroup library. Not many people like to use it > though as cgroup is exported as a filesystem and they prefer to use > normal > libc api to traverse and configure cgroups (Instead of going through > another library). Some examples include libvrit, systemd. > > Thanks > Vivek Well honestly I think that is a libvirt/systemd issue and libvirt also invokes things like iscsiadm, dcb etc as a binary :-/ Some one could always use qemu command lines to invoke KVM/XEN but libvirt has saved me one too many days in doing a quick operation without wondering about a qemu commandline. I am also asking for ideas on how to avoid this fragmentation because just like libvirt others are also encouraged to do their own libc thing in the absence of a common storage management framework.. Does the standard interface for linux end at the user/kernel boundary or the user/libc boundary? If so I feel we would continue to lag behind other OSes in features because of the model. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-22 20:54 ` Shyam_Iyer @ 2011-12-23 3:06 ` Vivek Goyal 2011-12-23 4:35 ` Shyam_Iyer 2012-01-09 12:18 ` Hannes Reinecke 1 sibling, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2011-12-23 3:06 UTC (permalink / raw) To: Shyam_Iyer; +Cc: rwheeler, linux-fsdevel, linux-scsi On Fri, Dec 23, 2011 at 02:24:42AM +0530, Shyam_Iyer@Dell.com wrote: > > > > -----Original Message----- > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > > owner@vger.kernel.org] On Behalf Of Vivek Goyal > > Sent: Thursday, December 22, 2011 10:59 AM > > To: Iyer, Shyam > > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- > > scsi@vger.kernel.org > > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's > > missing? > > > > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: > > > > [..] > > > > > Simple asks - > > > 1) Provide a consistent storage and fs management library that > > discourages folks to write their own usespace storage library. Include > > things like fs formatting(fs profiles), transport configuration(eg: > > iscsiadm as a library), thin provisioning watermarks, cluster > > management, apis for cgroups etc. > > ^^^^^^^^^^^^^^^^ > > For cgroups, we have libcgroup library. Not many people like to use it > > though as cgroup is exported as a filesystem and they prefer to use > > normal > > libc api to traverse and configure cgroups (Instead of going through > > another library). Some examples include libvrit, systemd. > > > > Thanks > > Vivek > > Well honestly I think that is a libvirt/systemd issue and libvirt also invokes things like iscsiadm, dcb etc as a binary :-/ > > Some one could always use qemu command lines to invoke KVM/XEN but libvirt has saved me one too many days in doing a quick operation without wondering about a qemu commandline. > > I am also asking for ideas on how to avoid this fragmentation because just like libvirt others are also encouraged to do their own libc thing in the absence of a common storage management framework.. > > Does the standard interface for linux end at the user/kernel boundary or the user/libc boundary? If so I feel we would continue to lag behind other OSes in features because of the model. This is true only for IO cgroup management. There is not much to be done. For basic management, an applicatoin can just write 500 lines of code and be done with it. libcgroup does offer bunch of commnad lines operations too. Do you have something in mind, what applications expect out of a IO cgroup library and what other OSes are supporting. Don't extend this libc thing to iscsi, and other storage management requirements. Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-23 3:06 ` Vivek Goyal @ 2011-12-23 4:35 ` Shyam_Iyer 0 siblings, 0 replies; 41+ messages in thread From: Shyam_Iyer @ 2011-12-23 4:35 UTC (permalink / raw) To: vgoyal; +Cc: rwheeler, linux-fsdevel, linux-scsi > -----Original Message----- > From: Vivek Goyal [mailto:vgoyal@redhat.com] > Sent: Thursday, December 22, 2011 10:07 PM > To: Iyer, Shyam > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- > scsi@vger.kernel.org > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's > missing? > > On Fri, Dec 23, 2011 at 02:24:42AM +0530, Shyam_Iyer@Dell.com wrote: > > > > > > > -----Original Message----- > > > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > > > owner@vger.kernel.org] On Behalf Of Vivek Goyal > > > Sent: Thursday, December 22, 2011 10:59 AM > > > To: Iyer, Shyam > > > Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- > > > scsi@vger.kernel.org > > > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - > what's > > > missing? > > > > > > On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com > wrote: > > > > > > [..] > > > > > > > Simple asks - > > > > 1) Provide a consistent storage and fs management library that > > > discourages folks to write their own usespace storage library. > Include > > > things like fs formatting(fs profiles), transport configuration(eg: > > > iscsiadm as a library), thin provisioning watermarks, cluster > > > management, apis for cgroups etc. > > > ^^^^^^^^^^^^^^^^ > > > For cgroups, we have libcgroup library. Not many people like to use > it > > > though as cgroup is exported as a filesystem and they prefer to use > > > normal > > > libc api to traverse and configure cgroups (Instead of going > through > > > another library). Some examples include libvrit, systemd. > > > > > > Thanks > > > Vivek > > > > Well honestly I think that is a libvirt/systemd issue and libvirt > also invokes things like iscsiadm, dcb etc as a binary :-/ > > > > Some one could always use qemu command lines to invoke KVM/XEN but > libvirt has saved me one too many days in doing a quick operation > without wondering about a qemu commandline. > > > > I am also asking for ideas on how to avoid this fragmentation because > just like libvirt others are also encouraged to do their own libc thing > in the absence of a common storage management framework.. > > > > Does the standard interface for linux end at the user/kernel boundary > or the user/libc boundary? If so I feel we would continue to lag behind > other OSes in features because of the model. > > This is true only for IO cgroup management. There is not much to be > done. For > basic management, an applicatoin can just write 500 lines of code and > be > done with it. > > libcgroup does offer bunch of commnad lines operations too. > > Do you have something in mind, what applications expect out of a IO > cgroup > library and what other OSes are supporting. Don't extend this libc > thing > to iscsi, and other storage management requirements. Sorry Vivek but that is just one of the points in my original post.. I think I am providing points on how to improve linux as a storage server and therefore I don't want to restrict the discussion to io cgroup alone. The problem is a lack of framework that looks somewhat like this.. (I hope the formatting is preserved ) Fs management IO cgroup management Monitoring Apis HBA Management Thin provisioning Fs_create() bw log_dump HBA Apis Watermark Snap_create scsi_log \ \ fs_log hba_create High ... io_log \ \ ... iscsi_session_create Alarms/Notifications | | | \ | | | fc_login ------------------------------------------------------------Storage API----------------------------------------------------------- Double clicking on FS Management VFS \ Fs_clone \ Copy Offload ------------------------- | | | | | | Ext 4 btrfs nfs For eg: If ext4 did not support discard it would return an error with one of the fs apis. Unless you have a unified interface there won't be uniformity of features or coordination to make linux a complete storage server with apps that do things at the higher layer. Today if xfs supports copy offload it would be an intuitive guess on the part of the app running on top of it. If you want to create thin provisioning water marks possibly snapshot management is a different tool. I think the reason libvirt has so many features built in quickly is because of the modular architecture and clear interface it provides for creating and managing virtual machines. Today if I had to write a virtualization management tool with libvirt as the back end for VM management it is really a question of whether there is an API support or not.. So.. in that sense linux has matured as a virtualization server. I don't think I can do that for storage.. Now while we are talking about storage servers it is important to note if we are talking about servers with local storage or attached remote storage and so I added transport management apis to the framework along with the hba management apis.. Which is why I think storage management could get a better interface than just libc/ ioctl calls/sysfs operations On your question on what applications would like out of io cgroups.. I think that is for a different discussion.. ;-) But nevertheless here is something I was thinking.. Apis around the following.. there can be more.. -io timeouts, driver/controller, scsi, blk, fs etc -io scheduler tunings, io_delay, read_ahead, -bw profile -IOPs stats -io resource monitoring - Eg: spindle movement per application run. This will help in data placement. For a storage server folks do a lot of tuning to qualify it. Having apis to manage these and auto calibrate them with application performance load would be awesome.. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-22 20:54 ` Shyam_Iyer 2011-12-23 3:06 ` Vivek Goyal @ 2012-01-09 12:18 ` Hannes Reinecke 2012-01-09 12:59 ` Tom Coughlan 1 sibling, 1 reply; 41+ messages in thread From: Hannes Reinecke @ 2012-01-09 12:18 UTC (permalink / raw) To: Shyam_Iyer; +Cc: vgoyal, rwheeler, linux-fsdevel, linux-scsi On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote: > > >> -----Original Message----- >> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- >> owner@vger.kernel.org] On Behalf Of Vivek Goyal >> Sent: Thursday, December 22, 2011 10:59 AM >> To: Iyer, Shyam >> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- >> scsi@vger.kernel.org >> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's >> missing? >> >> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: >> >> [..] >> >>> Simple asks - >>> 1) Provide a consistent storage and fs management library that >> discourages folks to write their own usespace storage library. Include >> things like fs formatting(fs profiles), transport configuration(eg: >> iscsiadm as a library), thin provisioning watermarks, cluster >> management, apis for cgroups etc. >> ^^^^^^^^^^^^^^^^ >> For cgroups, we have libcgroup library. Not many people like to use it >> though as cgroup is exported as a filesystem and they prefer to use >> normal >> libc api to traverse and configure cgroups (Instead of going through >> another library). Some examples include libvrit, systemd. >> >> Thanks >> Vivek > > Well honestly I think that is a libvirt/systemd issue and libvirt also > invokes things like iscsiadm, dcb etc as a binary :-/ > > Some one could always use qemu command lines to invoke KVM/XEN but > libvirt has saved me one too many days in doing a quick operation > without wondering about a qemu commandline. > > I am also asking for ideas on how to avoid this fragmentation because > just like libvirt others are also encouraged to do their own libc thing > in the absence of a common storage management framework.. > > Does the standard interface for linux end at the user/kernel boundary > or the user/libc boundary? If so I feel we would continue to lag behind > other OSes in features because of the model. > StorageAPI _again_. I was under the impression RH had someone working on it. (Actually I was trying to give it a go, but then got buried under customer escalations). So yes, we know there is a shortcoming. And yes, we should improve things. But I feel another discussion about this will only give us more insight, but not moving things forward. What about having a separate session at the storage summit (or even at the collab summit) to hammer out the requirements here? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-09 12:18 ` Hannes Reinecke @ 2012-01-09 12:59 ` Tom Coughlan 2012-01-10 6:53 ` Ric Wheeler 2012-01-19 16:17 ` [LSF/MM TOPIC] linux servers as a storage server - what'smissing? Loke, Chetan 0 siblings, 2 replies; 41+ messages in thread From: Tom Coughlan @ 2012-01-09 12:59 UTC (permalink / raw) To: Hannes Reinecke, tasleson Cc: Shyam_Iyer, vgoyal, rwheeler, linux-fsdevel, linux-scsi On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote: > On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote: > > > > > >> -----Original Message----- > >> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > >> owner@vger.kernel.org] On Behalf Of Vivek Goyal > >> Sent: Thursday, December 22, 2011 10:59 AM > >> To: Iyer, Shyam > >> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- > >> scsi@vger.kernel.org > >> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's > >> missing? > >> > >> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: > >> > >> [..] > >> > >>> Simple asks - > >>> 1) Provide a consistent storage and fs management library that > >> discourages folks to write their own usespace storage library. Include > >> things like fs formatting(fs profiles), transport configuration(eg: > >> iscsiadm as a library), thin provisioning watermarks, cluster > >> management, apis for cgroups etc. > >> ^^^^^^^^^^^^^^^^ > >> For cgroups, we have libcgroup library. Not many people like to use it > >> though as cgroup is exported as a filesystem and they prefer to use > >> normal > >> libc api to traverse and configure cgroups (Instead of going through > >> another library). Some examples include libvrit, systemd. > >> > >> Thanks > >> Vivek > > > > Well honestly I think that is a libvirt/systemd issue and libvirt also > > invokes things like iscsiadm, dcb etc as a binary :-/ > > > > Some one could always use qemu command lines to invoke KVM/XEN but > > libvirt has saved me one too many days in doing a quick operation > > without wondering about a qemu commandline. > > > > I am also asking for ideas on how to avoid this fragmentation because > > just like libvirt others are also encouraged to do their own libc > thing > > in the absence of a common storage management framework.. > > > > Does the standard interface for linux end at the user/kernel boundary > > or the user/libc boundary? If so I feel we would continue to lag > behind > > other OSes in features because of the model. > > > StorageAPI _again_. > > I was under the impression RH had someone working on it. Yes, Red Hat does. Tony Asleson. libStorageMgmt: http://sourceforge.net/apps/trac/libstoragemgmt The current focus is on managing external storage (SMI-S, etc.). This focus can be expanded over time. Contributions welcome. > (Actually I was trying to give it a go, but then got buried under > customer escalations). > > So yes, we know there is a shortcoming. > And yes, we should improve things. > > But I feel another discussion about this will only give us more > insight, but not moving things forward. > > What about having a separate session at the storage summit (or even > at the collab summit) to hammer out the requirements here? That would be fine, although as you say, we need more than talk. Tom ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-09 12:59 ` Tom Coughlan @ 2012-01-10 6:53 ` Ric Wheeler 2012-01-20 8:55 ` Hannes Reinecke 2012-01-19 16:17 ` [LSF/MM TOPIC] linux servers as a storage server - what'smissing? Loke, Chetan 1 sibling, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-10 6:53 UTC (permalink / raw) To: Tom Coughlan Cc: Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/09/2012 02:59 PM, Tom Coughlan wrote: > On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote: >> On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote: >>> >>>> -----Original Message----- >>>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- >>>> owner@vger.kernel.org] On Behalf Of Vivek Goyal >>>> Sent: Thursday, December 22, 2011 10:59 AM >>>> To: Iyer, Shyam >>>> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- >>>> scsi@vger.kernel.org >>>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's >>>> missing? >>>> >>>> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com wrote: >>>> >>>> [..] >>>> >>>>> Simple asks - >>>>> 1) Provide a consistent storage and fs management library that >>>> discourages folks to write their own usespace storage library. Include >>>> things like fs formatting(fs profiles), transport configuration(eg: >>>> iscsiadm as a library), thin provisioning watermarks, cluster >>>> management, apis for cgroups etc. >>>> ^^^^^^^^^^^^^^^^ >>>> For cgroups, we have libcgroup library. Not many people like to use it >>>> though as cgroup is exported as a filesystem and they prefer to use >>>> normal >>>> libc api to traverse and configure cgroups (Instead of going through >>>> another library). Some examples include libvrit, systemd. >>>> >>>> Thanks >>>> Vivek >>> Well honestly I think that is a libvirt/systemd issue and libvirt also >>> invokes things like iscsiadm, dcb etc as a binary :-/ >>> >>> Some one could always use qemu command lines to invoke KVM/XEN but >>> libvirt has saved me one too many days in doing a quick operation >>> without wondering about a qemu commandline. >>> >>> I am also asking for ideas on how to avoid this fragmentation because >>> just like libvirt others are also encouraged to do their own libc >> thing >>> in the absence of a common storage management framework.. >>> >>> Does the standard interface for linux end at the user/kernel boundary >>> or the user/libc boundary? If so I feel we would continue to lag >> behind >>> other OSes in features because of the model. >>> >> StorageAPI _again_. >> >> I was under the impression RH had someone working on it. > Yes, Red Hat does. Tony Asleson. libStorageMgmt: > > http://sourceforge.net/apps/trac/libstoragemgmt > > The current focus is on managing external storage (SMI-S, etc.). This > focus can be expanded over time. Contributions welcome. > >> (Actually I was trying to give it a go, but then got buried under >> customer escalations). >> >> So yes, we know there is a shortcoming. >> And yes, we should improve things. >> >> But I feel another discussion about this will only give us more >> insight, but not moving things forward. >> >> What about having a separate session at the storage summit (or even >> at the collab summit) to hammer out the requirements here? > That would be fine, although as you say, we need more than talk. > > Tom > Having a special session would be really a good idea - given the size of the discussion, we might want to do both a talk and a breakout at collab summit... ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-10 6:53 ` Ric Wheeler @ 2012-01-20 8:55 ` Hannes Reinecke 0 siblings, 0 replies; 41+ messages in thread From: Hannes Reinecke @ 2012-01-20 8:55 UTC (permalink / raw) To: Ric Wheeler Cc: Tom Coughlan, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/10/2012 07:53 AM, Ric Wheeler wrote: > On 01/09/2012 02:59 PM, Tom Coughlan wrote: >> On Mon, 2012-01-09 at 13:18 +0100, Hannes Reinecke wrote: >>> On 12/22/2011 09:54 PM, Shyam_Iyer@Dell.com wrote: >>>> >>>>> -----Original Message----- >>>>> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- >>>>> owner@vger.kernel.org] On Behalf Of Vivek Goyal >>>>> Sent: Thursday, December 22, 2011 10:59 AM >>>>> To: Iyer, Shyam >>>>> Cc: rwheeler@redhat.com; linux-fsdevel@vger.kernel.org; linux- >>>>> scsi@vger.kernel.org >>>>> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - >>>>> what's >>>>> missing? >>>>> >>>>> On Thu, Dec 22, 2011 at 01:44:16PM +0530, Shyam_Iyer@Dell.com >>>>> wrote: >>>>> >>>>> [..] >>>>> >>>>>> Simple asks - >>>>>> 1) Provide a consistent storage and fs management library that >>>>> discourages folks to write their own usespace storage library. >>>>> Include >>>>> things like fs formatting(fs profiles), transport >>>>> configuration(eg: >>>>> iscsiadm as a library), thin provisioning watermarks, cluster >>>>> management, apis for cgroups etc. >>>>> ^^^^^^^^^^^^^^^^ >>>>> For cgroups, we have libcgroup library. Not many people like to >>>>> use it >>>>> though as cgroup is exported as a filesystem and they prefer to >>>>> use >>>>> normal >>>>> libc api to traverse and configure cgroups (Instead of going >>>>> through >>>>> another library). Some examples include libvrit, systemd. >>>>> >>>>> Thanks >>>>> Vivek >>>> Well honestly I think that is a libvirt/systemd issue and >>>> libvirt also >>>> invokes things like iscsiadm, dcb etc as a binary :-/ >>>> >>>> Some one could always use qemu command lines to invoke KVM/XEN but >>>> libvirt has saved me one too many days in doing a quick operation >>>> without wondering about a qemu commandline. >>>> >>>> I am also asking for ideas on how to avoid this fragmentation >>>> because >>>> just like libvirt others are also encouraged to do their own libc >>> thing >>>> in the absence of a common storage management framework.. >>>> >>>> Does the standard interface for linux end at the user/kernel >>>> boundary >>>> or the user/libc boundary? If so I feel we would continue to lag >>> behind >>>> other OSes in features because of the model. >>>> >>> StorageAPI _again_. >>> >>> I was under the impression RH had someone working on it. >> Yes, Red Hat does. Tony Asleson. libStorageMgmt: >> >> http://sourceforge.net/apps/trac/libstoragemgmt >> >> The current focus is on managing external storage (SMI-S, etc.). This >> focus can be expanded over time. Contributions welcome. >> >>> (Actually I was trying to give it a go, but then got buried under >>> customer escalations). >>> >>> So yes, we know there is a shortcoming. >>> And yes, we should improve things. >>> >>> But I feel another discussion about this will only give us more >>> insight, but not moving things forward. >>> >>> What about having a separate session at the storage summit (or even >>> at the collab summit) to hammer out the requirements here? >> That would be fine, although as you say, we need more than talk. >> >> Tom >> > > Having a special session would be really a good idea - given the > size of the discussion, we might want to do both a talk and a > breakout at collab summit... > To advance this further, I've submitted a BoF proposal 'Storage Management on Linux' for the Collab Summit. I would be available for a joint talk as an introduction if someone would be interested ... Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-09 12:59 ` Tom Coughlan 2012-01-10 6:53 ` Ric Wheeler @ 2012-01-19 16:17 ` Loke, Chetan 2012-01-19 16:19 ` Ric Wheeler 1 sibling, 1 reply; 41+ messages in thread From: Loke, Chetan @ 2012-01-19 16:17 UTC (permalink / raw) To: Tom Coughlan, Hannes Reinecke, tasleson Cc: Shyam_Iyer, vgoyal, rwheeler, linux-fsdevel, linux-scsi > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan > > Yes, Red Hat does. Tony Asleson. libStorageMgmt: > > http://sourceforge.net/apps/trac/libstoragemgmt > > The current focus is on managing external storage (SMI-S, etc.). This > focus can be expanded over time. Contributions welcome. > Device management(scalability/feature mgmt) - 1) scalability: I may be wrong but storage boxes like symmetrix(and others) support large number of LUNs. So device management 'scalability' will make linux shine. By scalability I mean efficiently managing let's say 32K(?) LUNs. 2) feature management: As Shyam mentioned in earlier emails - provide an ecosystem similar to smart-phones. Let me provide an example(and then you guys can correct me by saying this already exists or bits and pieces exist): a) As a kernel developer - export all the features(to the upper layers) we can by querying the target. b) As an app developer - GetLUNFeatures(), could return - {Thin Provision, RAID-level, ... ). c) As a sys-admin - If I right click(for a volume/LUN) on my management GUI, I should be able to tell if my volume supports thin-prov, backed by DRBD[proxy?] etc. So once we provide b) from above, tens(if not hundreds) of apps will be developed in a short period of time. I feel libstoragemgmt is an excellent place to get all of this organized. Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 16:17 ` [LSF/MM TOPIC] linux servers as a storage server - what'smissing? Loke, Chetan @ 2012-01-19 16:19 ` Ric Wheeler 2012-01-19 16:26 ` Loke, Chetan 0 siblings, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-19 16:19 UTC (permalink / raw) To: Loke, Chetan Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/19/2012 11:17 AM, Loke, Chetan wrote: >> From: linux-scsi-owner@vger.kernel.org > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan > >> Yes, Red Hat does. Tony Asleson. libStorageMgmt: >> >> http://sourceforge.net/apps/trac/libstoragemgmt >> >> The current focus is on managing external storage (SMI-S, etc.). This >> focus can be expanded over time. Contributions welcome. >> > Device management(scalability/feature mgmt) - > > 1) scalability: > I may be wrong but storage boxes like symmetrix(and others) support > large number of LUNs. > So device management 'scalability' will make linux shine. By scalability > I mean efficiently managing let's say 32K(?) LUNs. You can actually have (even more) luns than that for each array in a SAN. From a host, you will probably not see all of the luns all of the time, but you certainly need to have good support for a large lun address space that will be sparsely populated. > > 2) feature management: > As Shyam mentioned in earlier emails - provide an ecosystem similar to > smart-phones. Let me provide an example(and then > you guys can correct me by saying this already exists or bits and pieces > exist): > > a) As a kernel developer - export all the features(to the upper layers) > we can by querying the target. > > b) As an app developer - GetLUNFeatures(), could return - {Thin > Provision, RAID-level, ... ). > > c) As a sys-admin - If I right click(for a volume/LUN) on my management > GUI, I should be able to tell if my volume supports thin-prov, backed by > DRBD[proxy?] etc. > > So once we provide b) from above, tens(if not hundreds) of apps will be > developed in a short period of time. > I feel libstoragemgmt is an excellent place to get all of this > organized. > > > Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 16:19 ` Ric Wheeler @ 2012-01-19 16:26 ` Loke, Chetan 2012-01-19 16:29 ` Ric Wheeler 0 siblings, 1 reply; 41+ messages in thread From: Loke, Chetan @ 2012-01-19 16:26 UTC (permalink / raw) To: Ric Wheeler Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi > -----Original Message----- > From: Ric Wheeler [mailto:rwheeler@redhat.com] > Sent: January 19, 2012 11:20 AM > To: Loke, Chetan > Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com; > Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org; > linux-scsi@vger.kernel.org > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - > what'smissing? > > On 01/19/2012 11:17 AM, Loke, Chetan wrote: > >> From: linux-scsi-owner@vger.kernel.org > > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan > > > >> Yes, Red Hat does. Tony Asleson. libStorageMgmt: > >> > >> http://sourceforge.net/apps/trac/libstoragemgmt > >> > >> The current focus is on managing external storage (SMI-S, etc.). > This > >> focus can be expanded over time. Contributions welcome. > >> > > Device management(scalability/feature mgmt) - > > > > 1) scalability: > > I may be wrong but storage boxes like symmetrix(and others) support > > large number of LUNs. > > So device management 'scalability' will make linux shine. By > scalability > > I mean efficiently managing let's say 32K(?) LUNs. > > You can actually have (even more) luns than that for each array in a > SAN. From a host, you will probably not see all of the luns all of the time, but > you certainly need to have good support for a large lun address space that > will be sparsely populated. > > True, a single front-end won't see all of those LUNs/devices. So not a big concern about the front-end hosts. I am thinking of a use-case where folks can use a linux-box to manage their different storage arrays. So this linux box with 'libstoragemgmt + app' needs to manage(scan/create/delete/so on) all those LUNs. Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 16:26 ` Loke, Chetan @ 2012-01-19 16:29 ` Ric Wheeler 2012-01-19 17:32 ` Loke, Chetan 0 siblings, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-19 16:29 UTC (permalink / raw) To: Loke, Chetan Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/19/2012 11:26 AM, Loke, Chetan wrote: >> -----Original Message----- >> From: Ric Wheeler [mailto:rwheeler@redhat.com] >> Sent: January 19, 2012 11:20 AM >> To: Loke, Chetan >> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com; >> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org; >> linux-scsi@vger.kernel.org >> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - >> what'smissing? >> >> On 01/19/2012 11:17 AM, Loke, Chetan wrote: >>>> From: linux-scsi-owner@vger.kernel.org >>> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Tom Coughlan >>> >>>> Yes, Red Hat does. Tony Asleson. libStorageMgmt: >>>> >>>> http://sourceforge.net/apps/trac/libstoragemgmt >>>> >>>> The current focus is on managing external storage (SMI-S, etc.). >> This >>>> focus can be expanded over time. Contributions welcome. >>>> >>> Device management(scalability/feature mgmt) - >>> >>> 1) scalability: >>> I may be wrong but storage boxes like symmetrix(and others) support >>> large number of LUNs. >>> So device management 'scalability' will make linux shine. By >> scalability >>> I mean efficiently managing let's say 32K(?) LUNs. >> You can actually have (even more) luns than that for each array in a >> SAN. From a host, you will probably not see all of the luns all of the > time, but >> you certainly need to have good support for a large lun address space > that >> will be sparsely populated. >> >> > True, a single front-end won't see all of those LUNs/devices. So not a > big concern > about the front-end hosts. > > I am thinking of a use-case where folks can use a linux-box to manage > their different storage arrays. > So this linux box with 'libstoragemgmt + app' needs to > manage(scan/create/delete/so on) all those LUNs. > > > Chetan Loke People do have boxes with thousands of luns though & file systems in active use. Both for SAN and NAS volumes. One of the challenges is what to do when just one LUN (or NFS server) crashes and burns. You simply cannot "reboot" the server to clean up after one bad mount when you have thousands of other happy users runs on thousands/hundreds of other mount points :) Ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 16:29 ` Ric Wheeler @ 2012-01-19 17:32 ` Loke, Chetan 2012-01-19 17:44 ` Ric Wheeler 0 siblings, 1 reply; 41+ messages in thread From: Loke, Chetan @ 2012-01-19 17:32 UTC (permalink / raw) To: Ric Wheeler Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi > > True, a single front-end won't see all of those LUNs/devices. So not > a > > big concern > > about the front-end hosts. > > > > I am thinking of a use-case where folks can use a linux-box to manage > > their different storage arrays. > > So this linux box with 'libstoragemgmt + app' needs to > > manage(scan/create/delete/so on) all those LUNs. > > > > People do have boxes with thousands of luns though & file systems in > active use. > Both for SAN and NAS volumes. > > One of the challenges is what to do when just one LUN (or NFS server) > crashes > and burns. The FS needs to go read-only(plain & simple) because you don't know what's going on. You can't risk writing data anymore. Let the apps fail. You can make it happen even today. It's a simple exercise. Like others, I have seen/debugged enough weirdness when it comes to resets/aborts(FYI - 200+ hosts in a cluster). Because of NDA reasons I can't disclose a whole lot but folks have fixed/enhanced scsi stack to make resets/aborts fully robust. And you need folks who can debug 'apps/FS/block/initiator/wire-protocol/target-side' in one shot. Simple. So when you say 'crash & burn' then either or 'all' of the above(minus the protocol handling) might need fixing. > You simply cannot "reboot" the server to clean up after one > bad mount when you have thousands of other happy users runs on thousands/hundreds > of other mount points :) Again, can't front-end can go read only and limit the outage w/o disturbing thousands of users? Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 17:32 ` Loke, Chetan @ 2012-01-19 17:44 ` Ric Wheeler 2012-01-19 21:30 ` Loke, Chetan 0 siblings, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-19 17:44 UTC (permalink / raw) To: Loke, Chetan Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/19/2012 12:32 PM, Loke, Chetan wrote: >>> True, a single front-end won't see all of those LUNs/devices. So not >> a >>> big concern >>> about the front-end hosts. >>> >>> I am thinking of a use-case where folks can use a linux-box to > manage >>> their different storage arrays. >>> So this linux box with 'libstoragemgmt + app' needs to >>> manage(scan/create/delete/so on) all those LUNs. >>> >> People do have boxes with thousands of luns though& file systems in >> active use. >> Both for SAN and NAS volumes. >> >> One of the challenges is what to do when just one LUN (or NFS server) >> crashes >> and burns. > The FS needs to go read-only(plain& simple) because you don't know > what's going on. > You can't risk writing data anymore. Let the apps fail. You can make it > happen even today. > It's a simple exercise. Nope - it needs to be torn down and we need to be able to cleanly unmount it. Letting an application see and read-only file system when the disk is gone or server down is not very useful since you won't get any non-cached data back. Also, if you have an ability to migrate that mount (same mount point) to another server or clone LUN, you want to unmount the source so you can remount the data under that same mount point/namespace.... ric > > Like others, I have seen/debugged enough weirdness when it comes to > resets/aborts(FYI - 200+ hosts in a cluster). > Because of NDA reasons I can't disclose a whole lot but folks have > fixed/enhanced > scsi stack to make resets/aborts fully robust. And you need folks who > can debug > 'apps/FS/block/initiator/wire-protocol/target-side' in one shot. Simple. > So when you say 'crash& burn' then either or 'all' of the above(minus > the protocol handling) might need fixing. > >> You simply cannot "reboot" the server to clean up after one >> bad mount when you have thousands of other happy users runs on > thousands/hundreds >> of other mount points :) > Again, can't front-end can go read only and limit the outage w/o > disturbing thousands of users? > > > Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 17:44 ` Ric Wheeler @ 2012-01-19 21:30 ` Loke, Chetan 2012-01-19 21:39 ` Ric Wheeler 0 siblings, 1 reply; 41+ messages in thread From: Loke, Chetan @ 2012-01-19 21:30 UTC (permalink / raw) To: Ric Wheeler Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi > From: Ric Wheeler [mailto:rwheeler@redhat.com] > Sent: January 19, 2012 12:44 PM > To: Loke, Chetan > Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com; > Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org; > linux-scsi@vger.kernel.org > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - > what'smissing? > > On 01/19/2012 12:32 PM, Loke, Chetan wrote: > >>> True, a single front-end won't see all of those LUNs/devices. So > not > >> a > >>> big concern > >>> about the front-end hosts. > >>> > >>> I am thinking of a use-case where folks can use a linux-box to > > manage > >>> their different storage arrays. > >>> So this linux box with 'libstoragemgmt + app' needs to > >>> manage(scan/create/delete/so on) all those LUNs. > >>> > >> People do have boxes with thousands of luns though& file systems in > >> active use. > >> Both for SAN and NAS volumes. > >> > >> One of the challenges is what to do when just one LUN (or NFS > server) > >> crashes > >> and burns. > > The FS needs to go read-only(plain& simple) because you don't know > > what's going on. > > You can't risk writing data anymore. Let the apps fail. You can make > it > > happen even today. > > It's a simple exercise. > > Nope - it needs to be torn down and we need to be able to cleanly > unmount it. > > Letting an application see and read-only file system when the disk is > gone or > server down is not very useful since you won't get any non-cached data > back. > Sure, it's just a partial snapshot(aka cached-data) of the file-system. But writes that have to fetch the non-cached data, will unnecessarily issue I/O to the fabric. These orphaned I/O's cause more pain in the cleanup. And if caching is enabled on the front-side then it's all the more painful. We can go one extra step and make FS fail read I/O for non-cached data too to avoid more orphan IOs. Tearing down will happen sometime later. But don't you agree that something needs to happen before that? And that something is, read-only, which will eventually propagate to the users(example when you are copying a new file). Users will then report it to their IT/admins. This approach of serving the snap-shot(cached) file-system could serve some users for what it's worth. It's better than surprise-removal and issuing needless IOs(read - eh race conditions). > Also, if you have an ability to migrate that mount (same mount point) > to another > server or clone LUN, you want to unmount the source so you can remount > the data > under that same mount point/namespace.... > Won't this be protocol specific. > ric Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 21:30 ` Loke, Chetan @ 2012-01-19 21:39 ` Ric Wheeler 2012-01-24 17:05 ` Loke, Chetan 0 siblings, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-19 21:39 UTC (permalink / raw) To: Loke, Chetan Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/19/2012 04:30 PM, Loke, Chetan wrote: >> From: Ric Wheeler [mailto:rwheeler@redhat.com] >> Sent: January 19, 2012 12:44 PM >> To: Loke, Chetan >> Cc: Tom Coughlan; Hannes Reinecke; tasleson@redhat.com; >> Shyam_Iyer@Dell.com; vgoyal@redhat.com; linux-fsdevel@vger.kernel.org; >> linux-scsi@vger.kernel.org >> Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - >> what'smissing? >> >> On 01/19/2012 12:32 PM, Loke, Chetan wrote: >>>>> True, a single front-end won't see all of those LUNs/devices. So >> not >>>> a >>>>> big concern >>>>> about the front-end hosts. >>>>> >>>>> I am thinking of a use-case where folks can use a linux-box to >>> manage >>>>> their different storage arrays. >>>>> So this linux box with 'libstoragemgmt + app' needs to >>>>> manage(scan/create/delete/so on) all those LUNs. >>>>> >>>> People do have boxes with thousands of luns though& file systems > in >>>> active use. >>>> Both for SAN and NAS volumes. >>>> >>>> One of the challenges is what to do when just one LUN (or NFS >> server) >>>> crashes >>>> and burns. >>> The FS needs to go read-only(plain& simple) because you don't know >>> what's going on. >>> You can't risk writing data anymore. Let the apps fail. You can make >> it >>> happen even today. >>> It's a simple exercise. >> Nope - it needs to be torn down and we need to be able to cleanly >> unmount it. >> >> Letting an application see and read-only file system when the disk is >> gone or >> server down is not very useful since you won't get any non-cached data >> back. >> > Sure, it's just a partial snapshot(aka cached-data) of the file-system. > > But writes that have to fetch the non-cached data, will unnecessarily > issue I/O to the fabric. These orphaned I/O's cause more pain in the > cleanup. > And if caching is enabled on the front-side then it's all the more > painful. > > We can go one extra step and make FS fail read I/O for non-cached data > too > to avoid more orphan IOs. I don't really see this as a useful state. Read-only without a real backing file system or LUN is hit or miss, that file system should go offline :) > > Tearing down will happen sometime later. But don't you agree that > something needs > to happen before that? And that something is, read-only, which will > eventually > propagate to the users(example when you are copying a new file). > Users will then report it to their IT/admins. > This approach of serving the snap-shot(cached) file-system could serve > some users for what it's worth. It's better than surprise-removal and > issuing > needless IOs(read - eh race conditions). > >> Also, if you have an ability to migrate that mount (same mount point) >> to another >> server or clone LUN, you want to unmount the source so you can remount >> the data >> under that same mount point/namespace.... >> > Won't this be protocol specific. > Not really protocol specific. We need to be able to do a forced unmount and then do fail over (that varies depending on many things like your HA frame work and certainly the type of thing you are attempting to fail over) Ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-19 21:39 ` Ric Wheeler @ 2012-01-24 17:05 ` Loke, Chetan 2012-01-24 18:13 ` Ric Wheeler 2012-01-26 22:24 ` Dave Chinner 0 siblings, 2 replies; 41+ messages in thread From: Loke, Chetan @ 2012-01-24 17:05 UTC (permalink / raw) To: Ric Wheeler Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi > > But writes that have to fetch the non-cached data, will unnecessarily > > issue I/O to the fabric. These orphaned I/O's cause more pain in the > > cleanup. > > And if caching is enabled on the front-side then it's all the more > > painful. > > > > We can go one extra step and make FS fail read I/O for non-cached data > > too to avoid more orphan IOs. > > I don't really see this as a useful state. Read-only without a real > backing file > system or LUN is hit or miss, that file system should go offline :) > Last year when I checked, I forget but I think xfs(or ext4) was going read-only. If this is still the case then you are basically asking FS's to modify that behavior. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-24 17:05 ` Loke, Chetan @ 2012-01-24 18:13 ` Ric Wheeler 2012-01-26 22:24 ` Dave Chinner 1 sibling, 0 replies; 41+ messages in thread From: Ric Wheeler @ 2012-01-24 18:13 UTC (permalink / raw) To: Loke, Chetan Cc: Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/24/2012 12:05 PM, Loke, Chetan wrote: >>> But writes that have to fetch the non-cached data, will > unnecessarily >>> issue I/O to the fabric. These orphaned I/O's cause more pain in the >>> cleanup. >>> And if caching is enabled on the front-side then it's all the more >>> painful. >>> >>> We can go one extra step and make FS fail read I/O for non-cached > data >>> too to avoid more orphan IOs. >> I don't really see this as a useful state. Read-only without a real >> backing file >> system or LUN is hit or miss, that file system should go offline :) >> > Last year when I checked, I forget but I think xfs(or ext4) was going > read-only. > If this is still the case then you are basically asking FS's to modify > that behavior. Not really. I think that at that state, we need to be able to cleanly umount those file systems and clean up the system without requiring a reboot. Read-only state is not a usable condition for a file system without a backing device, just an side effect of our error handling... ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-24 17:05 ` Loke, Chetan 2012-01-24 18:13 ` Ric Wheeler @ 2012-01-26 22:24 ` Dave Chinner 2012-01-26 22:29 ` Ric Wheeler 1 sibling, 1 reply; 41+ messages in thread From: Dave Chinner @ 2012-01-26 22:24 UTC (permalink / raw) To: Loke, Chetan Cc: Ric Wheeler, Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On Tue, Jan 24, 2012 at 12:05:23PM -0500, Loke, Chetan wrote: > > > But writes that have to fetch the non-cached data, will > unnecessarily > > > issue I/O to the fabric. These orphaned I/O's cause more pain in the > > > cleanup. > > > And if caching is enabled on the front-side then it's all the more > > > painful. > > > > > > We can go one extra step and make FS fail read I/O for non-cached > data > > > too to avoid more orphan IOs. > > > > I don't really see this as a useful state. Read-only without a real > > backing file > > system or LUN is hit or miss, that file system should go offline :) > > > > Last year when I checked, I forget but I think xfs(or ext4) was going > read-only. > If this is still the case then you are basically asking FS's to modify > that behavior. ext4 goes read only on erro by default. I think that behaviour can be changed by a mount option. XFS shuts the filesystem down (takes it offline) preventing all dirty data and metadata from being issued to disk, aborts any journal IO that might be going on, etc. It the returns fatal errors (EIO or EUCLEAN "structure needs cleaning" errors) to all callers that attempt to write or modify the filesystem, and EIO to all attempts to read from it. IOWs, the XFS filesystem is *gone* once it shuts down and requires administrator intervention to get it back to a working state.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what'smissing? 2012-01-26 22:24 ` Dave Chinner @ 2012-01-26 22:29 ` Ric Wheeler 0 siblings, 0 replies; 41+ messages in thread From: Ric Wheeler @ 2012-01-26 22:29 UTC (permalink / raw) To: Dave Chinner Cc: Loke, Chetan, Tom Coughlan, Hannes Reinecke, tasleson, Shyam_Iyer, vgoyal, linux-fsdevel, linux-scsi On 01/26/2012 05:24 PM, Dave Chinner wrote: > On Tue, Jan 24, 2012 at 12:05:23PM -0500, Loke, Chetan wrote: >>>> But writes that have to fetch the non-cached data, will >> unnecessarily >>>> issue I/O to the fabric. These orphaned I/O's cause more pain in the >>>> cleanup. >>>> And if caching is enabled on the front-side then it's all the more >>>> painful. >>>> >>>> We can go one extra step and make FS fail read I/O for non-cached >> data >>>> too to avoid more orphan IOs. >>> I don't really see this as a useful state. Read-only without a real >>> backing file >>> system or LUN is hit or miss, that file system should go offline :) >>> >> Last year when I checked, I forget but I think xfs(or ext4) was going >> read-only. >> If this is still the case then you are basically asking FS's to modify >> that behavior. > ext4 goes read only on erro by default. I think that behaviour can > be changed by a mount option. > > XFS shuts the filesystem down (takes it offline) preventing all > dirty data and metadata from being issued to disk, aborts any > journal IO that might be going on, etc. It the returns fatal errors > (EIO or EUCLEAN "structure needs cleaning" errors) to all callers > that attempt to write or modify the filesystem, and EIO to all > attempts to read from it. > > IOWs, the XFS filesystem is *gone* once it shuts down and requires > administrator intervention to get it back to a working state.... > > Cheers, > > Dave. If it isn't obvious, I like the XFS approach :) Ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler 2011-12-22 8:14 ` Shyam_Iyer @ 2012-01-03 19:26 ` Jeff Layton 2012-01-03 19:32 ` Chuck Lever 2012-01-24 21:36 ` J. Bruce Fields 2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier 2 siblings, 2 replies; 41+ messages in thread From: Jeff Layton @ 2012-01-03 19:26 UTC (permalink / raw) To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi On Wed, 21 Dec 2011 10:59:43 -0500 Ric Wheeler <rwheeler@redhat.com> wrote: > > One common thing that I see a lot of these days is an increasing number of > platforms that are built on our stack as storage servers. Ranging from the > common linux based storage/NAS devices up to various distributed systems. > Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. > > At last year's SNIA developers conference, it was clear that Microsoft is > putting a lot of effort into enhancing windows 8 server as a storage server with > both support for a pNFS server and of course SMB. I think that linux (+samba) is > ahead of the windows based storage appliances today, but they are putting > together a very aggressive list of features. > > I think that it would be useful and interesting to take a slot at this year's > LSF to see how we are doing in this space. How large do we need to scale for an > appliance? What kind of work is needed (support for the copy offload system > call? better support for out of band notifications like those used in "thinly > provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). > > The goal would be to see what technical gaps we have that need more active > development in, not just a wish list :) > > Ric Unfortunately, w/o a wishlist of sorts, it's hard to know what needs more active development ;). While HCH will probably disagree, being able to support more NFSv4/Windows API features at the VFS layer would make it a lot easier to do a more unified serving appliance. Right now, both knfsd and samba track too much info internally, and that makes it very difficult to serve the same data via multiple protocols. Off the top of my head, my "wishlist" for better NFSv4 serving would be: - RichACLs - Share/Deny mode support on open - mandatory locking that doesn't rely on weirdo file modes It's always going to be hard for us to compete with dedicated appliances. Where Linux can shine though is in allowing for more innovative combinations. Being able to do active/active NFS serving from clustered filesystems, for instance is something that we can eventually attain but that would be harder to do in an appliance. This sort of discussion might also dovetail with Benny's proposal about pNFS serving. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton @ 2012-01-03 19:32 ` Chuck Lever 2012-01-17 21:16 ` J. Bruce Fields 2012-01-24 21:36 ` J. Bruce Fields 1 sibling, 1 reply; 41+ messages in thread From: Chuck Lever @ 2012-01-03 19:32 UTC (permalink / raw) To: Jeff Layton; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote: > On Wed, 21 Dec 2011 10:59:43 -0500 > Ric Wheeler <rwheeler@redhat.com> wrote: > >> >> One common thing that I see a lot of these days is an increasing number of >> platforms that are built on our stack as storage servers. Ranging from the >> common linux based storage/NAS devices up to various distributed systems. >> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. >> >> At last year's SNIA developers conference, it was clear that Microsoft is >> putting a lot of effort into enhancing windows 8 server as a storage server with >> both support for a pNFS server and of course SMB. I think that linux (+samba) is >> ahead of the windows based storage appliances today, but they are putting >> together a very aggressive list of features. >> >> I think that it would be useful and interesting to take a slot at this year's >> LSF to see how we are doing in this space. How large do we need to scale for an >> appliance? What kind of work is needed (support for the copy offload system >> call? better support for out of band notifications like those used in "thinly >> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). >> >> The goal would be to see what technical gaps we have that need more active >> development in, not just a wish list :) >> >> Ric > > Unfortunately, w/o a wishlist of sorts, it's hard to know what needs > more active development ;). > > While HCH will probably disagree, being able to support more > NFSv4/Windows API features at the VFS layer would make it a lot easier > to do a more unified serving appliance. Right now, both knfsd and samba > track too much info internally, and that makes it very difficult to > serve the same data via multiple protocols. > > Off the top of my head, my "wishlist" for better NFSv4 serving would be: > > - RichACLs > - Share/Deny mode support on open > - mandatory locking that doesn't rely on weirdo file modes To add a few more NFSv4 related items: - Simplified ID mapping and security configuration - Support for NFSv4 migration and replication - Better server observability (for operational and performance debugging in the field) - FedFS and NFS basic junctions (already under way) > It's always going to be hard for us to compete with dedicated > appliances. Where Linux can shine though is in allowing for more > innovative combinations. > > Being able to do active/active NFS serving from clustered filesystems, > for instance is something that we can eventually attain but that would > be harder to do in an appliance. This sort of discussion might also > dovetail with Benny's proposal about pNFS serving. > > -- > Jeff Layton <jlayton@redhat.com> > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-03 19:32 ` Chuck Lever @ 2012-01-17 21:16 ` J. Bruce Fields 2012-01-17 21:25 ` Chuck Lever 0 siblings, 1 reply; 41+ messages in thread From: J. Bruce Fields @ 2012-01-17 21:16 UTC (permalink / raw) To: Chuck Lever; +Cc: Jeff Layton, Ric Wheeler, linux-fsdevel, linux-scsi On Tue, Jan 03, 2012 at 02:32:40PM -0500, Chuck Lever wrote: > > On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote: > > > On Wed, 21 Dec 2011 10:59:43 -0500 > > Ric Wheeler <rwheeler@redhat.com> wrote: > > > >> > >> One common thing that I see a lot of these days is an increasing number of > >> platforms that are built on our stack as storage servers. Ranging from the > >> common linux based storage/NAS devices up to various distributed systems. > >> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. > >> > >> At last year's SNIA developers conference, it was clear that Microsoft is > >> putting a lot of effort into enhancing windows 8 server as a storage server with > >> both support for a pNFS server and of course SMB. I think that linux (+samba) is > >> ahead of the windows based storage appliances today, but they are putting > >> together a very aggressive list of features. > >> > >> I think that it would be useful and interesting to take a slot at this year's > >> LSF to see how we are doing in this space. How large do we need to scale for an > >> appliance? What kind of work is needed (support for the copy offload system > >> call? better support for out of band notifications like those used in "thinly > >> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). > >> > >> The goal would be to see what technical gaps we have that need more active > >> development in, not just a wish list :) > >> > >> Ric > > > > Unfortunately, w/o a wishlist of sorts, it's hard to know what needs > > more active development ;). > > > > While HCH will probably disagree, being able to support more > > NFSv4/Windows API features at the VFS layer would make it a lot easier > > to do a more unified serving appliance. Right now, both knfsd and samba > > track too much info internally, and that makes it very difficult to > > serve the same data via multiple protocols. > > > > Off the top of my head, my "wishlist" for better NFSv4 serving would be: > > > > - RichACLs > > - Share/Deny mode support on open > > - mandatory locking that doesn't rely on weirdo file modes > > To add a few more NFSv4 related items: > > - Simplified ID mapping What are you thinking of here? --b. > and security configuration > - Support for NFSv4 migration and replication > - Better server observability (for operational and performance debugging in the field) > - FedFS and NFS basic junctions (already under way) > > > It's always going to be hard for us to compete with dedicated > > appliances. Where Linux can shine though is in allowing for more > > innovative combinations. > > > > Being able to do active/active NFS serving from clustered filesystems, > > for instance is something that we can eventually attain but that would > > be harder to do in an appliance. This sort of discussion might also > > dovetail with Benny's proposal about pNFS serving. > > > > -- > > Jeff Layton <jlayton@redhat.com> > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > chuck[dot]lever[at]oracle[dot]com > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-17 21:16 ` J. Bruce Fields @ 2012-01-17 21:25 ` Chuck Lever 0 siblings, 0 replies; 41+ messages in thread From: Chuck Lever @ 2012-01-17 21:25 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Jeff Layton, Ric Wheeler, linux-fsdevel, linux-scsi On Jan 17, 2012, at 4:16 PM, J. Bruce Fields wrote: > On Tue, Jan 03, 2012 at 02:32:40PM -0500, Chuck Lever wrote: >> >> On Jan 3, 2012, at 2:26 PM, Jeff Layton wrote: >> >>> On Wed, 21 Dec 2011 10:59:43 -0500 >>> Ric Wheeler <rwheeler@redhat.com> wrote: >>> >>>> >>>> One common thing that I see a lot of these days is an increasing number of >>>> platforms that are built on our stack as storage servers. Ranging from the >>>> common linux based storage/NAS devices up to various distributed systems. >>>> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. >>>> >>>> At last year's SNIA developers conference, it was clear that Microsoft is >>>> putting a lot of effort into enhancing windows 8 server as a storage server with >>>> both support for a pNFS server and of course SMB. I think that linux (+samba) is >>>> ahead of the windows based storage appliances today, but they are putting >>>> together a very aggressive list of features. >>>> >>>> I think that it would be useful and interesting to take a slot at this year's >>>> LSF to see how we are doing in this space. How large do we need to scale for an >>>> appliance? What kind of work is needed (support for the copy offload system >>>> call? better support for out of band notifications like those used in "thinly >>>> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). >>>> >>>> The goal would be to see what technical gaps we have that need more active >>>> development in, not just a wish list :) >>>> >>>> Ric >>> >>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs >>> more active development ;). >>> >>> While HCH will probably disagree, being able to support more >>> NFSv4/Windows API features at the VFS layer would make it a lot easier >>> to do a more unified serving appliance. Right now, both knfsd and samba >>> track too much info internally, and that makes it very difficult to >>> serve the same data via multiple protocols. >>> >>> Off the top of my head, my "wishlist" for better NFSv4 serving would be: >>> >>> - RichACLs >>> - Share/Deny mode support on open >>> - mandatory locking that doesn't rely on weirdo file modes >> >> To add a few more NFSv4 related items: >> >> - Simplified ID mapping > > What are you thinking of here? > > --b. > >> and security configuration Trond has already made things easier for NFSv3 to NFSv4 transition by having the client send numeric UIDs and GIDs in idmap strings when servers can deal with that. It would be even better if we had some kind of GUI like the "Users and Groups" tool that could combine the configuration of ID mapping and security configuration, and maybe provide some nice preset configurations (all local IDs, Kerberos only, LDAP, and so on). This also needs to integrate well with network services like FreeIPA. And it would probably need to work on both both NFS clients and servers, but what if we had some way of automatically configuring clients, on first contact with a server or realm, with a Kerberos keytab and the correct ID mapping and security set up? >> - Support for NFSv4 migration and replication >> - Better server observability (for operational and performance debugging in the field) >> - FedFS and NFS basic junctions (already under way) >> >>> It's always going to be hard for us to compete with dedicated >>> appliances. Where Linux can shine though is in allowing for more >>> innovative combinations. >>> >>> Being able to do active/active NFS serving from clustered filesystems, >>> for instance is something that we can eventually attain but that would >>> be harder to do in an appliance. This sort of discussion might also >>> dovetail with Benny's proposal about pNFS serving. >>> >>> -- >>> Jeff Layton <jlayton@redhat.com> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> Chuck Lever >> chuck[dot]lever[at]oracle[dot]com >> >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton 2012-01-03 19:32 ` Chuck Lever @ 2012-01-24 21:36 ` J. Bruce Fields 2012-01-24 23:13 ` Ric Wheeler 1 sibling, 1 reply; 41+ messages in thread From: J. Bruce Fields @ 2012-01-24 21:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote: > On Wed, 21 Dec 2011 10:59:43 -0500 > Ric Wheeler <rwheeler@redhat.com> wrote: > > > > > One common thing that I see a lot of these days is an increasing number of > > platforms that are built on our stack as storage servers. Ranging from the > > common linux based storage/NAS devices up to various distributed systems. > > Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. > > > > At last year's SNIA developers conference, it was clear that Microsoft is > > putting a lot of effort into enhancing windows 8 server as a storage server with > > both support for a pNFS server and of course SMB. I think that linux (+samba) is > > ahead of the windows based storage appliances today, but they are putting > > together a very aggressive list of features. > > > > I think that it would be useful and interesting to take a slot at this year's > > LSF to see how we are doing in this space. How large do we need to scale for an > > appliance? What kind of work is needed (support for the copy offload system > > call? better support for out of band notifications like those used in "thinly > > provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). > > > > The goal would be to see what technical gaps we have that need more active > > development in, not just a wish list :) > > > > Ric > > Unfortunately, w/o a wishlist of sorts, it's hard to know what needs > more active development ;). > > While HCH will probably disagree, being able to support more > NFSv4/Windows API features at the VFS layer would make it a lot easier > to do a more unified serving appliance. Right now, both knfsd and samba > track too much info internally, and that makes it very difficult to > serve the same data via multiple protocols. By the way, we could really use a Windows/Samba expert if we're going to discuss that. I don't think their list(s) got the announcement? --b. > > Off the top of my head, my "wishlist" for better NFSv4 serving would be: > > - RichACLs > - Share/Deny mode support on open > - mandatory locking that doesn't rely on weirdo file modes > > It's always going to be hard for us to compete with dedicated > appliances. Where Linux can shine though is in allowing for more > innovative combinations. > > Being able to do active/active NFS serving from clustered filesystems, > for instance is something that we can eventually attain but that would > be harder to do in an appliance. This sort of discussion might also > dovetail with Benny's proposal about pNFS serving. > > -- > Jeff Layton <jlayton@redhat.com> > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-24 21:36 ` J. Bruce Fields @ 2012-01-24 23:13 ` Ric Wheeler 2012-01-25 19:05 ` Christopher R. Hertel 2012-01-25 20:25 ` Christopher R. Hertel 0 siblings, 2 replies; 41+ messages in thread From: Ric Wheeler @ 2012-01-24 23:13 UTC (permalink / raw) To: J. Bruce Fields Cc: Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce, Christopher R. Hertel On 01/24/2012 04:36 PM, J. Bruce Fields wrote: > On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote: >> On Wed, 21 Dec 2011 10:59:43 -0500 >> Ric Wheeler<rwheeler@redhat.com> wrote: >> >>> One common thing that I see a lot of these days is an increasing number of >>> platforms that are built on our stack as storage servers. Ranging from the >>> common linux based storage/NAS devices up to various distributed systems. >>> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and samba. >>> >>> At last year's SNIA developers conference, it was clear that Microsoft is >>> putting a lot of effort into enhancing windows 8 server as a storage server with >>> both support for a pNFS server and of course SMB. I think that linux (+samba) is >>> ahead of the windows based storage appliances today, but they are putting >>> together a very aggressive list of features. >>> >>> I think that it would be useful and interesting to take a slot at this year's >>> LSF to see how we are doing in this space. How large do we need to scale for an >>> appliance? What kind of work is needed (support for the copy offload system >>> call? better support for out of band notifications like those used in "thinly >>> provisioned" SCSI devices? management API's? Ease of use CLI work? SMB2.2 support?). >>> >>> The goal would be to see what technical gaps we have that need more active >>> development in, not just a wish list :) >>> >>> Ric >> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs >> more active development ;). >> >> While HCH will probably disagree, being able to support more >> NFSv4/Windows API features at the VFS layer would make it a lot easier >> to do a more unified serving appliance. Right now, both knfsd and samba >> track too much info internally, and that makes it very difficult to >> serve the same data via multiple protocols. > By the way, we could really use a > Windows/Samba expert if we're going to discuss that. > > I don't think their list(s) got the announcement? > > --b. Adding in three windows/samba people that I know of :) Ric >> Off the top of my head, my "wishlist" for better NFSv4 serving would be: >> >> - RichACLs >> - Share/Deny mode support on open >> - mandatory locking that doesn't rely on weirdo file modes >> >> It's always going to be hard for us to compete with dedicated >> appliances. Where Linux can shine though is in allowing for more >> innovative combinations. >> >> Being able to do active/active NFS serving from clustered filesystems, >> for instance is something that we can eventually attain but that would >> be harder to do in an appliance. This sort of discussion might also >> dovetail with Benny's proposal about pNFS serving. >> >> -- >> Jeff Layton<jlayton@redhat.com> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-24 23:13 ` Ric Wheeler @ 2012-01-25 19:05 ` Christopher R. Hertel 2012-01-25 20:25 ` Christopher R. Hertel 1 sibling, 0 replies; 41+ messages in thread From: Christopher R. Hertel @ 2012-01-25 19:05 UTC (permalink / raw) To: Ric Wheeler Cc: J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce Ric, et. al., I want to think of these questions from the perspective of the storage systems vendor who is (or may be interested in) using the Linux storage stack in a product. * How do we make things easy for those vendors (without compromising the licensing--theirs and ours)? * How do we ensure that they have all of the features that they need, and can add their own software "skin" to present to the end-user? We have talked about some of these issues already, and I know that there is work being done, but the key will be to make that work consistent so that we are not reinventing wheels but also not forcing customers to use this or that specific piece or part. I think we want to look at our storage stack from the bottom up, and consider what can be done within the kernel and/or in user-space to bring the pieces together. Okay... so all of that is wildly vague and magic-waving-of-hands-ish. The questions raised in the thread below require a bit of thought to answer. Chris -)----- Ric Wheeler wrote: > On 01/24/2012 04:36 PM, J. Bruce Fields wrote: >> On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote: >>> On Wed, 21 Dec 2011 10:59:43 -0500 >>> Ric Wheeler<rwheeler@redhat.com> wrote: >>> >>>> One common thing that I see a lot of these days is an increasing >>>> number of >>>> platforms that are built on our stack as storage servers. Ranging >>>> from the >>>> common linux based storage/NAS devices up to various distributed >>>> systems. >>>> Almost all of them use our common stack - software RAID, LVM, >>>> XFS/ext4 and samba. >>>> >>>> At last year's SNIA developers conference, it was clear that >>>> Microsoft is >>>> putting a lot of effort into enhancing windows 8 server as a storage >>>> server with >>>> both support for a pNFS server and of course SMB. I think that linux >>>> (+samba) is >>>> ahead of the windows based storage appliances today, but they are >>>> putting >>>> together a very aggressive list of features. >>>> >>>> I think that it would be useful and interesting to take a slot at >>>> this year's >>>> LSF to see how we are doing in this space. How large do we need to >>>> scale for an >>>> appliance? What kind of work is needed (support for the copy >>>> offload system >>>> call? better support for out of band notifications like those used >>>> in "thinly >>>> provisioned" SCSI devices? management API's? Ease of use CLI work? >>>> SMB2.2 support?). >>>> >>>> The goal would be to see what technical gaps we have that need more >>>> active >>>> development in, not just a wish list :) >>>> >>>> Ric >>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs >>> more active development ;). >>> >>> While HCH will probably disagree, being able to support more >>> NFSv4/Windows API features at the VFS layer would make it a lot easier >>> to do a more unified serving appliance. Right now, both knfsd and samba >>> track too much info internally, and that makes it very difficult to >>> serve the same data via multiple protocols. >> By the way, we could really use a >> Windows/Samba expert if we're going to discuss that. >> >> I don't think their list(s) got the announcement? >> >> --b. > > Adding in three windows/samba people that I know of :) > > Ric > >>> Off the top of my head, my "wishlist" for better NFSv4 serving would be: >>> >>> - RichACLs >>> - Share/Deny mode support on open >>> - mandatory locking that doesn't rely on weirdo file modes >>> >>> It's always going to be hard for us to compete with dedicated >>> appliances. Where Linux can shine though is in allowing for more >>> innovative combinations. >>> >>> Being able to do active/active NFS serving from clustered filesystems, >>> for instance is something that we can eventually attain but that would >>> be harder to do in an appliance. This sort of discussion might also >>> dovetail with Benny's proposal about pNFS serving. >>> >>> -- >>> Jeff Layton<jlayton@redhat.com> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-fsdevel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- "Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq. ubiqx Team -- http://www.ubiqx.org/ -)----- crh@ubiqx.mn.org OnLineBook -- http://ubiqx.org/cifs/ -)----- crh@ubiqx.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-24 23:13 ` Ric Wheeler 2012-01-25 19:05 ` Christopher R. Hertel @ 2012-01-25 20:25 ` Christopher R. Hertel 2012-01-25 21:56 ` Roland Dreier 2012-01-26 11:15 ` Bart Van Assche 1 sibling, 2 replies; 41+ messages in thread From: Christopher R. Hertel @ 2012-01-25 20:25 UTC (permalink / raw) To: Ric Wheeler Cc: J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce I have actually been mulling this over a good deal in my mind, but I keep approaching it from different perspectives. Here are some random questions that pop up in my mind... Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and export LUNs via FibreChannel? - Do we support multipath in this configuration? - Zoning? Muli-access LUNs for Clustering? Where are we on FCoE support? Where to we stand on ATAoE? - Coraid, the creaters of ATAoE, were recently listed 3rd on a "Most Promising Storage Start-Ups in 2012" list[1]. - ATAoE support has been standard in Linux for years. - The target software is very low-level. It needs an overhaul and a powerful configuration API. What about other block-level protocols? My familiarity with these varies. Chris -)----- [1]http://www.storagenewsletter.com/news/startups/most-promising-storage-start-ups Ric Wheeler wrote: > On 01/24/2012 04:36 PM, J. Bruce Fields wrote: >> On Tue, Jan 03, 2012 at 02:26:09PM -0500, Jeff Layton wrote: >>> On Wed, 21 Dec 2011 10:59:43 -0500 >>> Ric Wheeler<rwheeler@redhat.com> wrote: >>> >>>> One common thing that I see a lot of these days is an increasing >>>> number of >>>> platforms that are built on our stack as storage servers. Ranging >>>> from the >>>> common linux based storage/NAS devices up to various distributed >>>> systems. >>>> Almost all of them use our common stack - software RAID, LVM, >>>> XFS/ext4 and samba. >>>> >>>> At last year's SNIA developers conference, it was clear that >>>> Microsoft is >>>> putting a lot of effort into enhancing windows 8 server as a storage >>>> server with >>>> both support for a pNFS server and of course SMB. I think that linux >>>> (+samba) is >>>> ahead of the windows based storage appliances today, but they are >>>> putting >>>> together a very aggressive list of features. >>>> >>>> I think that it would be useful and interesting to take a slot at >>>> this year's >>>> LSF to see how we are doing in this space. How large do we need to >>>> scale for an >>>> appliance? What kind of work is needed (support for the copy >>>> offload system >>>> call? better support for out of band notifications like those used >>>> in "thinly >>>> provisioned" SCSI devices? management API's? Ease of use CLI work? >>>> SMB2.2 support?). >>>> >>>> The goal would be to see what technical gaps we have that need more >>>> active >>>> development in, not just a wish list :) >>>> >>>> Ric >>> Unfortunately, w/o a wishlist of sorts, it's hard to know what needs >>> more active development ;). >>> >>> While HCH will probably disagree, being able to support more >>> NFSv4/Windows API features at the VFS layer would make it a lot easier >>> to do a more unified serving appliance. Right now, both knfsd and samba >>> track too much info internally, and that makes it very difficult to >>> serve the same data via multiple protocols. >> By the way, we could really use a >> Windows/Samba expert if we're going to discuss that. >> >> I don't think their list(s) got the announcement? >> >> --b. > > Adding in three windows/samba people that I know of :) > > Ric > >>> Off the top of my head, my "wishlist" for better NFSv4 serving would be: >>> >>> - RichACLs >>> - Share/Deny mode support on open >>> - mandatory locking that doesn't rely on weirdo file modes >>> >>> It's always going to be hard for us to compete with dedicated >>> appliances. Where Linux can shine though is in allowing for more >>> innovative combinations. >>> >>> Being able to do active/active NFS serving from clustered filesystems, >>> for instance is something that we can eventually attain but that would >>> be harder to do in an appliance. This sort of discussion might also >>> dovetail with Benny's proposal about pNFS serving. >>> >>> -- >>> Jeff Layton<jlayton@redhat.com> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-fsdevel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- "Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq. ubiqx Team -- http://www.ubiqx.org/ -)----- crh@ubiqx.mn.org OnLineBook -- http://ubiqx.org/cifs/ -)----- crh@ubiqx.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-25 20:25 ` Christopher R. Hertel @ 2012-01-25 21:56 ` Roland Dreier 2012-01-25 22:09 ` Christopher R. Hertel 2012-01-26 11:15 ` Bart Van Assche 1 sibling, 1 reply; 41+ messages in thread From: Roland Dreier @ 2012-01-25 21:56 UTC (permalink / raw) To: Christopher R. Hertel Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote: > Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and > export LUNs via FibreChannel? > - Do we support multipath in this configuration? > - Zoning? Muli-access LUNs for Clustering? Yes to everything, although the low-level FC driver patches are not quite upstream yet. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-25 21:56 ` Roland Dreier @ 2012-01-25 22:09 ` Christopher R. Hertel 2012-01-26 21:52 ` Andy Grover 0 siblings, 1 reply; 41+ messages in thread From: Christopher R. Hertel @ 2012-01-25 22:09 UTC (permalink / raw) To: Roland Dreier Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce Roland Dreier wrote: > On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote: >> Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and >> export LUNs via FibreChannel? >> - Do we support multipath in this configuration? >> - Zoning? Muli-access LUNs for Clustering? > > Yes to everything, although the low-level FC driver patches are not > quite upstream yet. Cool! Thanks for the update. I assume that the management API for this is similar to or the same as the FCoE API? That would make it easier to produce a consistent look-and-feel in whatever end-user management tools a third-party my create. Chris -)----- -- "Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq. ubiqx Team -- http://www.ubiqx.org/ -)----- crh@ubiqx.mn.org OnLineBook -- http://ubiqx.org/cifs/ -)----- crh@ubiqx.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-25 22:09 ` Christopher R. Hertel @ 2012-01-26 21:52 ` Andy Grover 0 siblings, 0 replies; 41+ messages in thread From: Andy Grover @ 2012-01-26 21:52 UTC (permalink / raw) To: Christopher R. Hertel Cc: Roland Dreier, Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce On 01/25/2012 02:09 PM, Christopher R. Hertel wrote: > Roland Dreier wrote: >> On Wed, Jan 25, 2012 at 12:25 PM, Christopher R. Hertel <crh@samba.org> wrote: >>> Can we put FibreChannel HBAs in *target* mode into a Linux-based PC and >>> export LUNs via FibreChannel? >>> - Do we support multipath in this configuration? >>> - Zoning? Muli-access LUNs for Clustering? >> >> Yes to everything, although the low-level FC driver patches are not >> quite upstream yet. > > Cool! > Thanks for the update. > > I assume that the management API for this is similar to or the same as the > FCoE API? That would make it easier to produce a consistent look-and-feel > in whatever end-user management tools a third-party my create. If you're talking about the kernel target stuff, you need to do the FCoE-specific setup using the same fcoe tools, and then configure the kernel target via configfs, so that's probably going to be a little different. Regards -- Andy ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? 2012-01-25 20:25 ` Christopher R. Hertel 2012-01-25 21:56 ` Roland Dreier @ 2012-01-26 11:15 ` Bart Van Assche 1 sibling, 0 replies; 41+ messages in thread From: Bart Van Assche @ 2012-01-26 11:15 UTC (permalink / raw) To: Christopher R. Hertel Cc: Ric Wheeler, J. Bruce Fields, Jeff Layton, linux-fsdevel, linux-scsi, Jeremy Allison, Simo Sorce On Wed, Jan 25, 2012 at 9:25 PM, Christopher R. Hertel <crh@samba.org> wrote: > Where are we on FCoE support? Initiator and target mode should work, except that VN2VN mode is broken. See also http://lists.open-fcoe.org/pipermail/devel/2011-December/011892.html. Bart. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler 2011-12-22 8:14 ` Shyam_Iyer 2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton @ 2012-01-18 17:00 ` Roland Dreier 2012-01-18 17:51 ` Ric Wheeler 2 siblings, 1 reply; 41+ messages in thread From: Roland Dreier @ 2012-01-18 17:00 UTC (permalink / raw) To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi On Wed, Dec 21, 2011 at 7:59 AM, Ric Wheeler <rwheeler@redhat.com> wrote: > One common thing that I see a lot of these days is an increasing number of > platforms that are built on our stack as storage servers. Ranging from the > common linux based storage/NAS devices up to various distributed systems. > Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and > samba. > > At last year's SNIA developers conference, it was clear that Microsoft is > putting a lot of effort into enhancing windows 8 server as a storage server > with both support for a pNFS server and of course SMB. I think that linux > (+samba) is ahead of the windows based storage appliances today, but they > are putting together a very aggressive list of features. > > I think that it would be useful and interesting to take a slot at this > year's LSF to see how we are doing in this space. How large do we need to > scale for an appliance? What kind of work is needed (support for the copy > offload system call? better support for out of band notifications like those > used in "thinly provisioned" SCSI devices? management API's? Ease of use CLI > work? SMB2.2 support?). > > The goal would be to see what technical gaps we have that need more active > development in, not just a wish list :) I see a technical gap in the robustness of our basic SCSI/block stack. In a pretty standard low to midrange setup, ie standard server with a couple of SAS HBAs connected to an external SAS JBOD, it's quite easy to run into problems like oopses or other issues that kill the whole system, even from faults that should affect only part of the system. For example losing one path to the JBOD, or losing one drive, or having a SCSI reservation conflict can lead to the whole system crashing. Which is not good for an HA storage server built on redundant hardware. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier @ 2012-01-18 17:51 ` Ric Wheeler 2012-01-18 18:46 ` Roland Dreier 0 siblings, 1 reply; 41+ messages in thread From: Ric Wheeler @ 2012-01-18 17:51 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-fsdevel, linux-scsi On 01/18/2012 12:00 PM, Roland Dreier wrote: > On Wed, Dec 21, 2011 at 7:59 AM, Ric Wheeler<rwheeler@redhat.com> wrote: >> One common thing that I see a lot of these days is an increasing number of >> platforms that are built on our stack as storage servers. Ranging from the >> common linux based storage/NAS devices up to various distributed systems. >> Almost all of them use our common stack - software RAID, LVM, XFS/ext4 and >> samba. >> >> At last year's SNIA developers conference, it was clear that Microsoft is >> putting a lot of effort into enhancing windows 8 server as a storage server >> with both support for a pNFS server and of course SMB. I think that linux >> (+samba) is ahead of the windows based storage appliances today, but they >> are putting together a very aggressive list of features. >> >> I think that it would be useful and interesting to take a slot at this >> year's LSF to see how we are doing in this space. How large do we need to >> scale for an appliance? What kind of work is needed (support for the copy >> offload system call? better support for out of band notifications like those >> used in "thinly provisioned" SCSI devices? management API's? Ease of use CLI >> work? SMB2.2 support?). >> >> The goal would be to see what technical gaps we have that need more active >> development in, not just a wish list :) > I see a technical gap in the robustness of our basic SCSI/block stack. In a > pretty standard low to midrange setup, ie standard server with a couple of SAS > HBAs connected to an external SAS JBOD, it's quite easy to run into problems > like oopses or other issues that kill the whole system, even from faults that > should affect only part of the system. For example losing one path to the JBOD, > or losing one drive, or having a SCSI reservation conflict can lead to the whole > system crashing. > > Which is not good for an HA storage server built on redundant hardware. > > - R. Why would you crash is you have device mapper multipath configured to handle path fail over? We have tons of enterprise customers that use that... On the broader topic of error handling and so on, I do agree that is always an area of concern (how many times to retry, how long time outs need to be, when to panic/reboot or propagate up an error code) ric ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 17:51 ` Ric Wheeler @ 2012-01-18 18:46 ` Roland Dreier 2012-01-18 18:51 ` Bart Van Assche 2012-01-19 17:50 ` Loke, Chetan 0 siblings, 2 replies; 41+ messages in thread From: Roland Dreier @ 2012-01-18 18:46 UTC (permalink / raw) To: Ric Wheeler; +Cc: linux-fsdevel, linux-scsi > Why would you crash is you have device mapper multipath configured to handle > path fail over? We have tons of enterprise customers that use that... cf http://www.spinics.net/lists/linux-scsi/msg56254.html Basically hot unplug of an sdX can oops on any recent kernel, no matter what dm stuff you have on top. > On the broader topic of error handling and so on, I do agree that is always > an area of concern (how many times to retry, how long time outs need to be, > when to panic/reboot or propagate up an error code) Yes, especially the scsi eh stuff escalating to a host reset when a single drive has gone bad -- even if the HBA is happily doing IO to other drives, we'll kill access to the whole SAS fabric. - R. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 18:46 ` Roland Dreier @ 2012-01-18 18:51 ` Bart Van Assche 2012-01-18 19:00 ` Roland Dreier 2012-01-19 8:16 ` Rolf Eike Beer 2012-01-19 17:50 ` Loke, Chetan 1 sibling, 2 replies; 41+ messages in thread From: Bart Van Assche @ 2012-01-18 18:51 UTC (permalink / raw) To: Roland Dreier; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi On Wed, Jan 18, 2012 at 6:46 PM, Roland Dreier <roland@purestorage.com> wrote: > > Why would you crash is you have device mapper multipath configured to handle > > path fail over? We have tons of enterprise customers that use that... > > cf http://www.spinics.net/lists/linux-scsi/msg56254.html > > Basically hot unplug of an sdX can oops on any recent kernel, no > matter what dm stuff you have on top. > > > On the broader topic of error handling and so on, I do agree that is always > > an area of concern (how many times to retry, how long time outs need to be, > > when to panic/reboot or propagate up an error code) > > Yes, especially the scsi eh stuff escalating to a host reset when > a single drive has gone bad -- even if the HBA is happily doing IO > to other drives, we'll kill access to the whole SAS fabric. With which SCSI low-level diver does that occur and how does the call stack look like ? I haven't encountered any such issues while testing the srp-ha patch set. However, I have to admit that the issues mentioned in the description of commit 3308511 were discovered while testing the srp-ha patch set. Bart. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 18:51 ` Bart Van Assche @ 2012-01-18 19:00 ` Roland Dreier 2012-01-19 8:16 ` Rolf Eike Beer 1 sibling, 0 replies; 41+ messages in thread From: Roland Dreier @ 2012-01-18 19:00 UTC (permalink / raw) To: Bart Van Assche; +Cc: Ric Wheeler, linux-fsdevel, linux-scsi On Wed, Jan 18, 2012 at 10:51 AM, Bart Van Assche <bvanassche@acm.org> wrote: >> cf http://www.spinics.net/lists/linux-scsi/msg56254.html > With which SCSI low-level diver does that occur and how does the call > stack look like ? I haven't encountered any such issues while testing > the srp-ha patch set. However, I have to admit that the issues > mentioned in the description of commit 3308511 were discovered while > testing the srp-ha patch set. I assume you're asking about the crash on hot unplug... Anyway, I've seen it with mp2tsas, looking like [<ffffffff8130078b>] scsi_setup_blk_pc_cmnd+0x2b/0x170 [<ffffffff8130091d>] scsi_prep_fn+0x4d/0x60 [<ffffffff81227794>] blk_peek_request+0xb4/0x240 [<ffffffff813010cd>] scsi_request_fn+0x47d/0x4b0 [<ffffffff81223bff>] __blk_run_queue+0x1f/0x80 [<ffffffff8122b808>] blk_execute_rq_nowait+0x68/0xc0 in my case I think there is some SG_IO stuff in flight for SCSI reservations; the cdrom case I linked to also has scsi_setup_blk_pc_cmnd in the trace and mentions an ioctl for medium presence. So maybe regular block IO is OK now but not non-IO commands. - R. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 18:51 ` Bart Van Assche 2012-01-18 19:00 ` Roland Dreier @ 2012-01-19 8:16 ` Rolf Eike Beer 1 sibling, 0 replies; 41+ messages in thread From: Rolf Eike Beer @ 2012-01-19 8:16 UTC (permalink / raw) To: Bart Van Assche; +Cc: Roland Dreier, Ric Wheeler, linux-fsdevel, linux-scsi > On Wed, Jan 18, 2012 at 6:46 PM, Roland Dreier <roland@purestorage.com> > wrote: >> > Why would you crash is you have device mapper multipath configured to >> handle >> > path fail over? We have tons of enterprise customers that use that... >> >> cf http://www.spinics.net/lists/linux-scsi/msg56254.html >> >> Basically hot unplug of an sdX can oops on any recent kernel, no >> matter what dm stuff you have on top. >> >> > On the broader topic of error handling and so on, I do agree that is >> always >> > an area of concern (how many times to retry, how long time outs need >> to be, >> > when to panic/reboot or propagate up an error code) >> >> Yes, especially the scsi eh stuff escalating to a host reset when >> a single drive has gone bad -- even if the HBA is happily doing IO >> to other drives, we'll kill access to the whole SAS fabric. > > With which SCSI low-level diver does that occur and how does the call > stack look like ? I haven't encountered any such issues while testing > the srp-ha patch set. However, I have to admit that the issues > mentioned in the description of commit 3308511 were discovered while > testing the srp-ha patch set. Likely unrelated to the stuff above, but this has happened for me. I was changing the USB devices while sending the machine to s2disk and this was what it came up with on resume: [91794.875373] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [91794.875385] IP: [<ffffffff813c46c1>] sd_revalidate_disk+0x31/0x320 [91794.875396] PGD 3fe33f067 PUD 3fff84067 PMD 0 [91794.875403] Oops: 0000 [#1] PREEMPT SMP [91794.875410] CPU 7 [91794.875412] Modules linked in: autofs4 fuse ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit af_packet edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NO TRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_connt rack ip6table_filter ip6_tables x_tables snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf snd_h da_codec_hdmi snd_hda_codec_realtek pl2303 usbserial kvm_intel kvm snd_hda_intel e1000e snd_hda_codec iTCO_wdt shpchp mei(C) xhci_hcd i2c_i801 pci_hotplug iTCO_vendor_supp ort snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc sr_mod cdrom sg serio_raw pcspkr linear raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 i915 drm_kms_helper drm i2c_algo_bit button video dm_snapshot dm_mod fan processor thermal thermal_sys pata_amd ata_generic sata_nv [last unlo aded: preloadtrace] [91794.875522] [91794.875525] Pid: 5242, comm: udisks-daemon Tainted: G C 3.1.0-46-desktop #1 /DH67CL [91794.875534] RIP: 0010:[<ffffffff813c46c1>] [<ffffffff813c46c1>] sd_revalidate_disk+0x31/0x320 [91794.875543] RSP: 0018:ffff88040399dbb8 EFLAGS: 00010293 [91794.875547] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001 [91794.875552] RDX: ffff8803fa9ba740 RSI: ffff8803fa9ba760 RDI: ffff8800d3975c00 [91794.875557] RBP: ffff8800d3975c00 R08: ffff88040399db84 R09: ffff8803fb546400 [91794.875561] R10: 0000000000000001 R11: 0000000000000001 R12: 00000000ffffff85 [91794.875565] R13: ffff88041efcb818 R14: ffff8800d3975c00 R15: ffff88040399dc08 [91794.875718] FS: 00007fb7921067a0(0000) GS:ffff88041fbc0000(0000) knlGS:0000000000000000 [91794.875863] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [91794.876016] CR2: 0000000000000008 CR3: 00000003fe33e000 CR4: 00000000000406e0 [91794.876172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [91794.876321] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [91794.876473] Process udisks-daemon (pid: 5242, threadinfo ffff88040399c000, task ffff8804035fa500) [91794.876596] done. [91794.876772] Stack: [91794.876774] ffff88040399dc08 ffff88041efcb800 0000000000000000 00000000ffffff85 [91794.876777] ffff88041efcb818 ffffffff811c7a98 ffff88041efcb800 000000001efcb800 [91794.876779] ffff8800d3975c78 ffff8800d3975c0c ffff8800d3975c00 0000000000000000 [91794.876782] Call Trace: [91794.876791] [<ffffffff811c7a98>] rescan_partitions+0xa8/0x320 [91794.876797] [<ffffffff811928ee>] __blkdev_get+0x2be/0x420 [91794.876802] [<ffffffff81192ab2>] blkdev_get+0x62/0x2d0 [91794.876807] [<ffffffff81159ffa>] __dentry_open+0x23a/0x3f0 [91794.876812] [<ffffffff8116b668>] do_last+0x3f8/0x7b0 [91794.876816] [<ffffffff8116bb4b>] path_openat+0xdb/0x400 [91794.876819] [<ffffffff8116bedd>] do_filp_open+0x4d/0xc0 [91794.876823] [<ffffffff8115b511>] do_sys_open+0x101/0x1e0 [91794.876827] [<ffffffff815ae692>] system_call_fastpath+0x16/0x1b [91794.876840] [<00007fb79189fb20>] 0x7fb79189fb1f [91794.876841] Code: 86 b0 9e 00 48 89 6c 24 10 48 89 5c 24 08 48 89 fd 4c 89 64 24 18 4c 89 6c 24 20 c1 e8 15 48 8b 9f 28 03 00 00 83 e0 07 83 f8 03 <4c> 8b 63 08 0f 87 8e 02 00 00 41 8b 84 24 50 06 00 00 31 d2 83 [91794.876857] RIP [<ffffffff813c46c1>] sd_revalidate_disk+0x31/0x320 [91794.876860] RSP <ffff88040399dbb8> [91794.876861] CR2: 0000000000000008 Kernel is from openSuSE 12.1: Linux devpool02 3.1.0-46-desktop #1 SMP PREEMPT Mon Oct 24 20:49:37 UTC 2011 (1cba112) x86_64 x86_64 x86_64 GNU/Linux Greetings, Eike ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [LSF/MM TOPIC] linux servers as a storage server - what's missing? 2012-01-18 18:46 ` Roland Dreier 2012-01-18 18:51 ` Bart Van Assche @ 2012-01-19 17:50 ` Loke, Chetan 1 sibling, 0 replies; 41+ messages in thread From: Loke, Chetan @ 2012-01-19 17:50 UTC (permalink / raw) To: Roland Dreier; +Cc: linux-fsdevel, linux-scsi > From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi- > owner@vger.kernel.org] On Behalf Of Roland Dreier > Sent: January 18, 2012 1:46 PM > To: Ric Wheeler > Cc: linux-fsdevel@vger.kernel.org; linux-scsi@vger.kernel.org > Subject: Re: [LSF/MM TOPIC] linux servers as a storage server - what's > missing? > > > Why would you crash is you have device mapper multipath configured to > handle > > path fail over? We have tons of enterprise customers that use that... > > cf http://www.spinics.net/lists/linux-scsi/msg56254.html > > Basically hot unplug of an sdX can oops on any recent kernel, no > matter what dm stuff you have on top. Depends on the use-case. In the particular (SAN)use-case that I'm talking about, you shouldn't unplug live sdX which is still mapped on the target(array) side. I've attempted to fix it somewhere and it works. If you need more info about eh and initiator fixes, email me offline. When would this case arise? : a) you unknowingly unmap *hot* LUNs from target(array) while managing the array. End result - reset storm on front-end. You just need a smarter way to isolate/localize this front-end storm. Chetan Loke ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2012-01-26 22:29 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-12-21 15:59 [LSF/MM TOPIC] linux servers as a storage server - what's missing? Ric Wheeler 2011-12-22 8:14 ` Shyam_Iyer 2011-12-22 15:58 ` Vivek Goyal 2011-12-22 20:54 ` Shyam_Iyer 2011-12-23 3:06 ` Vivek Goyal 2011-12-23 4:35 ` Shyam_Iyer 2012-01-09 12:18 ` Hannes Reinecke 2012-01-09 12:59 ` Tom Coughlan 2012-01-10 6:53 ` Ric Wheeler 2012-01-20 8:55 ` Hannes Reinecke 2012-01-19 16:17 ` [LSF/MM TOPIC] linux servers as a storage server - what'smissing? Loke, Chetan 2012-01-19 16:19 ` Ric Wheeler 2012-01-19 16:26 ` Loke, Chetan 2012-01-19 16:29 ` Ric Wheeler 2012-01-19 17:32 ` Loke, Chetan 2012-01-19 17:44 ` Ric Wheeler 2012-01-19 21:30 ` Loke, Chetan 2012-01-19 21:39 ` Ric Wheeler 2012-01-24 17:05 ` Loke, Chetan 2012-01-24 18:13 ` Ric Wheeler 2012-01-26 22:24 ` Dave Chinner 2012-01-26 22:29 ` Ric Wheeler 2012-01-03 19:26 ` [LSF/MM TOPIC][ATTEND] linux servers as a storage server - what's missing? Jeff Layton 2012-01-03 19:32 ` Chuck Lever 2012-01-17 21:16 ` J. Bruce Fields 2012-01-17 21:25 ` Chuck Lever 2012-01-24 21:36 ` J. Bruce Fields 2012-01-24 23:13 ` Ric Wheeler 2012-01-25 19:05 ` Christopher R. Hertel 2012-01-25 20:25 ` Christopher R. Hertel 2012-01-25 21:56 ` Roland Dreier 2012-01-25 22:09 ` Christopher R. Hertel 2012-01-26 21:52 ` Andy Grover 2012-01-26 11:15 ` Bart Van Assche 2012-01-18 17:00 ` [LSF/MM TOPIC] " Roland Dreier 2012-01-18 17:51 ` Ric Wheeler 2012-01-18 18:46 ` Roland Dreier 2012-01-18 18:51 ` Bart Van Assche 2012-01-18 19:00 ` Roland Dreier 2012-01-19 8:16 ` Rolf Eike Beer 2012-01-19 17:50 ` Loke, Chetan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).