All of lore.kernel.org
 help / color / mirror / Atom feed
* Issuing custom IOCTLs to SCSI LLD
@ 2016-05-09  9:27 Naveen
  2016-05-11  4:18 ` Naveen
  0 siblings, 1 reply; 10+ messages in thread
From: Naveen @ 2016-05-09  9:27 UTC (permalink / raw)
  To: ceph-devel

Hi,

I'm new to ceph and trying to find if on CentOS there is a way to issue 
IOCTL or similar custom requests (other than fops request like OS issued 
read/write requests) from ceph OSD/RBD to the underlying SCSI LLD. 
From initial reading of docs it looks like in the I/O stack user/client 
issued I/Os would reach the actual device as below:

Client I/O req
 VFS
  ceph fs
   Ceph OSD daemon
     RBD
      SCSI midlayer
       SCSI LLD (say mpt3sas)
         Disk

Ref: 
https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
nux_kernel.svg

My question is:
The SCSI LLD would support both read /write entry points for I/O 
requests issued by the filesystem/block I/O but they also support some
custom requests using IOCTLs. So how can ceph support issuing of such 
IOCTL requests to the device if user issues such request. Say for 
example power cycling the drive etc. It can also be a passthro request 
down to the device.

Appreciate the help and thanks in advance.

thanks & regards,
naveen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-09  9:27 Issuing custom IOCTLs to SCSI LLD Naveen
@ 2016-05-11  4:18 ` Naveen
  2016-05-11  5:27   ` Christoph Hellwig
  2016-05-11 12:41   ` Sage Weil
  0 siblings, 2 replies; 10+ messages in thread
From: Naveen @ 2016-05-11  4:18 UTC (permalink / raw)
  To: ceph-devel



Naveen <Naveen.Chandrasekaran <at> radisys.com> writes:

> 
> Hi,
> 
> I'm new to ceph and trying to find if on CentOS there is a way to 
issue 
> IOCTL or similar custom requests (other than fops request like OS 
issued 
> read/write requests) from ceph OSD/RBD to the underlying SCSI LLD. 
> From initial reading of docs it looks like in the I/O stack 
user/client 
> issued I/Os would reach the actual device as below:
> 
> Client I/O req
>  VFS
>   ceph fs
>    Ceph OSD daemon
>      RBD
>       SCSI midlayer
>        SCSI LLD (say mpt3sas)
>          Disk
> 
> Ref: 
> 
https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
> nux_kernel.svg
> 
> My question is:
> The SCSI LLD would support both read /write entry points for I/O 
> requests issued by the filesystem/block I/O but they also support some
> custom requests using IOCTLs. So how can ceph support issuing of such 
> IOCTL requests to the device if user issues such request. Say for 
> example power cycling the drive etc. It can also be a passthro request 
> down to the device.
> 
> Appreciate the help and thanks in advance.
> 


Hi,

Still waiting for a response. Can any of the experts clarify on this?

thanks,
naveen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-11  4:18 ` Naveen
@ 2016-05-11  5:27   ` Christoph Hellwig
  2016-05-12  8:52     ` Naveen
  2016-05-11 12:41   ` Sage Weil
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2016-05-11  5:27 UTC (permalink / raw)
  To: Naveen; +Cc: ceph-devel

Short answer is: don't do it.  Use the SES interface for your example,
or ensure a proper API exists if the example was just made up.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-11  4:18 ` Naveen
  2016-05-11  5:27   ` Christoph Hellwig
@ 2016-05-11 12:41   ` Sage Weil
  2016-05-12  9:32     ` Naveen
  1 sibling, 1 reply; 10+ messages in thread
From: Sage Weil @ 2016-05-11 12:41 UTC (permalink / raw)
  To: Naveen; +Cc: ceph-devel

On Wed, 11 May 2016, Naveen wrote:
> Naveen <Naveen.Chandrasekaran <at> radisys.com> writes:
> > Hi,
> > 
> > I'm new to ceph and trying to find if on CentOS there is a way to 
> issue 
> > IOCTL or similar custom requests (other than fops request like OS 
> issued 
> > read/write requests) from ceph OSD/RBD to the underlying SCSI LLD. 
> > From initial reading of docs it looks like in the I/O stack 
> user/client 
> > issued I/Os would reach the actual device as below:
> > 
> > Client I/O req
> >  VFS
> >   ceph fs
> >    Ceph OSD daemon
> >      RBD
> >       SCSI midlayer
> >        SCSI LLD (say mpt3sas)
> >          Disk
> > 
> > Ref: 
> > 
> https://en.wikipedia.org/wiki/NVM_Express#/media/File:IO_stack_of_the_Li
> > nux_kernel.svg
> > 
> > My question is:
> > The SCSI LLD would support both read /write entry points for I/O 
> > requests issued by the filesystem/block I/O but they also support some
> > custom requests using IOCTLs. So how can ceph support issuing of such 
> > IOCTL requests to the device if user issues such request. Say for 
> > example power cycling the drive etc. It can also be a passthro request 
> > down to the device.

Can you give an example of such an operation?

In general, any operation is generalized at the librados level. 
For example, in order to get write same and cmpxchg block operations, we 
added librados operations with similar semantics and implement them there.  
It is unlikely that passing a complex operation down to the SCSI layer 
will work in unison with the other steps involved in committing 
an operation (e.g., updating metadata indicating the object 
version has changed).

sage


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-11  5:27   ` Christoph Hellwig
@ 2016-05-12  8:52     ` Naveen
  0 siblings, 0 replies; 10+ messages in thread
From: Naveen @ 2016-05-12  8:52 UTC (permalink / raw)
  To: ceph-devel

Christoph Hellwig <hch <at> infradead.org> writes:

> 
> Short answer is: don't do it.  Use the SES interface for your example,
> or ensure a proper API exists if the example was just made up.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

Thanks Christoph. Power cycling of drive was an example of how some of the 
management IOCTLs provided by the SCSI LLD driver could be used. So your 
suggestion is dont' use ceph for this purpose(though its possible) but 
write or use an existing utility(if HBA vendor provides one) that issues 
IOCTLs to the SCSI/SAS LLD HBA driver directly? 

Related question is if a device like /dev/sda[1..n] is assigned a OSD and 
is being used by ceph can another user application continue to open and 
read/write to the /dev/sda1 device directly. How ceph prevents such 
scenarios?

thanks,
Naveen





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-11 12:41   ` Sage Weil
@ 2016-05-12  9:32     ` Naveen
  2016-05-12 12:37       ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Naveen @ 2016-05-12  9:32 UTC (permalink / raw)
  To: ceph-devel

Sage Weil <sage <at> newdream.net> writes:


> > > My question is:
> > > The SCSI LLD would support both read /write entry points for I/O 
> > > requests issued by the filesystem/block I/O but they also support 
some
> > > custom requests using IOCTLs. So how can ceph support issuing of 
such 
> > > IOCTL requests to the device if user issues such request. Say for 
> > > example power cycling the drive etc. It can also be a passthro 
request 
> > > down to the device.
> 
> Can you give an example of such an operation?
> 
> In general, any operation is generalized at the librados level. 
> For example, in order to get write same and cmpxchg block operations, 
we 
> added librados operations with similar semantics and implement them 
there.  
> It is unlikely that passing a complex operation down to the SCSI layer 
> will work in unison with the other steps involved in committing 
> an operation (e.g., updating metadata indicating the object 
> version has changed).
> 
> sage
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

Thanks for the response Sage. Example IOCTL operations would be like 
downloading a new FW to the drive/HBA, task management requests like 
Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro 
command to the drive for querying etc. All these would have to be 
initiated and go through ceph (if supported) and not bypassing it. 

I asked a related question in another post too: Can a physical disk 
(/dev/sda1) assigned to a ceph OSD object be continued to used by other 
apps in the system to issue I/O directly via (/dev/sda1) interface? Does 
ceph prevent it, as such operations may corrupt data?


Thanks in advance,
naveen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-12  9:32     ` Naveen
@ 2016-05-12 12:37       ` Sage Weil
  2016-05-12 13:21         ` John Spray
  0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2016-05-12 12:37 UTC (permalink / raw)
  To: Naveen; +Cc: ceph-devel

On Thu, 12 May 2016, Naveen wrote:
> Sage Weil <sage <at> newdream.net> writes:
> 
> 
> > > > My question is:
> > > > The SCSI LLD would support both read /write entry points for I/O 
> > > > requests issued by the filesystem/block I/O but they also support 
> some
> > > > custom requests using IOCTLs. So how can ceph support issuing of 
> such 
> > > > IOCTL requests to the device if user issues such request. Say for 
> > > > example power cycling the drive etc. It can also be a passthro 
> request 
> > > > down to the device.
> > 
> > Can you give an example of such an operation?
> > 
> > In general, any operation is generalized at the librados level. 
> > For example, in order to get write same and cmpxchg block operations, 
> we 
> > added librados operations with similar semantics and implement them 
> there.  
> > It is unlikely that passing a complex operation down to the SCSI layer 
> > will work in unison with the other steps involved in committing 
> > an operation (e.g., updating metadata indicating the object 
> > version has changed).
> > 
> > sage
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in
> > the body of a message to majordomo <at> vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> 
> Thanks for the response Sage. Example IOCTL operations would be like 
> downloading a new FW to the drive/HBA, task management requests like 
> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro 
> command to the drive for querying etc. All these would have to be 
> initiated and go through ceph (if supported) and not bypassing it.

Ah.  I think these kind of management functions should be performed while 
the ceph-osd daemon for that drive is offline.  We would probably want 
some hardware management layer that coexists with ceph or that perhaps has 
some minimal integration with the ceph osds to do this sort of thing.  
It's not something that a client (user) would initiate, though.
 
> I asked a related question in another post too: Can a physical disk 
> (/dev/sda1) assigned to a ceph OSD object be continued to used by other 
> apps in the system to issue I/O directly via (/dev/sda1) interface? Does 
> ceph prevent it, as such operations may corrupt data?

It depends on what privileges the other app has.  If it's root or user 
ceph, it can step all over the disk (and the rest of the system) and wreak 
havoc.  With the current backend, we are storing data as files, so you 
could have other apps using other directories on the file system--this is 
generally a bad idea for real deployments, though, as performance and disk 
utilization will be unpredictable.

sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-12 12:37       ` Sage Weil
@ 2016-05-12 13:21         ` John Spray
  2016-05-12 13:38           ` Handzik, Joseph
  0 siblings, 1 reply; 10+ messages in thread
From: John Spray @ 2016-05-12 13:21 UTC (permalink / raw)
  To: Sage Weil; +Cc: Naveen, Ceph Development

On Thu, May 12, 2016 at 1:37 PM, Sage Weil <sage@newdream.net> wrote:
> On Thu, 12 May 2016, Naveen wrote:
>> Sage Weil <sage <at> newdream.net> writes:
>>
>>
>> > > > My question is:
>> > > > The SCSI LLD would support both read /write entry points for I/O
>> > > > requests issued by the filesystem/block I/O but they also support
>> some
>> > > > custom requests using IOCTLs. So how can ceph support issuing of
>> such
>> > > > IOCTL requests to the device if user issues such request. Say for
>> > > > example power cycling the drive etc. It can also be a passthro
>> request
>> > > > down to the device.
>> >
>> > Can you give an example of such an operation?
>> >
>> > In general, any operation is generalized at the librados level.
>> > For example, in order to get write same and cmpxchg block operations,
>> we
>> > added librados operations with similar semantics and implement them
>> there.
>> > It is unlikely that passing a complex operation down to the SCSI layer
>> > will work in unison with the other steps involved in committing
>> > an operation (e.g., updating metadata indicating the object
>> > version has changed).
>> >
>> > sage
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>> in
>> > the body of a message to majordomo <at> vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>> >
>>
>> Thanks for the response Sage. Example IOCTL operations would be like
>> downloading a new FW to the drive/HBA, task management requests like
>> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
>> command to the drive for querying etc. All these would have to be
>> initiated and go through ceph (if supported) and not bypassing it.
>
> Ah.  I think these kind of management functions should be performed while
> the ceph-osd daemon for that drive is offline.  We would probably want
> some hardware management layer that coexists with ceph or that perhaps has
> some minimal integration with the ceph osds to do this sort of thing.
> It's not something that a client (user) would initiate, though.

This is all pretty relevant to Joe Handzik's stuff:
https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
http://www.spinics.net/lists/ceph-devel/msg30126.html

The idea there though is to enable passing libstoragemgmt calls
through the OSD, as opposed to arbitrary SCSI operations.

Although libstoragemgmt is fairly young, I'm a fan of the idea that we
could use it internally within Ceph, and then have the same tools/libs
used by out-of-ceph management platforms when they want to do
equivalent stuff while the OSD is offline.

John


>> I asked a related question in another post too: Can a physical disk
>> (/dev/sda1) assigned to a ceph OSD object be continued to used by other
>> apps in the system to issue I/O directly via (/dev/sda1) interface? Does
>> ceph prevent it, as such operations may corrupt data?
>
> It depends on what privileges the other app has.  If it's root or user
> ceph, it can step all over the disk (and the rest of the system) and wreak
> havoc.  With the current backend, we are storing data as files, so you
> could have other apps using other directories on the file system--this is
> generally a bad idea for real deployments, though, as performance and disk
> utilization will be unpredictable.
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-12 13:21         ` John Spray
@ 2016-05-12 13:38           ` Handzik, Joseph
  2016-05-13 15:01             ` Naveen
  0 siblings, 1 reply; 10+ messages in thread
From: Handzik, Joseph @ 2016-05-12 13:38 UTC (permalink / raw)
  To: John Spray; +Cc: Sage Weil, Naveen, Ceph Development

I was intentionally staying quiet to hear other opinions, but I agree that the same mechanisms I'm building for LED operations could extend here (within Ceph of we want it to, but at the very least in libstoragemgmt).

Naveen, if you have interest in helping push this along I'd encourage you to voice your interest over on the libstoragemgmt GitHub: https://github.com/libstorage/libstoragemgmt

Joe

> On May 12, 2016, at 8:22 AM, John Spray <jspray@redhat.com> wrote:
> 
>> On Thu, May 12, 2016 at 1:37 PM, Sage Weil <sage@newdream.net> wrote:
>>> On Thu, 12 May 2016, Naveen wrote:
>>> Sage Weil <sage <at> newdream.net> writes:
>>> 
>>> 
>>>>>> My question is:
>>>>>> The SCSI LLD would support both read /write entry points for I/O
>>>>>> requests issued by the filesystem/block I/O but they also support
>>> some
>>>>>> custom requests using IOCTLs. So how can ceph support issuing of
>>> such
>>>>>> IOCTL requests to the device if user issues such request. Say for
>>>>>> example power cycling the drive etc. It can also be a passthro
>>> request
>>>>>> down to the device.
>>>> 
>>>> Can you give an example of such an operation?
>>>> 
>>>> In general, any operation is generalized at the librados level.
>>>> For example, in order to get write same and cmpxchg block operations,
>>> we
>>>> added librados operations with similar semantics and implement them
>>> there.
>>>> It is unlikely that passing a complex operation down to the SCSI layer
>>>> will work in unison with the other steps involved in committing
>>>> an operation (e.g., updating metadata indicating the object
>>>> version has changed).
>>>> 
>>>> sage
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>> in
>>>> the body of a message to majordomo <at> vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> Thanks for the response Sage. Example IOCTL operations would be like
>>> downloading a new FW to the drive/HBA, task management requests like
>>> Hard reset, power cycling the drive, issue a SAS/SMP/STP pass thro
>>> command to the drive for querying etc. All these would have to be
>>> initiated and go through ceph (if supported) and not bypassing it.
>> 
>> Ah.  I think these kind of management functions should be performed while
>> the ceph-osd daemon for that drive is offline.  We would probably want
>> some hardware management layer that coexists with ceph or that perhaps has
>> some minimal integration with the ceph osds to do this sort of thing.
>> It's not something that a client (user) would initiate, though.
> 
> This is all pretty relevant to Joe Handzik's stuff:
> https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
> http://www.spinics.net/lists/ceph-devel/msg30126.html
> 
> The idea there though is to enable passing libstoragemgmt calls
> through the OSD, as opposed to arbitrary SCSI operations.
> 
> Although libstoragemgmt is fairly young, I'm a fan of the idea that we
> could use it internally within Ceph, and then have the same tools/libs
> used by out-of-ceph management platforms when they want to do
> equivalent stuff while the OSD is offline.
> 
> John
> 
> 
>>> I asked a related question in another post too: Can a physical disk
>>> (/dev/sda1) assigned to a ceph OSD object be continued to used by other
>>> apps in the system to issue I/O directly via (/dev/sda1) interface? Does
>>> ceph prevent it, as such operations may corrupt data?
>> 
>> It depends on what privileges the other app has.  If it's root or user
>> ceph, it can step all over the disk (and the rest of the system) and wreak
>> havoc.  With the current backend, we are storing data as files, so you
>> could have other apps using other directories on the file system--this is
>> generally a bad idea for real deployments, though, as performance and disk
>> utilization will be unpredictable.
>> 
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Issuing custom IOCTLs to SCSI LLD
  2016-05-12 13:38           ` Handzik, Joseph
@ 2016-05-13 15:01             ` Naveen
  0 siblings, 0 replies; 10+ messages in thread
From: Naveen @ 2016-05-13 15:01 UTC (permalink / raw)
  To: ceph-devel

Handzik, Joseph <joseph.t.handzik <at> hpe.com> writes:

> 
> I was intentionally staying quiet to hear other opinions, but I agree 
that the same mechanisms I'm building
> for LED operations could extend here (within Ceph of we want it to, 
but at the very least in libstoragemgmt).
> 
> Naveen, if you have interest in helping push this along I'd encourage 
you to voice your interest over on the
> libstoragemgmt GitHub: https://github.com/libstorage/libstoragemgmt
> 
> Joe
> 
> > 
> > This is all pretty relevant to Joe Handzik's stuff:
> > https://github.com/joehandzik/ceph/commits/wip-hw-mgmt-cli
> > http://www.spinics.net/lists/ceph-devel/msg30126.html
> > 
> > The idea there though is to enable passing libstoragemgmt calls
> > through the OSD, as opposed to arbitrary SCSI operations.
> > 
> > Although libstoragemgmt is fairly young, I'm a fan of the idea that 
we
> > could use it internally within Ceph, and then have the same 
tools/libs
> > used by out-of-ceph management platforms when they want to do
> > equivalent stuff while the OSD is offline.
> > 
> > John
> > 
> > 
> >>> I asked a related question in another post too: Can a physical 
disk
> >>> (/dev/sda1) assigned to a ceph OSD object be continued to used by 
other
> >>> apps in the system to issue I/O directly via (/dev/sda1) 
interface? Does
> >>> ceph prevent it, as such operations may corrupt data?
> >> 
> >> It depends on what privileges the other app has.  If it's root or 
user
> >> ceph, it can step all over the disk (and the rest of the system) 
and wreak
> >> havoc.  With the current backend, we are storing data as files, so 
you
> >> could have other apps using other directories on the file system--
this is
> >> generally a bad idea for real deployments, though, as performance 
and disk
> >> utilization will be unpredictable.
> >> 
> >> sage

Thanks everyone for the inputs and clarifications. Looks like 
libstoragemgmt is key to lot of these feature. Will explore that next 
and add if any questions in the git.

thanks,
naveen


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-05-13 15:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-09  9:27 Issuing custom IOCTLs to SCSI LLD Naveen
2016-05-11  4:18 ` Naveen
2016-05-11  5:27   ` Christoph Hellwig
2016-05-12  8:52     ` Naveen
2016-05-11 12:41   ` Sage Weil
2016-05-12  9:32     ` Naveen
2016-05-12 12:37       ` Sage Weil
2016-05-12 13:21         ` John Spray
2016-05-12 13:38           ` Handzik, Joseph
2016-05-13 15:01             ` Naveen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.