All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [ceph-users] Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]   ` <CALWuvY_9uWE_DKg5UaYZrc_GchfMK1EpX1rkWrJEuAV+Y3+Uww@mail.gmail.com>
@ 2017-03-15  8:38     ` Brad Hubbard
       [not found]       ` <CAF-wwdFvgzkYMnkheomsWMikZoZU+THyrxzovORx3bCDfPZ7Ng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Brad Hubbard @ 2017-03-15  8:38 UTC (permalink / raw)
  To: nokia ceph; +Cc: Ceph Users, ceph-devel, Weil, Sage, Mark Nelson

+ceph-devel

On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers@gmail.com> wrote:
> Hello,
>
> We suspect these messages not only at the time of OSD creation. But in idle
> conditions also. May I know what is the impact of these error? Can we safely
> ignore this? Or is there any way/config to fix this problem
>
> Few occurrence for these events as follows:---
>
> ====
> 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 started
> 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60: memtable #1
> done
> 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros": 1489511769500289,
> "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0, 0, 0],
> "immutable_memtables": 0}
> 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 max bytes
> base 268435456 files[2 4 6 0 0 0 0] max score 0.76
>
> 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to delete
> WAL files size 244090350, prev total WAL file size 247331500, number of live
> WAL files 2.
>
> 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6

These errors come from here.

void KernelDevice::aio_submit(IOContext *ioc)
{
...
    int r = aio_queue.submit(*cur, &retries);
    if (retries)
      derr << __func__ << " retries " << retries << dendl;

The submit function is this one which calls libaio's io_submit
function directly and increments retries if it receives EAGAIN.

#if defined(HAVE_LIBAIO)
int FS::aio_queue_t::submit(aio_t &aio, int *retries)
{
  // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
  int attempts = 16;
  int delay = 125;
  iocb *piocb = &aio.iocb;
  while (true) {
    int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
    if (r < 0) {
      if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
        usleep(delay);
        delay *= 2;
        (*retries)++;
        continue;
      }
      return r;
    }
    assert(r == 1);
    break;
  }
  return 0;
}


From the man page.

IO_SUBMIT(2)                                   Linux Programmer's
Manual                                  IO_SUBMIT(2)

NAME
       io_submit - submit asynchronous I/O blocks for processing
...
RETURN VALUE
       On success, io_submit() returns the number of iocbs submitted
(which may be 0 if nr is zero).  For the  failure
       return, see NOTES.

ERRORS
       EAGAIN Insufficient resources are available to queue any iocbs.

I suspect increasing bdev_aio_max_queue_depth may help here but some
of the other devs may have more/better ideas.

> 2017-03-14 17:34:11.672123 7fedf226e700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> 2017-03-14 17:34:11.696253 7fedf2a6f700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 10
> 2017-03-14 17:34:11.739016 7fedf3a71700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 10
> 2017-03-14 17:51:08.808848 7fee05294700  4 rocksdb: reusing log 57 from
> recycle list
> =====
>
>
> =====
> 2017-03-14 18:32:06.910308 7fedeb260700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1489516326910303, "job": 27, "event": "table_file_deletion",
> "file_number": 69}
> 2017-03-14 18:32:06.910326 7fedeb260700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1489516326910325, "job": 27, "event": "table_file_deletion",
> "file_number": 51}
> 2017-03-14 18:32:06.910343 7fedeb260700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1489516326910342, "job": 27, "event": "table_file_deletion",
> "file_number": 50}
> 2017-03-14 18:37:37.139963 7fedf3270700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 10
> 2017-03-14 18:37:37.144147 7fedf4a73700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 9
> 2017-03-14 18:37:37.179302 7fedf226e700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 10
> 2017-03-14 18:37:37.205677 7fedf2a6f700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 18:37:37.226650 7fedf5274700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 18:37:37.234589 7fedf3a71700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 19:01:53.494364 7fee05294700  4 rocksdb: reusing log 61 from
> recycle list
> ======
>
> ======
> 2017-03-14 19:38:04.946439 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-19:38:04.946355) EVENT_LOG_v1 {"time_micros": 1489520284946348,
> "job": 29, "event": "flush_finished", "lsm_state": [2, 4, 13, 0, 0, 0, 0],
> "immutable_memtables": 0}
> 2017-03-14 19:38:04.946441 7fedeba61700  4 rocksdb: (Original Log Time
> 2017/03/14-19:38:04.946383) [default] Level summary: base level 1 max bytes
> base 268435456 files[2 4 13 0 0 0 0] max score 0.85
>
> 2017-03-14 19:38:04.946461 7fedeba61700  4 rocksdb: [JOB 29] Try to delete
> WAL files size 244060984, prev total WAL file size 247581163, number of live
> WAL files 2.
>
> 2017-03-14 20:01:43.916418 7fedf4272700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 10
> 2017-03-14 20:01:43.951939 7fedf126c700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 20:01:43.960599 7fedf3270700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 20:01:43.969409 7fedf3a71700 -1
> bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 11
> 2017-03-14 20:13:04.291160 7fee05294700  4 rocksdb: reusing log 85 from
> recycle list
>
> 2017-03-14 20:13:04.291254 7fee05294700  4 rocksdb: [default] New memtable
> created with log file: #89. Immutable memtables: 0.
> =====
>
> Thanks
>
>
> On Wed, Mar 15, 2017 at 11:18 AM, nokia ceph <nokiacephusers@gmail.com>
> wrote:
>>
>> Hello,
>>
>> Can we get any update for this problem?
>>
>> Thanks
>>
>>
>> On Thu, Mar 2, 2017 at 2:16 PM, nokia ceph <nokiacephusers@gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> Env:- v11.2.0 - bluestore - EC 3 + 1
>>>
>>> We are getting below entries both in /var/log/messages and osd logs. May
>>> I know what is the impact of the below message and as these message were
>>> flooded in osd and sys logs.
>>>
>>> ~~~
>>>
>>> 2017-03-01 13:00:59.938839 7f6c96915700 -1
>>> bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 2
>>>
>>> 2017-03-01 13:00:59.940939 7f6c96915700 -1
>>> bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 4
>>>
>>> 2017-03-01 13:00:59.941126 7f6c96915700 -1
>>> bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
>>>
>>> ~~~
>>>
>>>
>>> I found these messages during the activation of ceph-osd's
>>>
>>>
>>> ~~~ From /var/log/messages
>>>
>>> Feb 27 13:59:46 PL6-CN2 sh: command_check_call: Running command:
>>> /usr/bin/systemctl start ceph-osd@39
>>>
>>> Feb 27 13:59:46 PL6-CN2 systemd: Started Ceph disk activation: /dev/sdk2.
>>>
>>> Feb 27 13:59:46 PL6-CN2 ceph-osd: 2017-02-27 13:59:46.540781 7f1f3f016700
>>> -1 bdev(/var/lib/ceph/osd/ceph-39/block) aio_submit retries 9
>>>
>>> Feb 27 13:59:46 PL6-CN2 ceph-osd: 2017-02-27 13:59:46.544670 7f1f3f016700
>>> -1 bdev(/var/lib/ceph/osd/ceph-39/block) aio_submit retries 2
>>>
>>> Feb 27 13:59:46 PL6-CN2 ceph-osd: 2017-02-27 13:59:46.544854 7f1f3f016700
>>> -1 bdev(/var/lib/ceph/osd/ceph-39/block) aio_submit retries 1
>>>
>>> Feb 27 13:59:47 PL6-CN2 kernel: sdl: sdl1
>>>
>>> Feb 27 13:59:47 PL6-CN2 kernel: sdl: sdl1
>>>
>>> Feb 27 13:59:48 PL6-CN2 kernel: sdl: sdl1 sdl2
>>>
>>> Feb 27 13:59:48 PL6-CN2 systemd: Cannot add dependency job for unit
>>> microcode.service, ignoring: Unit is not loaded properly: Invalid argument.
>>>
>>> Feb 27 13:59:48 PL6-CN2 systemd: Starting Ceph disk activation:
>>> /dev/sdl2...
>>>
>>> ~~~
>>>
>>> At the same time on OSD logs:-- /var/log/ceph/ceph-osd.43.log-20170228.gz
>>>
>>> 2017-02-27 14:00:17.121460 7f147351e940  0 osd.43 0 crush map has
>>> features 2199057072128, adjusting msgr requires for clients
>>> 2017-02-27 14:00:17.121466 7f147351e940  0 osd.43 0 crush map has
>>> features 2199057072128 was 8705, adjusting msgr requires for mons
>>> 2017-02-27 14:00:17.121468 7f147351e940  0 osd.43 0 crush map has
>>> features 2199057072128, adjusting msgr requires for osds
>>> 2017-02-27 14:00:17.121511 7f147351e940  0 osd.43 0 load_pgs
>>> 2017-02-27 14:00:17.121514 7f147351e940  0 osd.43 0 load_pgs opened 0 pgs
>>> 2017-02-27 14:00:17.121517 7f147351e940  0 osd.43 0 using 1 op queue with
>>> priority op cut off at 64.
>>> 2017-02-27 14:00:17.122364 7f147351e940 -1 osd.43 0 log_to_monitors
>>> {default=true}
>>> 2017-02-27 14:00:18.371762 7f147351e940  0 osd.43 0 done with init,
>>> starting boot process
>>> 2017-02-27 14:00:18.486559 7f1459952700 -1
>>> bdev(/var/lib/ceph/osd/ceph-43/block) aio_submit retries 7
>>> 2017-02-27 14:00:18.488770 7f1459952700 -1
>>> bdev(/var/lib/ceph/osd/ceph-43/block) aio_submit retries 4
>>> 2017-02-27 14:00:18.489306 7f1459952700 -1
>>> bdev(/var/lib/ceph/osd/ceph-43/block) aio_submit retries 2
>>> 2017-02-27 14:00:18.489826 7f1459952700 -1
>>> bdev(/var/lib/ceph/osd/ceph-43/block) aio_submit retries 2
>>> <snip>
>>> ..
>>> <snip>
>>> 2017-02-27 14:00:18.583234 7f145814f700  0 osd.43 93 crush map has
>>> features 2200130813952, adjusting msgr requires for clients
>>> 2017-02-27 14:00:18.583257 7f145814f700  0 osd.43 93 crush map has
>>> features 2200130813952 was 2199057080833, adjusting msgr requires for mons
>>> 2017-02-27 14:00:18.583271 7f145814f700  0 osd.43 93 crush map has
>>> features 2200130813952, adjusting msgr requires for osds
>>> ~~~~
>>>
>>> As per my understanding on a bluestore device, we can write with both
>>> O_DIRECT and aio way. For some reasons, bdev can't able to commit with aio
>>> type.
>>>
>>> ~~~
>>> "ms_type": "async"
>>> ~~~
>>>
>>> Need your suggestion how to skip these messages.
>>>
>>> Thanks
>>>
>>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]       ` <CAF-wwdFvgzkYMnkheomsWMikZoZU+THyrxzovORx3bCDfPZ7Ng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-15 14:24         ` Sage Weil
       [not found]           ` <alpine.DEB.2.11.1703151422110.10776-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2017-03-15 14:24 UTC (permalink / raw)
  To: Brad Hubbard; +Cc: Ceph Users, ceph-devel

On Wed, 15 Mar 2017, Brad Hubbard wrote:
> +ceph-devel
> 
> On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Hello,
> >
> > We suspect these messages not only at the time of OSD creation. But in idle
> > conditions also. May I know what is the impact of these error? Can we safely
> > ignore this? Or is there any way/config to fix this problem
> >
> > Few occurrence for these events as follows:---
> >
> > ====
> > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log Time
> > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 started
> > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log Time
> > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60: memtable #1
> > done
> > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log Time
> > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros": 1489511769500289,
> > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0, 0, 0],
> > "immutable_memtables": 0}
> > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log Time
> > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 max bytes
> > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
> >
> > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to delete
> > WAL files size 244090350, prev total WAL file size 247331500, number of live
> > WAL files 2.
> >
> > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> 
> These errors come from here.
> 
> void KernelDevice::aio_submit(IOContext *ioc)
> {
> ...
>     int r = aio_queue.submit(*cur, &retries);
>     if (retries)
>       derr << __func__ << " retries " << retries << dendl;
> 
> The submit function is this one which calls libaio's io_submit
> function directly and increments retries if it receives EAGAIN.
> 
> #if defined(HAVE_LIBAIO)
> int FS::aio_queue_t::submit(aio_t &aio, int *retries)
> {
>   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
>   int attempts = 16;
>   int delay = 125;
>   iocb *piocb = &aio.iocb;
>   while (true) {
>     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
>     if (r < 0) {
>       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
>         usleep(delay);
>         delay *= 2;
>         (*retries)++;
>         continue;
>       }
>       return r;
>     }
>     assert(r == 1);
>     break;
>   }
>   return 0;
> }
> 
> 
> From the man page.
> 
> IO_SUBMIT(2)                                   Linux Programmer's
> Manual                                  IO_SUBMIT(2)
> 
> NAME
>        io_submit - submit asynchronous I/O blocks for processing
> ...
> RETURN VALUE
>        On success, io_submit() returns the number of iocbs submitted
> (which may be 0 if nr is zero).  For the  failure
>        return, see NOTES.
> 
> ERRORS
>        EAGAIN Insufficient resources are available to queue any iocbs.
> 
> I suspect increasing bdev_aio_max_queue_depth may help here but some
> of the other devs may have more/better ideas.

Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try 
changing it to 128, 1024, or 4096 and see if these errors go away.

I've never been able to trigger this on my test boxes, but I put in the 
warning to help ensure we pick a good default.

What kernel version are you running?

Thanks!
sage

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]           ` <alpine.DEB.2.11.1703151422110.10776-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
@ 2017-03-16  6:15             ` nokia ceph
       [not found]               ` <CALWuvY8f1j9_ccbzrchfdhZ66RYDTqRuL2A2qy9XFHTARitPBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: nokia ceph @ 2017-03-16  6:15 UTC (permalink / raw)
  To: Sage Weil; +Cc: Ceph Users, ceph-devel


[-- Attachment #1.1: Type: text/plain, Size: 3956 bytes --]

Hello,

We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }

Sure I will try to alter this directive - bdev_aio_max_queue_depth and will
share our results.

Could you please explain how this calculation happens?

Thanks


On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:

> On Wed, 15 Mar 2017, Brad Hubbard wrote:
> > +ceph-devel
> >
> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> wrote:
> > > Hello,
> > >
> > > We suspect these messages not only at the time of OSD creation. But in
> idle
> > > conditions also. May I know what is the impact of these error? Can we
> safely
> > > ignore this? Or is there any way/config to fix this problem
> > >
> > > Few occurrence for these events as follows:---
> > >
> > > ====
> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log Time
> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 started
> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log Time
> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
> memtable #1
> > > done
> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log Time
> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
> 1489511769500289,
> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0, 0,
> 0],
> > > "immutable_memtables": 0}
> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log Time
> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 max
> bytes
> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
> > >
> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to
> delete
> > > WAL files size 244090350, prev total WAL file size 247331500, number
> of live
> > > WAL files 2.
> > >
> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> >
> > These errors come from here.
> >
> > void KernelDevice::aio_submit(IOContext *ioc)
> > {
> > ...
> >     int r = aio_queue.submit(*cur, &retries);
> >     if (retries)
> >       derr << __func__ << " retries " << retries << dendl;
> >
> > The submit function is this one which calls libaio's io_submit
> > function directly and increments retries if it receives EAGAIN.
> >
> > #if defined(HAVE_LIBAIO)
> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
> > {
> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
> >   int attempts = 16;
> >   int delay = 125;
> >   iocb *piocb = &aio.iocb;
> >   while (true) {
> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
> >     if (r < 0) {
> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
> >         usleep(delay);
> >         delay *= 2;
> >         (*retries)++;
> >         continue;
> >       }
> >       return r;
> >     }
> >     assert(r == 1);
> >     break;
> >   }
> >   return 0;
> > }
> >
> >
> > From the man page.
> >
> > IO_SUBMIT(2)                                   Linux Programmer's
> > Manual                                  IO_SUBMIT(2)
> >
> > NAME
> >        io_submit - submit asynchronous I/O blocks for processing
> > ...
> > RETURN VALUE
> >        On success, io_submit() returns the number of iocbs submitted
> > (which may be 0 if nr is zero).  For the  failure
> >        return, see NOTES.
> >
> > ERRORS
> >        EAGAIN Insufficient resources are available to queue any iocbs.
> >
> > I suspect increasing bdev_aio_max_queue_depth may help here but some
> > of the other devs may have more/better ideas.
>
> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
> changing it to 128, 1024, or 4096 and see if these errors go away.
>
> I've never been able to trigger this on my test boxes, but I put in the
> warning to help ensure we pick a good default.
>
> What kernel version are you running?
>
> Thanks!
> sage
>

[-- Attachment #1.2: Type: text/html, Size: 5655 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]               ` <CALWuvY8f1j9_ccbzrchfdhZ66RYDTqRuL2A2qy9XFHTARitPBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-16  6:23                 ` Brad Hubbard
       [not found]                   ` <CAF-wwdGBF5yDMOpdxm0S5fXcYYt94LaEiAXw8qXT2g66pB30xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Brad Hubbard @ 2017-03-16  6:23 UTC (permalink / raw)
  To: nokia ceph; +Cc: Ceph Users, ceph-devel

On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hello,
>
> We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }
>
> Sure I will try to alter this directive - bdev_aio_max_queue_depth and will
> share our results.
>
> Could you please explain how this calculation happens?

What calculation are you referring to?

> Thanks
>
>
> On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
>>
>> On Wed, 15 Mar 2017, Brad Hubbard wrote:
>> > +ceph-devel
>> >
>> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> > wrote:
>> > > Hello,
>> > >
>> > > We suspect these messages not only at the time of OSD creation. But in
>> > > idle
>> > > conditions also. May I know what is the impact of these error? Can we
>> > > safely
>> > > ignore this? Or is there any way/config to fix this problem
>> > >
>> > > Few occurrence for these events as follows:---
>> > >
>> > > ====
>> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log Time
>> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60 started
>> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log Time
>> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
>> > > memtable #1
>> > > done
>> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log Time
>> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
>> > > 1489511769500289,
>> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0, 0,
>> > > 0],
>> > > "immutable_memtables": 0}
>> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log Time
>> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1 max
>> > > bytes
>> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
>> > >
>> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to
>> > > delete
>> > > WAL files size 244090350, prev total WAL file size 247331500, number
>> > > of live
>> > > WAL files 2.
>> > >
>> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
>> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
>> >
>> > These errors come from here.
>> >
>> > void KernelDevice::aio_submit(IOContext *ioc)
>> > {
>> > ...
>> >     int r = aio_queue.submit(*cur, &retries);
>> >     if (retries)
>> >       derr << __func__ << " retries " << retries << dendl;
>> >
>> > The submit function is this one which calls libaio's io_submit
>> > function directly and increments retries if it receives EAGAIN.
>> >
>> > #if defined(HAVE_LIBAIO)
>> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
>> > {
>> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
>> >   int attempts = 16;
>> >   int delay = 125;
>> >   iocb *piocb = &aio.iocb;
>> >   while (true) {
>> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
>> >     if (r < 0) {
>> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
>> >         usleep(delay);
>> >         delay *= 2;
>> >         (*retries)++;
>> >         continue;
>> >       }
>> >       return r;
>> >     }
>> >     assert(r == 1);
>> >     break;
>> >   }
>> >   return 0;
>> > }
>> >
>> >
>> > From the man page.
>> >
>> > IO_SUBMIT(2)                                   Linux Programmer's
>> > Manual                                  IO_SUBMIT(2)
>> >
>> > NAME
>> >        io_submit - submit asynchronous I/O blocks for processing
>> > ...
>> > RETURN VALUE
>> >        On success, io_submit() returns the number of iocbs submitted
>> > (which may be 0 if nr is zero).  For the  failure
>> >        return, see NOTES.
>> >
>> > ERRORS
>> >        EAGAIN Insufficient resources are available to queue any iocbs.
>> >
>> > I suspect increasing bdev_aio_max_queue_depth may help here but some
>> > of the other devs may have more/better ideas.
>>
>> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
>> changing it to 128, 1024, or 4096 and see if these errors go away.
>>
>> I've never been able to trigger this on my test boxes, but I put in the
>> warning to help ensure we pick a good default.
>>
>> What kernel version are you running?
>>
>> Thanks!
>> sage
>
>



-- 
Cheers,
Brad

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]                   ` <CAF-wwdGBF5yDMOpdxm0S5fXcYYt94LaEiAXw8qXT2g66pB30xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-16  6:33                     ` nokia ceph
       [not found]                       ` <CALWuvY-m4JyaohTJCfzeC==CnfJ2cGK1n8P+49pS65u5n8hnfQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: nokia ceph @ 2017-03-16  6:33 UTC (permalink / raw)
  To: Brad Hubbard; +Cc: Ceph Users, ceph-devel


[-- Attachment #1.1: Type: text/plain, Size: 4881 bytes --]

Hello Brad,

I meant for this parameter bdev_aio_max_queue_depth , Sage suggested try
diff values, 128,1024 , 4096 . So my doubt how this calculation happens? Is
this  related to memory?

Thanks




On Thu, Mar 16, 2017 at 11:53 AM, Brad Hubbard <bhubbard-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> wrote:
> > Hello,
> >
> > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }
> >
> > Sure I will try to alter this directive - bdev_aio_max_queue_depth and
> will
> > share our results.
> >
> > Could you please explain how this calculation happens?
>
> What calculation are you referring to?
>
> > Thanks
> >
> >
> > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
> >>
> >> On Wed, 15 Mar 2017, Brad Hubbard wrote:
> >> > +ceph-devel
> >> >
> >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> >
> >> > wrote:
> >> > > Hello,
> >> > >
> >> > > We suspect these messages not only at the time of OSD creation. But
> in
> >> > > idle
> >> > > conditions also. May I know what is the impact of these error? Can
> we
> >> > > safely
> >> > > ignore this? Or is there any way/config to fix this problem
> >> > >
> >> > > Few occurrence for these events as follows:---
> >> > >
> >> > > ====
> >> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log
> Time
> >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60
> started
> >> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log
> Time
> >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
> >> > > memtable #1
> >> > > done
> >> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log
> Time
> >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
> >> > > 1489511769500289,
> >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0,
> 0,
> >> > > 0],
> >> > > "immutable_memtables": 0}
> >> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log
> Time
> >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1
> max
> >> > > bytes
> >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
> >> > >
> >> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to
> >> > > delete
> >> > > WAL files size 244090350, prev total WAL file size 247331500, number
> >> > > of live
> >> > > WAL files 2.
> >> > >
> >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> >> >
> >> > These errors come from here.
> >> >
> >> > void KernelDevice::aio_submit(IOContext *ioc)
> >> > {
> >> > ...
> >> >     int r = aio_queue.submit(*cur, &retries);
> >> >     if (retries)
> >> >       derr << __func__ << " retries " << retries << dendl;
> >> >
> >> > The submit function is this one which calls libaio's io_submit
> >> > function directly and increments retries if it receives EAGAIN.
> >> >
> >> > #if defined(HAVE_LIBAIO)
> >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
> >> > {
> >> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
> >> >   int attempts = 16;
> >> >   int delay = 125;
> >> >   iocb *piocb = &aio.iocb;
> >> >   while (true) {
> >> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
> >> >     if (r < 0) {
> >> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
> >> >         usleep(delay);
> >> >         delay *= 2;
> >> >         (*retries)++;
> >> >         continue;
> >> >       }
> >> >       return r;
> >> >     }
> >> >     assert(r == 1);
> >> >     break;
> >> >   }
> >> >   return 0;
> >> > }
> >> >
> >> >
> >> > From the man page.
> >> >
> >> > IO_SUBMIT(2)                                   Linux Programmer's
> >> > Manual                                  IO_SUBMIT(2)
> >> >
> >> > NAME
> >> >        io_submit - submit asynchronous I/O blocks for processing
> >> > ...
> >> > RETURN VALUE
> >> >        On success, io_submit() returns the number of iocbs submitted
> >> > (which may be 0 if nr is zero).  For the  failure
> >> >        return, see NOTES.
> >> >
> >> > ERRORS
> >> >        EAGAIN Insufficient resources are available to queue any iocbs.
> >> >
> >> > I suspect increasing bdev_aio_max_queue_depth may help here but some
> >> > of the other devs may have more/better ideas.
> >>
> >> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
> >> changing it to 128, 1024, or 4096 and see if these errors go away.
> >>
> >> I've never been able to trigger this on my test boxes, but I put in the
> >> warning to help ensure we pick a good default.
> >>
> >> What kernel version are you running?
> >>
> >> Thanks!
> >> sage
> >
> >
>
>
>
> --
> Cheers,
> Brad
>

[-- Attachment #1.2: Type: text/html, Size: 7559 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]                       ` <CALWuvY-m4JyaohTJCfzeC==CnfJ2cGK1n8P+49pS65u5n8hnfQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-16  7:12                         ` Brad Hubbard
       [not found]                           ` <CAF-wwdGHgO6etLromGd_5-UTBFmXD1JeOAiX_eFDB8z7bXZJ6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Brad Hubbard @ 2017-03-16  7:12 UTC (permalink / raw)
  To: nokia ceph; +Cc: Ceph Users, ceph-devel

On Thu, Mar 16, 2017 at 4:33 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hello Brad,
>
> I meant for this parameter bdev_aio_max_queue_depth , Sage suggested try
> diff values, 128,1024 , 4096 . So my doubt how this calculation happens? Is
> this  related to memory?

The bdev_aio_max_queue_depth parameter represents the nr_events
argument to the libaio io_setup function.

int io_setup(unsigned nr_events, aio_context_t *ctx_idp);

From the man page for io_setup:

"The io_setup() system call creates an asynchronous I/O context
suitable for concurrently processing nr_events operations."

The current theory we are working with is that io_submit is returning
EAGAIN because nr_events is too small at the default of 32. Therefore
we have suggested raising this value. There is no real calculation
involved in the values Sage is suggesting other than they are
*larger*. It's a matter of playing with the value to see if, and when,
the error messages go away. If we know a larger value reduces or
eradicates the error we can then turn our focus more to *why*. Longer
term this can assist us in setting a more reasonable default.

>
> Thanks
>
>
>
>
> On Thu, Mar 16, 2017 at 11:53 AM, Brad Hubbard <bhubbard-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>
>> On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> wrote:
>> > Hello,
>> >
>> > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }
>> >
>> > Sure I will try to alter this directive - bdev_aio_max_queue_depth and
>> > will
>> > share our results.
>> >
>> > Could you please explain how this calculation happens?
>>
>> What calculation are you referring to?
>>
>> > Thanks
>> >
>> >
>> > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
>> >>
>> >> On Wed, 15 Mar 2017, Brad Hubbard wrote:
>> >> > +ceph-devel
>> >> >
>> >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph
>> >> > <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> >> > wrote:
>> >> > > Hello,
>> >> > >
>> >> > > We suspect these messages not only at the time of OSD creation. But
>> >> > > in
>> >> > > idle
>> >> > > conditions also. May I know what is the impact of these error? Can
>> >> > > we
>> >> > > safely
>> >> > > ignore this? Or is there any way/config to fix this problem
>> >> > >
>> >> > > Few occurrence for these events as follows:---
>> >> > >
>> >> > > ====
>> >> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log
>> >> > > Time
>> >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60
>> >> > > started
>> >> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log
>> >> > > Time
>> >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
>> >> > > memtable #1
>> >> > > done
>> >> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log
>> >> > > Time
>> >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
>> >> > > 1489511769500289,
>> >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0,
>> >> > > 0,
>> >> > > 0],
>> >> > > "immutable_memtables": 0}
>> >> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log
>> >> > > Time
>> >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1
>> >> > > max
>> >> > > bytes
>> >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
>> >> > >
>> >> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to
>> >> > > delete
>> >> > > WAL files size 244090350, prev total WAL file size 247331500,
>> >> > > number
>> >> > > of live
>> >> > > WAL files 2.
>> >> > >
>> >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
>> >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
>> >> >
>> >> > These errors come from here.
>> >> >
>> >> > void KernelDevice::aio_submit(IOContext *ioc)
>> >> > {
>> >> > ...
>> >> >     int r = aio_queue.submit(*cur, &retries);
>> >> >     if (retries)
>> >> >       derr << __func__ << " retries " << retries << dendl;
>> >> >
>> >> > The submit function is this one which calls libaio's io_submit
>> >> > function directly and increments retries if it receives EAGAIN.
>> >> >
>> >> > #if defined(HAVE_LIBAIO)
>> >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
>> >> > {
>> >> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
>> >> >   int attempts = 16;
>> >> >   int delay = 125;
>> >> >   iocb *piocb = &aio.iocb;
>> >> >   while (true) {
>> >> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
>> >> >     if (r < 0) {
>> >> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
>> >> >         usleep(delay);
>> >> >         delay *= 2;
>> >> >         (*retries)++;
>> >> >         continue;
>> >> >       }
>> >> >       return r;
>> >> >     }
>> >> >     assert(r == 1);
>> >> >     break;
>> >> >   }
>> >> >   return 0;
>> >> > }
>> >> >
>> >> >
>> >> > From the man page.
>> >> >
>> >> > IO_SUBMIT(2)                                   Linux Programmer's
>> >> > Manual                                  IO_SUBMIT(2)
>> >> >
>> >> > NAME
>> >> >        io_submit - submit asynchronous I/O blocks for processing
>> >> > ...
>> >> > RETURN VALUE
>> >> >        On success, io_submit() returns the number of iocbs submitted
>> >> > (which may be 0 if nr is zero).  For the  failure
>> >> >        return, see NOTES.
>> >> >
>> >> > ERRORS
>> >> >        EAGAIN Insufficient resources are available to queue any
>> >> > iocbs.
>> >> >
>> >> > I suspect increasing bdev_aio_max_queue_depth may help here but some
>> >> > of the other devs may have more/better ideas.
>> >>
>> >> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
>> >> changing it to 128, 1024, or 4096 and see if these errors go away.
>> >>
>> >> I've never been able to trigger this on my test boxes, but I put in the
>> >> warning to help ensure we pick a good default.
>> >>
>> >> What kernel version are you running?
>> >>
>> >> Thanks!
>> >> sage
>> >
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brad
>
>



-- 
Cheers,
Brad

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]                           ` <CAF-wwdGHgO6etLromGd_5-UTBFmXD1JeOAiX_eFDB8z7bXZJ6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-16  7:19                             ` nokia ceph
  2017-03-16 13:48                             ` Sage Weil
  1 sibling, 0 replies; 8+ messages in thread
From: nokia ceph @ 2017-03-16  7:19 UTC (permalink / raw)
  To: Brad Hubbard; +Cc: Ceph Users, ceph-devel


[-- Attachment #1.1: Type: text/plain, Size: 6678 bytes --]

Sounds good :), Brad many thanks for the explanation .

On Thu, Mar 16, 2017 at 12:42 PM, Brad Hubbard <bhubbard-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> On Thu, Mar 16, 2017 at 4:33 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> wrote:
> > Hello Brad,
> >
> > I meant for this parameter bdev_aio_max_queue_depth , Sage suggested try
> > diff values, 128,1024 , 4096 . So my doubt how this calculation happens?
> Is
> > this  related to memory?
>
> The bdev_aio_max_queue_depth parameter represents the nr_events
> argument to the libaio io_setup function.
>
> int io_setup(unsigned nr_events, aio_context_t *ctx_idp);
>
> From the man page for io_setup:
>
> "The io_setup() system call creates an asynchronous I/O context
> suitable for concurrently processing nr_events operations."
>
> The current theory we are working with is that io_submit is returning
> EAGAIN because nr_events is too small at the default of 32. Therefore
> we have suggested raising this value. There is no real calculation
> involved in the values Sage is suggesting other than they are
> *larger*. It's a matter of playing with the value to see if, and when,
> the error messages go away. If we know a larger value reduces or
> eradicates the error we can then turn our focus more to *why*. Longer
> term this can assist us in setting a more reasonable default.
>
> >
> > Thanks
> >
> >
> >
> >
> > On Thu, Mar 16, 2017 at 11:53 AM, Brad Hubbard <bhubbard-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> wrote:
> >>
> >> On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> wrote:
> >> > Hello,
> >> >
> >> > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }
> >> >
> >> > Sure I will try to alter this directive - bdev_aio_max_queue_depth and
> >> > will
> >> > share our results.
> >> >
> >> > Could you please explain how this calculation happens?
> >>
> >> What calculation are you referring to?
> >>
> >> > Thanks
> >> >
> >> >
> >> > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
> >> >>
> >> >> On Wed, 15 Mar 2017, Brad Hubbard wrote:
> >> >> > +ceph-devel
> >> >> >
> >> >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph
> >> >> > <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> >> > wrote:
> >> >> > > Hello,
> >> >> > >
> >> >> > > We suspect these messages not only at the time of OSD creation.
> But
> >> >> > > in
> >> >> > > idle
> >> >> > > conditions also. May I know what is the impact of these error?
> Can
> >> >> > > we
> >> >> > > safely
> >> >> > > ignore this? Or is there any way/config to fix this problem
> >> >> > >
> >> >> > > Few occurrence for these events as follows:---
> >> >> > >
> >> >> > > ====
> >> >> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60
> >> >> > > started
> >> >> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
> >> >> > > memtable #1
> >> >> > > done
> >> >> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
> >> >> > > 1489511769500289,
> >> >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0,
> 0,
> >> >> > > 0,
> >> >> > > 0],
> >> >> > > "immutable_memtables": 0}
> >> >> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1
> >> >> > > max
> >> >> > > bytes
> >> >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
> >> >> > >
> >> >> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try
> to
> >> >> > > delete
> >> >> > > WAL files size 244090350, prev total WAL file size 247331500,
> >> >> > > number
> >> >> > > of live
> >> >> > > WAL files 2.
> >> >> > >
> >> >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> >> >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> >> >> >
> >> >> > These errors come from here.
> >> >> >
> >> >> > void KernelDevice::aio_submit(IOContext *ioc)
> >> >> > {
> >> >> > ...
> >> >> >     int r = aio_queue.submit(*cur, &retries);
> >> >> >     if (retries)
> >> >> >       derr << __func__ << " retries " << retries << dendl;
> >> >> >
> >> >> > The submit function is this one which calls libaio's io_submit
> >> >> > function directly and increments retries if it receives EAGAIN.
> >> >> >
> >> >> > #if defined(HAVE_LIBAIO)
> >> >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
> >> >> > {
> >> >> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
> >> >> >   int attempts = 16;
> >> >> >   int delay = 125;
> >> >> >   iocb *piocb = &aio.iocb;
> >> >> >   while (true) {
> >> >> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
> >> >> >     if (r < 0) {
> >> >> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
> >> >> >         usleep(delay);
> >> >> >         delay *= 2;
> >> >> >         (*retries)++;
> >> >> >         continue;
> >> >> >       }
> >> >> >       return r;
> >> >> >     }
> >> >> >     assert(r == 1);
> >> >> >     break;
> >> >> >   }
> >> >> >   return 0;
> >> >> > }
> >> >> >
> >> >> >
> >> >> > From the man page.
> >> >> >
> >> >> > IO_SUBMIT(2)                                   Linux Programmer's
> >> >> > Manual                                  IO_SUBMIT(2)
> >> >> >
> >> >> > NAME
> >> >> >        io_submit - submit asynchronous I/O blocks for processing
> >> >> > ...
> >> >> > RETURN VALUE
> >> >> >        On success, io_submit() returns the number of iocbs
> submitted
> >> >> > (which may be 0 if nr is zero).  For the  failure
> >> >> >        return, see NOTES.
> >> >> >
> >> >> > ERRORS
> >> >> >        EAGAIN Insufficient resources are available to queue any
> >> >> > iocbs.
> >> >> >
> >> >> > I suspect increasing bdev_aio_max_queue_depth may help here but
> some
> >> >> > of the other devs may have more/better ideas.
> >> >>
> >> >> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
> >> >> changing it to 128, 1024, or 4096 and see if these errors go away.
> >> >>
> >> >> I've never been able to trigger this on my test boxes, but I put in
> the
> >> >> warning to help ensure we pick a good default.
> >> >>
> >> >> What kernel version are you running?
> >> >>
> >> >> Thanks!
> >> >> sage
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Cheers,
> >> Brad
> >
> >
>
>
>
> --
> Cheers,
> Brad
>

[-- Attachment #1.2: Type: text/html, Size: 10542 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries"
       [not found]                           ` <CAF-wwdGHgO6etLromGd_5-UTBFmXD1JeOAiX_eFDB8z7bXZJ6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-03-16  7:19                             ` nokia ceph
@ 2017-03-16 13:48                             ` Sage Weil
  1 sibling, 0 replies; 8+ messages in thread
From: Sage Weil @ 2017-03-16 13:48 UTC (permalink / raw)
  To: Brad Hubbard; +Cc: Ceph Users, ceph-devel

On Thu, 16 Mar 2017, Brad Hubbard wrote:
> On Thu, Mar 16, 2017 at 4:33 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Hello Brad,
> >
> > I meant for this parameter bdev_aio_max_queue_depth , Sage suggested try
> > diff values, 128,1024 , 4096 . So my doubt how this calculation happens? Is
> > this  related to memory?
> 
> The bdev_aio_max_queue_depth parameter represents the nr_events
> argument to the libaio io_setup function.
> 
> int io_setup(unsigned nr_events, aio_context_t *ctx_idp);
> 
> From the man page for io_setup:
> 
> "The io_setup() system call creates an asynchronous I/O context
> suitable for concurrently processing nr_events operations."
> 
> The current theory we are working with is that io_submit is returning
> EAGAIN because nr_events is too small at the default of 32. Therefore
> we have suggested raising this value. There is no real calculation
> involved in the values Sage is suggesting other than they are
> *larger*. It's a matter of playing with the value to see if, and when,
> the error messages go away. If we know a larger value reduces or
> eradicates the error we can then turn our focus more to *why*. Longer
> term this can assist us in setting a more reasonable default.

One nuance is that small values are equivalent because the kernel 
apparently rounds up to a page size-aligned buffer full of struct iocb's 
(or whatever the kernel equivalent is).  My guess is that we want a 
default that equates to 2 or 4 pages instead of 1 page.

We do lots of testing of the same kernel in the lab; I'm adding an item to 
my list to look for these messages in our logs too.  FWIW, as long as the 
retry is succeeding this is pretty harmless (we're basic just limiting the 
depth of the io queue at the kernel and device to some probably-reasoanble 
value).  My curiousity here is somewhat academic.  :)

Thanks!
sage


> 
> >
> > Thanks
> >
> >
> >
> >
> > On Thu, Mar 16, 2017 at 11:53 AM, Brad Hubbard <bhubbard-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >>
> >> On Thu, Mar 16, 2017 at 4:15 PM, nokia ceph <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> wrote:
> >> > Hello,
> >> >
> >> > We are running latest kernel - 3.10.0-514.2.2.el7.x86_64 { RHEL 7.3 }
> >> >
> >> > Sure I will try to alter this directive - bdev_aio_max_queue_depth and
> >> > will
> >> > share our results.
> >> >
> >> > Could you please explain how this calculation happens?
> >>
> >> What calculation are you referring to?
> >>
> >> > Thanks
> >> >
> >> >
> >> > On Wed, Mar 15, 2017 at 7:54 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
> >> >>
> >> >> On Wed, 15 Mar 2017, Brad Hubbard wrote:
> >> >> > +ceph-devel
> >> >> >
> >> >> > On Wed, Mar 15, 2017 at 5:25 PM, nokia ceph
> >> >> > <nokiacephusers-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> >> > wrote:
> >> >> > > Hello,
> >> >> > >
> >> >> > > We suspect these messages not only at the time of OSD creation. But
> >> >> > > in
> >> >> > > idle
> >> >> > > conditions also. May I know what is the impact of these error? Can
> >> >> > > we
> >> >> > > safely
> >> >> > > ignore this? Or is there any way/config to fix this problem
> >> >> > >
> >> >> > > Few occurrence for these events as follows:---
> >> >> > >
> >> >> > > ====
> >> >> > > 2017-03-14 17:16:09.500370 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.453130) [default] Level-0 commit table #60
> >> >> > > started
> >> >> > > 2017-03-14 17:16:09.500374 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500273) [default] Level-0 commit table #60:
> >> >> > > memtable #1
> >> >> > > done
> >> >> > > 2017-03-14 17:16:09.500376 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500297) EVENT_LOG_v1 {"time_micros":
> >> >> > > 1489511769500289,
> >> >> > > "job": 17, "event": "flush_finished", "lsm_state": [2, 4, 6, 0, 0,
> >> >> > > 0,
> >> >> > > 0],
> >> >> > > "immutable_memtables": 0}
> >> >> > > 2017-03-14 17:16:09.500382 7fedeba61700  4 rocksdb: (Original Log
> >> >> > > Time
> >> >> > > 2017/03/14-17:16:09.500330) [default] Level summary: base level 1
> >> >> > > max
> >> >> > > bytes
> >> >> > > base 268435456 files[2 4 6 0 0 0 0] max score 0.76
> >> >> > >
> >> >> > > 2017-03-14 17:16:09.500390 7fedeba61700  4 rocksdb: [JOB 17] Try to
> >> >> > > delete
> >> >> > > WAL files size 244090350, prev total WAL file size 247331500,
> >> >> > > number
> >> >> > > of live
> >> >> > > WAL files 2.
> >> >> > >
> >> >> > > 2017-03-14 17:34:11.610513 7fedf3a71700 -1
> >> >> > > bdev(/var/lib/ceph/osd/ceph-73/block) aio_submit retries 6
> >> >> >
> >> >> > These errors come from here.
> >> >> >
> >> >> > void KernelDevice::aio_submit(IOContext *ioc)
> >> >> > {
> >> >> > ...
> >> >> >     int r = aio_queue.submit(*cur, &retries);
> >> >> >     if (retries)
> >> >> >       derr << __func__ << " retries " << retries << dendl;
> >> >> >
> >> >> > The submit function is this one which calls libaio's io_submit
> >> >> > function directly and increments retries if it receives EAGAIN.
> >> >> >
> >> >> > #if defined(HAVE_LIBAIO)
> >> >> > int FS::aio_queue_t::submit(aio_t &aio, int *retries)
> >> >> > {
> >> >> >   // 2^16 * 125us = ~8 seconds, so max sleep is ~16 seconds
> >> >> >   int attempts = 16;
> >> >> >   int delay = 125;
> >> >> >   iocb *piocb = &aio.iocb;
> >> >> >   while (true) {
> >> >> >     int r = io_submit(ctx, 1, &piocb);     <-------------NOTE
> >> >> >     if (r < 0) {
> >> >> >       if (r == -EAGAIN && attempts-- > 0) {     <-------------NOTE
> >> >> >         usleep(delay);
> >> >> >         delay *= 2;
> >> >> >         (*retries)++;
> >> >> >         continue;
> >> >> >       }
> >> >> >       return r;
> >> >> >     }
> >> >> >     assert(r == 1);
> >> >> >     break;
> >> >> >   }
> >> >> >   return 0;
> >> >> > }
> >> >> >
> >> >> >
> >> >> > From the man page.
> >> >> >
> >> >> > IO_SUBMIT(2)                                   Linux Programmer's
> >> >> > Manual                                  IO_SUBMIT(2)
> >> >> >
> >> >> > NAME
> >> >> >        io_submit - submit asynchronous I/O blocks for processing
> >> >> > ...
> >> >> > RETURN VALUE
> >> >> >        On success, io_submit() returns the number of iocbs submitted
> >> >> > (which may be 0 if nr is zero).  For the  failure
> >> >> >        return, see NOTES.
> >> >> >
> >> >> > ERRORS
> >> >> >        EAGAIN Insufficient resources are available to queue any
> >> >> > iocbs.
> >> >> >
> >> >> > I suspect increasing bdev_aio_max_queue_depth may help here but some
> >> >> > of the other devs may have more/better ideas.
> >> >>
> >> >> Yes--try increasing bdev_aio_max_queue_depth.  It defaults to 32; try
> >> >> changing it to 128, 1024, or 4096 and see if these errors go away.
> >> >>
> >> >> I've never been able to trigger this on my test boxes, but I put in the
> >> >> warning to help ensure we pick a good default.
> >> >>
> >> >> What kernel version are you running?
> >> >>
> >> >> Thanks!
> >> >> sage
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Cheers,
> >> Brad
> >
> >
> 
> 
> 
> -- 
> Cheers,
> Brad
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-03-16 13:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CALWuvY8dEyWiUxg9VDaMkfQGwPSsWG0adrtzvGa9k7fhNmdsGw@mail.gmail.com>
     [not found] ` <CALWuvY9wY08n2AE9HMfZgZCU9ZEtk2XOwa353_stxT-7x7L30A@mail.gmail.com>
     [not found]   ` <CALWuvY_9uWE_DKg5UaYZrc_GchfMK1EpX1rkWrJEuAV+Y3+Uww@mail.gmail.com>
2017-03-15  8:38     ` [ceph-users] Log message --> "bdev(/var/lib/ceph/osd/ceph-x/block) aio_submit retries" Brad Hubbard
     [not found]       ` <CAF-wwdFvgzkYMnkheomsWMikZoZU+THyrxzovORx3bCDfPZ7Ng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-15 14:24         ` Sage Weil
     [not found]           ` <alpine.DEB.2.11.1703151422110.10776-qHenpvqtifaMSRpgCs4c+g@public.gmane.org>
2017-03-16  6:15             ` nokia ceph
     [not found]               ` <CALWuvY8f1j9_ccbzrchfdhZ66RYDTqRuL2A2qy9XFHTARitPBQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-16  6:23                 ` Brad Hubbard
     [not found]                   ` <CAF-wwdGBF5yDMOpdxm0S5fXcYYt94LaEiAXw8qXT2g66pB30xg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-16  6:33                     ` nokia ceph
     [not found]                       ` <CALWuvY-m4JyaohTJCfzeC==CnfJ2cGK1n8P+49pS65u5n8hnfQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-16  7:12                         ` Brad Hubbard
     [not found]                           ` <CAF-wwdGHgO6etLromGd_5-UTBFmXD1JeOAiX_eFDB8z7bXZJ6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-16  7:19                             ` nokia ceph
2017-03-16 13:48                             ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.