[RFC] [CTDB] Optimized memory handling.

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
As mentioned in an earlier mail, I'm new to SAMBA and CTDB.
During my on-boarding (reading code) I was wondering if the memory
handling is really so fortunate.

The attached patch shows a more extensive usage of the 
memory pool concepts provided by the talloc API.
In addition I re-organized the processing a bit in that way that I'm
trying to first gather the memory requirements and then allocate,
instead of the other way around and then fail.

The currently used magic numbers for the pool sizes are still under
investigation...hints and recommendations are welcome.

However, initial tests showed big improvements.

Please comment.

Cheers Swen

mem_opt_rfc.patch (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Tue, Dec 12, 2017 at 12:49:08PM +0100, Swen Schillig via samba-technical wrote:

> As mentioned in an earlier mail, I'm new to SAMBA and CTDB.
> During my on-boarding (reading code) I was wondering if the memory
> handling is really so fortunate.
>
> The attached patch shows a more extensive usage of the 
> memory pool concepts provided by the talloc API.
> In addition I re-organized the processing a bit in that way that I'm
> trying to first gather the memory requirements and then allocate,
> instead of the other way around and then fail.
>
> The currently used magic numbers for the pool sizes are still under
> investigation...hints and recommendations are welcome.
>
> However, initial tests showed big improvements.

Can you give more details on what you mean by "big improvements" ?
Some numbers would be good :-).

Thanks,

Jeremy.

> diff --git a/ctdb/common/ctdb_io.c b/ctdb/common/ctdb_io.c
> index 2b43287..f0a73d1 100644
> --- a/ctdb/common/ctdb_io.c
> +++ b/ctdb/common/ctdb_io.c
> @@ -70,7 +70,8 @@ struct ctdb_queue {
>   uint32_t buffer_size;
>  };
>  
> -
> +#define DEFAULT_BUFFER_SIZE 1024
> +static TALLOC_CTX *data_pool = NULL;
>  
>  int ctdb_queue_length(struct ctdb_queue *queue)
>  {
> @@ -98,7 +99,9 @@ static void queue_process(struct ctdb_queue *queue)
>  {
>   uint32_t pkt_size;
>   uint8_t *data;
> + uint8_t *tmp_bdata;
>  
> + /* Did we at least read the size into the buffer */
>   if (queue->buffer.length < sizeof(pkt_size)) {
>   return;
>   }
> @@ -117,12 +120,11 @@ static void queue_process(struct ctdb_queue *queue)
>   }
>  
>   /* Extract complete packet */
> - data = talloc_size(queue, pkt_size);
> + data = talloc_memdup(data_pool, queue->buffer.data, pkt_size);
>   if (data == NULL) {
> - DEBUG(DEBUG_ERR, ("read error alloc failed for %u\n", pkt_size));
> + DEBUG(DEBUG_ERR, ("data alloc failed for %u\n", pkt_size));
>   return;
>   }
> - memcpy(data, queue->buffer.data, pkt_size);
>  
>   /* Shift packet out from buffer */
>   if (queue->buffer.length > pkt_size) {
> @@ -136,10 +138,15 @@ static void queue_process(struct ctdb_queue *queue)
>   /* There is more data to be processed, schedule an event */
>   tevent_schedule_immediate(queue->im, queue->ctdb->ev,
>    queue_process_event, queue);
> - } else {
> - if (queue->buffer.size > queue->buffer_size) {
> - TALLOC_FREE(queue->buffer.data);
> - queue->buffer.size = 0;
> + } else if (queue->buffer.size > (queue->buffer_size << 1)) {
> + /* only reduce size if buffer grew to more than double */
> + // SS should that be done at all ? Why not leaving it as it is.
> +
> + tmp_bdata = talloc_realloc_size(queue, queue->buffer.data,
> + queue->buffer_size);
> + if (likely(tmp_bdata)) {
> + queue->buffer.data = tmp_bdata;
> + queue->buffer.size = queue->buffer_size;
>   }
>   }
>  
> @@ -152,17 +159,15 @@ failed:
>  
>  }
>  
> -
>  /*
>    called when an incoming connection is readable
>    This function MUST be safe for reentry via the queue callback!
>  */
>  static void queue_io_read(struct ctdb_queue *queue)
>  {
> - int num_ready = 0;
>   ssize_t nread;
>   uint8_t *data;
> - int navail;
> + int num_ready;
>  
>   /* check how much data is available on the socket for immediately
>     guaranteed nonblocking access.
> @@ -173,24 +178,24 @@ static void queue_io_read(struct ctdb_queue *queue)
>   if (ioctl(queue->fd, FIONREAD, &num_ready) != 0) {
>   return;
>   }
> +
>   if (num_ready == 0) {
>   /* the descriptor has been closed */
>   goto failed;
>   }
>  
> - if (queue->buffer.data == NULL) {
> - /* starting fresh, allocate buf to read data */
> - queue->buffer.data = talloc_size(queue, queue->buffer_size);
> - if (queue->buffer.data == NULL) {
> - DEBUG(DEBUG_ERR, ("read error alloc failed for %u\n", num_ready));
> - goto failed;
> - }
> - queue->buffer.size = queue->buffer_size;
> - } else if (queue->buffer.extend > 0) {
> + if (num_ready > (queue->buffer.size - queue->buffer.length)) {
> + queue->buffer.extend = MAX(queue->buffer.extend,
> +   (num_ready + queue->buffer.length));
> + }
> +
> + if (queue->buffer.extend > 0) {
>   /* extending buffer */
> - data = talloc_realloc_size(queue, queue->buffer.data, queue->buffer.extend);
> + data = talloc_realloc_size(queue, queue->buffer.data,
> +   queue->buffer.extend);
>   if (data == NULL) {
> - DEBUG(DEBUG_ERR, ("read error realloc failed for %u\n", queue->buffer.extend));
> + DEBUG(DEBUG_ERR, ("read error realloc failed for %u\n",
> +  queue->buffer.extend));
>   goto failed;
>   }
>   queue->buffer.data = data;
> @@ -198,22 +203,17 @@ static void queue_io_read(struct ctdb_queue *queue)
>   queue->buffer.extend = 0;
>   }
>  
> - navail = queue->buffer.size - queue->buffer.length;
> - if (num_ready > navail) {
> - num_ready = navail;
> - }
> + nread = sys_read(queue->fd, queue->buffer.data + queue->buffer.length,
> + num_ready);
>  
> - if (num_ready > 0) {
> - nread = sys_read(queue->fd,
> - queue->buffer.data + queue->buffer.length,
> - num_ready);
> - if (nread <= 0) {
> - DEBUG(DEBUG_ERR, ("read error nread=%d\n", (int)nread));
> - goto failed;
> - }
> - queue->buffer.length += nread;
> + if (nread <= 0) {
> + DEBUG(DEBUG_ERR, ("read error %d (%s)\n", errno,
> +  strerror(errno)));
> + goto failed;
>   }
>  
> + queue->buffer.length += nread;
> +
>   queue_process(queue);
>   return;
>  
> @@ -392,8 +392,7 @@ int ctdb_queue_send(struct ctdb_queue *queue, uint8_t *data, uint32_t length)
>  int ctdb_queue_set_fd(struct ctdb_queue *queue, int fd)
>  {
>   queue->fd = fd;
> - talloc_free(queue->fde);
> - queue->fde = NULL;
> + TALLOC_FREE(queue->fde);
>  
>   if (fd != -1) {
>   queue->fde = tevent_add_fd(queue->ctdb->ev, queue, fd,
> @@ -423,34 +422,47 @@ struct ctdb_queue *ctdb_queue_setup(struct ctdb_context *ctdb,
>   struct ctdb_queue *queue;
>   va_list ap;
>  
> - queue = talloc_zero(mem_ctx, struct ctdb_queue);
> + if (unlikely(data_pool == NULL)) {
> + data_pool = talloc_pool(NULL, 32 * DEFAULT_BUFFER_SIZE);
> + if (data_pool == NULL) {
> + DEBUG(DEBUG_ERR, ("data_pool alloc failed for %p\n",
> +  private_data));
> + return NULL;
> + }
> + }
> +
> + queue = talloc_pooled_object(ctx, struct ctdb_queue, 4, 2048);
>   CTDB_NO_MEMORY_NULL(ctdb, queue);
> + memset(queue, 0, sizeof(struct ctdb_queue));
>  
>   va_start(ap, fmt);
>   queue->name = talloc_vasprintf(queue, fmt, ap);
>   va_end(ap);
> - CTDB_NO_MEMORY_NULL(ctdb, queue->name);
> -
> - queue->im= tevent_create_immediate(queue);
> - CTDB_NO_MEMORY_NULL(ctdb, queue->im);
>  
> + queue->im = tevent_create_immediate(queue);
>   queue->ctdb = ctdb;
>   queue->alignment = alignment;
>   queue->private_data = private_data;
>   queue->callback = callback;
>  
>   if (ctdb_queue_set_fd(queue, fd) != 0) {
> - talloc_free(queue);
> - return NULL;
> + goto failed;
>   }
>  
> - queue->buffer_size = ctdb->tunable.queue_buffer_size;
>   /* In client code, ctdb->tunable is not initialized.
>   * This does not affect recovery daemon.
>   */
> - if (queue->buffer_size == 0) {
> - queue->buffer_size = 1024;
> + queue->buffer_size = ctdb->tunable.queue_buffer_size ?
> + ctdb->tunable.queue_buffer_size :
> + DEFAULT_BUFFER_SIZE;
> +
> + queue->buffer.data = talloc_size(queue, queue->buffer_size);
> + if (likely(queue->buffer.data != NULL)) {
> + queue->buffer.size = queue->buffer_size;
> + return queue;
>   }
>  
> - return queue;
> +failed:
> + talloc_free(ctx);
> + return NULL;
>  }


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Wed, Dec 13, 2017 at 10:52 AM, Jeremy Allison via samba-technical
<[hidden email]> wrote:

> On Tue, Dec 12, 2017 at 12:49:08PM +0100, Swen Schillig via samba-technical wrote:
>> As mentioned in an earlier mail, I'm new to SAMBA and CTDB.
>> During my on-boarding (reading code) I was wondering if the memory
>> handling is really so fortunate.
>>
>> The attached patch shows a more extensive usage of the
>> memory pool concepts provided by the talloc API.
>> In addition I re-organized the processing a bit in that way that I'm
>> trying to first gather the memory requirements and then allocate,
>> instead of the other way around and then fail.
>>
>> The currently used magic numbers for the pool sizes are still under
>> investigation...hints and recommendations are welcome.
>>
>> However, initial tests showed big improvements.
>
> Can you give more details on what you mean by "big improvements" ?
> Some numbers would be good :-).
>
> Thanks,
>
> Jeremy.
>

Even without seeing the proof I can imagine the improvements in packet
handling using the talloc pool.  That code is in the hot path and
avoiding extra malloc() calls will definitely result in reducing the
load on ctdb daemon.  However, as Jeremy mentioned, it would be good
to verify with real numbers.

As outlined in another email, I would prefer if you make these changes
in sock_io.c.  It will be easier to spin up a self-contained
performance test using sock_io without needing to run an expensive
cluster test.

Regarding the actual changes, I wish I had looked at talloc pools
before.  Packet handling is the ideal place for using talloc pools.  I
would avoid adding static globals (datapool) and stick them on a
context.

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Thu, Dec 14, 2017 at 10:42:58AM +1100, Amitay Isaacs wrote:

>
> Even without seeing the proof I can imagine the improvements in packet
> handling using the talloc pool.  That code is in the hot path and
> avoiding extra malloc() calls will definitely result in reducing the
> load on ctdb daemon.  However, as Jeremy mentioned, it would be good
> to verify with real numbers.
>
> As outlined in another email, I would prefer if you make these changes
> in sock_io.c.  It will be easier to spin up a self-contained
> performance test using sock_io without needing to run an expensive
> cluster test.
>
> Regarding the actual changes, I wish I had looked at talloc pools
> before.  Packet handling is the ideal place for using talloc pools.  I

Yep. That's why Volker invented them for smbd :-).

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Thu, 2017-12-14 at 10:42 +1100, Amitay Isaacs via samba-technical
wrote:

> On Wed, Dec 13, 2017 at 10:52 AM, Jeremy Allison via samba-technical
> <[hidden email]> wrote:
> > On Tue, Dec 12, 2017 at 12:49:08PM +0100, Swen Schillig via samba-
> > technical wrote:
> > > ...
> > > The currently used magic numbers for the pool sizes are still
> > > under
> > > investigation...hints and recommendations are welcome.
> > >
> > > However, initial tests showed big improvements.
> >
> > Can you give more details on what you mean by "big improvements" ?
> > Some numbers would be good :-).
> >
> > Thanks,
> >
> > Jeremy.
> >
>
> Even without seeing the proof I can imagine the improvements in
> packet
> handling using the talloc pool.  That code is in the hot path and
> avoiding extra malloc() calls will definitely result in reducing the
> load on ctdb daemon.  However, as Jeremy mentioned, it would be good
> to verify with real numbers.
> ...
>
> Amitay.
>
Thank you guys for having a look, it's really appreciated.

As for the numbers, I do not have any "real" numbers yet.
But an isolated test of the code in question did show improvements of
approx. 40% of pool-alloc-code vs. non-pool-alloc-code.
This means just for the areas in question and not running a 
real-life load.
Anyhow, as Amitay stated, this is the hot path and if we're "only"
getting 5% in the end that would be "a big improvement" :-)

I'm in the process of getting a cluster ready to perform 
real-life-tests to get an idea of some good initial pool-size values,
(especially for the data pool).

@Amitay: I will try and transfer the changes to sock_io today so you
guys can have a look again.... making sure the direction is right.

Regarding the "static" data pool.
I'm not a fan of those either, but I couldn't find a decent context to
attach the pool to. Therefore, I decided to setup "one" pool once
for all data-buffer requirements instead of being dependent on some
upper structure which is not related in any way to the data mem.

Cheers Swen


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Thu, 2017-12-14 at 08:47 +0100, Swen Schillig via samba-technical
wrote:

> On Thu, 2017-12-14 at 10:42 +1100, Amitay Isaacs via samba-technical
> wrote:
> > On Wed, Dec 13, 2017 at 10:52 AM, Jeremy Allison via samba-
> > technical
> > <[hidden email]> wrote:
> > > On Tue, Dec 12, 2017 at 12:49:08PM +0100, Swen Schillig via
> > > samba-
> > > technical wrote:
> > > > ...
> > > > The currently used magic numbers for the pool sizes are still
> > > > under
> > > > investigation...hints and recommendations are welcome.
> > > >
> > > > However, initial tests showed big improvements.
> > >
> > > Can you give more details on what you mean by "big improvements"
> > > ?
> > > Some numbers would be good :-).
> > >
> > > Thanks,
> > >
> > > Jeremy.
> > >
> >
> > Even without seeing the proof I can imagine the improvements in
> > packet
> > handling using the talloc pool.  That code is in the hot path and
> > avoiding extra malloc() calls will definitely result in reducing
> > the
> > load on ctdb daemon.  However, as Jeremy mentioned, it would be
> > good
> > to verify with real numbers.
> > ...
> >
> > Amitay.
> >
>
> Thank you guys for having a look, it's really appreciated.
>
> As for the numbers, I do not have any "real" numbers yet.
> But an isolated test of the code in question did show improvements of
> approx. 40% of pool-alloc-code vs. non-pool-alloc-code.
> This means just for the areas in question and not running a 
> real-life load.
> Anyhow, as Amitay stated, this is the hot path and if we're "only"
> getting 5% in the end that would be "a big improvement" :-)
>
> I'm in the process of getting a cluster ready to perform 
> real-life-tests to get an idea of some good initial pool-size values,
> (especially for the data pool).
>
> @Amitay: I will try and transfer the changes to sock_io today so you
> guys can have a look again.... making sure the direction is right.
>
> Regarding the "static" data pool. 
> I'm not a fan of those either, but I couldn't find a decent context
> to
> attach the pool to. Therefore, I decided to setup "one" pool once
> for all data-buffer requirements instead of being dependent on some
> upper structure which is not related in any way to the data mem.
>
> Cheers Swen
Now that's embarrassing.

The changes I proposed are almost all there in 
Amitay's code (socket_io.c) already.
I should have read more before posting, sorry.

From the patches/code I posted so far, I guess I have to re-order the
changes for them to make sense again.
First memory-opt (now in socket_io.c) and then ctdb_io.c adoptions.

A few comments on the code.
1) why do we have to free the memory for queue->buf everytime ?
   I don't think this is necessary (talloc_realloc if needed) and
   would save time again. Once done with the queue it'll be
   free'd anyway.
2) I still believe we don't need a destructor.
   There's nothing done which is either required (queue-> = -1) or
   not done automatically (TALLOC_FREE(queue->fde)).
3) @Amitay: I really like the removal (in comparison to ctdb_io) of
   the really expensive "data" memory / memcpy thing, that on it's own 
   should save time already. I just have to make sure to change all
   the callbacks for ctdb_io to NOT free that memory anymore.

Attached you can find two patches, the memory-opt patch is pretty small
and straight forward. I could prepare a committable patch by tomorrow,
if it's wanted. Please let me know if you'd be ok with the 2 addtl.
changes I suggested above( no destructor / no mem-free).

Regarding the other patch (ctdb_sock_io replacement) this is WIP and
won't come out of this state until next year.
It does compile but has most likely a good few bugs in it.

Please comment.

Cheers Swen
>
>

memory_opt.patch (2K) Download Attachment
ctdb_sock_io.patch (29K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Thu, 2017-12-14 at 14:46 +0100, Swen Schillig via samba-technical
wrote:
>
> 3) @Amitay: I really like the removal (in comparison to ctdb_io) of 
>    the really expensive "data" memory / memcpy thing, that on it's
> own 
>    should save time already. I just have to make sure to change all
>    the callbacks for ctdb_io to NOT free that memory anymore.
>
I'm afraid I was a bit too quick here.

One thing I forgot about, even though I pointed that out as a ToDo, is
that the callbacks are responsible to free the memory behind "data".

But they mustn't do that because it is not a copy anymore. This memory
is stolen at some point in the call chain which unfortunately isn't
always on the first level.
For now I decided to re-introduce the data_pool again (now in
sock_io.c)

Cheers Swen


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, Dec 15, 2017 at 12:46 AM, Swen Schillig <[hidden email]> wrote:

> On Thu, 2017-12-14 at 08:47 +0100, Swen Schillig via samba-technical
> wrote:
>> On Thu, 2017-12-14 at 10:42 +1100, Amitay Isaacs via samba-technical
>> wrote:
>> > On Wed, Dec 13, 2017 at 10:52 AM, Jeremy Allison via samba-
>> > technical
>> > <[hidden email]> wrote:
>> > > On Tue, Dec 12, 2017 at 12:49:08PM +0100, Swen Schillig via
>> > > samba-
>> > > technical wrote:
>> > > > ...
>> > > > The currently used magic numbers for the pool sizes are still
>> > > > under
>> > > > investigation...hints and recommendations are welcome.
>> > > >
>> > > > However, initial tests showed big improvements.
>> > >
>> > > Can you give more details on what you mean by "big improvements"
>> > > ?
>> > > Some numbers would be good :-).
>> > >
>> > > Thanks,
>> > >
>> > > Jeremy.
>> > >
>> >
>> > Even without seeing the proof I can imagine the improvements in
>> > packet
>> > handling using the talloc pool.  That code is in the hot path and
>> > avoiding extra malloc() calls will definitely result in reducing
>> > the
>> > load on ctdb daemon.  However, as Jeremy mentioned, it would be
>> > good
>> > to verify with real numbers.
>> > ...
>> >
>> > Amitay.
>> >
>>
>> Thank you guys for having a look, it's really appreciated.
>>
>> As for the numbers, I do not have any "real" numbers yet.
>> But an isolated test of the code in question did show improvements of
>> approx. 40% of pool-alloc-code vs. non-pool-alloc-code.
>> This means just for the areas in question and not running a
>> real-life load.
>> Anyhow, as Amitay stated, this is the hot path and if we're "only"
>> getting 5% in the end that would be "a big improvement" :-)
>>
>> I'm in the process of getting a cluster ready to perform
>> real-life-tests to get an idea of some good initial pool-size values,
>> (especially for the data pool).
>>
>> @Amitay: I will try and transfer the changes to sock_io today so you
>> guys can have a look again.... making sure the direction is right.
>>
>> Regarding the "static" data pool.
>> I'm not a fan of those either, but I couldn't find a decent context
>> to
>> attach the pool to. Therefore, I decided to setup "one" pool once
>> for all data-buffer requirements instead of being dependent on some
>> upper structure which is not related in any way to the data mem.
>>
>> Cheers Swen
> Now that's embarrassing.
>
> The changes I proposed are almost all there in
> Amitay's code (socket_io.c) already.
> I should have read more before posting, sorry.
>
> From the patches/code I posted so far, I guess I have to re-order the
> changes for them to make sense again.
> First memory-opt (now in socket_io.c) and then ctdb_io.c adoptions.
>
> A few comments on the code.
> 1) why do we have to free the memory for queue->buf everytime ?
>    I don't think this is necessary (talloc_realloc if needed) and
>    would save time again. Once done with the queue it'll be
>    free'd anyway.

To avoid the buffer sizes growing.  Instead of adding some heuristic
to shrink the buffer size, this was simpler.

> 2) I still believe we don't need a destructor.
>    There's nothing done which is either required (queue-> = -1) or
>    not done automatically (TALLOC_FREE(queue->fde)).

Sure.

> 3) @Amitay: I really like the removal (in comparison to ctdb_io) of
>    the really expensive "data" memory / memcpy thing, that on it's own
>    should save time already. I just have to make sure to change all
>    the callbacks for ctdb_io to NOT free that memory anymore.

That may not be possible.  The model used here is that the callback is
supposed to free the memory once the callback has processed the
packet.  Usually the packets are processed quickly and freed, so we
can still use talloc pools.

>
> Attached you can find two patches, the memory-opt patch is pretty small
> and straight forward. I could prepare a committable patch by tomorrow,
> if it's wanted. Please let me know if you'd be ok with the 2 addtl.
> changes I suggested above( no destructor / no mem-free).
>
> Regarding the other patch (ctdb_sock_io replacement) this is WIP and
> won't come out of this state until next year.
> It does compile but has most likely a good few bugs in it.

Can you send the patches created with "git format-patch" command?  I
cannot "git am" the attached patches.

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Fri, 2017-12-15 at 04:28 +1100, Amitay Isaacs via samba-technical
wrote:
> A few comments on the code.
> > 1) why do we have to free the memory for queue->buf everytime ?
> >    I don't think this is necessary (talloc_realloc if needed) and
> >    would save time again. Once done with the queue it'll be
> >    free'd anyway.
>
> To avoid the buffer sizes growing.  Instead of adding some heuristic
> to shrink the buffer size, this was simpler.

The buffer would only grow if needed and the lifetime of the queues
(including their attached buffers) is limited as well.
Therefore I was thinking we could leave out the shrinking AND the
explicit free'ing of the buffer.
What do you think ?
>
> > 2) I still believe we don't need a destructor.
> >    There's nothing done which is either required (queue-> = -1) or
> >    not done automatically (TALLOC_FREE(queue->fde)).
>
> Sure.
Done.

>
> > 3) @Amitay: I really like the removal (in comparison to ctdb_io) of
> >    the really expensive "data" memory / memcpy thing, that on it's
> > own
> >    should save time already. I just have to make sure to change all
> >    the callbacks for ctdb_io to NOT free that memory anymore.
>
> That may not be possible.  The model used here is that the callback
> is
> supposed to free the memory once the callback has processed the
> packet.  Usually the packets are processed quickly and freed, so we
> can still use talloc pools.
Ok, as mentioned in an earlier mail, I re-introduced the idea of the
data_pool again.

>
> >
> > Attached you can find two patches, the memory-opt patch is pretty
> > small
> > and straight forward. I could prepare a committable patch by
> > tomorrow,
> > if it's wanted. Please let me know if you'd be ok with the 2 addtl.
> > changes I suggested above( no destructor / no mem-free).
> >
> > Regarding the other patch (ctdb_sock_io replacement) this is WIP
> > and
> > won't come out of this state until next year.
> > It does compile but has most likely a good few bugs in it.
>
> Can you send the patches created with "git format-patch" command?  I
> cannot "git am" the attached patches.
Yep, see attachment.
Wasn't sure if the changes are even wanted, therefore I chose to send
just a diff.

Thanks again for reviewing.

Cheers Swen

ctdb_sock_io_rebase_series.patch (34K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
Shouldn't the 32 in '32 * DEFAULT_OBJ_SIZE' in talloc_pool call be '8 *
DEFAULT_OBJ_NUMBER' or a separate define?

On 12/14/2017 2:04 PM, Swen Schillig via samba-technical wrote:

> On Fri, 2017-12-15 at 04:28 +1100, Amitay Isaacs via samba-technical
> wrote:
>> A few comments on the code.
>>> 1) why do we have to free the memory for queue->buf everytime ?
>>>     I don't think this is necessary (talloc_realloc if needed) and
>>>     would save time again. Once done with the queue it'll be
>>>     free'd anyway.
>> To avoid the buffer sizes growing.  Instead of adding some heuristic
>> to shrink the buffer size, this was simpler.
> The buffer would only grow if needed and the lifetime of the queues
> (including their attached buffers) is limited as well.
> Therefore I was thinking we could leave out the shrinking AND the
> explicit free'ing of the buffer.
> What do you think ?
>>> 2) I still believe we don't need a destructor.
>>>     There's nothing done which is either required (queue-> = -1) or
>>>     not done automatically (TALLOC_FREE(queue->fde)).
>> Sure.
> Done.
>>> 3) @Amitay: I really like the removal (in comparison to ctdb_io) of
>>>     the really expensive "data" memory / memcpy thing, that on it's
>>> own
>>>     should save time already. I just have to make sure to change all
>>>     the callbacks for ctdb_io to NOT free that memory anymore.
>> That may not be possible.  The model used here is that the callback
>> is
>> supposed to free the memory once the callback has processed the
>> packet.  Usually the packets are processed quickly and freed, so we
>> can still use talloc pools.
> Ok, as mentioned in an earlier mail, I re-introduced the idea of the
> data_pool again.
>>> Attached you can find two patches, the memory-opt patch is pretty
>>> small
>>> and straight forward. I could prepare a committable patch by
>>> tomorrow,
>>> if it's wanted. Please let me know if you'd be ok with the 2 addtl.
>>> changes I suggested above( no destructor / no mem-free).
>>>
>>> Regarding the other patch (ctdb_sock_io replacement) this is WIP
>>> and
>>> won't come out of this state until next year.
>>> It does compile but has most likely a good few bugs in it.
>> Can you send the patches created with "git format-patch" command?  I
>> cannot "git am" the attached patches.
> Yep, see attachment.
> Wasn't sure if the changes are even wanted, therefore I chose to send
> just a diff.
>
> Thanks again for reviewing.
>
> Cheers Swen


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
You should not remove the error checks.
The errors won't happen but that is no reason not to keep the checks.
The code might change in the future.

On 12/14/2017 2:04 PM, Swen Schillig via samba-technical wrote:

>   queue->im = tevent_create_immediate(queue);
> - if (queue->im == NULL) {
> - talloc_free(queue);
> - return NULL;
> - }
> -
>   queue->queue = tevent_queue_create(queue, "out-queue");
> - if (queue->queue == NULL) {
> - talloc_free(queue);
> - return NULL;
> - }


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Thu, 2017-12-14 at 14:50 -0500, jim via samba-technical wrote:
> Shouldn't the 32 in '32 * DEFAULT_OBJ_SIZE' in talloc_pool call be '8
> * 
> DEFAULT_OBJ_NUMBER' or a separate define?
You're right, but this is not a committable patch yet and the numbers
are subject to change once I'm ready to run some real life scenarios.

Cheers Swen

>
> On 12/14/2017 2:04 PM, Swen Schillig via samba-technical wrote:
> > On Fri, 2017-12-15 at 04:28 +1100, Amitay Isaacs via samba-
> > technical
> > wrote:
> > > A few comments on the code.
> > > > 1) why do we have to free the memory for queue->buf everytime ?
> > > >     I don't think this is necessary (talloc_realloc if needed)
> > > > and
> > > >     would save time again. Once done with the queue it'll be
> > > >     free'd anyway.
> > >
> > > To avoid the buffer sizes growing.  Instead of adding some
> > > heuristic
> > > to shrink the buffer size, this was simpler.
> >
> > The buffer would only grow if needed and the lifetime of the queues
> > (including their attached buffers) is limited as well.
> > Therefore I was thinking we could leave out the shrinking AND the
> > explicit free'ing of the buffer.
> > What do you think ?
> > > > 2) I still believe we don't need a destructor.
> > > >     There's nothing done which is either required (queue-> =
> > > > -1) or
> > > >     not done automatically (TALLOC_FREE(queue->fde)).
> > >
> > > Sure.
> >
> > Done.
> > > > 3) @Amitay: I really like the removal (in comparison to
> > > > ctdb_io) of
> > > >     the really expensive "data" memory / memcpy thing, that on
> > > > it's
> > > > own
> > > >     should save time already. I just have to make sure to
> > > > change all
> > > >     the callbacks for ctdb_io to NOT free that memory anymore.
> > >
> > > That may not be possible.  The model used here is that the
> > > callback
> > > is
> > > supposed to free the memory once the callback has processed the
> > > packet.  Usually the packets are processed quickly and freed, so
> > > we
> > > can still use talloc pools.
> >
> > Ok, as mentioned in an earlier mail, I re-introduced the idea of
> > the
> > data_pool again.
> > > > Attached you can find two patches, the memory-opt patch is
> > > > pretty
> > > > small
> > > > and straight forward. I could prepare a committable patch by
> > > > tomorrow,
> > > > if it's wanted. Please let me know if you'd be ok with the 2
> > > > addtl.
> > > > changes I suggested above( no destructor / no mem-free).
> > > >
> > > > Regarding the other patch (ctdb_sock_io replacement) this is
> > > > WIP
> > > > and
> > > > won't come out of this state until next year.
> > > > It does compile but has most likely a good few bugs in it.
> > >
> > > Can you send the patches created with "git format-patch"
> > > command?  I
> > > cannot "git am" the attached patches.
> >
> > Yep, see attachment.
> > Wasn't sure if the changes are even wanted, therefore I chose to
> > send
> > just a diff.
> >
> > Thanks again for reviewing.
> >
> > Cheers Swen
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Thu, 2017-12-14 at 14:53 -0500, jim via samba-technical wrote:
> You should not remove the error checks.
> The errors won't happen but that is no reason not to keep the checks.
> The code might change in the future.

I disagree, if a situation cannot happen, then there's no reason to
check for it. If things change in future, then this can too.

Cheers Swen

>
> On 12/14/2017 2:04 PM, Swen Schillig via samba-technical wrote:
> >    queue->im = tevent_create_immediate(queue);
> > - if (queue->im == NULL) {
> > - talloc_free(queue);
> > - return NULL;
> > - }
> > -
> >    queue->queue = tevent_queue_create(queue, "out-queue");
> > - if (queue->queue == NULL) {
> > - talloc_free(queue);
> > - return NULL;
> > - }
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Fri, Dec 15, 2017 at 5:16 PM, Swen Schillig via samba-technical
<[hidden email]> wrote:
> On Thu, 2017-12-14 at 14:53 -0500, jim via samba-technical wrote:
>> You should not remove the error checks.
>> The errors won't happen but that is no reason not to keep the checks.
>> The code might change in the future.
>
> I disagree, if a situation cannot happen, then there's no reason to
> check for it. If things change in future, then this can too.
>

That's not correct.  As long as we are making any library calls (even
though it's tevent), you cannot assume it will always succeed.

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, Dec 15, 2017 at 6:04 AM, Swen Schillig <[hidden email]> wrote:

> On Fri, 2017-12-15 at 04:28 +1100, Amitay Isaacs via samba-technical
> wrote:
>> A few comments on the code.
>> > 1) why do we have to free the memory for queue->buf everytime ?
>> >    I don't think this is necessary (talloc_realloc if needed) and
>> >    would save time again. Once done with the queue it'll be
>> >    free'd anyway.
>>
>> To avoid the buffer sizes growing.  Instead of adding some heuristic
>> to shrink the buffer size, this was simpler.
>
> The buffer would only grow if needed and the lifetime of the queues
> (including their attached buffers) is limited as well.
> Therefore I was thinking we could leave out the shrinking AND the
> explicit free'ing of the buffer.
> What do you think ?
>>
>> > 2) I still believe we don't need a destructor.
>> >    There's nothing done which is either required (queue-> = -1) or
>> >    not done automatically (TALLOC_FREE(queue->fde)).
>>
>> Sure.
> Done.
>>
>> > 3) @Amitay: I really like the removal (in comparison to ctdb_io) of
>> >    the really expensive "data" memory / memcpy thing, that on it's
>> > own
>> >    should save time already. I just have to make sure to change all
>> >    the callbacks for ctdb_io to NOT free that memory anymore.
>>
>> That may not be possible.  The model used here is that the callback
>> is
>> supposed to free the memory once the callback has processed the
>> packet.  Usually the packets are processed quickly and freed, so we
>> can still use talloc pools.
> Ok, as mentioned in an earlier mail, I re-introduced the idea of the
> data_pool again.
>>
>> >
>> > Attached you can find two patches, the memory-opt patch is pretty
>> > small
>> > and straight forward. I could prepare a committable patch by
>> > tomorrow,
>> > if it's wanted. Please let me know if you'd be ok with the 2 addtl.
>> > changes I suggested above( no destructor / no mem-free).
>> >
>> > Regarding the other patch (ctdb_sock_io replacement) this is WIP
>> > and
>> > won't come out of this state until next year.
>> > It does compile but has most likely a good few bugs in it.
>>
>> Can you send the patches created with "git format-patch" command?  I
>> cannot "git am" the attached patches.
> Yep, see attachment.
> Wasn't sure if the changes are even wanted, therefore I chose to send
> just a diff.

Even though it's a wip, it's much easier to look at the patches which
can be applied with "git am".

Please send patches generated with "git format-patch ...".

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Fri, 2017-12-15 at 18:57 +1100, Amitay Isaacs via samba-technical
wrote:
>
> Please send patches generated with "git format-patch ...".
>
> Amitay.
The last one was.

Is it not working for you ?

Cheers Swen


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, 2017-12-15 at 18:56 +1100, Amitay Isaacs via samba-technical
wrote:

> On Fri, Dec 15, 2017 at 5:16 PM, Swen Schillig via samba-technical
> <[hidden email]> wrote:
> > On Thu, 2017-12-14 at 14:53 -0500, jim via samba-technical wrote:
> > > You should not remove the error checks.
> > > The errors won't happen but that is no reason not to keep the
> > > checks.
> > > The code might change in the future.
> >
> > I disagree, if a situation cannot happen, then there's no reason to
> > check for it. If things change in future, then this can too.
> >
>
> That's not correct.  As long as we are making any library calls (even
> though it's tevent), you cannot assume it will always succeed.
>
> Amitay.
No, not always, but here I can because the only possibility for those
calls to fail is if there's no memory for a talloc...and that is cannot
happen.
But if you all insist, I can re-add.

Cheers Swen
>


Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Fri, Dec 15, 2017 at 7:20 PM, Swen Schillig <[hidden email]> wrote:

> On Fri, 2017-12-15 at 18:56 +1100, Amitay Isaacs via samba-technical
> wrote:
>> On Fri, Dec 15, 2017 at 5:16 PM, Swen Schillig via samba-technical
>> <[hidden email]> wrote:
>> > On Thu, 2017-12-14 at 14:53 -0500, jim via samba-technical wrote:
>> > > You should not remove the error checks.
>> > > The errors won't happen but that is no reason not to keep the
>> > > checks.
>> > > The code might change in the future.
>> >
>> > I disagree, if a situation cannot happen, then there's no reason to
>> > check for it. If things change in future, then this can too.
>> >
>>
>> That's not correct.  As long as we are making any library calls (even
>> though it's tevent), you cannot assume it will always succeed.
>>
>> Amitay.
> No, not always, but here I can because the only possibility for those
> calls to fail is if there's no memory for a talloc...and that is cannot
> happen.
> But if you all insist, I can re-add.

You cannot assume what tevent library does in tevent_queue_create()
and tevent_create_immediate() calls.  Currently if might be only doing
talloc calls, but it might also make some system calls which can fail.

Looking at API documentation for tevent_queue_create(), it says it can
return NULL on error.  So we *must* check for it.

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, Dec 15, 2017 at 7:14 PM, Swen Schillig <[hidden email]> wrote:

> On Fri, 2017-12-15 at 18:57 +1100, Amitay Isaacs via samba-technical
> wrote:
>>
>> Please send patches generated with "git format-patch ...".
>>
>> Amitay.
> The last one was.
>
> Is it not working for you ?
>
> Cheers Swen
>

May be it's not based on the latest master?  Or you have some other
patches in your tree.

Here is what I get when I try to apply the patch.

$ git am ~/Downloads/ctdb_sock_io_rebase_series.patch
Applying: memory optimizations for sock_io.
error: patch failed: ctdb/common/sock_io.c:31
error: ctdb/common/sock_io.c: patch does not apply
Patch failed at 0001 memory optimizations for sock_io.
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Amitay.

Reply | Threaded
Open this post in threaded view
|

Re: [RFC] [CTDB] Optimized memory handling.

Samba - samba-technical mailing list
On Fri, 2017-12-15 at 19:48 +1100, Amitay Isaacs via samba-technical
wrote:

> On Fri, Dec 15, 2017 at 7:14 PM, Swen Schillig <[hidden email]>
> wrote:
> > On Fri, 2017-12-15 at 18:57 +1100, Amitay Isaacs via samba-
> > technical
> > wrote:
> > >
> > > Please send patches generated with "git format-patch ...".
> > >
> > > Amitay.
> >
> > The last one was.
> >
> > Is it not working for you ?
> >
> > Cheers Swen
> >
>
> May be it's not based on the latest master?  Or you have some other
> patches in your tree.
Ok, sorry for that.

Let's do this one by one.
Attached is the first smaller one which is introducing the pools
to sock_io.

I will rebase the other one later today.

Cheers Swen

sock_io_mem_opt.patch (4K) Download Attachment
12