Quantcast

[PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

Amitay Isaacs
Hi,

What's the best practice when using tevent_queue with tevent_req?

comm_write_send/recv computation is used to send data to an fd.  There can
be multiple write requests occurring simultaneously, so comm_write_send
uses a tevent_queue to serialize all those requests.  However, what happens
if tevent_req which is in the queue is freed?  What's the best way to track
the queue entry corresponding to that tevent_req?

Here's an attempt to track tevent_req and corresponding tevent_queue_entry,
so when tevent_req is free'd, the corresponding tevent_queue_entry also
gets freed and the entry is removed from the queue.

Please review and push.

Amitay.

ctdb.patches (20K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

Martin Schwenke
On Wed, 15 Feb 2017 11:26:55 +1100, Amitay Isaacs <[hidden email]>
wrote:

> What's the best practice when using tevent_queue with tevent_req?
>
> comm_write_send/recv computation is used to send data to an fd.  There can
> be multiple write requests occurring simultaneously, so comm_write_send
> uses a tevent_queue to serialize all those requests.  However, what happens
> if tevent_req which is in the queue is freed?  What's the best way to track
> the queue entry corresponding to that tevent_req?
>
> Here's an attempt to track tevent_req and corresponding tevent_queue_entry,
> so when tevent_req is free'd, the corresponding tevent_queue_entry also
> gets freed and the entry is removed from the queue.
>
> Please review and push.

I've run the relevant CTDB test in a loop for 4 days and it is now rock
solid. This clearly fixes the problem, so...

Reviewed-by: Martin Schwenke <[hidden email]>

However, I'll hold off pushing so that we can see what others say about
best practice.  :-)

peace & happiness,
martin

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

Martin Schwenke
On Wed, 15 Feb 2017 11:39:32 +1100, Martin Schwenke <[hidden email]>
wrote:

> On Wed, 15 Feb 2017 11:26:55 +1100, Amitay Isaacs <[hidden email]>
> wrote:
>
> > What's the best practice when using tevent_queue with tevent_req?
> >
> > comm_write_send/recv computation is used to send data to an fd.  There can
> > be multiple write requests occurring simultaneously, so comm_write_send
> > uses a tevent_queue to serialize all those requests.  However, what happens
> > if tevent_req which is in the queue is freed?  What's the best way to track
> > the queue entry corresponding to that tevent_req?
> >
> > Here's an attempt to track tevent_req and corresponding tevent_queue_entry,
> > so when tevent_req is free'd, the corresponding tevent_queue_entry also
> > gets freed and the entry is removed from the queue.
> >
> > Please review and push.  
>
> I've run the relevant CTDB test in a loop for 4 days and it is now rock
> solid. This clearly fixes the problem, so...
>
> Reviewed-by: Martin Schwenke <[hidden email]>
>
> However, I'll hold off pushing so that we can see what others say about
> best practice.  :-)

Pushed...

peace & happiness,
martin

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

Volker Lendecke
In reply to this post by Amitay Isaacs
On Wed, Feb 15, 2017 at 11:26:55AM +1100, Amitay Isaacs wrote:

> What's the best practice when using tevent_queue with tevent_req?
>
> comm_write_send/recv computation is used to send data to an fd.  There can
> be multiple write requests occurring simultaneously, so comm_write_send
> uses a tevent_queue to serialize all those requests.  However, what happens
> if tevent_req which is in the queue is freed?  What's the best way to track
> the queue entry corresponding to that tevent_req?
>
> Here's an attempt to track tevent_req and corresponding tevent_queue_entry,
> so when tevent_req is free'd, the corresponding tevent_queue_entry also
> gets freed and the entry is removed from the queue.

Not answering directly:

There's code in lib/async_req that performs a very similar task:
writev_send. I would have to write a specific test case to see whether
writev_send is safe from the error you found in ctdb. One of the main
differences in lib/async_req is that we create a fresh tevent_fd when
a new write request comes in, whereas comm_write_send reuses the
existing one.

Question: Would it be worthwhile exploring to merge those two
functions, base one on top of the other? If I get it right the
outgoing fd event is only really necessary if the kernel buffer is
full. Is the overhead of tevent_add_fd really measurable? In theory I
would guess 99% of the time the kernel buffer should be empty, and if
we do a nonblocking write no tevent_fd should be necessary at all.

Volker

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] Fix an abort in transaction_loop under transaction_loop_recovery test (bug 12580)

Amitay Isaacs
On Fri, Feb 17, 2017 at 3:16 AM, Volker Lendecke <[hidden email]>
wrote:

> On Wed, Feb 15, 2017 at 11:26:55AM +1100, Amitay Isaacs wrote:
> > What's the best practice when using tevent_queue with tevent_req?
> >
> > comm_write_send/recv computation is used to send data to an fd.  There
> can
> > be multiple write requests occurring simultaneously, so comm_write_send
> > uses a tevent_queue to serialize all those requests.  However, what
> happens
> > if tevent_req which is in the queue is freed?  What's the best way to
> track
> > the queue entry corresponding to that tevent_req?
> >
> > Here's an attempt to track tevent_req and corresponding
> tevent_queue_entry,
> > so when tevent_req is free'd, the corresponding tevent_queue_entry also
> > gets freed and the entry is removed from the queue.
>
> Not answering directly:
>
> There's code in lib/async_req that performs a very similar task:
> writev_send. I would have to write a specific test case to see whether
> writev_send is safe from the error you found in ctdb. One of the main
> differences in lib/async_req is that we create a fresh tevent_fd when
> a new write request comes in, whereas comm_write_send reuses the
> existing one.
>

When I was designing comm abstraction, I was thinking of ctdb daemon with
thousands of fds in the epoll set and trying to avoid adding/removing fd
for every read/write.  So far this code has only been used in the client
context where there is usually only a single fd to worry about.

Question: Would it be worthwhile exploring to merge those two
> functions, base one on top of the other? If I get it right the
> outgoing fd event is only really necessary if the kernel buffer is
> full. Is the overhead of tevent_add_fd really measurable? In theory I
> would guess 99% of the time the kernel buffer should be empty, and if
> we do a nonblocking write no tevent_fd should be necessary at all.
>

In the client context of course the overhead is going to be very small, but
in the server context this may become an issue.  But I don't have any hard
evidence to support this.

If there is a large chunk of data to be written to non-blocking fd with
writev(), will it not block to write the whole data since the operation
needs to be atomic? (I really need to test this.)

Amitay.
Loading...