|
Metze,
In my repl-devel branch I have a series of patches to better test our replication and conflict resolution handling. https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel Currently we have a number of issues in this area. The test I added there shows that we do not consistently handle the conflict resolution. This is particularly the case with conflicting renamed. The attempts at modification of the replication code I've included try to handle some of this, but it still doesn't work. However, this code remains dizzyingly complex, and I wondered if, particularly as I now have a reasonable testsuite, you might be able toa assist me in making this more reliable? Thanks, Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
|
On Mon, 2012-07-30 at 23:47 +1000, Andrew Bartlett wrote:
> Metze, > > In my repl-devel branch I have a series of patches to better test our > replication and conflict resolution handling. > > https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel > > Currently we have a number of issues in this area. The test I added > there shows that we do not consistently handle the conflict resolution. > This is particularly the case with conflicting renamed. > > The attempts at modification of the replication code I've included try > to handle some of this, but it still doesn't work. > > However, this code remains dizzyingly complex, and I wondered if, > particularly as I now have a reasonable testsuite, you might be able toa > assist me in making this more reliable? I've found some of the issues here, but I still can't make the conflict handling reliable. I've put in the test simply asserting that one or other record becomes a conflict, until we can get back to this. It would be very helpful to me if you could look at this area, as this should be deterministic :-(. Still, at least is no longer stops or crashes. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
|
Hi Andrew,
>> In my repl-devel branch I have a series of patches to better test our >> replication and conflict resolution handling. >> >> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel >> >> Currently we have a number of issues in this area. The test I added >> there shows that we do not consistently handle the conflict resolution. >> This is particularly the case with conflicting renamed. >> >> The attempts at modification of the replication code I've included try >> to handle some of this, but it still doesn't work. >> >> However, this code remains dizzyingly complex, and I wondered if, >> particularly as I now have a reasonable testsuite, you might be able toa >> assist me in making this more reliable? > > I've found some of the issues here, but I still can't make the conflict > handling reliable. I've put in the test simply asserting that one or > other record becomes a conflict, until we can get back to this. It > would be very helpful to me if you could look at this area, as this > should be deterministic :-(. > > Still, at least is no longer stops or crashes. or do you see the strange behavior in normal operation? I was also debugging a replication problem with the servicePrincipalName attribute on a RODC, maybe this is related. metze |
|
On Tue, 2012-07-31 at 08:04 +0200, Stefan (metze) Metzmacher wrote:
> Hi Andrew, > > >> In my repl-devel branch I have a series of patches to better test our > >> replication and conflict resolution handling. > >> > >> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel > >> > >> Currently we have a number of issues in this area. The test I added > >> there shows that we do not consistently handle the conflict resolution. > >> This is particularly the case with conflicting renamed. > >> > >> The attempts at modification of the replication code I've included try > >> to handle some of this, but it still doesn't work. > >> > >> However, this code remains dizzyingly complex, and I wondered if, > >> particularly as I now have a reasonable testsuite, you might be able toa > >> assist me in making this more reliable? > > > > I've found some of the issues here, but I still can't make the conflict > > handling reliable. I've put in the test simply asserting that one or > > other record becomes a conflict, until we can get back to this. It > > would be very helpful to me if you could look at this area, as this > > should be deterministic :-(. > > > > Still, at least is no longer stops or crashes. > > Does it randomly fail make test (if so what's the test name?) > or do you see the strange behavior in normal operation? What happens is that the additional tests I added in samba4.drs.replica_sync.python fail randomly. To get the rest of the patch into mater (and to ensure we have any coverage of this codepath at all), I've modified the tests to accept that one DN or the other is made into a conflict, but not to assert on which one in particular is the conflict. This is in autobuild now. On that branch, It is clear that it's random because if you run it twice, the line number (corresponding to unit tests) of the assertions changes. Once these are in master, I'll update that branch with just the stricter test. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
|
On Tue, 2012-07-31 at 16:09 +1000, Andrew Bartlett wrote:
> On Tue, 2012-07-31 at 08:04 +0200, Stefan (metze) Metzmacher wrote: > > Hi Andrew, > > > > >> In my repl-devel branch I have a series of patches to better test our > > >> replication and conflict resolution handling. > > >> > > >> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel > > >> > > >> Currently we have a number of issues in this area. The test I added > > >> there shows that we do not consistently handle the conflict resolution. > > >> This is particularly the case with conflicting renamed. > > >> > > >> The attempts at modification of the replication code I've included try > > >> to handle some of this, but it still doesn't work. > > >> > > >> However, this code remains dizzyingly complex, and I wondered if, > > >> particularly as I now have a reasonable testsuite, you might be able toa > > >> assist me in making this more reliable? > > > > > > I've found some of the issues here, but I still can't make the conflict > > > handling reliable. I've put in the test simply asserting that one or > > > other record becomes a conflict, until we can get back to this. It > > > would be very helpful to me if you could look at this area, as this > > > should be deterministic :-(. > > > > > > Still, at least is no longer stops or crashes. > > > > Does it randomly fail make test (if so what's the test name?) > > or do you see the strange behavior in normal operation? > > What happens is that the additional tests I added in > samba4.drs.replica_sync.python fail randomly. > > To get the rest of the patch into mater (and to ensure we have any > coverage of this codepath at all), I've modified the tests to accept > that one DN or the other is made into a conflict, but not to assert on > which one in particular is the conflict. This is in autobuild now. > > On that branch, It is clear that it's random because if you run it > twice, the line number (corresponding to unit tests) of the assertions > changes. > > Once these are in master, I'll update that branch with just the stricter > test. I've updated the branch. To reproduce, just run: make test TESTS=samba4.drs.replica_sync.python -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
|
Am 31.07.2012 08:37, schrieb Andrew Bartlett:
> On Tue, 2012-07-31 at 16:09 +1000, Andrew Bartlett wrote: >> On Tue, 2012-07-31 at 08:04 +0200, Stefan (metze) Metzmacher wrote: >>> Hi Andrew, >>> >>>>> In my repl-devel branch I have a series of patches to better test our >>>>> replication and conflict resolution handling. >>>>> >>>>> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel >>>>> >>>>> Currently we have a number of issues in this area. The test I added >>>>> there shows that we do not consistently handle the conflict resolution. >>>>> This is particularly the case with conflicting renamed. >>>>> >>>>> The attempts at modification of the replication code I've included try >>>>> to handle some of this, but it still doesn't work. >>>>> >>>>> However, this code remains dizzyingly complex, and I wondered if, >>>>> particularly as I now have a reasonable testsuite, you might be able toa >>>>> assist me in making this more reliable? >>>> >>>> I've found some of the issues here, but I still can't make the conflict >>>> handling reliable. I've put in the test simply asserting that one or >>>> other record becomes a conflict, until we can get back to this. It >>>> would be very helpful to me if you could look at this area, as this >>>> should be deterministic :-(. >>>> >>>> Still, at least is no longer stops or crashes. >>> >>> Does it randomly fail make test (if so what's the test name?) >>> or do you see the strange behavior in normal operation? >> >> What happens is that the additional tests I added in >> samba4.drs.replica_sync.python fail randomly. >> >> To get the rest of the patch into mater (and to ensure we have any >> coverage of this codepath at all), I've modified the tests to accept >> that one DN or the other is made into a conflict, but not to assert on >> which one in particular is the conflict. This is in autobuild now. >> >> On that branch, It is clear that it's random because if you run it >> twice, the line number (corresponding to unit tests) of the assertions >> changes. >> >> Once these are in master, I'll update that branch with just the stricter >> test. > > I've updated the branch. To reproduce, just run: > > make test TESTS=samba4.drs.replica_sync.python on the invocationId. The timestamps are in 1 sec intervals, in the protocol! I think you should find out the invocationId and define the dc with the lower invocationId as dc1 and the other as dc2. metze |
|
On Tue, 2012-07-31 at 10:37 +0200, Stefan (metze) Metzmacher wrote:
> Am 31.07.2012 08:37, schrieb Andrew Bartlett: > > On Tue, 2012-07-31 at 16:09 +1000, Andrew Bartlett wrote: > >> On Tue, 2012-07-31 at 08:04 +0200, Stefan (metze) Metzmacher wrote: > >>> Hi Andrew, > >>> > >>>>> In my repl-devel branch I have a series of patches to better test our > >>>>> replication and conflict resolution handling. > >>>>> > >>>>> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/repl-devel > >>>>> > >>>>> Currently we have a number of issues in this area. The test I added > >>>>> there shows that we do not consistently handle the conflict resolution. > >>>>> This is particularly the case with conflicting renamed. > >>>>> > >>>>> The attempts at modification of the replication code I've included try > >>>>> to handle some of this, but it still doesn't work. > >>>>> > >>>>> However, this code remains dizzyingly complex, and I wondered if, > >>>>> particularly as I now have a reasonable testsuite, you might be able toa > >>>>> assist me in making this more reliable? > >>>> > >>>> I've found some of the issues here, but I still can't make the conflict > >>>> handling reliable. I've put in the test simply asserting that one or > >>>> other record becomes a conflict, until we can get back to this. It > >>>> would be very helpful to me if you could look at this area, as this > >>>> should be deterministic :-(. > >>>> > >>>> Still, at least is no longer stops or crashes. > >>> > >>> Does it randomly fail make test (if so what's the test name?) > >>> or do you see the strange behavior in normal operation? > >> > >> What happens is that the additional tests I added in > >> samba4.drs.replica_sync.python fail randomly. > >> > >> To get the rest of the patch into mater (and to ensure we have any > >> coverage of this codepath at all), I've modified the tests to accept > >> that one DN or the other is made into a conflict, but not to assert on > >> which one in particular is the conflict. This is in autobuild now. > >> > >> On that branch, It is clear that it's random because if you run it > >> twice, the line number (corresponding to unit tests) of the assertions > >> changes. > >> > >> Once these are in master, I'll update that branch with just the stricter > >> test. > > > > I've updated the branch. To reproduce, just run: > > > > make test TESTS=samba4.drs.replica_sync.python > > I guess it's related to the fact that the conflict resolution also depends > on the invocationId. The timestamps are in 1 sec intervals, in the protocol! Ouch! Does that mean I would cause damage with this patch: https://git.samba.org/?p=abartlet/samba.git/.git;a=commitdiff;h=862b26518a0629f6112fb7e6270c0b98ef71a855 (or would the NDR layer just remove the partial seconds anyway?) It seems better to always work with NTTIME - if it's not harmful I'll just change the commit message to clarify. > I think you should find out the invocationId and define the dc with the > lower > invocationId as dc1 and the other as dc2. I can just put some sleep into the tests to get times different if that's what is going on. (I've stopped my autobuild, which includes the next beta because it was due today, pending resolving this) Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
|
Hi Andrew,
>>> I've updated the branch. To reproduce, just run: >>> >>> make test TESTS=samba4.drs.replica_sync.python >> >> I guess it's related to the fact that the conflict resolution also depends >> on the invocationId. The timestamps are in 1 sec intervals, in the protocol! > > Ouch! Does that mean I would cause damage with this patch: > https://git.samba.org/?p=abartlet/samba.git/.git;a=commitdiff;h=862b26518a0629f6112fb7e6270c0b98ef71a855 > > (or would the NDR layer just remove the partial seconds anyway?) > It seems better to always work with NTTIME - if it's not harmful I'll > just change the commit message to clarify. I'd prefer to just skip that patch. >> I think you should find out the invocationId and define the dc with the >> lower >> invocationId as dc1 and the other as dc2. > > I can just put some sleep into the tests to get times different if > that's what is going on. maybe for some parts, but you should also test the resolution based on the invocationId and assing the dc1 and dc2 variable based on the invocationId. > (I've stopped my autobuild, which includes the next beta because it was > due today, pending resolving this) Didn't it fail on a dbcheck test (something with lastKnownParent)? metze |
|
On Tue, 2012-07-31 at 10:50 +0200, Stefan (metze) Metzmacher wrote:
> Hi Andrew, > > >>> I've updated the branch. To reproduce, just run: > >>> > >>> make test TESTS=samba4.drs.replica_sync.python > >> > >> I guess it's related to the fact that the conflict resolution also depends > >> on the invocationId. The timestamps are in 1 sec intervals, in the protocol! > > > > Ouch! Does that mean I would cause damage with this patch: > > https://git.samba.org/?p=abartlet/samba.git/.git;a=commitdiff;h=862b26518a0629f6112fb7e6270c0b98ef71a855 > > > > (or would the NDR layer just remove the partial seconds anyway?) > > I guess so > > > It seems better to always work with NTTIME - if it's not harmful I'll > > just change the commit message to clarify. > > I'd prefer to just skip that patch. I'll do that. Thanks for the feedback. > >> I think you should find out the invocationId and define the dc with the > >> lower > >> invocationId as dc1 and the other as dc2. > > > > I can just put some sleep into the tests to get times different if > > that's what is going on. > > maybe for some parts, but you should also test the resolution based on the > invocationId and assing the dc1 and dc2 variable based on the invocationId. That certainly sounds like a reasonable extension. > > (I've stopped my autobuild, which includes the next beta because it was > > due today, pending resolving this) > > Didn't it fail on a dbcheck test (something with lastKnownParent)? It did, and then I fixed that up, then had this discussion. I'll upgrade the branch. Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org |
| Powered by Nabble | Edit this page |
