[WIP][PATCH] GUID index for LDB

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Fri, 2017-09-08 at 11:31 +0200, Stefan Metzmacher wrote:
> Am 08.09.2017 um 11:07 schrieb Andrew Bartlett:
> >
> > I understand your fears, but to be clear, a database of 100,000 users
> > was only 860MB with the new code.  
>
> My concern is about large database, which gets expanded by the
> transaction, which means it will be more fragmented and all
> of the new records are appended at the end and the old space
> moves to the freelist.

A very reasonable concern.  Is there a normal strategy to avoid this
issue?

> How does the reindexing code operates in detail?
> Were can I find the related code?

ldb_index.c:ltdb_reindex()

First, remember that we keep an in-memory internal TDB for the duration
of the transaction holding the in-memory copy of the index.

Unchanged from master (except in the selection of the key) we:

 - Delete the entries from the internal TDB
 - Add empty-record values into the internal TDB for each index found
in the LDB
 - Traverse and re-key the database, 
   - for each entry if the key has changed:
     - remove record
     - add record under new key
 - Traverse and add/update index entries to/in the internal TDB
 - prepare_commit the transaction (the caller does this)
   - ltdb_index_transaction_commit() then writes out the index values
(added, deleted, modified) from the internal TDB into the LDB
   - for each entry in the internal tdb:
     - tdb_store() with TDB_REPLACE is used for a new or updated index
into LDB
     - tdb_delete is used to delete a now empty index into the LDB

> Do we delete all index related keys first?

See above.  The index keys in LDB are written out during the
transaction prepare_commit.

In the general case most index values (and so data size) won't change,
but they will for this upgrade (shrinking).  Given that, should we wipe
all the index keys from the LDB at the start of the reindex?  

> Do we delete an old DN= based record before storing it under
> the GUID= key?

Yes, they go one-in-on-out

> And then rebuild the indexes?

Yes.

> > In terms of existing deployment scale, the indeed network that Kevin
> > presented at SambaXP is the largest production deployment I know of and
> > before Samba 4.7 significantly larger deployments are un-feasible due
> > to the O(n^2) handling on links at join time.  (They take him ~ 30-45
> > minutes).
> >
> > * 45 domain controllers on 5 continents
> > * 6,252 User/Group objects ,
> > * 37,625 group memberships
>
> Is this in total or 45 mins per dc?

I understand that is per DC.  That is why we did all the linked
attribute sorting work during Dec 2016/Jan 2017.

> > > I think I'd prefer making the switch for existing databases an
> > > explicit task for the admin.
> >
> > I understand the concern, and I changed the index re-write code not to
> > force the set of DSDB_FEATURES_SUPPORTED flag for 4.7 for this reason,
> > meaning only new databases support 'features'.  A similar arrangement
> > could be made if needed.
> >
> > The reverse concern I have is that if we do that, we have to maintain
> > and test Samba in both modes in perpetuity, particularly as we start to
> > structure our code to try to be GUID rather than string DN based.
> >
> > Either way, it is Samba (not LDB) that controls when this is enabled,
> > and this and the auto-upgrade is only proposed to be enabled on a major
> > version upgrade, for Samba 4.8.  I will naturally make any change we
> > make here very clear in the WHATSNEW.
>
> Ok, I think it would include some numbers.
>
> We may also need to document a procedure that avoids doing
> the upgrade by starting samba. I guess a good strategy might
> be disabling samba and use "samba-tool dbcheck --cross-ncs"
> to trigger an offline upgrade. That would hopefully avoid
> problems for clients which may talk to an unusable dc.

That is a good idea.

> > Most installations (from the discussions on the list) upgrade by
> > joining a new DC to the domain.  The few that don't just upgrade one DC
> > in a pair, then the other.  We hear about these because folks regularly
> > ask if the versions need to be upgraded in sync.
> >
> > Garming keeps a 40,000 user test DB around, so I'll get those numbers
> > from that and hope the above helps address your concerns.  
>
> Please backup the files before the upgrade, so we can redo it if needed.
>
> What are the sizes before the upgrade?
> Please collect "tdbtool $file info" for all files,
> before and after the upgrade.

Sure.

> Please also watch top for the memory consumption.

OK.  

Thanks for all the feedback!

Andrew Bartlett
--
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Wed, 2017-09-06 at 12:44 +0200, Stefan Metzmacher wrote:
>
> Can you also run an autobuild without the activation in Samba,
> so that we're sure we don't insert regressions for possible backports?

I have 7 successful runs and two failures:

[675(3927)/2196 at 1h21m53s] samba3.unix.whoami machine
account(nt4_member:local)

UNEXPECTED(failure): idmap.rfc2307.Testing for expected group
memberships(ad_member_rfc2307)

I've seen both of those fail many times previously, therefore I'm happy
 that we don't introduce regresions for a backport.

The only masked test is dnscmd for wildcards, which is not in 4.7 and
needs fixing regardless.  There is also a fix for a record return order
assumption in the wildcard DNS early in the patch series.

Andrew Bartlett

--
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Fri, 2017-09-08 at 10:36 +0200, Stefan Metzmacher wrote:

> Am 08.09.2017 um 05:56 schrieb Andrew Bartlett:
> > On Thu, 2017-09-07 at 12:03 +1200, Andrew Bartlett via samba-technical
> > wrote:
> > >
> > > I'll put that in the release commit message and in the top of the
> > > ldb_index.c file.
> >
> > The attached, updated patch set includes this slab (see the patch for
> > the full text):
> >
> > The new 'GUID index' format is:
> > -------------------------------
> >
> > dn: @INDEX:NAME:DNSUPDATEPROXY
> > @IDXVERSION: 3
> > @IDX: <binary GUID>[<binary GUID>[...]]
> >
> > The binary guid is 16 bytes, as bytes and not expanded as hexidecimal
> > or pretty-printed.  The GUID is chosen from the message to be stored
> > by the @IDXGUID attribute on @INDEXLIST.
> >
> > If there are multiple values the @IDX value simply becomes longer,
> > in multiples of 16.
> >
> > The corrosponding entry is stored in a TDB record with key:
> >
> > GUID=<binary GUID>
> >
> > This allows a very quick translation between the fixed-length index
> > values and the TDB key, while seperating entries from other data
> > in the TDB, should they be unlucky enough to start with the bytes of
> > the 'DN=' prefix.  
> >
> > Additionally, this allows a scope BASE search to directly find the
> > record via a simple match on a GUID= extended DN, controlled via
> > @IDX_DN_GUID on @INDEXLIST
> >
> > Exception for special @ DNs:
> >
> > @BASEINFO, @INDEXLIST and all other special DNs are stored as per the
> > original format, as they are never referenced in an index and are used
> > to bootstrap the database.
> >
> >
> > Control points for choice of index mode
> > ---------------------------------------
> >
> > The choice of index and TDB key mode is made based (for example, from
> > Samba) on entries in the @INDEXLIST DN:
> >
> > dn: @INDEXLIST
> > @IDXGUID: objectGUID
> > @IDX_DN_GUID: GUID
> >
> > By default, the original DN format is used.
>
> So we're upgrading the database on first use with the new code?
>
> My fear with this is that a simple package upgrade will make
> a dc with a large database unusable for quite some time.
>
> Can you please check the cost of an upgrade for databases with
> 1.) 5000 users, 5000 computers and 5000 groups
> 2.) 20000 users, 20000 computers and 20000 groups
> 3.) with the numbers of the largest known customer size
With 100k users in 1-4 groups each, it takes 3min30sec on my desktop.

Oddly, the undo operation (just wrote a downgrade script in case it is
needed) takes 11mins.

The patches I've got to only re-index once per change to the index
helps a lot, naturally :-).

> I guess rewriting the whole database consumes quite some cpu
> and also memory. A server may run out of memory while doing this
> as we need more than twice the size of all sam.ldb* databases together.

We only hold the index in memory, the rest could get paged out, but I
agree that running on a server with 2x DB size in free memory would
avoid thrashing.

I agree a DB without a big transaction area already and perfectly
packed seems the worst possible case.  I gave up measuring the undo
run, but the upgrade case only took 4mins, probably because the index
records get to re-use their place in the DB.  (The new GUID index is
smaller, indeed this seems to be the primary advantage).

The (100k user) database does grow from 800MB to 2700MB.

Process resident size was around 2.3GB (from memory).

tdbtool output from all the DBs in sam.ldb.d attached.

Given all this, and the new output being printed, I think it is
reasonable to do this on first use for Samba 4.8, with adequate warning
in the WHATSNEW.txt

Thanks,

Andrew Bartlett
--
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba

after.txt (5K) Download Attachment
before.txt (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
Hi Andrew,

>> So we're upgrading the database on first use with the new code?
>>
>> My fear with this is that a simple package upgrade will make
>> a dc with a large database unusable for quite some time.
>>
>> Can you please check the cost of an upgrade for databases with
>> 1.) 5000 users, 5000 computers and 5000 groups
>> 2.) 20000 users, 20000 computers and 20000 groups
>> 3.) with the numbers of the largest known customer size
>
> With 100k users in 1-4 groups each, it takes 3min30sec on my desktop.
>
> Oddly, the undo operation (just wrote a downgrade script in case it is
> needed) takes 11mins.
>
> The patches I've got to only re-index once per change to the index
> helps a lot, naturally :-).
Ok, that's relatively fast and I guess for most current installations
it would under 30 seconds, which would be fine I guess.

Can you also test this with factor 10 and 100 smaller, so 10000 and 1000
users, to be sure.

>> I guess rewriting the whole database consumes quite some cpu
>> and also memory. A server may run out of memory while doing this
>> as we need more than twice the size of all sam.ldb* databases together.
>
> We only hold the index in memory, the rest could get paged out, but I
> agree that running on a server with 2x DB size in free memory would
> avoid thrashing.

Don't we also hold the transaction preparation area in memory before
we write the recovery area and write the changes to the file?

> I agree a DB without a big transaction area already and perfectly
> packed seems the worst possible case.  I gave up measuring the undo
> run, but the upgrade case only took 4mins, probably because the index
> records get to re-use their place in the DB.  (The new GUID index is
> smaller, indeed this seems to be the primary advantage).
>
> The (100k user) database does grow from 800MB to 2700MB.

That's more than factor 3 and may mean /var gets full during this :-(

It would be really good to have this optimized somehow,
maybe using more than one transaction.

But that's really tricky, it would be ideal if we could
somehow truncate the file size with tdb_repack().

I would be good to make the worse case to use 2 of the original size
(because of the recovery area) and also truncate the recovery area
on a successful commit, if it's at the end of the file (in case the
file got just expanded to hold the recovery area.

If we notice that the recovery area will be more than 25% of the
database size we should do an implicit repack and make sure the
recovery area is at the end and can be truncated after the
commit finished. Which the transaction lock held we should
be able to at least truncate back to the original file size,
as no other caller had a chance to remap the file while the
transaction lock is held.

> Process resident size was around 2.3GB (from memory).
>
> tdbtool output from all the DBs in sam.ldb.d attached.
>
> Given all this, and the new output being printed, I think it is
> reasonable to do this on first use for Samba 4.8, with adequate warning
> in the WHATSNEW.txt

What do you mean with "the new output being printed"?

metze


signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Tue, 2017-09-12 at 02:57 +0200, Stefan Metzmacher wrote:

> Hi Andrew,
>
> > > So we're upgrading the database on first use with the new code?
> > >
> > > My fear with this is that a simple package upgrade will make
> > > a dc with a large database unusable for quite some time.
> > >
> > > Can you please check the cost of an upgrade for databases with
> > > 1.) 5000 users, 5000 computers and 5000 groups
> > > 2.) 20000 users, 20000 computers and 20000 groups
> > > 3.) with the numbers of the largest known customer size
> >
> > With 100k users in 1-4 groups each, it takes 3min30sec on my
> > desktop.
> >
> > Oddly, the undo operation (just wrote a downgrade script in case it
> > is
> > needed) takes 11mins. 
> >
> > The patches I've got to only re-index once per change to the index
> > helps a lot, naturally :-). 
>
> Ok, that's relatively fast and I guess for most current installations
> it would under 30 seconds, which would be fine I guess.
I agree.

> Can you also test this with factor 10 and 100 smaller, so 10000 and
> 1000
> users, to be sure.

1000 users is practically instant, the schema DB is larger.

I had a 27000 user DB handy (no groups), and it re-indexed in 38 second
including the time to dump the DB to the console (oops :-).

This isn't going to be a big deal for our current use cases, but will
open us up to much larger opportunities.

> > > I guess rewriting the whole database consumes quite some cpu
> > > and also memory. A server may run out of memory while doing this
> > > as we need more than twice the size of all sam.ldb* databases
> > > together.
> >
> > We only hold the index in memory, the rest could get paged out, but
> > I
> > agree that running on a server with 2x DB size in free memory would
> > avoid thrashing. 
>
> Don't we also hold the transaction preparation area in memory before
> we write the recovery area and write the changes to the file?
I'm not sure, I assumed it was in that record, but I'm not going to
open up that can of worms anyway.  Samba 4.7 and above needs memory, as
noted in the release notes.

> > I agree a DB without a big transaction area already and perfectly
> > packed seems the worst possible case.  I gave up measuring the undo
> > run, but the upgrade case only took 4mins, probably because the
> > index
> > records get to re-use their place in the DB.  (The new GUID index
> > is
> > smaller, indeed this seems to be the primary advantage). 
> >
> > The (100k user) database does grow from 800MB to 2700MB. 
>
> That's more than factor 3 and may mean /var gets full during this :-(
>
> It would be really good to have this optimized somehow,
> maybe using more than one transaction.
>
> But that's really tricky, it would be ideal if we could
> somehow truncate the file size with tdb_repack().
>
> I would be good to make the worse case to use 2 of the original size
> (because of the recovery area) and also truncate the recovery area
> on a successful commit, if it's at the end of the file (in case the
> file got just expanded to hold the recovery area.
>
> If we notice that the recovery area will be more than 25% of the
> database size we should do an implicit repack and make sure the
> recovery area is at the end and can be truncated after the
> commit finished. Which the transaction lock held we should
> be able to at least truncate back to the original file size,
> as no other caller had a chance to remap the file while the
> transaction lock is held.
I'm very sorry, but I don't have the time to rework tdb for an extreme
case well beyond any known or foreseeable installation.  It is all in a
transaction, if it fails it will just fail.

I have a downgrade script in any case.  

In the long term is is quite likely the main DB will be moved to LMDB,
as the folks wanting 100k users don't like being anywhere near that 4GB
limit, so re-working TDB for this is pointless.

Sadly we didn't accept TDB2 for Samba.

> > Process resident size was around 2.3GB (from memory). 
> >
> > tdbtool output from all the DBs in sam.ldb.d attached. 
> >
> > Given all this, and the new output being printed, I think it is
> > reasonable to do this on first use for Samba 4.8, with adequate
> > warning
> > in the WHATSNEW.txt
>
> What do you mean with "the new output being printed"?
In my current branch I've added some incremental output during the re-
index.

Thanks,

Andrew Bartlett

--
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT  
https://catalyst.net.nz/services/samba




signature.asc (879 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Tue, 2017-09-12 at 13:21 +1200, Andrew Bartlett via samba-technical
wrote:
>
> In my current branch I've added some incremental output during the re-
> index.

Attached is my current branch.  A recent version of branch passed a
full autobuild (multiple times), and I trust this will also.

Please review, as the only outstanding things as far as I know are:
 - any required tests for the downgrade script
 - the dns wildcard issue

Thanks,

Andrew Bartlett

--
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba

GUID-index-6.patch.txt (349K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Tue, Sep 12, 2017 at 01:21:17PM +1200, Andrew Bartlett via samba-technical wrote:
> Sadly we didn't accept TDB2 for Samba.

What would have been the techniques in tdb2 that would have made
transactions cheaper? My impression from (admittedly lossy) memory was
that the transaction code would have been the same.

Volker

--
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Tue, 2017-09-12 at 13:45 +0200, Volker Lendecke wrote:
> On Tue, Sep 12, 2017 at 01:21:17PM +1200, Andrew Bartlett via samba-technical wrote:
> > Sadly we didn't accept TDB2 for Samba.
>
> What would have been the techniques in tdb2 that would have made
> transactions cheaper? My impression from (admittedly lossy) memory was
> that the transaction code would have been the same.

Sorry, I should have given some context to that.  I'm being asked by
clients to evaluate LMDB as a backend choice because the apparent DB
size (2.4GB) of a large AD domain (100k user, or for example less users
and more jpegPhoto) is too close to comfort to the 4GB limit of TDB.  

There are probably very good reasons to seriously evaluate it anyway,
but it just seems a pity we didn't get over the 64bit hump with our own
technology.

Andrew Bartlett

--
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Wed, Sep 13, 2017 at 06:46:28AM +1200, Andrew Bartlett wrote:
> Sorry, I should have given some context to that.  I'm being asked by
> clients to evaluate LMDB as a backend choice because the apparent DB
> size (2.4GB) of a large AD domain (100k user, or for example less users
> and more jpegPhoto) is too close to comfort to the 4GB limit of TDB.  

What's the real content of the db? "tdbtool <db> info" would be great.
2.4GB is the indicated file size, how much data is actually in there?
But probably we could optimize transactions to waste less disk space.
Also, if ldb adopted dbwrap, it might be simpler to shard across more
than one tdb.

Also, with tdb now having solved many of the problems that tdb2 was
set out to solve with a completely new implementation, just doing
8-byte offsets should be a lot easier than a complete rewrite.

Volker

--
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Tue, 2017-09-12 at 22:47 +0200, Volker Lendecke wrote:

> On Wed, Sep 13, 2017 at 06:46:28AM +1200, Andrew Bartlett wrote:
> > Sorry, I should have given some context to that.  I'm being asked
> > by
> > clients to evaluate LMDB as a backend choice because the apparent
> > DB
> > size (2.4GB) of a large AD domain (100k user, or for example less
> > users
> > and more jpegPhoto) is too close to comfort to the 4GB limit of
> > TDB.  
>
> What's the real content of the db? "tdbtool <db> info" would be
> great.

I attached that earlier in the thread at metze's request.  Feel free to
take a look.

> 2.4GB is the indicated file size, how much data is actually in there?
> But probably we could optimize transactions to waste less disk space.

We could, but as I say, as soon as you start loading in staff photos (a
reasonable use case), even at 10,000 users 4k per photo would fill it.

> Also, if ldb adopted dbwrap, it might be simpler to shard across more
> than one tdb.
>
> Also, with tdb now having solved many of the problems that tdb2 was
> set out to solve with a completely new implementation, just doing
> 8-byte offsets should be a lot easier than a complete rewrite.

OK.

Anyway, we are off on a tangent.  For now I'm just trying to get the
GUID index in :-)

Thanks,

Andrew Bartlett

--
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT  
https://catalyst.net.nz/services/samba





Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
In reply to this post by Samba - samba-technical mailing list
On Tue, 2017-09-12 at 23:10 +1200, Andrew Bartlett via samba-technical
wrote:

> On Tue, 2017-09-12 at 13:21 +1200, Andrew Bartlett via samba-
> technical
> wrote:
> >
> > In my current branch I've added some incremental output during the
> > re-
> > index.
>
> Attached is my current branch.  A recent version of branch passed a
> full autobuild (multiple times), and I trust this will also.
>
> Please review, as the only outstanding things as far as I know are:
>  - any required tests for the downgrade script
>  - the dns wildcard issue
Attached is the current patch set.  Please review so I can get this in
to master soon.

Thanks,

Andrew Bartlett
--
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT  
https://catalyst.net.nz/services/samba




GUID-index-6.patch.txt (372K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] GUID index for LDB

Samba - samba-technical mailing list
On Wed, 2017-09-13 at 17:16 +1200, Andrew Bartlett via samba-technical
wrote:

> On Tue, 2017-09-12 at 23:10 +1200, Andrew Bartlett via samba-
> technical
> wrote:
> > On Tue, 2017-09-12 at 13:21 +1200, Andrew Bartlett via samba-
> > technical
> > wrote:
> > >
> > > In my current branch I've added some incremental output during
> > > the
> > > re-
> > > index.
> >
> > Attached is my current branch.  A recent version of branch passed a
> > full autobuild (multiple times), and I trust this will also.
> >
> > Please review, as the only outstanding things as far as I know are:
> >  - any required tests for the downgrade script
> >  - the dns wildcard issue
>
> Attached is the current patch set.  Please review so I can get this
> in
> to master soon.
Garming has given me a review on this patch set.  To ease backporting
I'll try and land the ldb read-only set Gary sent first, with an
earlier LDB version number.

Thanks,

Andrew Bartlett

--
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT  
https://catalyst.net.nz/services/samba




GUID-index-6.patch.txt (363K) Download Attachment
12