Re: Trying to understand entryrdn.db
by Mark Reynolds
On 08/02/2017 02:19 PM, Ilias Stamatis wrote:
> I see now, thank you both very much!
>
> Follow-up:
>
> [1] Get entry from id2entry and use its ID
> [2] Look in entryrdn for the parent of the ID
> [3] Keep looking for parents, building the DN as you go along
>
>
> Example:
>
> [1] Get entry from id2entry: ID 6 --> "cn=Accounting Managers"
> [2] Check entryrdn for "P<ID>". In this case it's "P6" which is
> "ou=Groups" with ID 3
> [3] So find "P3", which is "dc=example,dc=com" with ID 1, and
> look for "P1". But there is no P1, so we stop the process/loop.
>
>
> Why do we need to look at entryrdn for parent's id? Is it faster?
I have not looked closely into it - so it might not be necessary to use
entryrdn. I thought it might be more efficient to use it. If you just
use id2entry, you have to keep scanning it over and over, and starting
over every time you need to read the next entry. Maybe not though,
maybe you can just "search" it and not have to scan it sequentially when
trying to find parents and entries. I'll leave that up to you to find
out ;-)
>
> I mean the same information can be found in id2entry (?). Or this is
> not the case and dbscan does the exact same process you just described
> in order to print "parentid: X" for each entry when you do "dbscan -f
> id2entry.db"?
>
> Thanks again,
>
6 years, 8 months
Re: Trying to understand entryrdn.db
by Nishan Boroian
Ok, thanks for the update.
>
> On Aug 4, 2017 at 08:08, <Ilias Stamatis (mailto:stamatis.iliass@gmail.com)> wrote:
>
>
>
> Okay, now that I have read and understood dbscan's code, I have a few more questions.
>
>
>
>
> 2017-08-03 10:10 GMT+03:00 Ludwig Krispenz <lkrispen(a)redhat.com (mailto:lkrispen@redhat.com)>:
>
> > Hi, now that I know the context here are some more comments.
> >
> > If the purpose is to create a useful ldif file, which could eventually be used for import then formatting an entry correctly is not enough. Order of entries matters: parents need to come before children. We already handle this in db2ldif or replication total update.
> > That said, whenever you write an entry you always have seen the parent and could stack the dn with the parentid and createt the dn without using the entryrdn index.
> > You even need not to keep track of all the entry rdsn/dns - only the ones with children will be needed later, the presence of "numsubordinates"
> > identifies a parent.
> >
>
>
> Is it guaranteed that parents are going to appear before children in id2entry.db?
>
>
> If so, here's what could probably work:
>
>
> - Start reading entries from id2entry sequentially.
>
> - For each entry, if it has a numSubordinates attribute it means it is a parent for other entries. So we can store it's ID - DN pair in a hash map.
>
> - For entries that they have a parentid and so we need to figure out their parent's DN, we just look for hashmap[parentid].
>
>
> To make it even more efficient (if really needed though, because it will make things more complicated) we can store the value of numSubordinates with each parent as well somehow in the map. Every time a parentid is looked in the map we can decrease the value of numSubordinates by 1. When it becomes 0, it means there are no more children of this ID so we can safely remove it from the map.
>
>
> However, I don't know if we would really need this last thing. In a 100 million entry db how many parents would we expect to have approximately?
>
>
> Also, do we have a hash map implemented somewhere?
>
>
> If parents are not guaranteed to appear before children in id2entry.db, then we would have to alter the above strategy.
>
>
>
> Thanks!
>
>
> _______________________________________________ 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
6 years, 8 months
Re: Trying to understand entryrdn.db
by Nishan Boroian
Let's discuss more on it.
>
> On Aug 3, 2017 at 07:33, <Ludwig Krispenz (mailto:lkrispen@redhat.com)> wrote:
>
>
>
>
> On 08/03/2017 12:24 PM, Ilias Stamatis wrote:
>
> >
> >
> >
> > > That said, whenever you write an entry you always have seen the parent and could stack the dn with the parentid and createt the dn without using the entryrdn index.
> > > You even need not to keep track of all the entry rdsn/dns - only the ones with children will be needed later, the presence of "numsubordinates"
> > > identifies a parent.
> >
> >
> >
> > Interesting. I think I now understand better how to approach this problem.
> > great. just one more hint. If you iterate the the entries in id2entry you have the entryid and the parentid of the entry. if parentid > entryid you need to get and export the parent first (an track that you did it already)
>
> >
> >
> >
> > I'll get back to it soon.
> >
> > Thanks so much!
> >
> >
> > > Last but not least, since I think dbscan is broken for entryrdn, investigating and fixing this would also be nice
> >
> > Sure. I'll open a ticket so it gets tracked.
> >
> >
> > _______________________________________________ 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org (mailto:389-devel@lists.fedoraproject.org) To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org (mailto:389-devel-leave@lists.fedoraproject.org)
>
>
>
> -- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric Shander
> _______________________________________________ 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
>
6 years, 8 months
Trying to understand entryrdn.db
by Ilias Stamatis
Hello,
I would like some help in order to understand entryrdn.db. When I do
"dbscan -f entryrdn.db" I get something like:
3
ID: 3; RDN: "ou=Groups"; NRDN: "ou=groups"
C3
ID: 6; RDN: "cn=Accounting Managers"; NRDN: "cn=accounting managers"
P6
ID: 3; RDN: "ou=Groups"; NRDN: "ou=groups"
I understand that 3 is this entry's ID, C3 means child of entry 3 and P6
means parent of entry 6.
What I don't understand however is why those entries are repeated again and
again. For example " ID: 7; RDN: "cn=HR Managers"; NRDN: "cn=hr managers"
is repeated about a dozen of times in my entryrdn. And I don't mean like a
parent, child, or whatever. It is repeated lots of time as ID 7 for example
(but also many times as C3, etc.).
I attach the complete output of what I get when I run "dbscan -f
entryrdn.db", in order to demonstrate what I mean (my db contains almost
default entries only).
So my question is; how is this database filled?
Thank you very much,
Ilias
6 years, 8 months
Where is wibrown?
by William Brown
Hi all,
I'll be in Melbourne this week, so please excuse me if my responses are
delayed,
--
Sincerely,
William Brown
Software Engineer
Red Hat, Australia/Brisbane
6 years, 9 months