[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hierarchial data structure



Eric--

> Does anybody know of a Perl module allowing me to maintain a hierarchial
> data structure?

As others have pointed out, Perl has some native support for
trees:

1. Good literal notation
2. OK nested records/arrays
3. Hashes for most graph algorithms.

> I wish I had a Perl module allowing me to maintain hierarchhial data in
> a relational database.

Relational databases often have problems with heirarchical data.
Consider the classic Bill-of-materials problem.
Non-standard extensions (Oracle has one) can help.

> If I had such a thing, then I would be better
> able to create various controlled vocabularies. It is important for this
> thing to be a module so I could use it over and over again in different
> applications as well as be able to share the same vocabularies between
> applications. I have written the beginnings of a program allowing me to
> do this, but I have not gotten my mind around the problem of how to edit
> the data. How can I do this?

We have a domain-specific problem. Look at
Lingua::Wordnet and the tpj article
http://www.samag.com/documents/s=1273/sam05020006/
> I have started to create such functionality in a program I call
> Thesaurus Builder:
>
>    http://dewey.library.nd.edu/thesauri/?cmd=about
>
> Using Thesaurus Builder I am able to create new terms, provide them with
> scope notes, and create relationships between the new term and other
> terms. The relationships included the typical relationships found in
> thesari and taxonomies:

Classical thesauri do not seem to have direct support in Perl yet.
Lingua::Wordnet is the closest I have been able to find.

You will quickly find yourself in graph theory, since these
relations form graphs. Read up on graph-oriented algorithms.
Relational models do not necessarily help.

>    1. broader term (parent)
>    2. narrower term (child)
>    3. used for (the inversion of a See reference)
>    4. synonymous term (See Also)
>
> To implement these functions I have created a relational (MySQL)
> database consisting of three tables:
>
>    thesari     terms         see_also
>    =======     =======       ========
>    id          id            id
>    thesarus    term          one_term_id
>    note        note          another_term_id
>                term_id       thesarus_id
>                see_id
>                thesaurus_id
>
> The thesari table is simple a list of taxonomies. The terms table lists
> each term where term_id represents the key value pointing to the term's
> parent term. See_id is the key value pointing to a see also term, if one
> exists. The see_also table simply relates to term id denoting equality.
>
> My question now is, how do I modify my tables once I create them? It is
> easy for me to change things like spelling errrrors. :-) On the other
> hand, it is difficult for me to change the relationships. If I add
> delete a term in the middle of the structure I have to find all the
> parents and children of that term and modify them accordingly.
> Additionally, my existing structure only allows terms to have a single
> parent. How can I change my data structure to allow terms to have
> multiple parents. Finally, how can I write this whole thing as a module?

I would look at Lingua::Wordnet and particularly the article,
ask Dan Brian what he thinks, and if that does not get you far
enough, start building on graph algorithms. (Relational implementations
for persistence and some support could be helpful, but do not directly
address any of the issues you have about consistency. Relational
databases have an idea of referential integrity that can sometimes
be helpful (e.g. cascade delete) but you are likely to discover
the you want more than they can provide.)

> Whew!

Agreed!

__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com