Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Converting indented data to a tree

Reply
Thread Tools

Converting indented data to a tree

 
 
Tore Aursand
Guest
Posts: n/a
 
      10-15-2003
Hi!

I need to convert a large text file with indented data into a tree (ie.
into a child-parent relationship).

The text file looks like this:

Page 1
Page 1.1
Page 1.1.1
Page 1.1.2
Page 1.2
Page 1.3
Page 2
Page 2.1

I know that for each "level", there are a known number of spaces (in this
case 4), or 0 spaces if we're on a root level.

For each line I want to assign an incremental counter, so that I should
end up with an array of arrays representing the tree. For the example
above the array would look like this:

@array = (
[1,0], # Page 1
[2,1], # Page 1.1
[3,2], # Page 1.1.1
[4,2], # Page 1.1.2
[5,1], # Page 1.2
[6,1], # Page 1.3
[7,0], # Page 2
[8,7], # Page 2.1
);

I guess you'll get the idea. I'm totally stuck on this one, and I would
like some help from you guys (and girls).

Thanks in advance!


--
Tore Aursand <(E-Mail Removed)>
 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      10-15-2003
Tore Aursand <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Hi!
>
> I need to convert a large text file with indented data into a tree (ie.
> into a child-parent relationship).
>
> The text file looks like this:
>
> Page 1
> Page 1.1
> Page 1.1.1
> Page 1.1.2
> Page 1.2
> Page 1.3
> Page 2
> Page 2.1
>
> I know that for each "level", there are a known number of spaces (in this
> case 4), or 0 spaces if we're on a root level.
>
> For each line I want to assign an incremental counter, so that I should
> end up with an array of arrays representing the tree. For the example
> above the array would look like this:
>
> @array = (
> [1,0], # Page 1
> [2,1], # Page 1.1
> [3,2], # Page 1.1.1
> [4,2], # Page 1.1.2
> [5,1], # Page 1.2
> [6,1], # Page 1.3
> [7,0], # Page 2
> [8,7], # Page 2.1
> );
>
> I guess you'll get the idea.


I thought I did, until I sw the last item. Should "[8,7]" be "[8,1]"?
Also, the first component seems to be nothing but a counter, one more
than the index. What is it for?

> I'm totally stuck on this one, and I would
> like some help from you guys (and girls).


The array is almost trivial to build:

my @array;
my $count = 1;
while ( <DATA> ) {
my ( $leading) = /^( *)/;
push @array, [ $count++, length( $leading)/4];
}

I don't see how the array represents the tree. It only holds the
level of each node, but not its relation to other nodes.

Anno
 
Reply With Quote
 
 
 
 
Tore Aursand
Guest
Posts: n/a
 
      10-15-2003
On Wed, 15 Oct 2003 07:45:19 +0000, Anno Siegel wrote:
>> @array = (
>> [1,0], # Page 1
>> [2,1], # Page 1.1
>> [3,2], # Page 1.1.1
>> [4,2], # Page 1.1.2
>> [5,1], # Page 1.2
>> [6,1], # Page 1.3
>> [7,0], # Page 2
>> [8,7], # Page 2.1
>> );


> I thought I did, until I sw the last item. Should "[8,7]" be "[8,1]"?
> Also, the first component seems to be nothing but a counter, one more
> than the index. What is it for?


*trying to figure out what's Anno is thinking* Ah! No, no. Guess I
should have explained it a little better.

The "innermost" array is a [id, parent_id] thing. While the 'id' can be a
simple counter (incremental), I need to keep track of the relationship to
the parent also.

While writing this message I got the idea of push'ing and pop'ing the
current parent '$count' onto/from an array, and it _seems to_ work;

my @array = ();
my $this_level = 0;
my $prev_level = 0;
my @parents = (0);
my $count = 0;

while ( <DATA> ) {
chomp;
next unless ( length );
$count++;

my ( $spaces ) = m,^( *),;
$this_level = length( $spaces ) / 4;

if ( $this_level > $prev_level ) {
push( @parents, $count );
}
elsif ( $this_level < $prev_level ) {
pop( @parents );
}

push( @array, [$count, $parents[-1]] );
$prev_level = $this_level;
}

Anyone for a better solution?


--
Tore Aursand <(E-Mail Removed)>
 
Reply With Quote
 
Steven Kuo
Guest
Posts: n/a
 
      10-15-2003
On Wed, 15 Oct 2003, Tore Aursand wrote:

> On Wed, 15 Oct 2003 07:45:19 +0000, Anno Siegel wrote:
> >> @array = (
> >> [1,0], # Page 1
> >> [2,1], # Page 1.1
> >> [3,2], # Page 1.1.1
> >> [4,2], # Page 1.1.2
> >> [5,1], # Page 1.2
> >> [6,1], # Page 1.3
> >> [7,0], # Page 2
> >> [8,7], # Page 2.1
> >> );

>
> > I thought I did, until I sw the last item. Should "[8,7]" be "[8,1]"?
> > Also, the first component seems to be nothing but a counter, one more
> > than the index. What is it for?

>
> *trying to figure out what's Anno is thinking* Ah! No, no. Guess I
> should have explained it a little better.
>
> The "innermost" array is a [id, parent_id] thing. While the 'id' can be a
> simple counter (incremental), I need to keep track of the relationship to
> the parent also.


(snipped)



Perhaps something like:

use Data:umper;

my %parent;
my @array;

while (<DATA>) {
chomp;
my $leading_spaces = 0;
++$leading_spaces while (/\G /g);
$leading_spaces /= 4;
if ($leading_spaces) {
push @array, [ $., $parent{$leading_spaces-1}];
} else {
push @array, [$., 0];
}

$parent{$leading_spaces} = $.;

}

print Dumper(\@array);

__DATA__
Page 1
Page 1.1
Page 1.1.1
Page 1.1.2
Page 1.2
Page 1.3
Page 2
Page 2.1



--
Hope this helps,
Steven

 
Reply With Quote
 
Jay Tilton
Guest
Posts: n/a
 
      10-15-2003
Tore Aursand <(E-Mail Removed)> wrote:

: The "innermost" array is a [id, parent_id] thing. While the 'id' can be a
: simple counter (incremental), I need to keep track of the relationship to
: the parent also.
:
: While writing this message I got the idea of push'ing and pop'ing the
: current parent '$count' onto/from an array, and it _seems to_ work;
:
: my @array = ();
: my $this_level = 0;
: my $prev_level = 0;
: my @parents = (0);
: my $count = 0;
:
: while ( <DATA> ) {
: chomp;
: next unless ( length );
: $count++;
:
: my ( $spaces ) = m,^( *),;
: $this_level = length( $spaces ) / 4;
:
: if ( $this_level > $prev_level ) {
: push( @parents, $count );
^^^^^^
Not really what you're after. The resulting data structure will end up
pointing to the parent node's first child instead of pointing to the
parent node itself.

push( @parents, $count-1 );

: }
: elsif ( $this_level < $prev_level ) {
: pop( @parents );

It's possible for $this_level to drop more than one from $prev_level.
If your data went like this:

Page 1
Page 1.1
Page 1.1.1
Page 1.1.2
Page 2

then the tree outline would be thrown out of alignment.
splice() will work better than pop() :

splice @parents, $this_level - $prev_level;

: }
:
: push( @array, [$count, $parents[-1]] );
: $prev_level = $this_level;
: }
:
: Anyone for a better solution?

Use the level to index the @parents array directly instead of
adding/removing elements at its end.

my @parents = 0;
my @array;
while(<DATA>) {
my $depth = length( (/^( *)/)[0] ) / 4;
$parents[$depth+1] = $. ;
push @array, [ $. , $parents[$depth] ];
}

 
Reply With Quote
 
Tore Aursand
Guest
Posts: n/a
 
      10-16-2003
On Wed, 15 Oct 2003 20:32:26 +0200, Tore Aursand wrote:
> if ( $this_level > $prev_level ) {
> push( @parents, $count );
> }
> elsif ( $this_level < $prev_level ) {
> pop( @parents );
> }


Found out that the last elsif won't work, 'cause there will be times when
I need to "jump" from level 4 to level 1.

Solved it by doing this instead:

elsif ( $this_level < $prev_level ) {
pop( @parents ) for ( $this_level .. ($prev_level - 1) );
}

Tata!


--
Tore Aursand <(E-Mail Removed)>
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 2 10-31-2007 02:58 AM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 1 10-30-2007 11:01 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 4 10-30-2007 08:21 PM
making indented XML Steve Maring Java 0 01-10-2004 03:13 AM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM



Advertisments