Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   Confused about Schwartz idiom utilizing map & split (http://www.velocityreviews.com/forums/t896835-confused-about-schwartz-idiom-utilizing-map-and-split.html)

weston 03-03-2006 11:53 PM

Confused about Schwartz idiom utilizing map & split
 
In an article on Stonehenge.com on using libxml2 to strip html from a
document, I came across a part of the listing that I'm having trouble
understanding. Randall apparently creates a hash of approved tags and
their attributes with these lines:

=9= my %PERMITTED =
=10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
=11= split /\n/, <<'END';
=12= a href name target class title
=13= b
=14= big
=15= blockquote class
....
=49= END

(See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )

I keep trying to parse line 10 in my head and am not getting a lot of
mental traction in really understanding how this works. Anybody want to
help?


Dr.Ruud 03-04-2006 12:21 AM

Re: Confused about Schwartz idiom utilizing map & split
 
weston schreef:
> In an article on Stonehenge.com on using libxml2 to strip html from a
> document, I came across a part of the listing that I'm having trouble
> understanding. Randall apparently creates a hash of approved tags and
> their attributes with these lines:
>
> =9= my %PERMITTED =
> =10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
> =11= split /\n/, <<'END';
> =12= a href name target class title
> =13= b
> =14= big
> =15= blockquote class
> ....
> =49= END
>
> (See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )
>
> I keep trying to parse line 10 in my head and am not getting a lot of
> mental traction in really understanding how this works. Anybody want
> to help?


Maybe this helps:

#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;

my %PERMITTED =
map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
split /\n/, <<'END';
a href name target class title
b
big
blockquote class
....
END

print Data::Dumper->Dump( [\%PERMITTED]
, [qw(%PERMITTED)]
), "\n";

--
Affijn, Ruud

"Gewoon is een tijger."

Randal L. Schwartz 03-04-2006 12:36 AM

Re: Confused about Schwartz idiom utilizing map & split
 
>>>>> "weston" == weston <weston@canncentral.org> writes:

weston> In an article on Stonehenge.com on using libxml2 to strip html from a
weston> document, I came across a part of the listing that I'm having trouble
weston> understanding. Randall apparently creates a hash of approved tags and
weston> their attributes with these lines:

weston> =9= my %PERMITTED =
weston> =10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
weston> =11= split /\n/, <<'END';
weston> =12= a href name target class title
weston> =13= b
weston> =14= big
weston> =15= blockquote class
weston> ....
weston> =49= END

weston> (See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )

weston> I keep trying to parse line 10 in my head and am not getting a lot of
weston> mental traction in really understanding how this works. Anybody want to
weston> help?

Heh.

The split on line 11 creates elements like:

"a href name target class title",
"b",
"big",
"blockquote class",

etc. The map on the beginning of line 10 sets $_ equal to each of those,
and looks for a list-valued return from the block.

The split in the middle of line 10 breaks each of those elements listed above
into a list, and assigns the first to $k, and any remaining ones to @v.

The second map on line 10 converts @v to a list of elements of @v alternating
with the value "1", and then turns that into a hashref, so that @v becomes
keys, with values 1. That hashref is then added along with $k to be
two values that eventually contribute to %PERMITTED.

But didn't I say all this in the article? :-)

print "Just another Perl hacker,"; # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***

Tad McClellan 03-04-2006 12:37 AM

Re: Confused about Schwartz idiom utilizing map & split
 
weston <notsew-reversePreceedingAndRemoveThis@canncentral.org> wrote:

> In an article on Stonehenge.com on using libxml2 to strip html from a
> document, I came across a part of the listing that I'm having trouble
> understanding. Randall apparently creates a hash of approved tags and
> their attributes with these lines:


> =10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }


> I keep trying to parse line 10 in my head and am not getting a lot of
> mental traction in really understanding how this works. Anybody want to
> help?



Does this help?

------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %PERMITTED =
map { my($k, @v) = split; # 1st space-sep'd field is tag, rest are its attrs
($k, {map {$_, 1} @v}) # a 2-element list. 1st is tag,
# 2nd is a hash-ref with keys as attr names,
# and values set to one
}
split /\n/, <<'END';
a href name target class title
b
big
blockquote class
END

print Dumper \%PERMITTED;
------------------------------


Or maybe it would help to "unroll" the maps into foreachs:

------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %PERMITTED;

foreach (split /\n/, <<'END')
a href name target class title
b
big
blockquote class
END
{
my($k, @v) = split;
my %h;
foreach ( @v ) { # "unroll" {map {$_, 1} @v
$h{$_} = 1;
}
$PERMITTED{$k} = \%h;
}

print Dumper \%PERMITTED;
------------------------------


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Anno Siegel 03-04-2006 01:01 AM

Re: Confused about Schwartz idiom utilizing map & split
 
weston <notsew-reversePreceedingAndRemoveThis@canncentral.org> wrote in comp.lang.perl.misc:
> In an article on Stonehenge.com on using libxml2 to strip html from a
> document, I came across a part of the listing that I'm having trouble
> understanding. Randall apparently creates a hash of approved tags and


Who is this Randall you speak of?

> their attributes with these lines:


Randal's code constructs a hash of hashes. The first word in each data
line is a primary key. The rest of the words in each line (if any)
become the keys of an inner hash, all with the value 1. Presumably
the inner hash represents a set of whatever, associated with the primary
key.

> =9= my %PERMITTED =
> =10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
> =11= split /\n/, <<'END';
> =12= a href name target class title
> =13= b
> =14= big
> =15= blockquote class
> ....
> =49= END


How does it do that? Rewriting the code with fewer map's and more
variable names may help. (untested)

my @lines = split /\n/, <<'END';
a href name target class title
b
big
blockquote class
END

my %PERMITTED;

for my $line ( @lines ) {
my ($primary_key, @words) = split; # ($k, @v) in the original code
# build wordlist
my @wordlist; # alternating one word and one 1 (for hash initialization)
for my $word ( @v ) {
push @wordlist, ( $word => 1);
}
# build a hash out of @wordlist and assign it to its place
$PERMITTED{ $k} = { @wordlist};
}

> I keep trying to parse line 10 in my head and am not getting a lot of
> mental traction in really understanding how this works. Anybody want to
> help?


Line 10 does basically what the (outer) for-loop does in my code. The
inner for-loop does the job of the nested map.

Randal's code is that of a fluent speaker of Perl. Its parts (the two map's)
are two well-known idioms for hash-building. Applied together, they may
look like a mess, but once you recognize the pattern of each their
interaction becomes clear too.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.

Dr.Ruud 03-04-2006 11:27 AM

Re: Confused about Schwartz idiom utilizing map & split
 
Tad McClellan schreef:

> print Dumper \%PERMITTED;



Alternative:

print Data::Dumper->Dump( [\%var], ['%var'] );

--
Affijn, Ruud

"Gewoon is een tijger."


All times are GMT. The time now is 07:37 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.