Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   get local packages symbols with require (http://www.velocityreviews.com/forums/t946614-get-local-packages-symbols-with-require.html)

Marc Girod 05-22-2012 07:47 AM

get local packages symbols with require
 
Hello,

I am still trying to improve the CPAN ClearCase::Wrapper package:
http://cpansearch.perl.org/src/DSB/C....16/Wrapper.pm

In particular, somebody brought to my attention the fact that errors
in sub-modules which Wrapper loads, are ignored.
I decided that the point to fix was the snippet:

local $^W = 0; # in case a function is redefined
eval {
local @INC = ($dir); # make %INC come out right
eval "require $pkg";
};
warn $@, next if $@;

I restored the warnings, excepting 'redefine'.
I moved the 'warn $@' inside the eval block, right after the require.
This gave me errors for all the modules used.
I removed the @INC restriction, and got to the next problem.

Now I got into %{"${pkg}::"} not only the local symbols, but also all
imported functions (e.g. 'find' from File::Find).
The problem is that I need this information (the local symbols) for
the remaining part.
Most of the symbols are indeed found in the autosplit.ix file which is
also accessed, but not all:
(first) some functions might have been excluded from the
autosplitting, and (second) functions may have been aliased to shorter
names, not matching any new *.al file (listed in autosplit.ix).

My next move was to try to (optionally) 'require' twice, first in an
open verbose mode, then as previously.
In the meanwhile, I restored %{"${pkg}::"} to its initial state.

This led to the next error: 'use vars', in the sub-modules, was
evaluated only once, the first time.

Deep in what looks as a dead end, I ask for critique and help
Thanks,
Marc

Marc Girod 05-22-2012 07:05 PM

Re: get local packages symbols with require
 
Hi Ben,

I start from the bottom.

On May 22, 4:53*pm, Ben Morrow <b...@morrow.me.uk> wrote:

> What is your actual problem? You appear to be thrashing about, trying a
> whole lot of things you don't really understand and making a horrible
> mess, but you haven't explained what you're trying to do and how it
> isn't working.


Sorry. I find this a bit unfair, but I asked for it.
It is absolutely correct that I touch things I don't fully understand.
On another hand, I believe that the guy who wrote this package did
understand what he did.
This doesn't make the result perfect.
What I try to do, I explained in my introduction: the context is to
improve an existing module, which has users, hence respecting its
backwards compatibility.

> > I am still trying to improve the CPAN ClearCase::Wrapper package:
> >http://cpansearch.perl.org/src/DSB/C....16/Wrapper.pm


Now there are precise issues:

> > In particular, somebody brought to my attention the fact that errors
> > in sub-modules which Wrapper loads, are ignored.


Apart for that, this module works satisfactorily and provides a useful
service.
What service? It doesn't really make sense that I copy the doc which I
referred to, on CPAN.
This sets up an infrastructure for developing wrappers around a
proprietary tool, extending or modifying this one's behavior.

I develop myself one such wrapper.

> > * *local $^W = 0; * *# in case a function is redefined
> > * *eval {
> > * * * *local @INC = ($dir); *# make %INC come out right


> What is $dir?


This is enclosed in a:

for my $dir (@INC) { ... }

loop. Wrapper modules will be searched under ClearCase/Wrapper
hierarchies below each such $dir.
Setting @INC to ($dir) restricts the number of visible packages.
Most commonly, the wrapper modules will be found under site_perl,
whereas strict or File::Find are located under lib.
Hence this restriction will prevent accessing many useful modules
specifed with 'use'.

> > * * * *eval "require $pkg";
> > * *};
> > * *warn $@, next if $@;

>
> > I restored the warnings, excepting 'redefine'.


> How did you do that? If you used 'warnings', it won't do what you
> expect. The reason for using $^W is that it propagates into required
> modules; 'warnings' does not.


Right. Thanks. I indeeed used 'no warnings q(redefine)'.
Redefining functions is not something generally recommendable.
It has only come necessary for backwards compatibility reasons, while
adding a top level loop to the wrapper infrastructure, hence requiring
to catch existing 'exit' and 'exec' calls.
It may well stay limited to one level.

> OTOH, $^W will only silence warnings in
> code not under 'warnings', so that's not terribly useful either.


> If you want to do this right, you either need to find a way to avoid
> redefining functions, or you need to set up a local __WARN__ handler
> which filters out the 'redefined' warnings.


I had a look at that and couldn't quite figure out how to do it.
But let's leave it for now.

> > I moved the 'warn $@' inside the eval block, right after the require.
> > This gave me errors for all the modules used.


> What errors?


Cannot find File::Find was one.
Sorry, I am home now, without access to my logs or the environment.

> > I removed the @INC restriction, and got to the next problem.


> ...so, @INC = $dir was completely wrong, and nothing was getting loaded
> at all, but you weren't seeing errors because you threw them away? OK.


As far as I can tell, no.
It must have done something useful despite the hidden errors.
Or I am understanding even less than you think.
What does 'require' do in presence of errors?
Somehow it must have skipped them in some way...

> > Now I got into %{"${pkg}::"} not only the local symbols, but also all
> > imported functions (e.g. 'find' from File::Find).
> > The problem is that I need this information (the local symbols) for
> > the remaining part.


> Sub::Identify will tell you where a sub comes from.


Thanks... Interesting.

> Be careful about grubbing around in the symbol tables: there can be
> things in there you may not expect. For instance, a constant created
> with 'constant' is represented as a scalar ref in the symbol table,
> where you would expect a glob to be. If you look up the glob by name
> perl will transform it into a real glob for you, but if you try to
> follow the ref from the stash you will find it isn't what you expect.


OK.

> > Most of the symbols are indeed found in the autosplit.ix file which is
> > also accessed, but not all:
> > (first) some functions might have been excluded from the
> > autosplitting, and (second) functions may have been aliased to shorter
> > names, not matching any new *.al file (listed in autosplit.ix).

>
> > My next move was to try to (optionally) 'require' twice, first in an
> > open verbose mode, then as previously.

>
> What did you think that would do?


I thought the first pass would populate the symbol table with
everything useful plus extra symbols I didn't want. It would give me
the useful errors and warnings I was lacking, without spurious extra
ones which obviously didn't affect the behavior.
I would throw away the whole result and the second pass would do as it
had always done, and whatever this is.
Now, as told, I noticed that I failed to 'throw away the whole
result', and that the first pass affected the second.

> > In the meanwhile, I restored %{"${pkg}::"} to its initial state.


> Be *even* *more* careful about changing the stashes directly. This is an
> area where there are still (probably) bugs in perl: here be segfaults.


OK. This was my attempt to clean up the result of the first phase.
Naive maybe.

> > This led to the next error: 'use vars', in the sub-modules, was
> > evaluated only once, the first time.


> If you require the same file twice, without clearing the %INC entry, the
> second time will do nothing. This is (pretty-much) the whole point of
> require.


OK. I could clean up %INC. Thanks.
Er... Was that a recommendation?

> > Deep in what looks as a dead end, I ask for critique and help


I got both.
In summary, you offer me two venues. Either:
- use Sub::Identify to skip some entries in %{"${pkg}::"}, without
needing to touch it after the single pass needed.
- cleaning %INC in addition to what I already did.

Clearly the first one looks preferable, especially as I still don't
understand how the 'use vars' were affected (and probably the 'use
constant' as well).
There obviously remains something unsettled about what actually
happens in the original code, with the require under a restricted
@INC, and throwing errors away.
I would be *very* surprised if nothing at all.
In fact, I don't know where else the modules could be loaded...
And, they are.

Thanks,
Marc
Marc

Thanks.

Marc Girod 05-22-2012 07:55 PM

Re: get local packages symbols with require
 
Hi again Ben, and of course others,

On May 22, 8:05*pm, Marc Girod <marc.gi...@gmail.com> wrote:

> - use Sub::Identify to skip some entries in %{"${pkg}::"}, without
> needing to touch it after the single pass needed.


I decided to give it a try from home, and indeed, this looks
promising:

tmp> perl -MFile::Find -M'Sub::Identify q(:all)' -le \
'for (keys %{"File::Find::"}){ \
my $sn=sub_name(\&$_); \
my $p = stash_name(\&$_); \
defined $sn and $p eq q(File::Find) and print "$sn in $p"}'
find in File::Find
finddepth in File::Find

(skipped: 44)

Using B instead trades off only functions which would be loaded from C
libraries, right?
I think this is excluded in my case.

tmp> perl -MFile::Find -MB -le \
'for (keys %{"File::Find::"}){ \
my $coderef=\&$_; \
ref $coderef or next; \
my $cv = B::svref_2object($coderef); \
$cv->isa('B::CV') or next; \
$cv->GV->isa('B::SPECIAL') and next; \
$p=$cv->GV->STASH->NAME; \
if($p eq q(File::Find)){ \
print "$p::",$cv->GV->NAME} \
else{$skip++}} \
print"skipped: $skip"'
find
finddepth
skipped: 44

Thanks,
Marc

Marc Girod 05-23-2012 08:09 AM

Re: get local packages symbols with require
 
Hi Ben,

On May 22, 10:36*pm, Ben Morrow <b...@morrow.me.uk> wrote:

> ...*why*? That's exactly what perl does when you call 'require' without
> messing about with @INC.


OK. I must be able to explain. Even for myself.
This is essential to this infrastructure, and the key word to describe
it probably 'overlay'.

The point is not to run known code, but to allow unknown modules,
mostly auto-split/loaded, to complete and redefine some functions.
These functions are made available through a shared driver (the
cleartool.plx wrapper, aliased to ct), from the command line, such as
e.g.:

ct describe
ct checkin
ct find

....so that the different functions may be provided by different
mixins, and some may override others. E.g. checkin can be offered by
ClearCase::Wrapper::DSB and ClearCase::Wrapper::MGi.
If you install only DSB, you get its definition. If you install both,
MGi's takes over (because of alphabetic order for the time being).

The ct wrapper should work whatever the mix of modules installed.
The modules are written independently by different authors.

So, the main wrapper implements a dispatch table, and forces the
functions to autoload as needed.

> The only effect this will have is that if the
> module you are loading loads more modules in its turn, they will have to
> be in the same directory. Is that the desired effect here? ISTM it will
> just cause problems, unless there's something funny going on you haven't
> posted yet.


In fact, there is. This part is just setting up the table. The real
meat (code) is in the *.al files, and its loading takes place from the
cleartool.plx driver.

There is in fact little code that would not be autosplit/loaded, and
it may well be that until now, it could not 'use' other modules. There
has however always been a 'use strict', which I have never seen to
affect in any way...
Something, though: some data, and function aliases.
And I have been able to place there some small routines, shared by
different autoload files.

I'd wish to be able to put more.

> (It's not quite clear to me yet how much of this you wrote, and how much
> is someone else's code you're trying to improve.


2% and 98%. My part is growing, but started from 0% 5 years ago.

> If you didn't, can you see anything elsewhere in
> the code to suggest why this was done? Do you have any VCS logs you can
> look through?)


Sure.

> The second is that overriding builtins like exit and exec is quite
> different from overriding functions. Which are you trying to do, or are
> you trying to both?


Both. As explained above, allowing to redefine functions is important
too, although the ordering makes it so that the first holds. The
others can be ignored.

> OK. Basically, you would want something like


Thanks. Will be useful.

Marc

Marc Girod 05-23-2012 09:30 AM

Re: get local packages symbols with require
 
Hi,

On May 22, 8:55*pm, Marc Girod <marc.gi...@gmail.com> wrote:

> Using B instead trades off only functions which would be loaded from C
> libraries, right?


I tried this in my code, after a clean require:

for my $subdir (qw(ClearCase/Wrapper)) {
for my $dir (@INC) {
my @pms = sort glob("$dir/$subdir/*.pm");
for my $pm (@pms) {
$pm =~ s%^$dir/(.*)\.pm$%$1%;
(my $pkg = $pm) =~ s%[/\\]+%::%g;
{
eval {
eval "require $pkg";
warn $@ if $@;
};
next if $@;
my $ix = "auto/$pm/autosplit.ix";
if (-e "$dir/$ix") {
eval { require $ix };
warn $@ if $@;
}
}
my %names = %{"${pkg}::"};
for (keys %names) {
my $coderef = \&$_;
ref $coderef or next;
my $cv = B::svref_2object($coderef);
$cv->isa('B::CV') or next;
$cv->GV->isa('B::SPECIAL') and next;
my $p=$cv->GV->STASH->NAME;
next unless $p eq $pkg;
....

Now, under the debugger, this gives, positioned on the last line:

DB<4> x $pkg, $dir, $subdir, scalar keys %names
0 'ClearCase::Wrapper::MGi'
1 '/home/emagiro/perl/lib/perl5'
2 'ClearCase/Wrapper'
3 105
DB<8> p $_
annotate
DB<9> x $p, $pkg
0 'ClearCase::Wrapper'
1 'ClearCase::Wrapper::MGi'
DB<10> x @pms
0 'ClearCase/Wrapper/MGi'

Yet, the function 'annotate' is only defined in
ClearCase::Wrapper::MGi.
Now, it gets assigned (wrongly) to ClearCase::Wrapper.

So... I hope I am progressing, but I am not home yet.

Thanks,
Marc

Marc Girod 05-23-2012 09:58 AM

Re: get local packages symbols with require
 
On May 23, 10:30*am, Marc Girod <marc.gi...@gmail.com> wrote:

> So... I hope I am progressing, but I am not home yet.


I progress reading Dave's original code.
He was 10 years ago ahead of me now.
Now, he had the following:

my $tglob = "${pkg}::$_";
....
if ($] >= 5.006) {
next unless eval { exists &{$tglob} };
}

which shows me my error.
I keep the B code:

my $coderef = \&{$tglob};
ref $coderef or next;
my $cv = B::svref_2object($coderef);
$cv->isa('B::CV') or next;
$cv->GV->isa('B::SPECIAL') and next;
my $p=$cv->GV->STASH->NAME;
next unless $p eq $pkg;

And this works fine on various symbols:

DB<16> x $tglob
0 'ClearCase::Wrapper::MGi::find'
....
DB<19> x $cv->GV->STASH->NAME
0 'File::Find'

DB<22> $tglob = 'ClearCase::Wrapper::MGi::annotate'
....
DB<27> x $cv->GV->STASH->NAME
0 'ClearCase::Wrapper::MGi'

At least the smoke test is through... even on the command line:

ClearCase-Wrapper-MGi> wrapdebug lsgen -d 0 -a MGi.pm
MGi.pm@@/main/lx/18 (MG, MG_4.106)
ClearCase-Wrapper-MGi> ct lsgen -d 0 -a MGi.pm
MGi.pm@@/main/lx/18 (MG, MG_4.106)
ClearCase-Wrapper-MGi> wrapdebug find . -name MGi.pm -print
../MGi.pm@@

Marc

Peter J. Holzer 05-24-2012 11:52 AM

Re: get local packages symbols with require
 
On 2012-05-23 08:09, Marc Girod <marc.girod@gmail.com> wrote:
> On May 22, 10:36*pm, Ben Morrow <b...@morrow.me.uk> wrote:
>> The only effect this will have is that if the
>> module you are loading loads more modules in its turn, they will have to
>> be in the same directory. Is that the desired effect here? ISTM it will
>> just cause problems, unless there's something funny going on you haven't
>> posted yet.

>
> In fact, there is. This part is just setting up the table. The real
> meat (code) is in the *.al files, and its loading takes place from the
> cleartool.plx driver.
>
> There is in fact little code that would not be autosplit/loaded, and
> it may well be that until now, it could not 'use' other modules. There
> has however always been a 'use strict', which I have never seen to
> affect in any way...


Use loads each module only once. Presumably your scripts all start with
"use strict", so strict is loaded long before you mess with @INC. All
modules loaded after that see that strict is already loaded and don't
care where it came from.

hp

--
_ | Peter J. Holzer | Deprecating human carelessness and
|_|_) | Sysadmin WSR | ignorance has no successful track record.
| | | hjp@hjp.at |
__/ | http://www.hjp.at/ | -- Bill Code on asrg@irtf.org

Marc Girod 05-24-2012 01:11 PM

Re: get local packages symbols with require
 
On May 24, 12:52*pm, "Peter J. Holzer" <hjp-usen...@hjp.at> wrote:

> Use loads each module only once. Presumably your scripts all start with
> "use strict", so strict is loaded long before you mess with @INC. All
> modules loaded after that see that strict is already loaded and don't
> care where it came from.


Good point.
What this shows (confirms) is that backwards compatibility is probably
easier to achieve than I thought: no module has even been able to load
anything, except from the autosplit functions.

Marc

Dr.Ruud 05-25-2012 04:18 PM

Re: get local packages symbols with require
 
On 2012-05-22 09:47, Marc Girod wrote:

> eval {
> local @INC = ($dir); # make %INC come out right
> eval "require $pkg";
> };
> warn $@, next if $@;


Don't test $@, but test the what the eval returns.

eval {
local @INC = ($dir); # make %INC come out right
eval "require $pkg";
1; # success
}
or do {
my $eval_error = $@ || 'Zombie Error';
warn $eval_error;
};

--
Ruud

Rainer Weikusat 05-25-2012 04:54 PM

Re: get local packages symbols with require
 
"Dr.Ruud" <rvtol+usenet@xs4all.nl> writes:
> On 2012-05-22 09:47, Marc Girod wrote:
>
>> eval {
>> local @INC = ($dir); # make %INC come out right
>> eval "require $pkg";
>> };
>> warn $@, next if $@;

>
> Don't test $@, but test the what the eval returns.


This is a hack intended to work around destructors clearing or
changing $@ which happen to be executed automatically before the 'eval
scope' is left using a version of Perl older than 5.14, based on the
'ideological standpoint' that fixing broken code is Just Completely
Wrong[tm] and that one should always prefer adding bizarre
workarounds, ideally, even in places where the aren't needed, instead.

Anything else just makes the code too easy to understand and since "it
was hard to write, it should be hard to read" ...


All times are GMT. The time now is 12:13 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.