Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Re: utilities in perl

Reply
Thread Tools

Re: utilities in perl

 
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-21-2013
On 2013-09-21 14:49, Henry Law <(E-Mail Removed)> wrote:
> On 21/09/13 01:04, Cal Dershowitz wrote:
>> if ($#ARGV < 1) {
>> print "Needs directory and filetype\n";
>> exit;
>> }
>> my $dir = $ARGV[0];
>> my $filetype = $ARGV[1];

[...]
>> Q1) Do the ultimate 2 statements effectively pipe the input from stdin
>> to stdout?

>
> No. STDIN contained the two values in one line;


No STDIN didn't contain those values at all. @ARGV is not STDIN!

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | http://www.velocityreviews.com/forums/(E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
 
 
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-21-2013
On 2013-09-21 17:26, Cal Dershowitz <(E-Mail Removed)> wrote:
> On 9/21/2013 8:33 AM, Peter J. Holzer wrote:
>> On 2013-09-21 14:49, Henry Law <(E-Mail Removed)> wrote:
>>> On 21/09/13 01:04, Cal Dershowitz wrote:
>>>> if ($#ARGV < 1) {
>>>> print "Needs directory and filetype\n";
>>>> exit;
>>>> }
>>>> my $dir = $ARGV[0];
>>>> my $filetype = $ARGV[1];

>> [...]
>>>> Q1) Do the ultimate 2 statements effectively pipe the input from stdin
>>>> to stdout?
>>>
>>> No. STDIN contained the two values in one line;

>>
>> No STDIN didn't contain those values at all. @ARGV is not STDIN!

>
> Hmmmm. I'm looking at Stevens and Rago, p. 774
>
> #include <fcntl.h>
>
> int getopt(int argc, const * const argv[], const char *options);
>
> It certainly reminds a person of the straight C version of it.


The C getopt function (and the arguments argc and argv to the main
function) doesn't have anything to do with stdin in C either.

> Maybe you can say a few words why this is not STDIN.


Because it isn't. They are completely separate. There is no "why" except
that Ken and Dennis thought it was a good idea to have both a standard
input and program arguments.

hp

--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      09-21-2013
"Peter J. Holzer" <(E-Mail Removed)> wrote:
>On 2013-09-21 17:26, Cal Dershowitz <(E-Mail Removed)> wrote:

[...]
>>>>> my $dir = $ARGV[0];
>>>>> my $filetype = $ARGV[1];
>>> [...]
>>>>> Q1) Do the ultimate 2 statements effectively pipe the input from stdin
>>>>> to stdout?
>>>>
>>>> No. STDIN contained the two values in one line;
>>>
>>> No STDIN didn't contain those values at all. @ARGV is not STDIN!

>>
>> Maybe you can say a few words why this is not STDIN.


Try a trivial experiment and redirect STDIN, e.g. feed it from a pipe:

cat whateverfile | myprog.pl foo bar
(yes, this is a useless use of cat, just to make the example
super-explicit)

Now, what value to you expect in $dir and $filetype? Based on your
reasoning it must be (part of) the content of whateverfile because that
is where the content of STDIN is coming from.

jue
 
Reply With Quote
 
hymie!
Guest
Posts: n/a
 
      09-21-2013
In our last episode, the evil Dr. Lacto had captured our hero,
Cal Dershowitz <(E-Mail Removed)>, who said:

>I don't want to spend too long talking about something where I clearly
>don't get it, but everyone else here does. I know this is a perl group,
>so C talk is OT.
>
>int main(int argc, char * argv)
>
>Do people still think these values don't come from STDIN in this context?


STDIN means that a program that is already running has asked you a
question and is waiting for you to type in an answer.

In your case, on the other hand, you are starting the program with a set
of arguments already provided when the program starts. That's ARGV.

It is possible, however, that one of the arguments you provide to
the program is - . That is a clue to the operating system that
"this argument should not read data from a pre-existing file, it should
read from STDIN."

--hymie! http://lactose.homelinux.net/~hymie (E-Mail Removed)
-------------------------------------------------------------------------------
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      09-21-2013
(E-Mail Removed) (hymie!) wrote:
>In our last episode, the evil Dr. Lacto had captured our hero,
> Cal Dershowitz <(E-Mail Removed)>, who said:
>
>>I don't want to spend too long talking about something where I clearly
>>don't get it, but everyone else here does. I know this is a perl group,
>>so C talk is OT.
>>
>>int main(int argc, char * argv)
>>
>>Do people still think these values don't come from STDIN in this context?


Of course they don't come from STDIN. They are command line parameters
and have absolutely nothing to do with STDIN.

jue
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      09-21-2013
Cal Dershowitz <(E-Mail Removed)> writes:
<snip>
> Ok. Does one say "data from the command line" for whatever populates
> ARGV? Something specified by Unix?


Yes, but it would be better just to say "the arguments for the program".
After all, that's all @ARGV stands for -- the "argument vector" so named
because C programmers use argv as the canonical name for the C
equivalent.

They do often come from a command line, but some environments don't have a
command line (for example Perl scripts running on a web server may not
have such a thing), and programs can be started in other ways (see, for
example, perldoc -f exec).

<snip>
--
Ben.
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-22-2013
On 2013-09-21 19:59, hymie! <(E-Mail Removed)> wrote:
> In our last episode, the evil Dr. Lacto had captured our hero,
> Cal Dershowitz <(E-Mail Removed)>, who said:
>>I don't want to spend too long talking about something where I clearly
>>don't get it, but everyone else here does. I know this is a perl group,
>>so C talk is OT.
>>
>>int main(int argc, char * argv)
>>
>>Do people still think these values don't come from STDIN in this context?

>
> STDIN means that a program that is already running has asked you a
> question and is waiting for you to type in an answer.


No, it doesn't mean that. Many programs reading from stdin never ask you
any questions. For example all the typical Unix filters: cat, grep, cut,
sort, ...


> In your case, on the other hand, you are starting the program with a set
> of arguments already provided when the program starts. That's ARGV.


Yes, he has provided the program with arguments and those can be
accessed through @ARGV. However, that hasn't anything to do with stdin.
A program can choose to read or not to read from stdin whether it was
passed any command line arguments or not.


> It is possible, however, that one of the arguments you provide to
> the program is - . That is a clue to the operating system that
> "this argument should not read data from a pre-existing file, it should
> read from STDIN."


Also wrong. It's not a clue to the operating system, it is a clue to the
program. Many programs accept a "-" instead of a filename to mean either
"read from stdin" or "write to stdout". This is something the program
has to handle. tho OS just passes the "-" to the program.


To summarize:

On startup, a program is provided with three sets of information:

1) The argument vector: This is an array of strings containing the
command name and any "command line" arguments, i.e. the arguments
you type on the command line after the command (interactively), or
the arguments to exec (in a program). (Perl is a bit unusual in that
it shoves the first argument (the command name) into $0 and only the
rest of the arguments into @ARGV).

2) The environment: Another array of strings. By convention each program
passes this through to any programs it invokes and the strings are in
"key=value" format. This contains the PATH, locale information,
information about the terminal (if applicable) and other
configuration information.

3) A set of three file descriptors numbered 0, 1, and 2, and typically
called stdin, stdout, and stderr respectively in most programming
languages. These are *file descriptors*, not strings. You can read
from them (well, you should read only from stdin) with the read
system call (or higher level functions like getc() in C or <> in
Perl) and write to them (stdout and stderr, at least) with write (or
print or printf, etc.)

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
Tim McDaniel
Guest
Posts: n/a
 
      09-23-2013
In article <(E-Mail Removed)>,
Ben Morrow <(E-Mail Removed)> wrote:
>
>Quoth "Peter J. Holzer" <(E-Mail Removed)>:
>>
>> To summarize:
>>
>> On startup, a program is provided with three sets of information:
>>
>> 1) The argument vector: This is an array of strings containing the
>> command name and any "command line" arguments, i.e. the arguments
>> you type on the command line after the command (interactively), or
>> the arguments to exec (in a program). (Perl is a bit unusual in that
>> it shoves the first argument (the command name) into $0 and only the
>> rest of the arguments into @ARGV).

>
>No, the first argument (the command name) goes into $^X, the first
>non-option argument goes into $0, and the rest of the arguments go into
>@ARGV.


I found your answer confusing. When I type a command line, like just
now with
$ chmod u+x local/test/106.pl
$ local/test/106.pl hello world
$0 was 'local/test/106.pl', as I expected, which was what I was
thinking of as the "command name", and I was thinking of "hello" as
the "first non-option argument".

However, the first line of the script was
#! /usr/bin/perl
and $^X was output as '/usr/bin/perl'.

So I think the explanation should be expanded. In UNIXy systems, for
a script that starts with #! and run from the command line, the
program on the #! line is put into $^X, and in particular, if it's a
Perl script, $^X is the perl program being run. $0 is set using the
first word on the command line (identifying the script itself), and
the rest of the arguments are put into @ARGV.

--
Tim McDaniel, (E-Mail Removed)
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-23-2013
On 2013-09-22 18:43, Ben Morrow <(E-Mail Removed)> wrote:
>
> Quoth "Peter J. Holzer" <(E-Mail Removed)>:
>>
>> To summarize:
>>
>> On startup, a program is provided with three sets of information:
>>
>> 1) The argument vector: This is an array of strings containing the
>> command name and any "command line" arguments, i.e. the arguments
>> you type on the command line after the command (interactively), or
>> the arguments to exec (in a program). (Perl is a bit unusual in that
>> it shoves the first argument (the command name) into $0 and only the
>> rest of the arguments into @ARGV).

>
> No, the first argument (the command name) goes into $^X, the first
> non-option argument goes into $0, and the rest of the arguments go into
> @ARGV.


That's what I get from adding parenthetical remarks just befor posting.
You are right of course, from the POV of the perl process. $0 and @ARGV
are handles as I described from the POV of the caller, but I didn't
write that.


> (Unless perl gets $^X from somewhere else, in which case the
> first argument is thrown away, or you pass an -e option, in which case
> $0 is "-e".)
>
> This is further confused by the kernel's (and perl's) #! processing, but
> by the time perl gets its final argument list to process the first
> argument is a path to perl itself.


Why "further confused"? The mechanism you describe is perl's attempt to
undo the effects of kernel's #! processing.

The caller invokes »execl("/usr/local/bin/script", "script", "foo",
NULL),
the kernel finds "#!/usr/bin/perl" in "/usr/local/bin/script" and
invokes /usr/bin/perl with the argv ["/usr/bin/perl",
"/usr/local/bin/script", "foo"] instead (note that the original argv[0]
is lost in the process)
the perl interpreter then "hides" itself by putting what it thinks was
the original argv[0] into $0 and the original argv[1] .. argv[argc-1]
into @ARGV.


> This is not really unusual: it's what all the shells do, and I'd wager
> also any other language which has some equivalent to $0.
>
>> 2) The environment: Another array of strings. By convention each program
>> passes this through to any programs it invokes and the strings are in
>> "key=value" format. This contains the PATH, locale information,
>> information about the terminal (if applicable) and other
>> configuration information.
>>
>> 3) A set of three file descriptors numbered 0, 1, and 2, and typically
>> called stdin, stdout, and stderr respectively in most programming
>> languages. These are *file descriptors*, not strings. You can read
>> from them (well, you should read only from stdin) with the read
>> system call (or higher level functions like getc() in C or <> in
>> Perl) and write to them (stdout and stderr, at least) with write (or
>> print or printf, etc.)

>
> In fact, a completely arbitrary set of file descriptors, which may or
> may not be contiguously numbered. It's entirely possible to invoke a
> program with one of the standard fds closed, though it's not a good idea
> since many programs misbehave.


Linux enforces that at least these three file descriptors are open at
least on setuid programs, but I don't know offhand whether that's done
by the kernel or the startup code. And I am aware that this isn't true
for other unixes.


> It's also not uncommon to pass additional open file descriptors.


Yes, I should have written "at least three". You can always pass more,
and indeed some Unixes did pass a fd to the controlling terminal as file
descriptor 3 ("stdtty") by default.

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      09-23-2013
(E-Mail Removed) (Tim McDaniel) writes:
> In article <(E-Mail Removed)>,
> Ben Morrow <(E-Mail Removed)> wrote:
>>
>>Quoth "Peter J. Holzer" <(E-Mail Removed)>:
>>>
>>> To summarize:
>>>
>>> On startup, a program is provided with three sets of information:
>>>
>>> 1) The argument vector: This is an array of strings containing the
>>> command name and any "command line" arguments, i.e. the arguments
>>> you type on the command line after the command (interactively), or
>>> the arguments to exec (in a program). (Perl is a bit unusual in that
>>> it shoves the first argument (the command name) into $0 and only the
>>> rest of the arguments into @ARGV).

>>
>>No, the first argument (the command name) goes into $^X, the first
>>non-option argument goes into $0, and the rest of the arguments go into
>>@ARGV.

>
> I found your answer confusing. When I type a command line, like just
> now with
> $ chmod u+x local/test/106.pl
> $ local/test/106.pl hello world
> $0 was 'local/test/106.pl', as I expected, which was what I was
> thinking of as the "command name", and I was thinking of "hello" as
> the "first non-option argument".


That's probably how the shell invoked it but it need not be done in this
way. Assuming execl as an example, the general format of that is

execl("/path/to/file", "argument #0", ...);

the first argument to execl being the pathname of the file which is
supposed to be executed and the next being what ends up in argv[0]. By
convention, this should be 'the program name' and IIRC, POSIX even says
somewhere that it should really just be the name and not the
path. Assuming that /tmp/a.pl is the following perl script,

-----
#!/usr/bin/perl
print($^X, "\t", $0, "\t", $ARGV[0], "\n");
-----

this could be invoked via

-----
#include <unistd.h>

int main(void)
{
execl("/tmp/a.pl", "Blafasel", "Are we having an argument?", (void *)0);
return 0;
}
-----

and the output would be

-----
/usr/bin/perl /tmp/a.pl Are we having an argument?
-----

with the original 'program name' ("Blafasel") vanishing in the
process. It could also be called with

-----
#include <unistd.h>

int main(void)
{
execl("/usr/bin/perl", "Now what?", "/tmp/a.pl", "Are we having an argument?", (void *)0);
return 0;
}
-----

This will result in the same output on a system which supports
/proc/self/exe aka 'Linux' but in case perl has to resort to the real
'program name' argument, $^X should become "Now what?" (according to the
documentation).

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: utilities in perl George Mpouras Perl Misc 5 09-26-2013 02:48 PM
Re: utilities in perl $Bill Perl Misc 4 09-22-2013 11:22 AM
Perl CGI utilities? JWhite Perl Misc 5 07-05-2008 03:39 AM
Accessing Browser Utilities on LinkSys WAP11 Doug Miannay Wireless Networking 1 04-18-2005 06:51 AM
Cisco Tools & Utilities JohnNews Cisco 3 10-28-2003 10:38 PM



Advertisments