Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Re: shebang strange thing... (http://www.velocityreviews.com/forums/t318766-re-shebang-strange-thing.html)

Dan Bishop 06-23-2003 11:29 PM

Re: shebang strange thing...
 
mwilson@the-wire.com (Mel Wilson) wrote in message news:<yKH9+ks/KvvV089yn@the-wire.com>...
> In article <uZSIa.534$sF6.54902840@newssvr21.news.prodigy.com >,
> Van Gale <news@exultants.org> wrote:
> >Erik Max Francis wrote:
> >> Van Gale wrote:
> >>>There's a very subtle bug (feature?) in bash (and maybe other shells?)
> >>>that will generate this error if the line is terminated with a CR/LF
> >>>pair instead of just a linefeed.
> >> Yes, it's common to other shells. It's not a feature, but it's not a
> >> "bug" per se -- using CR LF terminated text files on Unix is operator
> >> error.

>
> >Well guess what, it happens. It even happens when the "operator" is
> >aware of the problem. So lets not call it a bug, and instead call it
> >poor programming because the error message is not only incorrect, but
> >will also waste a fair amount of time of someone trying to debug the
> >problem because it points them in the wrong direction.

>
> It's understandable once you realize that the shell
> thinks the '\r' is part of the filename. Just like
>
> os.execv ('/usr/bin/python\r', ('myfile.py',))


But how many people use \r at the end of filenames? Or are even aware
that they can?

Even if it isn't a bug, it's a feature that causes more harm than
good.

Ben Finney 06-24-2003 12:25 AM

Re: shebang strange thing...
 
On 23 Jun 2003 16:29:44 -0700, Dan Bishop wrote:
> mwilson@the-wire.com (Mel Wilson) wrote:
>> It's understandable once you realize that the shell thinks the '\r'
>> is part of the filename. Just like
>>
>> os.execv ('/usr/bin/python\r', ('myfile.py',))

>
> But how many people use \r at the end of filenames? Or are even aware
> that they can?
>
> Even if it isn't a bug, it's a feature that causes more harm than
> good.


You seem to be under the impression that this is some "process the \r at
the end of the filename" feature. It isn't. The kernel will treat
everything from the shebang to the linefeed as the command-line to be
used; there's no special "feature" specifically spotting a rogue
character and tripping the foolish user up.

This is simple, known, documented behaviour. If other systems place
foreign characters in the shebang line, it's up to the *user* to know
that; the kernel does what it's told. I certainly don't want the kernel
having special-case, workaround code for line-ending confusions that are
nothing to do with it.

When writing shell scripts, there are many things to learn; line endings
is but one of them. When moving files between operating systems, there
are many things to learn; the differences in line endings is but one of
them.

It's not the job of the kernel to protect the user from herself. That's
the job of userspace programs -- or meatspace processes :-)

--
\ "A man may be a fool and not know it -- but not if he is |
`\ married." -- Henry L. Mencken |
_o__) |
http://bignose.squidly.org/ 9CFE12B0 791A4267 887F520C B7AC2E51 BD41714B

Erik Max Francis 06-24-2003 12:53 AM

Re: shebang strange thing...
 
Dan Bishop wrote:

> But how many people use \r at the end of filenames? Or are even aware
> that they can?
>
> Even if it isn't a bug, it's a feature that causes more harm than
> good.


It's simply an end-of-line issue. The "bug" here is that DOS chose to
use CR LF as the end-of-line terminator. (Mac gets even fewer points,
since it chose to do deliberately do smoething even more different.)
This has nothing to do with Unix, it's an inherent difference between
platforms. The platforms are not the same; if you pretend like they are
then you'll continually run into problems.

--
Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ I go out with actresses because I'm not apt to marry one.
\__/ Henry Kissinger

Greg Ewing (using news.cis.dfn.de) 06-25-2003 03:05 AM

Re: shebang strange thing...
 
Erik Max Francis wrote:
> (Mac gets even fewer points,
> since it chose to do deliberately do smoething even more different.)


That's debatable. At least the Mac still only uses *one*
character for each end-of-line...

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg


Erik Max Francis 06-25-2003 05:31 AM

Re: shebang strange thing...
 
"Greg Ewing (using news.cis.dfn.de)" wrote:

> Erik Max Francis wrote:
>
> > (Mac gets even fewer points,
> > since it chose to do deliberately do smoething even more different.)

>
> That's debatable. At least the Mac still only uses *one*
> character for each end-of-line...


At the time the Mac was created, line endings of LF (Unix) and CR LF
(CP/M, DOS) were common. The only reason you'd choose CR is to do
something different. Their implementation of data forks vs. resource
forks are another example of Apple doing something different simply for
its own sake, which introduce massive interplatform compatibility
problems.

--
Erik Max Francis && max@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ The doors of Heaven and Hell are adjacent and identical.
\__/ Nikos Kazantzakis

Bengt Richter 06-25-2003 05:10 PM

Re: shebang strange thing...
 
On Tue, 24 Jun 2003 22:31:22 -0700, Erik Max Francis <max@alcyone.com> wrote:

>"Greg Ewing (using news.cis.dfn.de)" wrote:
>
>> Erik Max Francis wrote:
>>
>> > (Mac gets even fewer points,
>> > since it chose to do deliberately do smoething even more different.)

>>
>> That's debatable. At least the Mac still only uses *one*
>> character for each end-of-line...

>
>At the time the Mac was created, line endings of LF (Unix) and CR LF
>(CP/M, DOS) were common. The only reason you'd choose CR is to do
>something different. Their implementation of data forks vs. resource
>forks are another example of Apple doing something different simply for
>its own sake, which introduce massive interplatform compatibility
>problems.

I'm not sure it's entirely fair to jump to the conclusion that Apple chose CR
as line end only to be different.

My interpretation is that going from CRLF to a single character signalled
a switch from hardware-control semantics to symbolic semantics. I.e., CR and LF
originally literally referred to printing hardware with a physical "carriage" like
a typewriter or -- keeping paper handling stationary -- a moving print head,
and line feeding actually fed paper a line at a time. They're still used when
dealing with devices and emulated/simulated devices, of course.

The CR is what you get when you hit the Enter key, so Apple did the most direct thing
in using that key code as an EOL symbol. Perhaps they thought that was "cleaner" and
that they would lead the way to a cleaner standard way of doing things when they
achieved market dominance ;-)

Of course, LF has (IMO) a better semantic relationship to the EOL meaning, so translating
the Enter key to LF seems better than CR. Either way, ISTM yet another symptom of what happens
when the hardware-oriented evolves towards the abstract. You wind up with vestigial hardware
semantics in abstract contexts where they don't really belong, e.g., as in one of my pet peeves:
drive letters in file paths.

Regards,
Bengt Richter

Michael Coleman 06-25-2003 10:34 PM

Re: shebang strange thing...
 
brian_l@yahoo.com (Brian Lenihan) writes:
> If you have two or more Python installations, the first one in
> your path gets invoked no matter what the shebang line says.
>
> If the first line of a script is #!/usr/local/bin/python, I expect the
> interpreter located in /usr/local/bin to execute the script, not the
> one in /usr/bin, or the one in /sw/bin, but that is what you get if
> you run the script as an executable.
>
> The process list shows why - python is called without a path, e.g. as
> "python". The same behavior occurs if the shell is bash or tcsh.
> As far as I know, OS X is the only "modern" Unix to behave this way.


Tru64 (5.1) also shows this behavior (which recently bit me too), but
it's arguably a bug in Python rather than in the OS. If you look
carefully, I think you'll find that the correct binary (e.g.,
/usr/local/bin/python) is in fact being invoked, but that that binary
then uses the libraries associated with the first python in your PATH.
The reason this is happening is that python determines where all of
its libraries live by examining argv[0], if a more suitable method is
not available. If this gives the full path, everything is fine, but
if only the basename is given ("python"), then the startup code walks
to the PATH to guess. As you've noticed, in some cases, this guess is
wrong.

Mike

--
Mike Coleman, Scientific Programmer, +1 816 926 4419
Stowers Institute for Biomedical Research
1000 E. 50th St., Kansas City, MO 64110

Brian Lenihan 06-26-2003 09:34 AM

Re: shebang strange thing...
 
Michael Coleman <mkc@stowers-institute.org> wrote in message news:<85el1hpz0c.fsf@stowers-institute.org>...
> brian_l@yahoo.com (Brian Lenihan) writes:
> > If you have two or more Python installations, the first one in
> > your path gets invoked no matter what the shebang line says.
> >
> > If the first line of a script is #!/usr/local/bin/python, I expect the
> > interpreter located in /usr/local/bin to execute the script, not the
> > one in /usr/bin, or the one in /sw/bin, but that is what you get if
> > you run the script as an executable.
> >
> > The process list shows why - python is called without a path, e.g. as
> > "python". The same behavior occurs if the shell is bash or tcsh.
> > As far as I know, OS X is the only "modern" Unix to behave this way.

>
> Tru64 (5.1) also shows this behavior (which recently bit me too), but
> it's arguably a bug in Python rather than in the OS. If you look
> carefully, I think you'll find that the correct binary (e.g.,
> /usr/local/bin/python) is in fact being invoked, but that that binary
> then uses the libraries associated with the first python in your PATH.
> The reason this is happening is that python determines where all of
> its libraries live by examining argv[0], if a more suitable method is
> not available. If this gives the full path, everything is fine, but
> if only the basename is given ("python"), then the startup code walks
> to the PATH to guess. As you've noticed, in some cases, this guess is
> wrong.


I never use True64, but my company does, so I'm glad you identified
the same problem on that platform. argv[0] should contain the full
path to the interpreter and it does not, which makes me believe
this is an OS error, not a Python error, except you could argue that
relying on argv is not a platform independent way to find the
correct path.

If I could get the Panther install CD to boot on my PowerBook, I
could see if this is still going to be a problem in the future.

- 06-26-2003 07:25 PM

Re: shebang strange thing...
 
bokr@oz.net (Bengt Richter) wrote in message news:<bdcl1d$v43$0@216.39.172.122>...
> The CR is what you get when you hit the Enter key, so Apple did the most direct thing
> in using that key code as an EOL symbol. Perhaps they thought that was "cleaner" and
> that they would lead the way to a cleaner standard way of doing things when they
> achieved market dominance ;-)
>


I think the terminology is not taken from typewriters, but from some
old printers where you needed both characters to start a new line.

CR moved the print head to the beginning of the line and LF moved the
paper one line. It can't be compared with a typewriter, where the
[Enter] key did both operations. The Microsoft (other operating
systems also had similar EOF) way is actually the "correct" way, since
the "cursor" needs to move down one line and start at the beginning.
The Unix way is of cource more elegant, because you have a digital
computer and not some mechanical device. It doesn't matter if it's CR
or LF, because both characters only does half of the operation. Apple
should have chosen LF to preserve compatibillity.

Ben Finney 06-26-2003 11:43 PM

Re: shebang strange thing...
 
On 26 Jun 2003 12:25:55 -0700, - wrote:
> I think the terminology is not taken from typewriters, but from some
> old printers where you needed both characters to start a new line.
>
> CR moved the print head to the beginning of the line


CR stands for "carriage return". If you're talking about a print head
moving across the paper, you're no longer talking about a carriage
"returning", so the terminology obviously didn't come from electric
printers.

Carriage Return is a direct reference to the paper carriage on a manual
typewriter. These predate electric printing machines, and thus the
terminology was borrowed when teletypes needed control codes to control
their print head.

On such typewriters, the "line feed" function was also separate; once
the carriage was returned to the start of the line, one could cause
the paper to feed up a line at a time to introduce more vertical space;
this didn't affect the position of the paper carriage, so was
conceptually a separate operation.

So, it was teletypes that needlessly preserved the CR and LF as separate
control operations, due to the typewriter-based thinking of their
designers. If they'd been combined into the one operation, we would
have all the same functionality but none of the confusion over line
ending controls.

--
\ "I installed a skylight in my apartment. The people who live |
`\ above me are furious!" -- Steven Wright |
_o__) |
http://bignose.squidly.org/ 9CFE12B0 791A4267 887F520C B7AC2E51 BD41714B


All times are GMT. The time now is 11:53 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.