Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   Tabs versus Spaces in Source Code (http://www.velocityreviews.com/forums/t897976-tabs-versus-spaces-in-source-code.html)

Xah Lee 05-15-2006 02:04 AM

Tabs versus Spaces in Source Code
 
Tabs versus Spaces in Source Code

Xah Lee, 2006-05-13

In coding a computer program, there's often the choices of tabs or
spaces for code indentation. There is a large amount of confusion about
which is better. It has become what's known as “religious war” —
a heated fight over trivia. In this essay, i like to explain what is
the situation behind it, and which is proper.

Simply put, tabs is proper, and spaces are improper. Why? This may seem
ridiculously simple given the de facto ball of confusion: the semantics
of tabs is what indenting is about, while, using spaces to align code
is a hack.

Now, tech geekers may object this simple conclusion because they itch
to drivel about different editors and so on. The alleged problem
created by tabs as seen by the industry coders are caused by two
things: (1) tech geeker's sloppiness and lack of critical thinking
which lead them to not understanding the semantic purposes of tab and
space characters. (2) Due to the first reason, they have created and
propagated a massive none-understanding and mis-use, to the degree that
many tools (e.g. vi) does not deal with tabs well and using spaces to
align code has become widely practiced, so that in the end spaces seem
to be actually better by popularity and seeming simplicity.

In short, this is a phenomenon of misunderstanding begetting a snowball
of misunderstanding, such that it created a cultural milieu to embrace
this malpractice and kick what is true or proper. Situations like this
happens a lot in unix. For one non-unix example, is the file name's
suffix known as “extension”, where the code of file's type became
part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
Another well-known example is HTML practices in the industry, where
badly designed tags from corporation's competitive greed, and stupid
coding and misunderstanding by coders and their tools are so
wide-spread such that they force the correct way to the side by the
eventual standardization caused by sheer quantity of inproper but set
practice.

Now, tech geekers may still object, that using tabs requires the
editors to set their positions, and plain files don't carry that
information. This is a good question, and the solution is to advance
the sciences such that your source code in some way embed such
information. This would be progress. However, this is never thought of
because the “unix philosophies” already conditioned people to hack
and be shallow. In this case, many will simply use the character
intended to separate words for the purpose of indentation or alignment,
and spread the practice with militant drivels.

Now, given the already messed up situation of the tabs vs spaces by the
unixers and unix brain-washing of the coders in the industry... Which
should we use today? I do not have a good proposition, other than just
use whichever that works for you but put more critical thinking into
things to prevent mishaps like this.

Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
semantic vs format. In these, it is always easy to convert from the
former to the latter, but near impossible from the latter to the
former. And, that is because the former encodes information that is
lost in the latter. If we look at the issue of tabs vs spaces, indeed,
it is easy to convert tabs to spaces in a source code, but more
difficult to convert from spaces to tabs. Because, tabs as indentation
actually contains the semantic information about indentation. With
spaces, this critical information is lost in space.

This issue is intimately related to another issue in source code:
soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
line character). This issue has far more consequences than tabs vs
spaces, and the unixer's unthinking has made far-reaching damages in
the computing industry. Due to unix's EOL ways of thinking, it has
created languages based on EOL (just about ALL languages except the
Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
and basically every tool in unix), thoughts based on EOL (software
value estimation by counting EOL, hard-coded email quoting system by
“>” prefix, and silent line-truncations in many unix tools), such
that any progress or development towards a “algorithmic code unit”
concept or language syntaxes are suppressed. I have not written a full
account on this issue, but i've touched it in this essay: “The Harm
of hard-wrapping Lines”, at
http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
----
This post is archived at:
http://xahlee.org/UnixResource_dir/w...vs_spaces.html

Xah
xah@xahlee.org
http://xahlee.org/


Eli Gottlieb 05-15-2006 02:44 AM

Re: Tabs versus Spaces in Source Code
 
Actually, spaces are better for indenting code. The exact amount of
space taken up by one space character will always (or at least tend to
be) the same, while every combination of keyboard driver, operating
system, text editor, content/file format, and character encoding all
change precisely what the tab key does.

There's no use in typing "tab" for indentation when my text editor will
simply convert it to three spaces, or worse, autoindent and mix tabs
with spaces so that I have no idea how many actual whitespace characters
of what kinds are really taking up all that whitespace. I admit it
doesn't usually matter, but then you go back to try and make your code
prettier and find yourself asking "WTF?"

Undoubtedly adding the second spark to the holy war,
Eli

--
The science of economics is the cleverest proof of free will yet
constructed.

Edward Elliott 05-15-2006 03:28 AM

Re: Tabs versus Spaces in Source Code
 
Eli Gottlieb wrote:

> Actually, spaces are better for indenting code. The exact amount of
> space taken up by one space character will always (or at least tend to
> be) the same, while every combination of keyboard driver, operating
> system, text editor, content/file format, and character encoding all
> change precisely what the tab key does.


What you see as tabs' weakness is their strength. They encode '1 level of
indentation', not a fixed width. Of course tabs are rendered differently
by different editors -- that's the point. If you like indentation to be 2
or 3 or 7 chars wide, you can view your preference without forcing it on
the rest of the world. It's a logical rather than a fixed encoding.


> There's no use in typing "tab" for indentation when my text editor will
> simply convert it to three spaces, or worse, autoindent and mix tabs
> with spaces so that I have no idea how many actual whitespace characters
> of what kinds are really taking up all that whitespace. I admit it
> doesn't usually matter, but then you go back to try and make your code
> prettier and find yourself asking "WTF?"


Sounds like the problem is your editor, not tabs. But I wouldn't rule out
PEBCAK either. ;)


> Undoubtedly adding the second spark to the holy war,


Undoubtedly. Let's keep it civil, shall we? And please limit the
cross-posting to a minimum. (directed at the group, not you personally
Eli).

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

jmcgill 05-15-2006 04:43 AM

Re: Tabs versus Spaces in Source Code
 

If I work on your project, I follow the coding and style standards you
specify.

Likewise if you work on my project you follow the established standards.

Fortunately for you, I am fairly liberal on such matters.

I like to see 4 spaces for indentation. If you use tabs, that's what I
will see, and you're very likely to have your code reformatted by the
automated build process, when the standard copyright header is pasted
and missing javadoc tags are generated as warnings.

I like the open brace to start on the line of the control keyword. I
can deal with the open brace being on the next line, at the same level
of indentation as the control keyword. I don't quite understand the
motivation behind the GNU style, where the brace itself is treated as a
half-indent, but I can live with it on *your* project.

Any whitespace or other style that isn't happy to be reformatted
automatically is an error anyway.

I'd be very laissez-faire about it except for the fact that code
repositories are much easier to manage if everything is formatted before
it goes in, or as a compromise, as a step at release tags.

mystilleef 05-15-2006 07:56 AM

Re: Tabs versus Spaces in Source Code
 
I agree, use tabs.


Mumia W. 05-15-2006 08:00 AM

Re: Tabs versus Spaces in Source Code
 
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>


Thanks Xah. I value your posts. Keep posting. And since your posts
usually cover broad areas of CS, keep crossposting. Don't go anywhere
Xah :-)


> Simply put, tabs is proper, and spaces are improper. Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
>


I wouldn't say that spaces are a hack, but tabs are superior.

> Now, tech geekers may object this simple conclusion because they itch
> to drivel about different editors and so on. The alleged problem
> created by tabs as seen by the industry coders are caused by two
> things: (1) tech geeker's sloppiness and lack of critical thinking
> which lead them to not understanding the semantic purposes of tab and
> space characters. (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well and using spaces to
> align code has become widely practiced, so that in the end spaces seem
> to be actually better by popularity and seeming simplicity.
>


Don't forget the laziness of programmers like me who don't put the
tabbing information in the source file. Vim deals with tabs well IMO,
but I almost never used to put the right auto-commands in the file to
get it set up right for other users.

> In short, this is a phenomenon of misunderstanding begetting a snowball
> of misunderstanding, such that it created a cultural milieu to embrace
> this malpractice and kick what is true or proper. Situations like this
> happens a lot in unix. For one non-unix example, is the file name's
> suffix known as “extension”, where the code of file's type became
> part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
> Another well-known example is HTML practices in the industry, where
> badly designed tags from corporation's competitive greed, and stupid
> coding and misunderstanding by coders and their tools are so
> wide-spread such that they force the correct way to the side by the
> eventual standardization caused by sheer quantity of inproper but set
> practice.
>
> Now, tech geekers may still object, that using tabs requires the
> editors to set their positions, and plain files don't carry that
> information. This is a good question, and the solution is to advance
> the sciences such that your source code in some way embed such
> information.


Vim does this. We just have to use it.

> This would be progress. However, this is never thought of
> because the “unix philosophies” already conditioned people to hack
> and be shallow. In this case, many will simply use the character
> intended to separate words for the purpose of indentation or alignment,
> and spread the practice with militant drivels.
>
> Now, given the already messed up situation of the tabs vs spaces by the
> unixers and unix brain-washing of the coders in the industry... Which
> should we use today? I do not have a good proposition, other than just
> use whichever that works for you but put more critical thinking into
> things to prevent mishaps like this.
>
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former. And, that is because the former encodes information that is
> lost in the latter.


Nope. Conversion is relatively easy. I've written programs to do this
myself, and everyone and his brother has also done this. Virtually every
programmer's editor that I've ever used can do this, and a great, great
many independent programs convert tabs to spaces. It's like saying,
"it's near impossible to write a calculator program." :-)

I bet that someone has a Perl one-liner to do it.

On any Debian system, try a "man expand" and see what you find. Also,
emacs and vim do it. Perl has a Text::Tabs module. TCL's
::textutil::(un)?tabify routines do it. The birds do it, and the bees do
it. Oh wait, that's something else :-)

> If we look at the issue of tabs vs spaces, indeed,
> it is easy to convert tabs to spaces in a source code, but more
> difficult to convert from spaces to tabs.


Nope again. It's easy, you just keep track of the virtual character
position as you decide whether to write a space or a tab. Computers do
the "counting" thing fairly well.

> Because, tabs as indentation
> actually contains the semantic information about indentation. With
> spaces, this critical information is lost in space.
>
> This issue is intimately related to another issue in source code:
> soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
> line character). This issue has far more consequences than tabs vs
> spaces, and the unixer's unthinking has made far-reaching damages in
> the computing industry. Due to unix's EOL ways of thinking, it has
> created languages based on EOL (just about ALL languages except the
> Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
> and basically every tool in unix), thoughts based on EOL (software
> value estimation by counting EOL, hard-coded email quoting system by
> “>” prefix, and silent line-truncations in many unix tools), such
> that any progress or development towards a “algorithmic code unit”
> concept or language syntaxes are suppressed. I have not written a full
> account on this issue, but i've touched it in this essay: “The Harm
> of hard-wrapping Lines”, at
> http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
> ----
> This post is archived at:
> http://xahlee.org/UnixResource_dir/w...vs_spaces.html
>
> Xah
> xah@xahlee.org
> ∑ http://xahlee.org/
>


I've never thought of tabs-vs-spaces as a religious war. Anyway, the
authority of the programming environment will determine which one is
used. Have a good week Xah.

Big and Blue 05-15-2006 07:03 PM

Re: Tabs versus Spaces in Source Code
 
Edward Elliott wrote:

> What you see as tabs' weakness is their strength. They encode '1 level of
> indentation', not a fixed width. Of course tabs are rendered differently
> by different editors -- that's the point.


Yes - it is the point. It is the point why you should never use them.
The meaning of the tab is writer-dependent while the
representation/interpretation of a tab is reader-dependent. What you think
is lined up nicely will be mis-aligned for someone else.

If you wish to allow your meaning to be changed by any and every reader
then reconsider what you are writing.


--
Just because I've written it doesn't mean that
either you or I have to believe it.

Iain King 05-16-2006 09:39 AM

Re: Tabs versus Spaces in Source Code
 
Oh God, I agree with Xah Lee. Someone take me out behind the chemical
sheds...

Iain


Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>
> Simply put, tabs is proper, and spaces are improper. Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
>
> Now, tech geekers may object this simple conclusion because they itch
> to drivel about different editors and so on. The alleged problem
> created by tabs as seen by the industry coders are caused by two
> things: (1) tech geeker's sloppiness and lack of critical thinking
> which lead them to not understanding the semantic purposes of tab and
> space characters. (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well and using spaces to
> align code has become widely practiced, so that in the end spaces seem
> to be actually better by popularity and seeming simplicity.
>
> In short, this is a phenomenon of misunderstanding begetting a snowball
> of misunderstanding, such that it created a cultural milieu to embrace
> this malpractice and kick what is true or proper. Situations like this
> happens a lot in unix. For one non-unix example, is the file name's
> suffix known as “extension”, where the code of file's type became
> part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
> Another well-known example is HTML practices in the industry, where
> badly designed tags from corporation's competitive greed, and stupid
> coding and misunderstanding by coders and their tools are so
> wide-spread such that they force the correct way to the side by the
> eventual standardization caused by sheer quantity of inproper but set
> practice.
>
> Now, tech geekers may still object, that using tabs requires the
> editors to set their positions, and plain files don't carry that
> information. This is a good question, and the solution is to advance
> the sciences such that your source code in some way embed such
> information. This would be progress. However, this is never thought of
> because the “unix philosophies” already conditioned people to hack
> and be shallow. In this case, many will simply use the character
> intended to separate words for the purpose of indentation or alignment,
> and spread the practice with militant drivels.
>
> Now, given the already messed up situation of the tabs vs spaces by the
> unixers and unix brain-washing of the coders in the industry... Which
> should we use today? I do not have a good proposition, other than just
> use whichever that works for you but put more critical thinking into
> things to prevent mishaps like this.
>
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former. And, that is because the former encodes information that is
> lost in the latter. If we look at the issue of tabs vs spaces, indeed,
> it is easy to convert tabs to spaces in a source code, but more
> difficult to convert from spaces to tabs. Because, tabs as indentation
> actually contains the semantic information about indentation. With
> spaces, this critical information is lost in space.
>
> This issue is intimately related to another issue in source code:
> soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
> line character). This issue has far more consequences than tabs vs
> spaces, and the unixer's unthinking has made far-reaching damages in
> the computing industry. Due to unix's EOL ways of thinking, it has
> created languages based on EOL (just about ALL languages except the
> Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
> and basically every tool in unix), thoughts based on EOL (software
> value estimation by counting EOL, hard-coded email quoting system by
> “>” prefix, and silent line-truncations in many unix tools), such
> that any progress or development towards a “algorithmic code unit”
> concept or language syntaxes are suppressed. I have not written a full
> account on this issue, but i've touched it in this essay: “The Harm
> of hard-wrapping Lines”, at
> http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
> ----
> This post is archived at:
> http://xahlee.org/UnixResource_dir/w...vs_spaces.html
>
> Xah
> xah@xahlee.org
> ∑ http://xahlee.org/



numeromancer 05-16-2006 01:48 PM

Re: Tabs versus Spaces in Source Code
 
An old debate. My $0.02 :

http://numeromancer.dyndns.org/~timo...scription.html

The idea can be extended to other programming languages.

TS


opalpa@gmail.com opalinski from opalpaweb 05-16-2006 03:31 PM

Re: Tabs versus Spaces in Source Code
 
> Simply put, tabs is proper, and spaces are improper.
> Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.


The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.

> ... and the solution is to advance
> the sciences such that your source code in some way
> embed such information.


If/when time comes where such info is embeded perhaps then tabs will be
OK.

---------------------------------------------------------------

I use spaces because of the many sources I've opened I have many times
sighed on opening tabed ones and never done so opening spaced ones.

I don't get mad, but sighing is a clear indicator of negativity.
Anyway, the more code I write and read the less indentation matters to
me. My brain can now parse akward source correctly far bettter than it
did a few years ago.


All the best,
Opalinski
opalpa@gmail.com
http://www.geocities.com/opalpaweb/



All times are GMT. The time now is 05:42 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.