Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > You're appointed as Portability Advisor

Reply
Thread Tools

You're appointed as Portability Advisor

 
 
Tomás Ó hÉilidhe
Guest
Posts: n/a
 
      02-15-2008

We frequently discuss portability here. I think most of us would agree
that for a program to be portable, the following two criteria must be
met:
1) None of its behaviour is undefined by the Standard.
2) Any behaviour which is unspecified or implementation-defined does not
interfer with the intended behaviour of the program.

When laid out in black and white like that, these rules are quite clear.

However, let's consider this: Let's say you're appointed as the
Portability Advisor for a multi-national company that makes billions of
dollar each year. They pay you $500,000 a year, they have you working 30
hours a week and they give you 60 days paid holiday leave per year. They
don't block newsgroups, and their firewall only blocks the most
offensive of sites. They even get a Santa for the kids at the Christmas
party.

Your job is to screen the code that other programmers in the company
write. Every couple of days there's a fresh upload of code to the
network drive, and your job is to scan thru the code and point out and
alter anything that's not portable. Of course tho, you're given a
context in which to judge the code, for instance:
a) This code must run on everything from a hedge-trimmer to an iPod, to
a Playstation 3,
b) This code must run on all the well-known Desktop PC's

Depending on the context, you judge some code harsher than others. For
instance, in context B, you might allow assumptions that there's an even
number of bits in a byte, also that integer types don't contain padding.
While in context A, you might fire that code right back if you see such
assumptions.

So... it's Thursday morning, you sit down to your desk with a hot cup of
tea and a fig-roll bar, you check your mail. You surf the web for a
couple of minutes, perhaps check the latest scores in the election, or
look up where you can get a new electric-window switch for your car
since it mysteriously stopped working this morning.

You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world? Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?

Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).

The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.

So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

Two such rules I myself would put on the list are:
1) Accessing different union members
2) De-referencing a pointer to an array

--
Tomás Ó hÉilidhe
 
Reply With Quote
 
 
 
 
Tomás Ó hÉilidhe
Guest
Posts: n/a
 
      02-15-2008
Tomás Ó hÉilidhe:

> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually



I meant for that to come out as:

typedef union ConfigFlags {
unsigned entire;
char unsigned bytes[sizeof(unsigned)];
} ConfigFlags;


--
Tomás Ó hÉilidhe
 
Reply With Quote
 
 
 
 
Malcolm McLean
Guest
Posts: n/a
 
      02-15-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> wrote in message
> We frequently discuss portability here. I think most of us would agree
> that for a program to be portable, the following two criteria must be
> met:
> 1) None of its behaviour is undefined by the Standard.
> 2) Any behaviour which is unspecified or implementation-defined does not
> interfer with the intended behaviour of the program.
>

There's such a thing as code which is "reasonably portable". The standard
has to guess about what sort of hardware will be available in the future, as
well as what relics of the past will still be around in a few years' time
and which can be ignored.
Then C cannot go the Java route of mandating portability at the cost of
runtime. Even Java broke its own rules because floating point arithmetic was
too slow to standardise.

However as portability expert your job is to be strict. For instance I
thought that, surely, slash slash comments were standard by now. No, my MPI
compiler won't accept them.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-15-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> writes:
[...]
> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?

[...]

If I haven't been reading comp.lang.c in the last few days, I spend a
few moments wondering what the heck that code is trying to do. Then I
step through it, and once I figure out what it does, I wonder why the
author wrote it that way, especially when there's a clearer and
unambiguously legal way to do the same thing:

double const *const pend = tangents + 5;

Of course 5 is a magic number, so either a named constant should be
used both for the array length and for the offset, or a macro should
be used to compute the length. For example:

#include <stdio.h>

#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))

int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
double const *const pbegin = tangents; /* just for symmetry */
double const *const pend = tangents + ARRAY_LENGTH(tangents);
double const *iter;

for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
return 0;
}

Note that a very similar approach can be used when we don't have a
declared array object, but just a pointer to its first element and its
length. There's no good way to apply the ``*(&tangents + 1)''
approach if you don't have the array object itself.

#include <stdio.h>

#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))

void show_elements(double *array, size_t count)
{
double const *const pbegin = array;
double const *const pend = array + count;
double const *iter;

for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
}

int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
show_elements(tangents, ARRAY_LENGTH(tangents));
return 0;
}

Even leaving portability concerns aside, I find the latter approach
easier to read, easier to use, and easier to think about.

--
Keith Thompson (The_Other_Keith) <(E-Mail Removed)>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      02-15-2008
Tomás Ó hÉilidhe said:

<snip>

> You get down to it. You open up the network drive and navigate to James
> Weir's source file. Its context is "run on anything". You're looking
> thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }


This won't even run on MS-DOS, let alone "anything".

> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
> 'bytes' member of the structure".


No, I look at this code and I think, "someone has assumed that unsigned
ints are at least four bytes wide", which isn't true on typical MS-DOS
systems (yes, they're still used, believe it or not), isn't true on
various DSPs, and came within a gnat's whisker of being true on at least
one Cray.

> You have a second suspicion that
> perhaps James might have made assumptions about the size of "unsigned",
> but inspecting the code you find that he hasn't.


Yes, he has.

>
> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard, are
> you really going to reject this code?


Absolutely, yes.

> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?


No, I'm sitting there 100% aware that unsigned ints need not be four bytes
wide.

> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants.


No, I sit here thinking "why is he being so dumb as to dereference an
object that does not exist?".

> Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?


Yes, of course. That's what they pay me for, right? "your job is to scan
thru the code and point out and alter anything that's not portable" - and
my yardstick for portability is the C Standard.

> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?


No, we're really going to be so acute as to reject this code in the real
world.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
 
Reply With Quote
 
Richard Heathfield
Guest
Posts: n/a
 
      02-15-2008
Richard Heathfield said:

<snip>

> No, I look at this code and I think, "someone has assumed that unsigned
> ints are at least four bytes wide", which isn't true on typical MS-DOS
> systems (yes, they're still used, believe it or not), isn't true on
> various DSPs, and came within a gnat's whisker of


not

> being true on at least one Cray.



(The Cray implementation in question very nearly defined CHAR_BIT as 64,
but they changed their mind, apparently quite late in the game. Had they
not changed their mind, sizeof(unsigned) would have been 1 on that
implementation.)

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      02-15-2008
On Feb 14, 4:05*pm, "Tomás Ó hÉilidhe" <(E-Mail Removed)> wrote:
> We frequently discuss portability here. I think most of us would agree
> that for a program to be portable, the following two criteria must be
> met:
> 1) None of its behaviour is undefined by the Standard.


Depends on which standard you choose. According to the C one,

#include <fcntl.h>

is undefined behavior. As a portability advisor to real people working
in real trenches, you have to allow for extensions.

What we want to aim for is not avoiding extensions, but rather /
knowing/ what the documented extensions are and using them
deliberately.

> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?


You can write to member A and read through member B, if member B is a
character type that treats the thing as an array of bytes.

I would simply remove the member "entire" from the union, and change
it to a struct. Then see what breaks when you recompile the program,
and fix those occurences. The struct itself gives you the entire
thing. You can define objects of that struct type, pass them to
functions, return them, assign them, etc.

> Later on in the code, you come to:
>
> * * double tangents[5];
> * * ...
> * * double *p = tangents;
> * * double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants.


This can simply be edited to:

double *pend = tangents + sizeof tangents / sizeof tangents[0];

Or maybe the size should be a manifest constant somewhere:

double tangents[vector_size];
const double *const pend = &tangents[vector_size];

*(&tangents + 1) stops being neat once you get out of programming
puberty. Stuff like that looked neat when I was learning C for the
first time. It's not neat; it's just cryptic B. S.

99% of the C programmers out there have probably never seen an array
type manipulated as an array type---addresses being taken to make
pointer-to-array types, etc. Their heads will do a double, triple or
even quadruple ``take'' when they see that expression.

> Are you, as the Portability Advisor, going to reject this code?


Of course! There is no need for it to be doing what it's trying to do
by that means. There is no need to rely on an undocumented extension
of behavior to get the desired effect.

> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?


Absolutely. At /least/ that obtuse, if not way more.

And don't forget that we have a $500K salary as portability advisor,
and so we must scramble for every little thing that can add a line or
two to our weekly status report, and that can make us look very sharp
and justify our job position in the eyes of senior management.

There is always that!

> Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?


It's not stupidly restrictive when you can rewrite the expression in a
way that will not surprise most of the programmers out there, and that
doesn't break any rules.

A restriction is something that actually gets in your way; it makes
something impossible to do at all, or maybe only with an
unsatisfactory workaround.

> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule. In both these cases I've mentioned,
> I don't think anything can go wrong, not naturally anyway. What _can_
> cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation
> 2) Deliberately putting in checks (such as terminating the program when
> it thinks you're going to access memory out-of-bounds).


There is

3) Wasting people's time when they have to scratch their heads about
what *(&array + 1) actually means and whether or not it's right.

> Should we have a webpage that lists the common coding techniques that
> skilled programmers use, but which are officially forbidden or "a grey
> area" in the Standard?
>
> Two such rules I myself would put on the list are:
> 1) Accessing different union members


Definitely not; count me out from your webpage.

The purpose of a union is to save space in implementing polymorphism.
If you store member X, you read member X.

Type punning is inherently nonportable. It's not enough to say that
type punning is allowed through members of a union, but undefined
elsewhere. To define its behavior, it's not enough to simply permit
some action. The outcome of the action must be specified. And you
cannot do that because it's totally nonportable.

At best you could say that if a member Y is accessed after member X is
stored, then there shall be no aliasing problem: Y will be
reconstituted out of the bits that were actually stored through X.
However, that's not anywhere nearly complete a definition of behavior
to be practically useful. That kind of thing belongs in the
architecture-specific pages of a compiler reference manual, not in the
language.

> 2) De-referencing a pointer to an array


But that is allowed. I think you mean, dereferencing a pointer to one
element past the end of an array-of-array object, where it's not
pointing to any array.

I tend to agree with this.

That is to say, the address-of operator could have some additional
semantic rules, along these lines:

When the operand of the address-of operator is a pointer-
dereferencing
expression based on the unary * operator, the two operators
effectively
cancel each other out, so that &*(E) is equivalent to (E), provided
that (E) is valid pointer: either a pointer to an object, a null
pointer, or a pointer one element past the end of an array object.

I can't think of a way of allowing (E)->member or (*(E)).member to be
defined when E is null, or otherwise valid but not pointing to an
object. Is there a use for this other than implementing offsetof,
which is already done for you?
 
Reply With Quote
 
Ark Khasin
Guest
Posts: n/a
 
      02-15-2008
Tomás Ó hÉilidhe wrote:
> We frequently discuss portability here. I think most of us would agree
> that for a program to be portable, the following two criteria must be
> met:
> 1) None of its behaviour is undefined by the Standard.
> 2) Any behaviour which is unspecified or implementation-defined does not
> interfer with the intended behaviour of the program.
>
> When laid out in black and white like that, these rules are quite clear.
>
> However, let's consider this: Let's say you're appointed as the
> Portability Advisor for a multi-national company that makes billions of
> dollar each year. They pay you $500,000 a year, they have you working 30
> hours a week and they give you 60 days paid holiday leave per year. They
> don't block newsgroups, and their firewall only blocks the most
> offensive of sites. They even get a Santa for the kids at the Christmas
> party.
>
> Your job is to screen the code that other programmers in the company
> write. Every couple of days there's a fresh upload of code to the
> network drive, and your job is to scan thru the code and point out and
> alter anything that's not portable. Of course tho, you're given a
> context in which to judge the code, for instance:
> a) This code must run on everything from a hedge-trimmer to an iPod, to
> a Playstation 3,
> b) This code must run on all the well-known Desktop PC's
>
> Depending on the context, you judge some code harsher than others. For
> instance, in context B, you might allow assumptions that there's an even
> number of bits in a byte, also that integer types don't contain padding.
> While in context A, you might fire that code right back if you see such
> assumptions.
>
> So... it's Thursday morning, you sit down to your desk with a hot cup of
> tea and a fig-roll bar, you check your mail. You surf the web for a
> couple of minutes, perhaps check the latest scores in the election, or
> look up where you can get a new electric-window switch for your car
> since it mysteriously stopped working this morning.
>
> You get down to it. You open up the network drive and navigate to James
> Weir's source file. Its context is "run on anything". You're looking
> thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }
>
> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
> 'bytes' member of the structure". You have a second suspicion that
> perhaps James might have made assumptions about the size of "unsigned",
> but inspecting the code you find that he hasn't.
>
> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard, are
> you really going to reject this code?
>
> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?
>

IMHO, the only reason to write portable code is to count on it being
used on various platforms and maintained either centrally or
collaboratively. Such code must be optimized for clarity and avoid any
discernible window of misunderstanding.

I've seen passages e.g. like this:
typedef struct foo {.......} foo;
foo *foo; ........; sizeof(foo);
I don't even bother to learn whether or how the (a?) standard (or a
dialect) resolves foo for the purpose of sizeof. Like in a spoken
language, there are many more correct constructs that make no or
ambiguous sense to us mortals.

To that end, MISRA is a great effort to define what intelligent C coders
may say in a polite society. IMHO, it's an overkill in many respects but
it's a great starting point.

In your union example, the result obviously depends on machine
endianness. Depending on the exact access patterns this dependency may
cancel itself out, but it's an unreasonable burden on the maintainer to
verify it throughout. Thus the code like this should be banned from the
portable club.

> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?
>
> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world? Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?

Yes! Respectable people here in a yesterday's thread read the language
rules on this very subject differently. It means that the behavior is
not crystal clear. [And it might not be crystal clear to the compiler
writers either.]

>
> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule. In both these cases I've mentioned,
> I don't think anything can go wrong, not naturally anyway. What _can_
> cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation
> 2) Deliberately putting in checks (such as terminating the program when
> it thinks you're going to access memory out-of-bounds).
>
> The first thought I think comes to everyone's mind when we're talking
> about these unnecessarily rigid rules, is that the Standard just needs
> to be neatly amended. But, of course, it's C we're talking about, where
> the current standard is from 1989 and where still not too many people
> are paying attention to the 1999 standard that came out nine years ago.
>
> So, I wonder, what can we do? If there was a consenus between many of
> the world's most skilled and experienced C programmers that a certain
> rule in the Standard were unnecessarily rigid, would it not be worth the
> compiler vendors' while to listen? Here at comp.lang.c, there are,
> without exageration, some of the world's best C programmers. Instead of
> contacting each and every compiler vendor to let them know that we'd
> prefer to optimise-away assignments to union members, would it be
> convenient, both for the programmers and the compiler vendors, to have a
> single place to go to to read what the world's best programmers think?
> Should we have a webpage that lists the common coding techniques that
> skilled programmers use, but which are officially forbidden or "a grey
> area" in the Standard?
>
> Two such rules I myself would put on the list are:
> 1) Accessing different union members
> 2) De-referencing a pointer to an array
>

Some people cobble together a compiler for, I'd say, a C-inspired
language just to sell their chip. If your portability requirements
include trimmers and such, you may step into this swamp. The standard
may not apply very well there.

If we favor simple constructs over "look what I can do!" we may get not
only more portable but also more efficient code, because the compiler's
optimizer may recognize more idioms. E.g. a dumb rotation of an unsigned
32-bit `a' left by `n' bits, (a<<n)|(a>>(32-n)) is translated by ARM ADS
into one rotate instruction; any attempt to get clever produces worse code.

OTOH, if you have a choice, you shop for a good compiler first; that
depends on how many "look what I can do!" you need to keep in the code.

Just my $0.02F

--
Ark
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      02-15-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> writes:

> Your job is to screen the code that other programmers in the company
> write.

<snip>
> You're looking
> thru it and you come to the following section of code:
>
> typedef union ConfigFlags {
> unsigned entire; /* Write to all bytes
> char unsigned bytes[sizeof(unsigned)]; at once or access
> } ConfigFlags; them individually
> */
>
> int IsRemoteAdminEnabled(ConfigFlags const cf)
> {
> return cf.bytes[3] & 0x3u;
> }
>
> You look at this code and you think, "Hmm, this chap plans to write to
> 'entire' and then subsequently read individual bytes by using the
> 'bytes' member of the structure". You have a second suspicion that
> perhaps James might have made assumptions about the size of "unsigned",
> but inspecting the code you find that he hasn't.


Already pointed out that the code does assume that sizeof(unsigned) >= 4.

> Now, the question is, in the real world, at 10:13am on a sunny Thursday
> morning, sitting at your desk with a hot cup of tea, munching away on a
> fig-roll bar getting small crumbs between the keys on the keyboard, are
> you really going to reject this code?
>
> You're sitting there 100% aware that the Standard explicitly forbids you
> to write to member A of a union and then read from member B, but how
> much do you care?


It does not. Not as far as I can see, anyway. It says the result is
"unspecified" with an informative footnote to tell us what range of
unspecified behaviour to expect. Basically we are referred to the
Representation of Types section so, as Portability Tsar, I am quite
happy with the type punning aspect of the code...

*But* I'd reject it, even if we could assume that sizeof(unsigned) >=
4 because one part of that unspecified behaviour is that the
resulting configuration file won't move between systems. If the
config file is on an NFS server, the bits in cf.bytes[3] will depend
on the target architecture the program was compiler for.

This is a penalty that /might/ be worth paying, but not if the
alternative is as simple as writing a 4 bytes array.

> Later on in the code, you come to:
>
> double tangents[5];
> ...
> double *p = tangents;
> double const *const pend = *(&tangents + 1);
>
> Again, you look at this code and you think to yourself this really is
> quite a neat way of achieving what he wants. Again, you know that the
> Standard in all its marvelous rigidity doesn't want you to dereference
> that pointer, but are you bothered? Are you, as the Portability Advisor,
> going to reject this code?


Yes. This is s clear-cut case. The behaviour is undefined by the
standard. I have posted opinions that suggest I'd like it not to be,
and I am not 100% persuaded that there was any practical reason for
making it so -- but it is. As we speak, compiler writers are tuning
their optimisers, safe in the knowledge that they can do anything they
like with this code. I would not want my product to be in their
hands.

Again, if the payoff is huge, and the alternatives costly, they a case
could be made, but there are too may alternatives here. At the very
least, (void *)(&tangents + 1) has a clearer meaning than above and is
well-defined.

> What I'm trying to get across is, that, while we may discuss in black
> and white what the Standard permits and what it forbids... are we really
> going to be so obtuse as to reject this code in the real world?


How is it obtuse to stick to the standard where practical? Both your
examples have potential risks attached and few benefits. There is
nothing obtuse about avoiding these risks.

> Are we
> really going to reject some code for a reason that we see as stupidly
> restrictive in the language's definition?
>
> Perhaps it might be useful to point out what exactly can go wrong when
> we're treading on a particular rule. In both these cases I've mentioned,
> I don't think anything can go wrong, not naturally anyway. What _can_
> cause problems tho is aspects of the compiler:
> 1) Over-zealous with its optimisation


I don't think it can be over-zealous if it does not break a correct
program. This is the whole point. If you stick to the letter of the
law you can't be banged up!

> 2) Deliberately putting in checks (such as terminating the program when
> it thinks you're going to access memory out-of-bounds).
>
> The first thought I think comes to everyone's mind when we're talking
> about these unnecessarily rigid rules, is that the Standard just needs
> to be neatly amended. But, of course, it's C we're talking about, where
> the current standard is from 1989 and where still not too many people
> are paying attention to the 1999 standard that came out nine years ago.
>
> So, I wonder, what can we do? If there was a consenus between many of
> the world's most skilled and experienced C programmers that a certain
> rule in the Standard were unnecessarily rigid, would it not be worth the
> compiler vendors' while to listen? Here at comp.lang.c, there are,
> without exageration, some of the world's best C programmers. Instead of
> contacting each and every compiler vendor to let them know that we'd
> prefer to optimise-away assignments to union members, would it be
> convenient, both for the programmers and the compiler vendors, to have a
> single place to go to to read what the world's best programmers think?
> Should we have a webpage that lists the common coding techniques that
> skilled programmers use, but which are officially forbidden or "a grey
> area" in the Standard?
>
> Two such rules I myself would put on the list are:
> 1) Accessing different union members


OK as it stands, I think, but often a portability nightmare for
practical reasons due to differing representations.

> 2) De-referencing a pointer to an array


You mean de-referencing a "one past the end" array pointer (for want
of more felicitous wording). I'd be happy if this was allowed in C0x,
but I'd live with any of the alternatives if it were not.

The best example of how this has happened in the past is the now
sanctioned struct array hack.

--
Ben.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-15-2008
Ark Khasin <(E-Mail Removed)> writes:
[...]
> IMHO, the only reason to write portable code is to count on it being
> used on various platforms and maintained either centrally or
> collaboratively. Such code must be optimized for clarity and avoid any
> discernible window of misunderstanding.


IMHO, that's hardly the *only* reason to write portable code. There
are benefits even if the code will never be run on more than one
platform. Most of the time, portable code is simpler and clearer than
code that depends on implementation-specific features. (Not all the
time, just most of the time.)

> I've seen passages e.g. like this:
> typedef struct foo {.......} foo;
> foo *foo; ........; sizeof(foo);
> I don't even bother to learn whether or how the (a?) standard (or a
> dialect) resolves foo for the purpose of sizeof. Like in a spoken
> language, there are many more correct constructs that make no or
> ambiguous sense to us mortals.


That particular construct is illegal, unless the typedef is declared
in an outer scope and the object in an inner one. In that case, the
declaration is legal, and the "foo" in sizeof(foo) refers to the
innermost declaration (the object) -- but the object declaration hides
the typedef, which is a lousy idea. For example:

typedef struct foo { struct foo *foo; } foo;
{
foo *foo; /* legal, alas */
foo *bar; /* illegal, since the typedef name is hidden */
}

That's just a minor quibble, though; I agree that there are plenty of
things you can legally do that you nevertheless shouldn't. The
correct answer to "What does this code do?" is often "It gets rejected
at the code review.".

[snip]

--
Keith Thompson (The_Other_Keith) <(E-Mail Removed)>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Questionable Olympus official appointed as new CFO!! RichA Digital Photography 0 04-11-2012 12:28 PM
Cisco Software Advisor Gary Cisco 2 11-15-2005 12:30 AM
Former NZ Exec appointed MS CFO Rob J NZ Computing 8 04-27-2005 02:20 AM
microsoft software advisor =?Utf-8?B?YWxleA==?= Microsoft Certification 1 12-10-2004 12:10 AM
SmartNavigation Crashing IE with Content Advisor Enabled Steve Roszko ASP .Net 0 12-04-2003 08:41 PM



Advertisments