Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > [half OT] About the not-in-common range of signed and unsigned char

Reply
Thread Tools

[half OT] About the not-in-common range of signed and unsigned char

 
 
Francesco S. Carta
Guest
Posts: n/a
 
      07-13-2010
Hi there,
when I create some less-than-trivial console program that involves some
kind of pseudo-graphic interface I resort to using the glyphs that lie
in the range [-128, -1] - the simple "char" type is signed in my
implementation.

You know, all those single/double borders, corners, crosses,
pseudo-shadow (dithered) boxes and so on.

Since those characters mess up the encoding of my files, I cannot put
them straight into the source code as char-literals, I have to hard-code
their numeric values.

I noticed that, at least on my implementation, it doesn't make any
difference if I assign a negative value to an unsigned char - the
expected glyph shows up correctly - hence I think I wouldn't have to
worry if the same code is run on an implementation where char is unsigned.

My questions:

- what assumptions (if any) can I make about the presence of those
out-of-common-range characters and their (correct) correspondence with
the codes I use to hard-code?

- assuming it is possible to, how can I ensure that my program displays
the correct "graphics" regardless of the platform / implementation it is
compiled onto?

Note: resorting to an external library that "does the stuff for me" is
not an option here, I'm asking in order to learn, not just to solve an
issue.

Thank you for your attention.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com
 
Reply With Quote
 
 
 
 
Francesco S. Carta
Guest
Posts: n/a
 
      07-13-2010
Victor Bazarov <(E-Mail Removed)>, on 13/07/2010 19:13:13, wrote:

> On 7/13/2010 7:01 PM, Francesco S. Carta wrote:
>> Hi there,
>> when I create some less-than-trivial console program that involves some
>> kind of pseudo-graphic interface I resort to using the glyphs that lie
>> in the range [-128, -1] - the simple "char" type is signed in my
>> implementation.
>>
>> You know, all those single/double borders, corners, crosses,
>> pseudo-shadow (dithered) boxes and so on.
>>
>> Since those characters mess up the encoding of my files, I cannot put
>> them straight into the source code as char-literals, I have to hard-code
>> their numeric values.
>>
>> I noticed that, at least on my implementation, it doesn't make any
>> difference if I assign a negative value to an unsigned char - the
>> expected glyph shows up correctly - hence I think I wouldn't have to
>> worry if the same code is run on an implementation where char is
>> unsigned.
>>
>> My questions:
>>
>> - what assumptions (if any) can I make about the presence of those
>> out-of-common-range characters and their (correct) correspondence with
>> the codes I use to hard-code?

>
> You need to ask this in the newsgroup for your OS and/or your terminal
> because those things are hardware- and platform-specific. Those
> characters are not part of the basic character set, C++ knows nothing
> about them.
>
>> - assuming it is possible to, how can I ensure that my program displays
>> the correct "graphics" regardless of the platform / implementation it is
>> compiled onto?

>
> There is no way.
>
>> Note: resorting to an external library that "does the stuff for me" is
>> not an option here, I'm asking in order to learn, not just to solve an
>> issue.

>
> <shrug> Whatever.


I'm sorry if my post disturbed you: I explicitly marked it as "[half
OT]" and I posted it here for a reason, which should be evident.

Nonetheless, thank you for your reply, Victor - that's just what I was
looking for: the confirmation that I cannot portably resort to those
graphics, so that I'll avoid struggling for something that isn't
achievable - this is "learning", for me.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com
 
Reply With Quote
 
 
 
 
Francesco S. Carta
Guest
Posts: n/a
 
      07-14-2010
Victor Bazarov <(E-Mail Removed)>, on 13/07/2010 19:48:07, wrote:

> On 7/13/2010 7:22 PM, Francesco S. Carta wrote:
>> Victor Bazarov <(E-Mail Removed)>, on 13/07/2010 19:13:13, wrote:
>>
>>> On 7/13/2010 7:01 PM, Francesco S. Carta wrote:
>>>> Hi there,
>>>> when I create some less-than-trivial console program that involves some
>>>> kind of pseudo-graphic interface I resort to using the glyphs that lie
>>>> in the range [-128, -1] - the simple "char" type is signed in my
>>>> implementation.
>>>>
>>>> You know, all those single/double borders, corners, crosses,
>>>> pseudo-shadow (dithered) boxes and so on.
>>>>
>>>> Since those characters mess up the encoding of my files, I cannot put
>>>> them straight into the source code as char-literals, I have to
>>>> hard-code
>>>> their numeric values.
>>>>
>>>> I noticed that, at least on my implementation, it doesn't make any
>>>> difference if I assign a negative value to an unsigned char - the
>>>> expected glyph shows up correctly - hence I think I wouldn't have to
>>>> worry if the same code is run on an implementation where char is
>>>> unsigned.
>>>>
>>>> My questions:
>>>>
>>>> - what assumptions (if any) can I make about the presence of those
>>>> out-of-common-range characters and their (correct) correspondence with
>>>> the codes I use to hard-code?
>>>
>>> You need to ask this in the newsgroup for your OS and/or your terminal
>>> because those things are hardware- and platform-specific. Those
>>> characters are not part of the basic character set, C++ knows nothing
>>> about them.
>>>
>>>> - assuming it is possible to, how can I ensure that my program displays
>>>> the correct "graphics" regardless of the platform / implementation
>>>> it is
>>>> compiled onto?
>>>
>>> There is no way.
>>>
>>>> Note: resorting to an external library that "does the stuff for me" is
>>>> not an option here, I'm asking in order to learn, not just to solve an
>>>> issue.
>>>
>>> <shrug> Whatever.

>>
>> I'm sorry if my post disturbed you: I explicitly marked it as "[half
>> OT]" and I posted it here for a reason, which should be evident.

>
> It didn't disturb me. I am sorry you thought I did (why did you think
> that?).


Your last line above ("<shrug> Whatever.") made me think that the whole
post disturbed or at least annoyed you. I'm glad to discover that I
misinterpreted your post

> And the only reason evident to me is that you asked a valid
> question on C++. What other reason would one need?


That was a "combined" reply, relative to my misinterpretation of your
post /and/ to the fact that you pointed me to another group. The reason
for posting it here is exactly the one you noted: it's about C++ - even
though it was likely to be a platform-specific issue - "half OT", as I
said

>> Nonetheless, thank you for your reply, Victor - that's just what I was
>> looking for: the confirmation that I cannot portably resort to those
>> graphics, so that I'll avoid struggling for something that isn't
>> achievable - this is "learning", for me.

>
> Well, you seemed to post when you already knew the answer (although I
> can still be mistaken). You either need to use somebody else's library
> (which will represent an abstraction layer for you, and behind the
> scenes its code is platform-specific, regardless what language it is
> implemented in) or implement that functionality yourself, essentially
> reproducing the same library.


Technically no, I didn't "know" the answer, I just suspected it, hence I
asked for confirmation (although I didn't express my question as such).

Although it is true that I could have just relied on my understanding of
the standard, I was also hoping to get a "real life" reply on the lines
of "on windows and linux you're pretty much safe assuming those
characters [are|aren't] available and [have|haven't] the same values,
I've tried [this] and [that], and [that other] gave me problems, YMMV,
do some tests".

[ besides: the threads here happen to see people dropping in with
not-strictly-related comments which are precious, at times, because they
lead me to investigate new things - posting stuff like this is (also)
another chance to see those kind of "lateral" follow-ups ]

Thank you for your clarification and for the further details.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com
 
Reply With Quote
 
Jonathan Lee
Guest
Posts: n/a
 
      07-14-2010
On Jul 13, 7:01*pm, "Francesco S. Carta" <(E-Mail Removed)> wrote:
> - what assumptions (if any) can I make about the presence of those
> out-of-common-range characters and their (correct) correspondence with
> the codes I use to hard-code?


signed to unsigned conversion is well-defined in [conv.integral]. If
you're storing these numbers in (signed) chars as negatives, they'll
predictably be changed to unsigned char. You should be okay so long
as CHAR_BIT is appropriate.

For example, suppose you have signed char c = -41, and are going to
cast this to char. If char is signed, no problem. If char is unsigned
then the result is (1 << CHAR_BIT) - 41. Suppose CHAR_BIT is 8, then
the
result is 215. If CHAR_BIT is 9, you'll get 471. The former probably
will probably be the same character in whatever extended ASCII as
-41. The latter, probably not. So I guess you could have an #if
to watch this.

Of course, there are different versions of extended ASCII, and even
non-ASCII so -41 isn't really guaranteed to be anything in particular.
But you can know the result of converting to unsigned. Whereas
conversion from unsigned to signed is not defined. I guess that's
my point.

> - assuming it is possible to, how can I ensure that my program displays
> the correct "graphics" regardless of the platform / implementation it is
> compiled onto?


If those characters were guaranteed to be present in _some_ order,
it might be conceivable. But they're not. How could you display
"filled
in square" on a platform that doesn't have such a character?

--Jonathan
 
Reply With Quote
 
Francesco S. Carta
Guest
Posts: n/a
 
      07-14-2010
Jonathan Lee <(E-Mail Removed)>, on 13/07/2010 18:33:22, wrote:

> On Jul 13, 7:01 pm, "Francesco S. Carta"<(E-Mail Removed)> wrote:
>> - what assumptions (if any) can I make about the presence of those
>> out-of-common-range characters and their (correct) correspondence with
>> the codes I use to hard-code?

>
> signed to unsigned conversion is well-defined in [conv.integral]. If
> you're storing these numbers in (signed) chars as negatives, they'll
> predictably be changed to unsigned char. You should be okay so long
> as CHAR_BIT is appropriate.
>
> For example, suppose you have signed char c = -41, and are going to
> cast this to char. If char is signed, no problem. If char is unsigned
> then the result is (1<< CHAR_BIT) - 41. Suppose CHAR_BIT is 8, then
> the
> result is 215. If CHAR_BIT is 9, you'll get 471. The former probably
> will probably be the same character in whatever extended ASCII as
> -41. The latter, probably not. So I guess you could have an #if
> to watch this.
>
> Of course, there are different versions of extended ASCII, and even
> non-ASCII so -41 isn't really guaranteed to be anything in particular.
> But you can know the result of converting to unsigned. Whereas
> conversion from unsigned to signed is not defined. I guess that's
> my point.


I didn't consider that CHAR_BIT problem at all, thank you for pointing
it out Jonathan.

I think I'd work around this by checking if the normal char is signed or
not, and filling the appropriate table with the appropriate values - so
that I'll avoid signed/unsigned conversions completely.

>> - assuming it is possible to, how can I ensure that my program displays
>> the correct "graphics" regardless of the platform / implementation it is
>> compiled onto?

>
> If those characters were guaranteed to be present in _some_ order,
> it might be conceivable. But they're not. How could you display
> "filled
> in square" on a platform that doesn't have such a character?


I think I've discovered my true point, I'm interested into a subset of:

http://en.wikipedia.org/wiki/Code_page_437

which, as it seems, "is still the primary font in the core of any EGA
and VGA compatible graphic card".

If I decide to spend some effort in making some portable program that
uses them, I'd have to find a way to activate that code page or
something comparable as explained in:

http://en.wikipedia.org/wiki/Box_drawing_characters

and resort to acceptable replacements (such as \, /, |, - and +) in case
none of the above is available.

In this way the program could be considered "portable" enough - at least
for me

Thanks a lot for your attention.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      07-14-2010
On Jul 14, 10:27 am, "Francesco S. Carta" <(E-Mail Removed)> wrote:
> Jonathan Lee <(E-Mail Removed)>, on 13/07/2010 18:33:22, wrote:
> > On Jul 13, 7:01 pm, "Francesco S. Carta"<(E-Mail Removed)> wrote:
> >> - what assumptions (if any) can I make about the presence
> >> of those out-of-common-range characters and their (correct)
> >> correspondence with the codes I use to hard-code?


> > signed to unsigned conversion is well-defined in [conv.integral]. If
> > you're storing these numbers in (signed) chars as negatives, they'll
> > predictably be changed to unsigned char. You should be okay so long
> > as CHAR_BIT is appropriate.


He needs a CHAR_BIT which is at least 8, which is guaranteed.

In practice, I'd use the positive (actually defined) values, and
not some negative mapping, even if char is signed.

> > For example, suppose you have signed char c = -41, and are
> > going to cast this to char. If char is signed, no problem.
> > If char is unsigned then the result is (1<< CHAR_BIT) - 41.
> > Suppose CHAR_BIT is 8, then the result is 215. If CHAR_BIT
> > is 9, you'll get 471. The former probably will probably be
> > the same character in whatever extended ASCII as -41. The
> > latter, probably not. So I guess you could have an #if to
> > watch this.


I'd use 0xD7, rather than -41. Formally, the conversion of this
value to char, if char's are signed, is implementation defined,
but practically, doing anything but preserving the bit pattern
would break so much code it isn't going to happen.

> > Of course, there are different versions of extended ASCII,
> > and even non-ASCII so -41 isn't really guaranteed to be
> > anything in particular. But you can know the result of
> > converting to unsigned. Whereas conversion from unsigned to
> > signed is not defined. I guess that's my point.


Formally, of course, there's no such thing as "extended
ASCII". There are just other code sets, which happen to
correspond exactly to ASCII for the range 0-127.

> I didn't consider that CHAR_BIT problem at all, thank you for pointing
> it out Jonathan.


> I think I'd work around this by checking if the normal char is signed or
> not, and filling the appropriate table with the appropriate values - so
> that I'll avoid signed/unsigned conversions completely.


> >> - assuming it is possible to, how can I ensure that my program displays
> >> the correct "graphics" regardless of the platform / implementation it is
> >> compiled onto?


> > If those characters were guaranteed to be present in _some_
> > order, it might be conceivable. But they're not. How could
> > you display "filled in square" on a platform that doesn't
> > have such a character?


> I think I've discovered my true point, I'm interested into a subset of:


> http://en.wikipedia.org/wiki/Code_page_437


> which, as it seems, "is still the primary font in the core of any EGA
> and VGA compatible graphic card".


I don't think so, but I've not actually programmed anything at
that low a level for many, many years.

Not that it matters, since you probably can't access the graphic
card directly.

> If I decide to spend some effort in making some portable program that
> uses them, I'd have to find a way to activate that code page or
> something comparable as explained in:


> http://en.wikipedia.org/wiki/Box_drawing_characters


> and resort to acceptable replacements (such as \, /, |, - and +) in case
> none of the above is available.


Most machines don't have "code pages"; they're an MS-DOS
invention. Most modern systems *do* support Unicode, however
(under Windows, it's code page 65001 if you're using UTF-8
encoding). You might have more luck with portability if you
used Unicode characters in the range 0x2500-0x257F.

> In this way the program could be considered "portable" enough - at least
> for me


It's only portable to Windows.

--
James Kanze
 
Reply With Quote
 
Francesco S. Carta
Guest
Posts: n/a
 
      07-14-2010
James Kanze <(E-Mail Removed)>, on 14/07/2010 07:22:01, wrote:

> On Jul 14, 10:27 am, "Francesco S. Carta"<(E-Mail Removed)> wrote:


<snip>

>> I think I've discovered my true point, I'm interested into a subset of:
>>
>> http://en.wikipedia.org/wiki/Code_page_437


<snip>

> Most machines don't have "code pages"; they're an MS-DOS
> invention. Most modern systems *do* support Unicode, however
> (under Windows, it's code page 65001 if you're using UTF-8
> encoding). You might have more luck with portability if you
> used Unicode characters in the range 0x2500-0x257F.


Heck, that's one of those (in)famous Columbus' eggs... thanks for the
further details James, I will resort to using Unicode characters, that's
a way better bet.

--
FSC - http://userscripts.org/scripts/show/59948
http://fscode.altervista.org - http://sardinias.com
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Padding bits and char, unsigned char, signed char Ioannis Vranos C Programming 6 03-29-2008 10:55 AM
Padding bits and char, unsigned char, signed char Ioannis Vranos C++ 11 03-28-2008 10:47 PM
Printing the range s of unsigned char and unsigned int. Junmin H. C Programming 20 09-20-2007 06:03 AM
void*, char*, unsigned char*, signed char* Steffen Fiksdal C Programming 1 05-09-2005 02:33 AM
signed char and unsigned char difference dam_fool_2003@yahoo.com C Programming 9 07-26-2004 01:59 AM



Advertisments