Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > printf in glorious colour

Reply
Thread Tools

printf in glorious colour

 
 
Keith Thompson
Guest
Posts: n/a
 
      03-12-2013
glen herrmannsfeldt <(E-Mail Removed)> writes:
> Keith Thompson <(E-Mail Removed)> wrote:
> (snip, I wrote)

[...]
>> Interesting. Do you have a reference for this proposed ASCII-8?
>> It turns out to be difficult to Google; most of the references I've
>> found incorrectly refer to things like Latin-1 or Windows-1252 as
>> "8-bit ASCII".

>
>> If ASCII-8 had caught on, with some common characters requiring 8
>> bits, I wonder if UTF-8 would have been possible.

>
> Look in the Appendix of the S/360 Principles of Operation. Later
> versions have a better description of it, such as the -7 (Dec 1967)
> version from bitsavers.
>
> There are still plenty of code points, they just moved them around.


Yes, and I wonder why.

For those who don't want to download the PDFs:

http://bitsavers.trailing-edge.com/p...60PrincOps.pdf
http://bitsavers.trailing-edge.com/p...ncOpsDec67.pdf

ASCII-8 had the same defined characters as ASCII-7, but remapped the
ranges relative to ASCII-7 (what we know as ASCII):

0..31 -> 0..31
32..63 -> 64..95
64..95 -> 160..191
96..127 -> 224..255

leaving gaps in between. This makes it incompatible with standard
ASCII. There doesn't seem to be any stated rationale for this
rather odd mapping. It doesn't represent more than 128 characters,
so I frankly don't see the point.

Such an encoding would be suitable for a conforming C implementation,
assuming you work around the changes for '^' and '!'. Like EBDIC
and ASCII, it keeps the decimal digits contiguous. Like ASCII,
but unlike EBCDIC, the lowercase letters are contiguous, as are
the uppercase letters (C doesn't require this). And like EBCDIC,
it would force C compiler to make plain char unsigned.

UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
compatible with ASCII-8, but it would have to use a less elegant
encoding, and it would probably lose some of UTF-8's nice properties.
And it would be incompatible with ASCII-7.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      03-12-2013
Keith Thompson <(E-Mail Removed)> wrote:

(snip, I wrote)
>> Look in the Appendix of the S/360 Principles of Operation. Later
>> versions have a better description of it, such as the -7 (Dec 1967)
>> version from bitsavers.


>> There are still plenty of code points, they just moved them around.


> Yes, and I wonder why.


> For those who don't want to download the PDFs:


> http://bitsavers.trailing-edge.com/p...60PrincOps.pdf
> http://bitsavers.trailing-edge.com/p...ncOpsDec67.pdf


> ASCII-8 had the same defined characters as ASCII-7, but remapped the
> ranges relative to ASCII-7 (what we know as ASCII):


> 0..31 -> 0..31
> 32..63 -> 64..95
> 64..95 -> 160..191
> 96..127 -> 224..255


> leaving gaps in between. This makes it incompatible with standard
> ASCII. There doesn't seem to be any stated rationale for this
> rather odd mapping. It doesn't represent more than 128 characters,
> so I frankly don't see the point.


I mostly don't see the point either. One thing, though. It is required
that 256 different code points map to 256 different card punch
combinations. It does seem like that could have been done
with ASCII-7, though.

> Such an encoding would be suitable for a conforming C implementation,
> assuming you work around the changes for '^' and '!'. Like EBDIC
> and ASCII, it keeps the decimal digits contiguous. Like ASCII,
> but unlike EBCDIC, the lowercase letters are contiguous, as are
> the uppercase letters (C doesn't require this). And like EBCDIC,
> it would force C compiler to make plain char unsigned.


> UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
> compatible with ASCII-8, but it would have to use a less elegant
> encoding, and it would probably lose some of UTF-8's nice properties.
> And it would be incompatible with ASCII-7.


It might have been that if IBM did get ASCII-8 standardized that
other byte-oriented machines would have followed it. At least IBM
might have believe that.

But okay, the properties of EBCDIC that S/360 was designed around:

From pretty early in the punched card days, the top rows were called
zones and the bottom rows digits. (The top two rows are commonly called
the 12 and 11 row, though they don't have markings on them like rows
zero through nine.) In BCDIC, the top row was the '+' character and
the next row '-' (but there was also another code for '-'). In the
pre-computer punched card days one could "overpunch" the sign by
punching it over one of the digit columns. Using the electromechanical
card sorter, it would be one additional pass to separate plus from
minus cards.

For EBCDIC characters in memory, the top (MSB) half of each byte is
called the zone, and the bottom (LSB) the digit. Note that in ASCII-7,
ASCII-8, and EBCDIC the low hex digit of characters '0' through '9'
corresponds to the digit value.

The S/360 (and successor) PACK instruction will take from 1 to 16
bytes of zone decimal (one digit per byte) and convert to packed
decimal (two BCD digits per byte, with the sign in the least
significant half of the rightmost (least significant) byte.

For a series of EBCDIC digits, the result is BCD digits with a sign
field of X'F'. Conveniently, X'F' counts as positive for the packed
decimal (BCD) instructions. When the ASCII mode bit is not set in
the PSW, decimal instructions generate X'C' for plus and X'D'
for minus. When unpacked with the UNPK instruction, positive values
with the rightmost digit 1 through 9 will convert to the EBCDIC
codes for 'A' through 'J' (that is, X'C1 through X'C9') and punch
as 12 punch plus digit 1 though 9. X'C0' is not a printable
EBCDIC character, but will punch as 12 and 0. Similarly, for
negative values, the low byte will be between X'D0' and X'D9',
and punch as 11 row plus digit 0 though 9, again with a non-printing
character for 0, and C'J' through C'R' for 1 through 9.

When the ASCII bit is set in the PSW, decimal instruction generate
for the sign X'A' for plus and X'B' for minus. PACK will then
convert the low digit of positive numbers to bytes from
X'A0' through X'A9' and for negative values X'B0' through X'B9'.
Positive values convert to C'@' and C'A' through C'I' in ASCII-8,
and negative C'P' through C'Y'. With the appropriate punch code
for C'@' that works for positive numbers, but not negative
numbers. But maybe they could convince people to punch negative
numbers using C'P' through C'Y'.

In any case, it is not hard to fixup the low byte using instructions
such as OI, NI, or XI (or immediate, and immediate, xor immediate)
which OR, AND, or XOR one byte with an immediate value. Also,
one can use the TR (translate) instruction to convert between 1
and 256 characters using a 256 byte (or less) translate table.

Independent of the ASCII bit, decimal instructions accept X'B'
and X'D' as negative, X'A', X'C', X'E', and X'F' as positive.
Other sign values will generate an interrupt, as will digits
other than X'0' through X'9' in digit positions.

I don't know if that helps much. Presumably IBM could have built
card readers and card punches with either code.

-- glen

 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      03-12-2013
glen herrmannsfeldt <(E-Mail Removed)> writes:
> Keith Thompson <(E-Mail Removed)> wrote:

[...]
>> UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
>> compatible with ASCII-8, but it would have to use a less elegant
>> encoding, and it would probably lose some of UTF-8's nice properties.
>> And it would be incompatible with ASCII-7.

>
> It might have been that if IBM did get ASCII-8 standardized that
> other byte-oriented machines would have followed it. At least IBM
> might have believe that.


I boldly predict that ASCII-8 won't catch on.

[big snip]

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      03-12-2013
Keith Thompson <(E-Mail Removed)> wrote:
> glen herrmannsfeldt <(E-Mail Removed)> writes:
>> Keith Thompson <(E-Mail Removed)> wrote:

> [...]
>>> UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
>>> compatible with ASCII-8, but it would have to use a less elegant
>>> encoding, and it would probably lose some of UTF-8's nice properties.
>>> And it would be incompatible with ASCII-7.


>> It might have been that if IBM did get ASCII-8 standardized that
>> other byte-oriented machines would have followed it. At least IBM
>> might have believe that.


> I boldly predict that ASCII-8 won't catch on.


But would you have predicted that 50 years ago.

We are now, more or less, 50 years from S/360.

The announcement was, according to Wikipedia, April 1964, but they
had to have been working on them earlier, including decisions like what
code to use.

-- glen
 
Reply With Quote
 
glen herrmannsfeldt
Guest
Posts: n/a
 
      03-12-2013

(snip, I wrote)
>>> There are still plenty of code points, they just moved them around.


>> Yes, and I wonder why.


>> For those who don't want to download the PDFs:


>> http://bitsavers.trailing-edge.com/p...60PrincOps.pdf
>> http://bitsavers.trailing-edge.com/p...ncOpsDec67.pdf


>> ASCII-8 had the same defined characters as ASCII-7, but remapped the
>> ranges relative to ASCII-7 (what we know as ASCII):


>> 0..31 -> 0..31
>> 32..63 -> 64..95
>> 64..95 -> 160..191
>> 96..127 -> 224..255


The IBM description takes the bits of ASCII-7, numbered 7654321 and
places them in the eight bit byte as 76754321. In contrast, the bits
of EBCDIC are described as 01234567. (That is, big endian order.)

>> leaving gaps in between. This makes it incompatible with standard
>> ASCII. There doesn't seem to be any stated rationale for this
>> rather odd mapping. It doesn't represent more than 128 characters,
>> so I frankly don't see the point.


> I mostly don't see the point either. One thing, though. It is required
> that 256 different code points map to 256 different card punch
> combinations. It does seem like that could have been done
> with ASCII-7, though.


There is good description of what went into the design of EBCDIC
in Blaauw & Brooks "Computer Architecture, Concepts and Evolution."
While many of the examples use IBM machines and decisions, they are
not afraid to note when a decision was a mistake.

For one, the designers of PL/I wanted to add &|~<>[] to the character
set. (As written in the book, it is ~, but likely supposed to be the
logical NOT character.) With the restrictions on printers, typewriters
(maybe including the 2741) and keypunches, two characters had to be
removed. The decision was to remove [], which Blaauw and Brooks say
was a mistake. PL/I, like Fortran, uses () for function references
and array subscripting. ASCII left out the NOT character, so that
in C we have [] for subscripts, but != and ! for relational and
logical operators.

>> Such an encoding would be suitable for a conforming C implementation,
>> assuming you work around the changes for '^' and '!'. Like EBDIC
>> and ASCII, it keeps the decimal digits contiguous. Like ASCII,
>> but unlike EBCDIC, the lowercase letters are contiguous, as are
>> the uppercase letters (C doesn't require this). And like EBCDIC,
>> it would force C compiler to make plain char unsigned.


>> UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
>> compatible with ASCII-8, but it would have to use a less elegant
>> encoding, and it would probably lose some of UTF-8's nice properties.
>> And it would be incompatible with ASCII-7.


> It might have been that if IBM did get ASCII-8 standardized that
> other byte-oriented machines would have followed it. At least IBM
> might have believe that.


As described, for ASCII "(For reasons of commercial rivalry, the were
determined NOT to be compatible with BCD.)" (BCD was the name used to
describe what is now usually BCDIC, the predecessor to EBCDIC.)

When Fortran was developed, the printers of 704 could only
print 48 different characters, plus blank. The ='(+) characters
replaced five characters used for commercial computing, #@%&
(the fifth character doesn't exist on any system that I know of).

Anyway, I recommend the Blaauw and Brooks book for anyone at all
interested in the developments of computer architecture.

-- glen
 
Reply With Quote
 
Chris F.A. Johnson
Guest
Posts: n/a
 
      06-30-2013
On 2013-03-01, Nobody wrote:
> On Fri, 01 Mar 2013 02:04:24 -0800, Malcolm McLean wrote:
>
>> How widely supported are the escape ... m colour codes for text?

>
> There are enough exceptions that hard-coding those sequences can't be
> explained as anything other than the programmer not knowing about termcap
> or terminfo.


There are few enough exceptions that hard-coding is not a problem,
especially in shell scripts, where the interface to termcap or
terminfo is limited.

--
Chris F.A. Johnson, <http://cfajohnson.com>
Author:
Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress)
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      07-01-2013
On 6/30/2013 5:37 PM, Chris F.A. Johnson wrote:
^^^^^^^^^

> On 2013-03-01, Nobody wrote:

^^^^^^^^^^

>> On Fri, 01 Mar 2013 02:04:24 -0800, Malcolm McLean wrote:

^^^^^^^^^^^

Film at eleven.

--
Eric Sosman
(E-Mail Removed)d
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Glorious Progress Back To The Past Lawrence D'Oliveiro NZ Computing 5 03-16-2011 11:47 AM
What is the point of having 16 bit colour if a computer monitor can only display 8 bit colour? How do you edit 16 bit colour when you can only see 8 bit? Scotius Digital Photography 6 07-13-2010 03:33 AM
Snow, glorious snow.....brings down the global warming fanatics richard Computer Support 10 02-07-2010 01:02 AM
Glorious USian broadband Lawrence D'Oliveiro NZ Computing 0 10-18-2007 12:53 AM
Another glorious blow against the reviled copyright violators Lawrence D'Oliveiro NZ Computing 1 05-21-2007 09:26 AM



Advertisments