Velocity Reviews > size of a pointer on 4-bit system

# size of a pointer on 4-bit system

Keith Thompson
Guest
Posts: n/a

 02-04-2013
Tim Rentsch <(E-Mail Removed)> writes:
> Richard Damon <(E-Mail Removed)> writes:
>> On 2/1/13 5:41 AM, BartC wrote:
>>> "Tim Rentsch" <(E-Mail Removed)> wrote in message
>>> news:(E-Mail Removed)...
>>>> "BartC" <(E-Mail Removed)> writes:
>>>>
>>>>> "Tim Rentsch" <(E-Mail Removed)> wrote in message
>>>
>>>>>> The sizeof operator can be defined in a way that satisfies all
>>>>>> the Standard's requirements but still allows this example to
>>>>>> allocate only 100 nibbles.
>>>>>
>>>>> So what would be the value of sizeof(*x)? It can only really
>>>>> be 1 or 0. [snip elaboration]
>>>>
>>>> Actually that isn't right. Using 'sizeof' on a non-standard
>>>> datatype may behave in unexpected ways. Here's a hint: sizeof
>>>> yields a result of type size_t; size_t is an (unsigned) integer
>>>> type; integer types larger than character types may have padding
>>>> bits. Can you fill in the rest?
>>>
>>> What, that extra information is stored in those padding bits? How is
>>> that going to help?
>>>
>>> What bit-pattern could be returned by sizeof(*x) that would make
>>> 100*sizeof(*x) yield 50 instead of 100?

>>
>> one simple solution is to have the sizeof operator not return a
>> size_t for the nybble type, but some special type that does math
>> "funny" to get the right value, for example a fixed point type
>> with 1 fractional bit. When doing arithmetic on this type, the
>> compiler can get the "right" answer, and also do the right thing
>> when converting it to a standard type.

>
> This is basically the idea, except the result isn't a new type
> but is always a size_t. The key insight is that size_t can be
> what is in effect a fixed-point type (with three fraction bits,
> for example), but still satisfy the requirements for being an
> integer type by designating the fraction bits as "padding bits".
> Any combination of fraction bits other than all zeroes would be
> a trap representation, allowing both standard behavior and
> extended behavior in the same data type (ie, size_t).

[...]

It's an interesting idea, if your goal is to have a completely
conforming C (C11, I suppose) implementation for a 4-bit system
in which you can write programs that can deal naturally with 4-bit
objects, and arrays of 4-bit objects. I think that making size_t a
kind of hybrid type, with extra bits that are padding bits as far
as any strictly conforming code is concerned, but fraction bits
for implementation-specific code, *could* achieve that goal.

But in real life, I think the *real* goal would be to create a
useful software development environment for the target 4-bit system,
and I don't think a conforming C implementation would be the best
solution. It's quite possible that such a system would be so tiny
that nothing above the level of assembly language would be worth
the effort. Or perhaps it would make sense to implement a C-like
language that violates the standard by setting CHAR_BIT==4, and/or
by decoupling the idea of the smallest addressible storage unit from
character types. Perhaps it wouldn't support floating-point either.
I think the result would be easier to implement, and less confusing
to programmers, than the conforming C with 4-bit extensions that
you propose.

One of C's great strengths is the ability to write code that's
portable across all conforming C implementations. Any code written
for a 4-bit system would likely be intended *only* for that system.

we could have an implementation with genuine 4-bit types, and
sizeof(char) could consistently be 2. But it's too late for that.)

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

glen herrmannsfeldt
Guest
Posts: n/a

 02-04-2013
David Thompson <(E-Mail Removed)> wrote:
> On Wed, 30 Jan 2013 15:47:35 -0500, Roberto Waltman
> <(E-Mail Removed)> wrote:

>> glen herrmannsfeldt wrote:

>> >Are there C implementations for any Harvard (separate program and
>> >data space) machines? Maybe the 8048 series?

>> Many, starting with the PDP-11's, today's Atmel AVR's, etc.

> Nit: *some* PDP-11s. Higher-end models had separate "instruction" and
> "data" spaces, and 3 privilege levels; middle had combined-I&D and 2
> levels; original and later cheapest models had no memory management
> and only 1 (implicit) privilege level.

As well as I understand it, for the PDP-11 that is separate virtual

At the time, I was thinking about processors like the 8048 that
have separate physical (usually ROM) address space for code,
and RAM for data. There is a way to load data from code
(ROM) address space, but no way to store, and no way to
execute instructions from data (RAM) address space.

But yes, for a user space program, C compiler or compiled
code, running on a PDP-11 under RSX-11, the separate address
spaces would matter.

Most of my PDP-11 work has been under RT11SJ where that doesn't
come up. Does the PDP-11 in separate I/D mode have a way to
load data (look-up tables) from I space?

-- glen

Roberto Waltman
Guest
Posts: n/a

 02-04-2013
glen herrmannsfeldt wrote:
>David Thompson <(E-Mail Removed)> wrote:
>>> >Are there C implementations for any Harvard (separate program and
>>> >data space) machines? Maybe the 8048 series?

>
>>> Many, starting with the PDP-11's, today's Atmel AVR's, etc.

>
>> Nit: *some* PDP-11s. Higher-end models had separate "instruction" and
>> "data" spaces, and 3 privilege levels; middle had combined-I&D and 2
>> levels; original and later cheapest models had no memory management
>> and only 1 (implicit) privilege level.

>
>As well as I understand it, for the PDP-11 that is separate virtual

Correct, and maybe it should not be called a "true" Harvard
architecture, since with a single physical bus there is no parallelism
between code and data access.

>Most of my PDP-11 work has been under RT11SJ where that doesn't
>come up. Does the PDP-11 in separate I/D mode have a way to
>load data (look-up tables) from I space?

Yes - "Move to/from previous data/instruction space"
I don't recall if these were "privileged" (system mode only)
instructions.
--
Roberto Waltman

Tim Rentsch
Guest
Posts: n/a

 02-04-2013
Keith Thompson <(E-Mail Removed)> writes:

> Richard Damon <(E-Mail Removed)> writes:
>> On 2/3/13 8:12 PM, Tim Rentsch wrote:
>>> Richard Damon <(E-Mail Removed)> writes:
>>>> one simple solution is to have the sizeof operator not return
>>>> a size_t for the nybble type, but some special type that does
>>>> math "funny" to get the right value, for example a fixed
>>>> point type with 1 fractional bit. When doing arithmetic on
>>>> this type, the compiler can get the "right" answer, and also
>>>> do the right thing when converting it to a standard type.
>>>
>>> This is basically the idea, except the result isn't a new type
>>> but is always a size_t. The key insight is that size_t can be
>>> what is in effect a fixed-point type (with three fraction bits,
>>> for example), but still satisfy the requirements for being an
>>> integer type by designating the fraction bits as "padding bits".
>>> Any combination of fraction bits other than all zeroes would be
>>> a trap representation, allowing both standard behavior and
>>> extended behavior in the same data type (ie, size_t).

>>
>> I am not sure that a fractional type meets the requirements for
>> size_t. The problem is that size_t is defined as an "unsigned
>> integral type", and math on such types is well defined and does
>> not allow for fractional bits. This is why I was proposing that
>> sizeof needs to return something besides size_t for the _Nybble
>> type.

>
> The extra bits would be padding bits as far as any language-defined
> operations are concerned. Arithmetic operations would happen to
> set those bits consistently, but you wouldn't be able to access
> them other than by performing undefined, or at least unspecified,
> operations.
>
> For example, `(size_t)1 / 2` would yield a result that would
> compare equal to 0, but if you look at the padding bits of the
> result and interpret them as fractional bits, you can interpret
> it as 1/2 (a value midway between 0 and 1).

expressions like (size_t)1 / 2 must not set any "fraction" bits.
This expression is well-defined by the Standard - it must behave
exactly like 0 in all respects for all subsequent operations (ie,
operations whose behavior is defined by the Standard). The only
value operations that produce size_t representations with non-zero
fractions must have an element of undefined behavior, or possibly
implementation-defined behavior. This expression is completely
defined so it mustn't do that.

Tim Rentsch
Guest
Posts: n/a

 02-04-2013
Keith Thompson <(E-Mail Removed)> writes:

> Tim Rentsch <(E-Mail Removed)> writes:
>> Richard Damon <(E-Mail Removed)> writes:
>>>>
>>>> [re implementing 4-bit wide data types, sizeof, size_t, etc]
>>>
>>> one simple solution is to have the sizeof operator not return a
>>> size_t for the nybble type, but some special type that does math
>>> "funny" to get the right value, for example a fixed point type
>>> with 1 fractional bit. When doing arithmetic on this type, the
>>> compiler can get the "right" answer, and also do the right thing
>>> when converting it to a standard type.

>>
>> This is basically the idea, except the result isn't a new type
>> but is always a size_t. The key insight is that size_t can be
>> what is in effect a fixed-point type (with three fraction bits,
>> for example), but still satisfy the requirements for being an
>> integer type by designating the fraction bits as "padding bits".
>> Any combination of fraction bits other than all zeroes would be
>> a trap representation, allowing both standard behavior and
>> extended behavior in the same data type (ie, size_t).

> [...]
>
> It's an interesting idea, if your goal is to have a completely
> conforming C (C11, I suppose) implementation for a 4-bit system
> in which you can write programs that can deal naturally with 4-bit
> objects, and arrays of 4-bit objects. I think that making size_t a
> kind of hybrid type, with extra bits that are padding bits as far
> as any strictly conforming code is concerned, but fraction bits
> for implementation-specific code, *could* achieve that goal.
>
> But in real life, I think the *real* goal would be to create a
> useful software development environment for the target 4-bit system,
> and I don't think a conforming C implementation would be the best
> solution. [snip elaboration]

One, the point of this scheme is to support addressing, and
especially arrays, for data types narrower than 8 bits. That
might be a machine with a 4-bit wide data path and/or ALU's, but
it doesn't have to be. It could be a typical large cpu of today,
and even one that doesn't natively support sub-byte addressing.
For example, it could be provided on x86 cpus. What I think is
interesting is that this capability could be provided in a way
that fits in quite neatly with Standard C.

Two, as far as small 4-bit systems go, I think the main holdup
for using standard C is the minimum width of certain data types.
An implementation with char, short, and int being 8 bits, and
long/size_t being 15 bits (plus 1 padding/fraction bit), and
long long being 16 bits, but otherwise conforming (and presumably
extended with some 4-bit data types) is probably small enough to
be usable at the same time as being close enough to regular C
so that switching between the two languages could be done fairly
easily. A lot of small programs would just work with no changes
needed.

Tim Rentsch
Guest
Posts: n/a

 02-04-2013
Richard Damon <(E-Mail Removed)> writes:

> On 2/3/13 8:12 PM, Tim Rentsch wrote:
>> Richard Damon <(E-Mail Removed)> writes:
>>> one simple solution is to have the sizeof operator not return a
>>> size_t for the nybble type, but some special type that does math
>>> "funny" to get the right value, for example a fixed point type
>>> with 1 fractional bit. When doing arithmetic on this type, the
>>> compiler can get the "right" answer, and also do the right thing
>>> when converting it to a standard type.

>>
>> This is basically the idea, except the result isn't a new type
>> but is always a size_t. The key insight is that size_t can be
>> what is in effect a fixed-point type (with three fraction bits,
>> for example), but still satisfy the requirements for being an
>> integer type by designating the fraction bits as "padding bits".
>> Any combination of fraction bits other than all zeroes would be
>> a trap representation, allowing both standard behavior and
>> extended behavior in the same data type (ie, size_t).

>
> I am not sure that a fractional type meets the requirements for
> size_t. The problem is that size_t is defined as an "unsigned
> integral type", and math on such types is well defined and does
> not allow for fractional bits. [snip]

What I am suggesting is not a pure fixed-point type but a kind of
combination. Any operation on operand values whose fraction bits
are zero produces the same result as the corresponding integer
operation would (and specifically the fraction bits of the result
would be zero). This approach ensures the size_t type will meet
the Standard's requirements for an integer type.

>>> It would say that
>>>
>>> malloc(100 * sizeof(*x));
>>>
>>> and
>>>
>>> size_t size = sizeof(*x);
>>> malloc(100 * size);
>>>
>>> might return different sized allocations, but as long as the
>>> "nybble" type is given a name in the implementation reserved
>>> name space (like _Nybble), the the use of that name leads to
>>> "undefined behavior" by the standard which is then defiend by
>>> the implementation to be useful.

>>
>> I would find discrepancies like this disquieting. And there are
>> other cases, eg, calloc(), where preserving the fractional
>> information in size_t could be important. It seems better to
>> have size_t be able to carry around the extra information,
>> since generally that should yield higher fidelity overall.

>
> Making the conversion of the _Size_t type to size_t, when the
> conversion isn't exact, generate a warning could at least help
> locate problematical cases, and perhaps let you change the
> declaration to _Size_t. Note that your second case, while it
> allocates too much space, will at least run properly, even if it
> will be wasteful.

I don't like introducing another type because it increases the
distance between the extended language and Standard C. And I
really don't like the idea of having to use a non-standard type
when a standard type could do the job.

Re: calloc() and space allocation - the problem is sometimes the
space allocated would be too large by a factor of two!

> I think that it is better that the way this is implemented keeps
> conforming code correct. Since sizeof(char)/2 MUST be 0 by the
> rules of Standard C, size_t can not hold fractional bits.

I think you're assuming that all operations on size_t values
would behave as the corresponding fixed-point operations would,
eg, sizeof(char)/2 would be 0.5. That isn't what I'm suggesting.
All operations on size_t values whose fraction bits are zero
would produce the same results as the corresponding operations
on integer values.

Keith Thompson
Guest
Posts: n/a

 02-04-2013
Tim Rentsch <(E-Mail Removed)> writes:
> Keith Thompson <(E-Mail Removed)> writes:

[...]
>> For example, `(size_t)1 / 2` would yield a result that would
>> compare equal to 0, but if you look at the padding bits of the
>> result and interpret them as fractional bits, you can interpret
>> it as 1/2 (a value midway between 0 and 1).

>
> expressions like (size_t)1 / 2 must not set any "fraction" bits.
> This expression is well-defined by the Standard - it must behave
> exactly like 0 in all respects for all subsequent operations (ie,
> operations whose behavior is defined by the Standard). The only
> value operations that produce size_t representations with non-zero
> fractions must have an element of undefined behavior, or possibly
> implementation-defined behavior. This expression is completely
> defined so it mustn't do that.

Hmm.

There can be representations for an integer 0 other than all-bits-zero,
so I'm not sure that having `(size_t)1 / 2` set some of the
padding/fraction bits to 1 would be forbidden. But certainly
`(size_t)1 / 2 * 2` must be 0.

On the other hand, we'd want `sizeof (_Nybble[2])` to denote 1
8-bit byte, and `sizeof (_Nybble[2]) / 2` to denote 1 4-bit nybble.

The only way I can think of to make this work consistently would
be to add another padding bit to size_t, a flag that indicates
whether it's an ordinary C size value or something that takes
nybbles into account.

I'm even more convinced that it wouldn't be worth the effort. The C
language is not conveniently portable to 4-bit addressable systems
(or trinary systems, or analog systems, or ...).

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

glen herrmannsfeldt
Guest
Posts: n/a

 02-04-2013
Keith Thompson <(E-Mail Removed)> wrote:

(snip regarding nybble addressing in C)

> I'm even more convinced that it wouldn't be worth the effort. The C
> language is not conveniently portable to 4-bit addressable systems
> (or trinary systems, or analog systems, or ...).

It seems that program memory is byte addressed with a 12 bit
nybbles in program space (the latter, presumably, only if
it is in RAM).

Index registers (register pairs) are 8 bits, so much of the
pointer discussion won't be very interesting.

You can do indirect branching within a (256 byte) ROM.
Cross ROM requires either a direct branch, subroutine
(three level stack built into the 4004) call, or return
from subroutine.

Data memory is specified as 5120 bits, which I believe
is 1280 nybbles. I believe that is five address spaces
of 256 nybbles each, addressed with separate instructions.

(The 4002 RAM holds 80 nybbles. For the full data space
you need some other type of RAM.)

-- glen

BartC
Guest
Posts: n/a

 02-05-2013
"glen herrmannsfeldt" <(E-Mail Removed)> wrote in message
news:kepe53\$u0m\$(E-Mail Removed)...
> Keith Thompson <(E-Mail Removed)> wrote:
>
> (snip regarding nybble addressing in C)
>
>> I'm even more convinced that it wouldn't be worth the effort. The C
>> language is not conveniently portable to 4-bit addressable systems
>> (or trinary systems, or analog systems, or ...).

>

Adapting C to work with such a processor is a waste of time I think.

But sub-byte data types are still useful whatever the processor or whether
they are individually addressable by hardware.

(These would be 1, 4 and 2-bits in order of usefulness. They fill in the
missing sizes at the start of this sequence: 8, 16, 32, 64, ..., so a 5-bit
type probably isn't necessary..)

This can be done now, with a crude collection of functions, but would nice
to have it as part of the language:

bit s[256]; //32 bytes
bit* p = &s[123];

sizeof() would still need to be byte-based; you don't want to mess about
with that part of the language. Something new is needed to work with
bit-types.

--
Bartc

Shao Miller
Guest
Posts: n/a

 02-05-2013
On 2/4/2013 20:17, BartC wrote:
> "glen herrmannsfeldt" <(E-Mail Removed)> wrote in message
> news:kepe53\$u0m\$(E-Mail Removed)...
>> Keith Thompson <(E-Mail Removed)> wrote:
>>
>> (snip regarding nybble addressing in C)
>>
>>> I'm even more convinced that it wouldn't be worth the effort. The C
>>> language is not conveniently portable to 4-bit addressable systems
>>> (or trinary systems, or analog systems, or ...).

>>

>
> Adapting C to work with such a processor is a waste of time I think.
>
> But sub-byte data types are still useful whatever the processor or
> whether they are individually addressable by hardware.
>
> (These would be 1, 4 and 2-bits in order of usefulness. They fill in the
> missing sizes at the start of this sequence: 8, 16, 32, 64, ..., so a
> 5-bit type probably isn't necessary..)
>
> This can be done now, with a crude collection of functions, but would
> nice to have it as part of the language:
>

Why do they have to be crude? Functions that the implementation
provides as extensions needn't be crude at all, and could translate with
whatever efficiency is possible.

> bit s[256]; //32 bytes
> bit* p = &s[123];
>

/* Is this allowed? */
struct {
int i;
bit ba[5];
double d;
} foo;
size_t sz = sizeof foo;

> sizeof() would still need to be byte-based; you don't want to mess about
> with that part of the language. Something new is needed to work with
> bit-types.
>

It's 'sizeof', not 'sizeof()', if you please. Else we could discuss the
'()+()' operator, the '*()' and '()*()' operators, etc.

--
- Shao Miller
--
"Thank you for the kind words; those are the kind of words I like to hear.

Cheerily," -- Richard Harter