Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > Questions about memory structure of value types and reference types

Reply
Thread Tools

Questions about memory structure of value types and reference types

 
 
Sam Kong
Guest
Posts: n/a
 
      10-12-2006
Hi,

JavaScript hides its memory structure.
I know that numbers, booleans, null and undefined are value types
(value is directed saved in a variable).

I want to know:

- How JavaScript distinguishes value types from reference types by
seeing the variable content?

- How many bytes are allocated in memory when I declare the following?
var i;


I can program in JavaScript without understanding them.
But I'm curious.

Thanks.

Sam

 
Reply With Quote
 
 
 
 
John G Harris
Guest
Posts: n/a
 
      10-14-2006
In article <(E-Mail Removed) om>, Sam
Kong <(E-Mail Removed)> writes
>Hi,
>
>JavaScript hides its memory structure.
>I know that numbers, booleans, null and undefined are value types
>(value is directed saved in a variable).
>
>I want to know:
>
>- How JavaScript distinguishes value types from reference types by
>seeing the variable content?
>
>- How many bytes are allocated in memory when I declare the following?
>var i;
>
>
>I can program in JavaScript without understanding them.
>But I'm curious.


A javascript engine can be programmed any way that works.

Your guesses are probably as good as ours.

John
--
John Harris
 
Reply With Quote
 
 
 
 
VK
Guest
Posts: n/a
 
      10-14-2006
Sam Kong wrote:
> JavaScript hides its memory structure.


It doesn't officially expose it to the language tools, that would be
more correct to say - but yes, it hides it.

> I know that numbers, booleans, null and undefined are value types
> (value is directed saved in a variable).


That is hardly possible in a loosely typed language.
When you declare
var myVar;
the engine has no idea what data will be stored in myVar; even if you
instantiate and assign a value to it right away:
var myVar = true;
then it improves the code readability but it doesn't help too much to
the engine, because right the next statement can change the data type:
var myVar = true;
myVar = new Object; // or something else
(The right above is not a good way of course to benefit from the loose
typing: yet the engine has to be ready for such flip around at any
system tick for any variable).

That is the end of generic considerations for *a* script engine:
further it is the question of how *this* engine or *that* engine is
dealing with the challenge.

> - How JavaScript distinguishes value types from reference types by
> seeing the variable content?
> - How many bytes are allocated in memory when I declare the following?
> var i;


I'm not ready to comment on the Gecko engine. In the old Netscape (very
fragmentary described in this aspect in ECMAScript specs) and in all
versions of Internet Explorer it is solved by creating a structure of a
kind of a database record with a MEMO field. So it's a fixed size
structure yet able to keep proprietary long data in it by having it
outside of the record and only referenced in the record itself.

So say Internet Explorer on statement
var i;
internally creates new VARIANT object (not that Variant which is in
VBA, but that VARIANT which is VT in C++)
This VARIANT object has a data member for the value (this is an
anonymous C++ union) and a data member indicating the type of
information stored in the union. That is why I used the analogy with a
database record / MEMO field (and surely made very upset all
professional C++ programmers
This VARIANT object also has data member with pointer to IDispatchEx
interface extending the base IDispatch interface: that lets your object
to be a reference to something else or to be a member of some other
object.
It also has a data member with flag set by Garbage Collector. Once for
a while Garbage Collector comes, studies the object and if no in-scope
references found, it raises this flag ("that is a garbage to remove").

There is a few other fields (like a pointer to IUnknown), but the above
members are the most essential for the core functionality.

All this structure in application to the script engine is called
scavenger, and it is not to mix with a heap, because the value itself
is not here but only referenced to.

(The mythological "native ECMAScript object with internal [[values]]"
is an allegoric decription of the scavenger in the way of how ECMA's
free-lancers saw it in 1999 on Netscape 4.x).

>From this point forward your variable is ready to be whatever

programmer's unlimited fantasy will make it be at any moment: a
primitive value, an object reference, a function, an object member - it
will not catch the engine on cold.

The byte-exact size of the structure depends on the current OS and on
the bit-base of the OS (16bit, 32bit, 64bit). You may look at C++ specs
for VARIANT to find out.

If you are really interested in the "night life" of the JScript engine
and if you have some knowledge of C++ programming, search MSDN for
IDispatch and IDispatchEx methods. Then you can write an util to
connect to the engine to see scavengers appearing, GC roaming from
scavenger to scavenger, references changing, the garbage removed.

P.S. JScript description above is not directly applicable to
JScript.NET
The latter is a rather different beast.

P.P.S. From the optimization point of view it is tempting to create at
least short-living primitive values as just primitive values in hope to
the programmer's reasonnable behavior; and in case of type change
dereference the primitive and and to create a VARIANT with the same
name. A pure speculation from my side but maybe such mechanics is
implemented in some engines.

 
Reply With Quote
 
Sam Kong
Guest
Posts: n/a
 
      10-14-2006
Hi VK,

VK wrote:
> Sam Kong wrote:
> > JavaScript hides its memory structure.

>
> It doesn't officially expose it to the language tools, that would be
> more correct to say - but yes, it hides it.
>
> > I know that numbers, booleans, null and undefined are value types
> > (value is directed saved in a variable).

>
> That is hardly possible in a loosely typed language.
> When you declare
> var myVar;
> the engine has no idea what data will be stored in myVar; even if you
> instantiate and assign a value to it right away:
> var myVar = true;
> then it improves the code readability but it doesn't help too much to
> the engine, because right the next statement can change the data type:
> var myVar = true;
> myVar = new Object; // or something else
> (The right above is not a good way of course to benefit from the loose
> typing: yet the engine has to be ready for such flip around at any
> system tick for any variable).
>
> That is the end of generic considerations for *a* script engine:
> further it is the question of how *this* engine or *that* engine is
> dealing with the challenge.
>
> > - How JavaScript distinguishes value types from reference types by
> > seeing the variable content?
> > - How many bytes are allocated in memory when I declare the following?
> > var i;

>
> I'm not ready to comment on the Gecko engine. In the old Netscape (very
> fragmentary described in this aspect in ECMAScript specs) and in all
> versions of Internet Explorer it is solved by creating a structure of a
> kind of a database record with a MEMO field. So it's a fixed size
> structure yet able to keep proprietary long data in it by having it
> outside of the record and only referenced in the record itself.
>
> So say Internet Explorer on statement
> var i;
> internally creates new VARIANT object (not that Variant which is in
> VBA, but that VARIANT which is VT in C++)
> This VARIANT object has a data member for the value (this is an
> anonymous C++ union) and a data member indicating the type of
> information stored in the union. That is why I used the analogy with a
> database record / MEMO field (and surely made very upset all
> professional C++ programmers
> This VARIANT object also has data member with pointer to IDispatchEx
> interface extending the base IDispatch interface: that lets your object
> to be a reference to something else or to be a member of some other
> object.
> It also has a data member with flag set by Garbage Collector. Once for
> a while Garbage Collector comes, studies the object and if no in-scope
> references found, it raises this flag ("that is a garbage to remove").
>
> There is a few other fields (like a pointer to IUnknown), but the above
> members are the most essential for the core functionality.
>
> All this structure in application to the script engine is called
> scavenger, and it is not to mix with a heap, because the value itself
> is not here but only referenced to.
>
> (The mythological "native ECMAScript object with internal [[values]]"
> is an allegoric decription of the scavenger in the way of how ECMA's
> free-lancers saw it in 1999 on Netscape 4.x).
>
> >From this point forward your variable is ready to be whatever

> programmer's unlimited fantasy will make it be at any moment: a
> primitive value, an object reference, a function, an object member - it
> will not catch the engine on cold.
>
> The byte-exact size of the structure depends on the current OS and on
> the bit-base of the OS (16bit, 32bit, 64bit). You may look at C++ specs
> for VARIANT to find out.
>
> If you are really interested in the "night life" of the JScript engine
> and if you have some knowledge of C++ programming, search MSDN for
> IDispatch and IDispatchEx methods. Then you can write an util to
> connect to the engine to see scavengers appearing, GC roaming from
> scavenger to scavenger, references changing, the garbage removed.
>
> P.S. JScript description above is not directly applicable to
> JScript.NET
> The latter is a rather different beast.
>
> P.P.S. From the optimization point of view it is tempting to create at
> least short-living primitive values as just primitive values in hope to
> the programmer's reasonnable behavior; and in case of type change
> dereference the primitive and and to create a VARIANT with the same
> name. A pure speculation from my side but maybe such mechanics is
> implemented in some engines.


You explanation is amazing.
I almost gave up on this question.
And you rescued me.

Thank you very much.
I'll study some more on the materials you recommended.

Sam

 
Reply With Quote
 
VK
Guest
Posts: n/a
 
      10-14-2006
> VK wrote:
> search MSDN for IDispatch and IDispatchEx methods.


Just noticed an ambiguosity here, better to say "IDispatch and
IDispatchEx *and their* methods" - otherwise it is possible to read as
if IDispatch and IDispatchEx are some methods while they are
interfaces. For just a hell of it correction...

> You explanation is amazing.
> I almost gave up on this question.
> And you rescued me.
>
> Thank you very much.
> I'll study some more on the materials you recommended.


You are welcome - and please share if you find something interesting.
In the particular I did not get a clear reference on MSDN of how do
they handle short-living scavengers with primitive values (say var i in
for(var i=0;...) loop inside a function.

Overall collecting such info is a pain in one place (my previous post
summarized the info collected from a good dozen of sources on MSDN,
MSDN forums and private blogs of IE developers). Overall no one
producer is rushing to disclose all internal mechanics of their engine
for the obvious security considerations. In this aspect JScript engine
is the post open one; Gecko engine is the most closed one however funny
it may sound. Of course Gecko's entire source is open, but where to
look without a single hit? That is way too much of a challenge if you
are not a real C++ professional - and I am not.

 
Reply With Quote
 
John G Harris
Guest
Posts: n/a
 
      10-15-2006
In article <(E-Mail Removed) .com>, VK
<(E-Mail Removed)> writes

<snip>
>It also has a data member with flag set by Garbage Collector. Once for
>a while Garbage Collector comes, studies the object and if no in-scope
>references found, it raises this flag ("that is a garbage to remove").


VK has somehow switched from talking about the data structure that
implements a property value to the data structure that implements an
object.


>There is a few other fields (like a pointer to IUnknown), but the above
>members are the most essential for the core functionality.
>
>All this structure in application to the script engine is called
>scavenger, and it is not to mix with a heap, because the value itself
>is not here but only referenced to.
>
>(The mythological "native ECMAScript object with internal [[values]]"
>is an allegoric decription of the scavenger in the way of how ECMA's
>free-lancers saw it in 1999 on Netscape 4.x).

<snip>

For some reason VK won't admit that an object's data structure includes
a field that points to the front of its prototype chain. Either that or
he doesn't like it to be given a name, [[Prototype]], so we can talk
about it even though we can't program it directly.


>You may look at C++ specs
>for VARIANT to find out.

<snip>

You won't find it in the C++ Standard. The next issue will have
something based on boost::variant but its name will certainly not be all
uppercase.


John
--
John Harris
 
Reply With Quote
 
VK
Guest
Posts: n/a
 
      10-15-2006
> >It also has a data member with flag set by Garbage Collector. Once for
> >a while Garbage Collector comes, studies the object and if no in-scope
> >references found, it raises this flag ("that is a garbage to remove").


> VK has somehow switched from talking about the data structure that
> implements a property value to the data structure that implements an
> object.


I didn't switch because there is no "data structure that implements a
property value" as such - unless we are talking about some abstract
model, while the OP's question was about real implementations. There is
one data structure then and special data fields in this structure
defining what kind of structure is that and what higher level relations
this data structure has/may have with other structures. You may take a
closer look at the methods (listed in the order of the actual call
sequence):

IDispatchEx::GetDispID
IDispatchEx::GetMemberProperties
IDispatchEx::InvokeEx

> For some reason VK won't admit that an object's data structure includes
> a field that points to the front of its prototype chain. Either that or
> he doesn't like it to be given a name, [[Prototype]], so we can talk
> about it even though we can't program it directly.


If we are talking about ECMA terms then I definitely don't like
[[Class]] name in application to Object vs. Function as rather
misleading. Even if the relevant field in Netscape's scavenger was
called this way, ECMA could find some better abstraction for the public
reading, say boolean [[MayCall]] or [[MayExecute]]. At the same time I
have nothing against the name [[Prototype]], but there is one problem
with it: if you take a look at the above-mentioned methods, you don't
see any prototypes or prototype chain in there. This is what is was
saying before: either you stay on the language level and talk avout
what does this given language have - or you are going behind the scene
to see how these higher level human abstractions are actually
implemented. And you have to be very cautious in mixing both levels
because you may easily get into the position of a man calculating
electron orbits by Newton's formulas.
On the level where you so insistently want go with your [[Prototype]]
there are no prototypes and there is no inheritance at all: neither
prototype-based nor class-based. There are objects connected in the
most different ways to each other over interfaces. Just one step
further (jumping over assembly) ant it will be some raw
00000010 10000000 00000000 00000110...
processor instructions and from this point it is not really important
anymore what script is that and what other language is possibly running
this script engine.

At the same time I'd like to state publicly that I do believe in
prototype as a vital property of javascript *language* objects; I do
believe that javascript natively supports prototype-based inheritance;
that I love prototype and I never tried to abuse it neither by words
nor by disrespectful actions.


<http://www.microsoft.com/mind/1099/dynamicobject/dynamicobject.asp>
can be also interesting

> >You may look at C++ specs
> >for VARIANT to find out.

> <snip>


> You won't find it in the C++ Standard. The next issue will have
> something based on boost::variant but its name will certainly not be all
> uppercase.


Right, it's the Microsoft Automation stuff, so I should better say
"look at Microsoft C++ and C# specs".

 
Reply With Quote
 
Michael Winter
Guest
Posts: n/a
 
      10-16-2006
VK wrote:

Please attribute quoted text properly.

[John G Harris wrote:]
>> VK has somehow switched from talking about the data structure that
>> implements a property value to the data structure that implements
>> an object.

>
> I didn't switch because there is no "data structure that implements a
> property value" as such - unless we are talking about some abstract
> model, while the OP's question was about real implementations.


So far, you have only mentioned details about JScript. Unless you have
examined the implementation of every language (and I /know/ you
haven't), you cannot eliminate possible approaches.

[snip]

> If we are talking about ECMA terms then I definitely don't like
> [[Class]] name in application to Object vs. Function as rather
> misleading.


What? Are you just making up the meaning of things again? It's hard to tell.

> Even if the relevant field in Netscape's scavenger was called this
> way, ECMA could find some better abstraction for the public reading,
> say boolean [[MayCall]] or [[MayExecute]].


Why? That isn't what the [[Class]] property represents.

> At the same time I have nothing against the name [[Prototype]], but
> there is one problem with it: if you take a look at the
> above-mentioned methods, you don't see any prototypes or prototype
> chain in there.


Whether that is true or not, it doesn't mean much. Again, one
implementation is not representative of all implementations. Moreover,
the prototype chain could be handled internally and wouldn't need to
exposed directly. If the IDispatchEx interface is merely a COM mechanism
for exposing the scripting engine to a host (it seems to be, but I don't
know, nor do I really care), then why should that host be given direct
access to the prototype chain?

How do you propose that the Object.prototype.isPrototypeOf method
asserts that an object is present within the prototype chain if there is
no prototype chain? How about the instanceof operator? It calls the
internal [[HasInstance]] method, which in turn examines the prototype chain.

In case you need reminding (and you often do), the ECMAScript
specification defines behaviour not implementation. No, there doesn't
necessarily need to be a prototype chain, but an implementation must
behave as though there were where that feature has significance. It
seems rather perverse to me to store meta data about properties that
signifies whether those properties are or aren't inherited, or whether a
particular object inherits from another given object, rather than
realising that inheritance.

[snip]

>> You won't find [VARIANT] in the C++ Standard. ...

>
> Right, it's the Microsoft Automation stuff, so I should better say
> "look at Microsoft C++ and C# specs".


Why? It's still not part of the language. A more sensible place to look
would be the COM reference.

Mike
 
Reply With Quote
 
Sam Kong
Guest
Posts: n/a
 
      10-16-2006

Michael Winter wrote:
> VK wrote:
>
> Please attribute quoted text properly.
>
> [John G Harris wrote:]
> >> VK has somehow switched from talking about the data structure that
> >> implements a property value to the data structure that implements
> >> an object.

> >
> > I didn't switch because there is no "data structure that implements a
> > property value" as such - unless we are talking about some abstract
> > model, while the OP's question was about real implementations.

>
> So far, you have only mentioned details about JScript. Unless you have
> examined the implementation of every language (and I /know/ you
> haven't), you cannot eliminate possible approaches.
>
> [snip]
>
> > If we are talking about ECMA terms then I definitely don't like
> > [[Class]] name in application to Object vs. Function as rather
> > misleading.

>
> What? Are you just making up the meaning of things again? It's hard to tell.
>
> > Even if the relevant field in Netscape's scavenger was called this
> > way, ECMA could find some better abstraction for the public reading,
> > say boolean [[MayCall]] or [[MayExecute]].

>
> Why? That isn't what the [[Class]] property represents.
>
> > At the same time I have nothing against the name [[Prototype]], but
> > there is one problem with it: if you take a look at the
> > above-mentioned methods, you don't see any prototypes or prototype
> > chain in there.

>
> Whether that is true or not, it doesn't mean much. Again, one
> implementation is not representative of all implementations. Moreover,
> the prototype chain could be handled internally and wouldn't need to
> exposed directly. If the IDispatchEx interface is merely a COM mechanism
> for exposing the scripting engine to a host (it seems to be, but I don't
> know, nor do I really care), then why should that host be given direct
> access to the prototype chain?
>
> How do you propose that the Object.prototype.isPrototypeOf method
> asserts that an object is present within the prototype chain if there is
> no prototype chain? How about the instanceof operator? It calls the
> internal [[HasInstance]] method, which in turn examines the prototype chain.
>
> In case you need reminding (and you often do), the ECMAScript
> specification defines behaviour not implementation. No, there doesn't
> necessarily need to be a prototype chain, but an implementation must
> behave as though there were where that feature has significance. It
> seems rather perverse to me to store meta data about properties that
> signifies whether those properties are or aren't inherited, or whether a
> particular object inherits from another given object, rather than
> realising that inheritance.
>
> [snip]
>
> >> You won't find [VARIANT] in the C++ Standard. ...

> >
> > Right, it's the Microsoft Automation stuff, so I should better say
> > "look at Microsoft C++ and C# specs".

>
> Why? It's still not part of the language. A more sensible place to look
> would be the COM reference.
>
> Mike


Do you personally have a problem with VK?
I would appreciate if you were not that aggressive in this public
place.
(If I misunderstood, I apologize.)

Thank you for your complementary explanation.

Sam

 
Reply With Quote
 
John G Harris
Guest
Posts: n/a
 
      10-16-2006
In article <(E-Mail Removed) .com>, Sam
Kong <(E-Mail Removed)> writes

<snip>
>Do you personally have a problem with VK?


Many of us have a problem with VK. He gives out misleading information
far too often.


>I would appreciate if you were not that aggressive in this public
>place.

<snip>

I'm afraid you are going to be disappointed. And anyway, 'that' was not
at all aggressive, only firmly stated.

John
--
John Harris
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem understanding pass by value and pass by reference of arrays and what happens in the memory venkatagmail C++ 11 10-03-2007 02:00 PM
Memory allocation in Structure to Structure pra_ramli@rediffmail.com C++ 2 03-09-2006 05:51 AM
Where are ref types that are members of value types stored? Sathyaish ASP .Net 2 05-22-2005 07:32 PM
memory structure in C and value of EOF Migrators C Programming 5 05-16-2004 09:03 PM
Passing the value by reference is same as pointer by reference sam pal C++ 3 07-16-2003 09:14 PM



Advertisments