Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > JSON.parse

Reply
Thread Tools

JSON.parse

 
 
Douglas Crockford
Guest
Posts: n/a
 
      12-29-2005
There is a new version of JSON.parse in JavaScript. It is vastly
faster and smaller than the previous version. It uses a single call to
eval to do the conversion, guarded by a single regexp test to assure
that it is safe.

JSON.parse = function (text) {
return
(/^(\s|[,:{}\[\]]|"(\\["\\bfnrtu]|[^\x00-\x1f"\\])*"|-?\d+(\.\d*)?([eE][+-]?\d+)?|true|false|null)+$/.test(text))
&& eval('(' + text + ')');
};

It is ugly, but it is really efficient. See
http://www.crockford.com/JSON/js.html
 
Reply With Quote
 
 
 
 
VK
Guest
Posts: n/a
 
      12-29-2005

Douglas Crockford wrote:
(/^(\s|[,:{}\[\]]|"(\\["\\bfnrtu]|[^\x00-\x1f"\\])*"|-?\d+(\.\d*)?([eE][+-]?\d+)?|true|false|null)+$/.test(text))

> && eval('(' + text + ')');
> };


Far of being a RegExp guru - trully sincerly not :-

In case of static RegExp are not they more runtime effective if
precompiled?

var re = /r/e/g/e/x/p/;
....
re.test(string);
....

 
Reply With Quote
 
 
 
 
Lasse Reichstein Nielsen
Guest
Posts: n/a
 
      12-29-2005
Douglas Crockford <(E-Mail Removed)> writes:

....
> guarded by a single regexp test to assure that it is safe.
>
> JSON.parse = function (text) {
> return
> (/^(\s|[,:{}\[\]]|"(\\["\\bfnrtu]|[^\x00-\x1f"\\])*"|-?\d+(\.\d*)?([eE][+-]?\d+)?|true|false|null)+$/.test(text))


Looks reasonable (but a comment stating what it is supposed to match
would would make it much more readable

For efficiency, I'd change \s to \s+.

If the regexp doesn't match, then false is returned. This can also
be the value of the JSON expression. Perhaps it would be safer to
return undefined if the test fails, i.e.,
re.test(text) ? eval("("+test+")") : undefined;
or
if(re.test(text)) { return eval("("+test+")"); }

Also, you could move the creation of the RegExp object out of the
function, and reuse it for each call, instead of creating a new,
lengthy, RegExp for each call. However, that is only important if
calls are frequent, which they probably shouldn't be anyway.

/L
--
Lasse Reichstein Nielsen - http://www.velocityreviews.com/forums/(E-Mail Removed)
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
 
Reply With Quote
 
Rob Williscroft
Guest
Posts: n/a
 
      12-29-2005
Lasse Reichstein Nielsen wrote in news:(E-Mail Removed) in
comp.lang.javascript:

> Also, you could move the creation of the RegExp object out of the
> function, and reuse it for each call, instead of creating a new,
> lengthy, RegExp for each call. However, that is only important if
> calls are frequent, which they probably shouldn't be anyway.
>


VK, stated something simialar to this too.

But AIUI a RegExp that comes from a /.../ expression is (supposed
to be) compiled when the function body is compiled, IOW it only
happens once, the /.../ expresion having been replaced with a
compiled RegExp object.

Rob.
--
http://www.victim-prime.dsl.pipex.com/
 
Reply With Quote
 
bwucke@gmail.com
Guest
Posts: n/a
 
      12-29-2005

Lasse Reichstein Nielsen napisal(a):
> Looks reasonable (but a comment stating what it is supposed to match
> would would make it much more readable
>
> For efficiency, I'd change \s to \s+.


Please oh please, don't sacrifice unambiguity for grammar correctness.
.. matches any single character. The above would mean a sequence of one
or more whitespaces followed by a single arbitrary character (...and
then the rest of re) which not only slows the regexp instead of
speeding it up, but also changes its meaning.
It took me a while to understand the . means the end of the sentence
here.

For efficiency, I'd change "\s" to "\s+".

Sure the rules of English state the final dot should go INSIDE the
quotation marks, but that would be even worse.

 
Reply With Quote
 
VK
Guest
Posts: n/a
 
      12-29-2005

Douglas Crockford wrote:
> There is a new version of JSON.parse in JavaScript. It is vastly
> faster and smaller than the previous version. It uses a single call to
> eval to do the conversion, guarded by a single regexp test to assure
> that it is safe.
>
> JSON.parse = function (text) {
> return
> (/^(\s|[,:{}\[\]]|"(\\["\\bfnrtu]|[^\x00-\x1f"\\])*"|-?\d+(\.\d*)?([eE][+-]?\d+)?|true|false|null)+$/.test(text))
> && eval('(' + text + ')');
> };
>
> It is ugly, but it is really efficient. See
> http://www.crockford.com/JSON/js.html


Semi-irrelevant to this post but important to know:

1. JSON engine versioning
Would the above to be considered as JSON 1.01, JSON 1.1 or JSON 2.0 or
?
It is crutial for benchmark references and proper download refs.

2. In the light of recent events (like JSON as one of official data
interfaces of Yahoo!) does author plan to change anyhow the licensing
(I hope not).

3. Leaving JSON engine in the public domain would it be possible to
narrow the covering license? So far JSON goes under the proprietary
"The Software shall be used for Good, not Evil." As good as it is -
would it be possible to move the software under one of more lecally
specific free software licenses? Like GNU General License or another
well defined copyleft license? If it is not desirable could author to
collaborate on the definition of Evil in the application to JSON? Say
non-ECMA-compliant code or no Firefox support - would it be an evil? Or
the license means the Evil in the social and religious aspects only?

I'm not trying to be nasty - but sometimes a dot counts for big
troubles.

 
Reply With Quote
 
Rob Williscroft
Guest
Posts: n/a
 
      12-29-2005
Thomas 'PointedEars' Lahn wrote in
news:(E-Mail Removed) in comp.lang.javascript:

> Rob Williscroft wrote:
>
>> Lasse Reichstein Nielsen wrote [...]:
>>> Also, you could move the creation of the RegExp object out of the
>>> function, and reuse it for each call, instead of creating a new,
>>> lengthy, RegExp for each call. However, that is only important if
>>> calls are frequent, which they probably shouldn't be anyway.

>>
>> VK, stated something simialar to this too.
>>
>> But AIUI a RegExp that comes from a /.../ expression is (supposed
>> to be) compiled when the function body is compiled, IOW it only
>> happens once, the /.../ expresion having been replaced with a
>> compiled RegExp object.

>
> `/.../' is equivalent to `new RegExp(...)', see ECMAScript (ES) 3,
> 7.8.5. There is a RegExp object created on each call and GC'd shortly
> after, so it is more efficient to create that object once and make it
> globally available. To avoid spoiling the global namespace and attach
> the object reference to the method that uses it, I wrote
>
> JSON.parse = function(...) { ... JSON.parse.rx ... };
> JSON.parse.rx = /.../


Thanks for the reference,

Standard ECMA-262 3rd Edition - December 1999

7.8.5 Regular Expression Literals

A regular expression literal is an input element that is converted
to a RegExp object (section 15.10) when it is scanned. The object
is created before evaluation of the containing program or function
begins. Evaluation of the literal produces a reference to that object;
it does not create a new object. ...

The above confirms my "AIUI" above, and confirms that there *isn't*
a "new RegExp object created on each call".

Has this version (ECMA-262) been superseeded ?
>
> However, it should be taken into account that RegExp.prototype.test()
> is doing very much the same as RegExp.prototype.exec() does (ES3,
> 15.10.6.3) and so it may not be wise to use a globally available
> RegExp object that retains the status of the last match.
>


This shouldn't be a problem for a RegExp that only ever has test()
called on it (as with the OP's code) as AFAICT exec() will only
ever reset the lastIndex property to 0 (which is the default anyway).

Rob.
--
http://www.victim-prime.dsl.pipex.com/
 
Reply With Quote
 
Thomas 'PointedEars' Lahn
Guest
Posts: n/a
 
      12-29-2005
Rob Williscroft wrote:

> Thomas 'PointedEars' Lahn wrote [...]:
>> Rob Williscroft wrote:
>>> But AIUI a RegExp that comes from a /.../ expression is (supposed
>>> to be) compiled when the function body is compiled, IOW it only
>>> happens once, the /.../ expresion having been replaced with a
>>> compiled RegExp object.

>>
>> `/.../' is equivalent to `new RegExp(...)', see ECMAScript (ES) 3,
>> 7.8.5. There is a RegExp object created on each call and GC'd shortly
>> after, so it is more efficient to create that object once and make it
>> globally available. To avoid spoiling the global namespace and attach
>> the object reference to the method that uses it, I wrote
>>
>> JSON.parse = function(...) { ... JSON.parse.rx ... };
>> JSON.parse.rx = /.../

>
> Thanks for the reference,
>
> Standard ECMA-262 3rd Edition - December 1999
>
> 7.8.5 Regular Expression Literals
>
> A regular expression literal is an input element that is converted
> to a RegExp object (section 15.10) when it is scanned. The object
> is created before evaluation of the containing program or function
> begins. Evaluation of the literal produces a reference to that object;
> it does not create a new object. ...
>
> The above confirms my "AIUI" above, and confirms that there *isn't*
> a "new RegExp object created on each call".


Yes, indeed. Somehow I overlooked the following sentences all the time,
and it appears I was not the only one here. Thank /you/ for pointing that
out.

> Has this version (ECMA-262)


It is ECMA-262 (ECMAScript) _Edition_ 3, actually.

> been superseeded ?


There is a PDF and Microsoft Word version of the ECMAScript Language
Specification that have 3 more pages (ref. PDF versions), are titled
"Edition 3 Final" and dated March 24, 2000 inside. (They refer to
themselves being downloadable from ftp.ecma.ch. However, [ftp.]ecma.ch
is no longer and ftp.ecma-international.org appears not to provide
access with anonymous login.)

These can be downloaded from

<URL:http://www.mozilla.org/js/language/>

Although it does not appear to include the required corrections mentioned
in the errata, the "Final" addition and the date indicate that this is the
latest revision published by the ECMA; it is unclear why only the December
1999 revision is linked on ecma-international.org. (Maybe the mozilla.org
folks have access to more recent information on ECMA's FTP server because
the Mozilla Foundation is an ECMA member.) A text comparison between the
two revisions I did today is inconclusive as yet.

However, whether it should be considered normative or not, that latest
revision says the same as its predecessor; you are correct.

>> However, it should be taken into account that RegExp.prototype.test()
>> is doing very much the same as RegExp.prototype.exec() does (ES3,
>> 15.10.6.3) and so it may not be wise to use a globally available
>> RegExp object that retains the status of the last match.

>
> This shouldn't be a problem for a RegExp that only ever has test()
> called on it (as with the OP's code) as AFAICT exec() will only
> ever reset the lastIndex property to 0 (which is the default anyway).


No, it could pose a problem since the next match will start from the
position the `lastIndex' property indicates. The value of that property is
reset to 0 iff "I < 0 or I > length" (15.10.6.2.6.), where according to
step 2 `length' refers to the length of the string the method is passed.
It is unclear what `I' refers to; known implementations suggest that this
is a typo not covered in the errata and actually `i' is meant. If we
assume this, `i' would be the value of ToInteger(lastIndex), according to
step 4, which is in fact the behavior of those implementations. That means
previous calls of RegExp.prototype.exec() on the same RegExp object do
affect the current call on the same object, unless

| 5. If the global property is false, let i = 0.

According to 15.10.4.1,

| The global property of the newly constructed object is set to a Boolean
| value that is true if F contains the character "g" and false otherwise.

So it does not pose a problem _here_, as Douglas is not using a global
expression (and the expression is anchored on both sides anyway.)


PointedEars
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off




Advertisments