Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Ruby/Java strings solved the Ruby way

Reply
Thread Tools

Ruby/Java strings solved the Ruby way

 
 
Charles Oliver Nutter
Guest
Posts: n/a
 
      07-20-2008
Hello all!

I'm posting this here because it seem like a topic better discussed in
the general Ruby community.

JRuby allows you pretty seamless access to Java libraries through its
Java integration layer. You can pull in just about any class,
instantiate objects, call methods, and so on. In order to make this a
bit easier, in many cases we automatically coerce particular Ruby types
to their equivalent Java types. For example, Fixnum becomes either boxed
integral or primitive integral values, Floats become boxed
floating-point or primitive floating point values, and Strings are
decoded from byte[] into Java String as UTF-8.

But there's a problem with this...it adds a bit of overhead in the
numeric cases, and a *lot* of overhead in the String case.

Here's a comparison between calling a method that takes an int and a
method that takes a String (best times out of five):

with string 'hello': 1.068688
with fixnum 1: 0.563014

And this is a short string. The coercion cost for strings is at least O(n).

It's about String coercion I'm writing.

We'll never be able to eliminate the coercion cost entirely. Ruby
Strings are byte[] and it has been a great move for us implementing our
own String and related classes to use byte[] always. So there's never
going to be a straight-through path from a Ruby String to a Java String.
But I think we can reduce the impact for JRuby users by doing things the
Ruby way.

Ruby already has a protocol for coercion, via methods like to_str,
to_ary and so on. This allows you to pass e.g. non-Strings to methods
that act on Strings, and frequently (usually) they'll coerce and work
fine. Often, if you want to avoid a coercion hit, you'll create the
String ahead of time. And that's where we can learn from Ruby for Java
String handling.

So I propose that instead of always decoding incoming Ruby String into a
Java String when calling a Java method, we introduce a new type--call it
JString for now--that represents a Java string. When you require in the
Java integration support, it would add to Ruby String a method
to_jstring (or to_String or hey, toString?). So for calls from Ruby to
Java, we'd follow Ruby coercion protocols and only accept either JString
or objects that coerce to JString.

Likewise, coming from Java to Ruby, we wouldn't automatically coerce;
we'd return a JString object that implements to_str. You can then
usually pass that to String APIs, or just coerce it immediately and go
on with your business. Since this latter change would break some apps
that expect Java strings to always be coerced, it would be saved for the
next major release of JRuby and thoroughly discussed.

I think this model provides the best possible experience when calling
Java from Ruby but also allow JRuby users to take control of the
coercion process, either be defining their own to_jstring methods on
other types, or by pre-coercing strings they intend to use a lot.

Thoughts?

- Charlie

 
Reply With Quote
 
 
 
 
Jim Menard
Guest
Posts: n/a
 
      07-21-2008
On Sun, Jul 20, 2008 at 1:36 PM, Charles Oliver Nutter
<(E-Mail Removed)> wrote:
[snip]
> So I propose that instead of always decoding incoming Ruby String into a
> Java String when calling a Java method, we introduce a new type--call it
> JString for now--that represents a Java string. When you require in the Java
> integration support, it would add to Ruby String a method to_jstring (or
> to_String or hey, toString?). So for calls from Ruby to Java, we'd follow
> Ruby coercion protocols and only accept either JString or objects that
> coerce to JString.
>
> Likewise, coming from Java to Ruby, we wouldn't automatically coerce; we'd
> return a JString object that implements to_str. You can then usually pass
> that to String APIs, or just coerce it immediately and go on with your
> business. Since this latter change would break some apps that expect Java
> strings to always be coerced, it would be saved for the next major release
> of JRuby and thoroughly discussed.


This sounds like an excellent compromise. I vote for to_jstring
because it looks most Ruby-esque.

Jim
--
Jim Menard, http://www.velocityreviews.com/forums/(E-Mail Removed), (E-Mail Removed)
http://www.io.com/~jimm/

 
Reply With Quote
 
 
 
 
Charles Oliver Nutter
Guest
Posts: n/a
 
      07-21-2008
Jim Menard wrote:
> On Sun, Jul 20, 2008 at 1:36 PM, Charles Oliver Nutter
> <(E-Mail Removed)> wrote:
> [snip]
>> So I propose that instead of always decoding incoming Ruby String into a
>> Java String when calling a Java method, we introduce a new type--call it
>> JString for now--that represents a Java string. When you require in the Java
>> integration support, it would add to Ruby String a method to_jstring (or
>> to_String or hey, toString?). So for calls from Ruby to Java, we'd follow
>> Ruby coercion protocols and only accept either JString or objects that
>> coerce to JString.
>>
>> Likewise, coming from Java to Ruby, we wouldn't automatically coerce; we'd
>> return a JString object that implements to_str. You can then usually pass
>> that to String APIs, or just coerce it immediately and go on with your
>> business. Since this latter change would break some apps that expect Java
>> strings to always be coerced, it would be saved for the next major release
>> of JRuby and thoroughly discussed.

>
> This sounds like an excellent compromise. I vote for to_jstring
> because it looks most Ruby-esque.


Also up for debate is whether boxed primitives from Java should behave
the same way, with a JInteger, JFloat, and so on that can coerce to
Fixnum or Float. But boxed primitives are considerably cheaper to coerce
than Strings, so it may not be worth it.

- Charlie

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RE: Populating a dictionary, fast [SOLVED SOLVED] Michael Bacarella Python 26 11-20-2007 03:02 PM
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
I have solved my problem in some diffrend way... dada ASP .Net Datagrid Control 0 03-05-2004 05:15 PM



Advertisments