Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > pack 'C3U*' not same as pack 'C3(xC)*'

Reply
Thread Tools

pack 'C3U*' not same as pack 'C3(xC)*'

 
 
Alexander Farber
Guest
Posts: n/a
 
      06-23-2005
Hi,

I have a small card game. The clients are Java-applets and the
server is written in C, mostly forwarding data from applet to applet.

The message format is:

1 byte: Number of unicode chars (s. below)
2 byte: Player number
3 byte: Event id
up to 510 bytes: A Java unicode string

Now I'm trying to rewrite my C-server to perl, because that way
it's easier to add features (syslog, auth against an SQL-db, etc.)

I have problems to understand what would be the best pack-format for
my messages. I have read "perldoc -f pack" numerous times and also
the many O'Reilly books I have, but the best I've come up with is

pack "C3(xC)*", length $ascii_str, $num, $id, unpack "C*",
$ascii_str;

for the cases, when I need to send an ASCII string (like an IP address
string) from the server to the Java-applet and thus have to stuff the
upper bytes of that ASCII with zeros (that's why the "x" above).

I wonder, why doesn't pack "C3U*" do the same? Here is a demo:

# perl -e '$str=pack "C3(xC)*", 4, 0, 14, unpack "C*", "test"; \
print join " ", unpack "C*", $str'

4 0 14 0 116 0 101 0 115 0 116

# perl -e '$str=pack "C3U*", 4, 0, 14, unpack "C*", "test"; \
print join " ", unpack "C*", $str'

4 0 14 116 101 115 116

As you see, the stuffing zeros are missing in the second output.
But why? Doesn't "perldoc -f pack" say

If you don't want this [UTF8] to happen, you can
begin your pattern with "C0" (or anything else) to force
Perl not to UTF8 encode your string, and then follow
this with a "U*" somewhere in your pattern.

Regards
Alex

PS: Also I wonder, if there are any nicer ways to communicate
Java-strings to Perl. "perldoc -f pack" mentions "n/..."
for Java-Strings, but doesn't elaborate. Is it "n/U*" ?

 
Reply With Quote
 
 
 
 
Mark
Guest
Posts: n/a
 
      06-23-2005
Alexander Farber wrote:
> Hi,
>
> I have a small card game. The clients are Java-applets and the
> server is written in C, mostly forwarding data from applet to applet.
>
> The message format is:
>
> 1 byte: Number of unicode chars (s. below)
> 2 byte: Player number
> 3 byte: Event id
> up to 510 bytes: A Java unicode string
>
> Now I'm trying to rewrite my C-server to perl, because that way
> it's easier to add features (syslog, auth against an SQL-db, etc.)

<snip>

> PS: Also I wonder, if there are any nicer ways to communicate
> Java-strings to Perl. "perldoc -f pack" mentions "n/..."
> for Java-Strings, but doesn't elaborate. Is it "n/U*" ?


I'd be tempted to use XML as the data format, in fact, I'd probably use
SOAP.

Mark
 
Reply With Quote
 
 
 
 
Ilmari Karonen
Guest
Posts: n/a
 
      06-23-2005
Alexander Farber <(E-Mail Removed)> kirjoitti 23.06.2005:
>
> The message format is:
>
> 1 byte: Number of unicode chars (s. below)
> 2 byte: Player number
> 3 byte: Event id
> up to 510 bytes: A Java unicode string


Your "Java unicode string" is presumably in (big-endian) UCS-2, which
is the representation used internally by Java. This is not how perl
normally encodes Unicode strings.

> I have problems to understand what would be the best pack-format for
> my messages. I have read "perldoc -f pack" numerous times and also
> the many O'Reilly books I have, but the best I've come up with is
>
> pack "C3(xC)*", length $ascii_str, $num, $id, unpack "C*", $ascii_str;


This is indeed a perfectly good way to convert ASCII (or ISO Latin 1)
text to UCS-2. If you want to handle characters above 255 as well,
may I suggest something like:

pack "C3n*", length($string), $num, $id, unpack "U*", $string;

> I wonder, why doesn't pack "C3U*" do the same? Here is a demo:


Because pack("U*") encodes the characters in UTF-8, not in UCS-2.
UTF-8 is a variable-length format which encodes ASCII characters in a
single byte and other characters in 2 or more bytes. So if your
original string only contains ASCII characters, it makes no difference
whether you use "U*" or "C*".

UTF-8 is also the format used by perl to store Unicode strings
internally, although perl hides this fact reasonably well -- in
theory, at least. As perl's Unicode support matures, practice is
gradually starting to approach theory here.

For more information, try googling for UTF-8 and UCS-2.

--
Ilmari Karonen
To reply by e-mail, please replace ".invalid" with ".net" in address.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
When Is A Visaster Service Pack Not A Service Pack? Lawrence D'Oliveiro NZ Computing 5 06-26-2008 09:11 PM
Buy 1 pack or 3 pack? (XP Pro x64) Rich Olver Windows 64bit 11 11-25-2006 11:33 PM
Excellent discount software packs - ImTOO Ripper Pack Platinum and ImTOO Ripper Pack Gold zhangelf01@gmail.com DVD Video 6 09-17-2006 03:27 AM
Pack parent control in child using control.pack(in_= syntax? Tim Jones Python 0 01-31-2004 10:22 PM
Poor Mans NIMH Battery Pack and SLA Battery Pack ajacobs2 Digital Photography 0 08-19-2003 12:42 PM



Advertisments