Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > HTML > HTML and right-to-left writing...

Reply
Thread Tools

HTML and right-to-left writing...

 
 
Daniel Bleisteiner
Guest
Posts: n/a
 
      05-26-2004
I have several understanding problems with HTML and the dir="rtl"
attribute. Maybe you can clear things up for me...

I have to evaluate the current possibilities for using HTML forms for the
arabic language and found two different things related to that topic. The
first is the dir="rtl" attribute which can be used for many HTML tags like
TEXTAREA and others. From my understanding the attribute should cause the
text to be written from right to left... possibly right-aligned at the
same time. It only does the alignment - NOT the textorder. When I type
something in the textarea the characters get appended to the right end of
the text.

Have a look at the following example: http://www.da3x.de/RTL.html

As far as my understanding goes the characters should also be added to the
left end of the text. But all browsers behave the same way (with small
differences concerning the exclamation mark of the last sentence (IE and
Mozilla put this and ONLY this last mark at the left end - Opera doesn't!).

Do all browsers make the same error or is my understanding wrong? I'd like
know how some arabic people think about that! Shouldn't be all newly typed
characters appended to the left end of the string?

Another element in HTML is BDO which can be used as <bdo
dir="rtl">test</bdo>. This TURNS the text as I'd also expect it from the
"dir"-attribute - but doesn't affect the textarea, no matter how my html
is constructed. I'd really like to clear this up because I need to
implement some routines in my server-system and I need clearance for this
GUI topics.

Thanks for all your help!

--
Daniel Bleisteiner
 
Reply With Quote
 
 
 
 
rf
Guest
Posts: n/a
 
      05-26-2004

"Daniel Bleisteiner" <(E-Mail Removed)> wrote in message
news(E-Mail Removed)-online.de...
> I have several understanding problems with HTML and the dir="rtl"
> attribute. Maybe you can clear things up for me...
>
> I have to evaluate the current possibilities for using HTML forms for the
> arabic language and found two different things related to that topic. The
> first is the dir="rtl" attribute which can be used for many HTML tags like
> TEXTAREA and others. From my understanding the attribute should cause the
> text to be written from right to left... possibly right-aligned at the
> same time. It only does the alignment - NOT the textorder. When I type
> something in the textarea the characters get appended to the right end of
> the text.
>
> Have a look at the following example: http://www.da3x.de/RTL.html


You are missing something.

The order of characters typed into an element is at a level below dir="rtl",
indeed it is at a level below the browser.

Basically, all modern multilingual applications use the standard inbuilt
multilingual capabilities of the operating system, as do all of the common
Windows controls (eg an Edit control for a textarea). There is an entire
multilingual subsystem in there, with it's own API and its own quirks.

You will never get English characters to type in right to left because the
OS knows that English characters are left to right.

If you install, for example, Arabic language support (*) then when you
switch to an Aribic then you will find that your input is right to left.

(*) XP supports all languages. 2000 supports them if you have installed the
relevant language packs, specifically the far east asian one. 98 requires
you to specifically install a "far east asian" version of the OS. Once
again, this is not a browser function, it is part of the OS. The browser
uses the OS's functionality.


 
Reply With Quote
 
 
 
 
Daniel Bleisteiner
Guest
Posts: n/a
 
      05-26-2004
On Wed, 26 May 2004 11:03:16 GMT, rf <(E-Mail Removed)> wrote:

> You are missing something.


Okay, this makes sense... and it means that I have no reliable way to test
my implementation when not using an arabic configured computer system.
I've tried to change to arabic using my WinXP settings but the language
was not available... seems as I need some special update to get this done.

One elementary question for me is how a string entered into an arabic
textfield is send to the server.

Example:
An arabic types "test" into a textfield which displays the text as "tset"
to him.
When retrieving the forms fields with an CGI script... will be text be
send as "test" or "tset"? I suppose its "test"... but I want to be sure...

--
Daniel Bleisteiner
 
Reply With Quote
 
rf
Guest
Posts: n/a
 
      05-26-2004

"Daniel Bleisteiner" <(E-Mail Removed)> wrote in message
news(E-Mail Removed)-online.de...
> On Wed, 26 May 2004 11:03:16 GMT, rf <(E-Mail Removed)> wrote:
>
> > You are missing something.

>
> Okay, this makes sense... and it means that I have no reliable way to test
> my implementation when not using an arabic configured computer system.
> I've tried to change to arabic using my WinXP settings but the language
> was not available... seems as I need some special update to get this done.


No, you just need to enable support for the language. Look up the help
files/install instructions, it's in there. I can't tell you exactly, it's
been a couple of months since I last insatlled an XP system

> One elementary question for me is how a string entered into an arabic
> textfield is send to the server.


It would be sent in the order it was typed in. That is all you need to know
and that is the way you store it.

This is the way all the common controls (edit for Textarea) store it.

The presentaional order of the characters happens at display time, that is
when the characters are actually drawn to the display surface. You, as
somebody who stores what the user has typed in do not need to know how the
characters are displayed. You store them as they come. The operating system
(yes, windows, not the browser) determines the order they are displayed on
the canvas.

Go over to microsoft.com and search for "unicode". It's the underlying
subsystem I mentioned earlier.


 
Reply With Quote
 
Daniel Bleisteiner
Guest
Posts: n/a
 
      05-26-2004
On Wed, 26 May 2004 11:51:46 GMT, rf <(E-Mail Removed)> wrote:

> The presentaional order of the characters happens at display time, that
> is when the characters are actually drawn to the display surface. You, as
> somebody who stores what the user has typed in do not need to know how
> the characters are displayed. You store them as they come. The operating
> system (yes, windows, not the browser) determines the order they are
> displayed on the canvas.


Thanks for the clearance! But I have to know how they are stored because I
need to render them using the software I'm developing. That's why I'm
asking those details. I have to generate the proper PostScript which
displays the text.

--
Daniel Bleisteiner
 
Reply With Quote
 
Mark Parnell
Guest
Posts: n/a
 
      05-26-2004
On Wed, 26 May 2004 11:03:16 GMT, "rf" <(E-Mail Removed)> declared in
alt.html:

> You are missing something.


Richard! Welcome back!

--
Mark Parnell
http://www.clarkecomputers.com.au
 
Reply With Quote
 
rf
Guest
Posts: n/a
 
      05-26-2004

"Daniel Bleisteiner" <(E-Mail Removed)> wrote in message
news(E-Mail Removed)-online.de...
> On Wed, 26 May 2004 11:51:46 GMT, rf <(E-Mail Removed)> wrote:
>
> > The presentaional order of the characters happens at display time, that
> > is when the characters are actually drawn to the display surface. You,

as
> > somebody who stores what the user has typed in do not need to know how
> > the characters are displayed. You store them as they come. The operating
> > system (yes, windows, not the browser) determines the order they are
> > displayed on the canvas.

>
> Thanks for the clearance! But I have to know how they are stored


They are stored in the order they were typed in by the user.
User types in [a][b][c], that is what is stored.
TextOut, given the entire string, renders [c][b][a].

More to the point, given [A][B][C][a][b][c] where upper case is english and
lower case is arabic TextOut would render
[A][B][C][c][b][a].

> because I
> need to render them using the software I'm developing.


Ah, I see. In that case you still don't care. You simply pass the string of
characters, in the sequence they were entered, to TextOut, like above.
TextOut is a uniscribe enabled API. Uniscribe takes care of all the layout
things and the glyph replacement (*). You don't have to worry about it.

> That's why I'm
> asking those details. I have to generate the proper PostScript which
> displays the text.


Don't know about postscript, never used it. However, if you can not use the
TextOut API or some equivelant to render the string then your software must
be uniscribe enabled. While this is not a trivial exercise (mainly because
of the lack of documentation) it is not too hard. 20 or 30 lines of C++ code
will do it.

(*) You are aware that certain character glyphs are replaced by others,
depending on the characters position in relation to other characters. It's
even worse in Thai. Certain characters are actually split into two seperate
glyphs which surround the following character. For example, type in an [a]
and then a [b] and you end up with
[a1][b][a2]. Handling all of the is *not* a trivial exercise. The bloke at
Microsoft that wrote uniscribe took two or three years to get it right.


 
Reply With Quote
 
Mark Parnell
Guest
Posts: n/a
 
      05-26-2004
On Wed, 26 May 2004 23:04:50 GMT, "rf" <(E-Mail Removed)> declared in
alt.html:

> The bloke at
> Microsoft that wrote uniscribe took two or three years to get it right.


Microsoft got something right?

--
Mark Parnell
http://www.clarkecomputers.com.au
 
Reply With Quote
 
rf
Guest
Posts: n/a
 
      05-27-2004

"Mark Parnell" <(E-Mail Removed)> wrote in message
news:1r5kt7e3f4n8l.1059n1yui618e$(E-Mail Removed).. .
> On Wed, 26 May 2004 11:03:16 GMT, "rf" <(E-Mail Removed)> declared in
> alt.html:
>
> > You are missing something.

>
> Richard! Welcome back!


Cheers mate


 
Reply With Quote
 
rf
Guest
Posts: n/a
 
      05-27-2004

"Mark Parnell" <(E-Mail Removed)> wrote in message
news:12qm2k030ph0x$(E-Mail Removed)...
> On Wed, 26 May 2004 23:04:50 GMT, "rf" <(E-Mail Removed)> declared in
> alt.html:
>
> > The bloke at
> > Microsoft that wrote uniscribe took two or three years to get it right.

>
> Microsoft got something right?


Yep. They got the unicode bit right. However, they did the usual thing with
their only example of how to use unicode: The example is so brain dead that
it will wordwrap a fullstop onto the next line
..

Cheers
Richard.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
firefox html, my downloaded html and firebug html different? Adam Akhtar Ruby 9 08-16-2008 07:55 PM
How to send an html message with inline images and text for non html mail clients? John Sutter ASP .Net 0 01-13-2004 08:08 PM
how to redirect to a frames-based html page and load the right html when coming from an ASP.NET page Mark Kamoski ASP .Net 1 08-13-2003 05:51 AM
How to use HTML::Parser to remove HTML tags and print result Mitchua Perl 1 07-15-2003 02:02 PM



Advertisments