LTR/RTL direction and language tags

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

LTR/RTL direction and language tags

Bèrto ëd Sèra

the application I'm using AIDA as a GUI is storing text and multimedia in some 400 different languages and plans to grow up to most of the over 6k known ISO 693-3 languages. This poses a number of needs that most CMS based sites do not have, yet there are a few things that I gather may be useful for all. In particular, Hebrew and Arabic need to write RTL (right2left) and both are an interesting market for a CMS like Scribo. This is much more true as most of the existing CMSs have poor support for directionality and even worse support for embedding into each other components that have different directions.

To make but a simple example, a news site like Al Jazeera may well offer readers an embedded release in English, but does this in an Arabic page. In our project we publish a dictionary from any to any language. So  Chinese editor who is fluent in english AND arabic may want to read both versions of an article/entry to produce his Chinese version. So we have a ideogram-based Chinese GUI and two embedded components, one in LTR Latin English and the other in RTL Arabic. Quite a puzzle, isn't it? Yet the way AIDA manages things is just ideal for this, because it's made exactly for people to "assemble" a patchwork of pre-existing elements.

The main problem I see is that usually directionality is specified in the <HTML> tag. Now this tag gets pruned off (toghether with the HEAD section) when you include a page into another. So probably we can think of putting the tag somewhere else. For example, I seem to understand that such an inclusion is made by generating a <div> into which the elements of the included page fit. Now... one may think that if the page knows in what language it's written... but this is not the case.

Many languages have a number of alternate scripts. To name but one, Serbian may be written either in Latin or in Cyrillic. Turkish is now written as LTR latin, but until the last century it was written in RTL Arabic. So when you include a library of Turkish text you may really end up being in trouble by simply saying "it's in Turkish". What a page needs to know is actually 2 data:
1) a language
2) a script

the crossing of these two values tells us exactly how to produce the HEAD section to make it easy for search engines to realize what is the language of the page and the script tells us what directionality should be used. Once we have them both we can produce a <span> that can be like:
<span lang="XX" dir="RTL">our elements here</div>

The reason why I suggest a <span> is that it doesn't have an influence on the way text is rendered, whereas a <div> would.

This "span" container will fit the embedded elements, while the outer layer (the page) needs putting its tags as usual, so something like:
<html xmlns="" xml:lang="pms" lang="pms" dir="ltr">
I can do this thing by altering stuff to make it work in our app only, but my feeling is that this is more of a general need and it should sit at AIDA level. I can volunteer to make it, although I will obviously need to make loads of questions about the way pages are generated. And obviously if and only if the people here think that it would be useful to have it in Aida.

So... who wants me to do it?

Aida mailing list
[hidden email]