smc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[smc-devel] Re: [Indic-computing-devel] javascript indic renderer and c


From: Krishnamurthy Nagarajan
Subject: [smc-devel] Re: [Indic-computing-devel] javascript indic renderer and community portals
Date: Mon, 12 May 2003 05:32:57 -0700 (PDT)

Hi Suryaprakash, Arun and others,

A good, generic transliteration library for the Indian
languages/scripts is what is needed, IMHO. As anyone
who has studied the structure of Indian languages and
scripts would appreciate, there is good structure in
the phonetic input and how the input syllables are
encoded using a script and how they are mapped to a
series of simple or composite glyphs.

In a web-based client-server model of app development,
unlike Latin scripts/languages, it would be more
appropriate to do input processing, transliteration
and glyph composition for rendering on the server
side. So, using Javascript may not be that feasible to
achieve this. At the same time, if all processing is
on the server side, then the application can't be
really that interactive (such as showing the display
to the user for each syllable typed in). Some
intermediate soluton would be needed. Also, using PHP
may be a better idea than Javascripts or jsp. What do
you folks think ?

A couple of weeks back, I made an enhancement to my
transliteration library (translib under the
indic-computing project on sourceforge) to take a
'word' in an Indian language encoded as a sequence of
Unicode characters (in UTF-8 format), kind of map it 
(using a user-defined lookup mapping file) to the most
appropriate Roman phonetic input and then apply the
transliteration rules for that language+script by
looking up the transliteration rule file. The output,
as before, is a sequence of symbolic glyph names that
correspond to glyph indices in a given font file. This
can be fed to a font reading and rendering library
such as freetype2 for final display (Koshy wrote a
python script to do this using gozer, but he is now
replacing gozer with a python wrapper to ft2).

I tried out this utf8-to-final-glyph rendering for
Hindi+Devanagari with very minimal mapping done and
did some prelim testing and it's ok. All the
intelligence is in the user-defined mapping files and
the source code itself has no knowledge of any Indian
language.

Unicode is neither an input mapping scheme nor a glyph
mapping scheme; it's just an encoding scheme, as all
of us know. It has limitations, but with a sound
transliteration library in place, utf-8 can be used
for 
storing Indian language content for further processing
(search, sort, display etc etc).

cheers,
Nagarajan

--- Suryaprakash Kompalli <address@hidden>
wrote:
> Hello,
> > What about Javascript ?  When user types in a text
> field using some
> > Transliteration scheme or Inscript KeyBoard layout
> convert it to font
> 
>       For collecting data on Indic scripts, we had
> created an interface
> that uses Java. I had used ITRANS transliteration
> and the default java
> fonts to come up with an input window that accepts
> transliteration in
> Latin script and outputs Devanagari on another
> window.
>       I am not familiar with CMS or *nuke - but with what
> I gather
> reusable java classes could be a good way to look at
> it. We can develop
> classes to work with other languages -
>       Right now, the tool might be a bit bulky to work
> with since the
> package has a lot of Image Processing related code
> to go with it. But
> its' good to test offline - Ppl can take a look at
> it here -
> www.cedar.buffalo.edu/ilt
>       If its' useful, we can plan to make the input
> system independant
> of the IP part -
> 
> > http://www.wandel.subnet.dk/hindi.html
>       I tried the site out - it didnt need applets - but
> it didnt work
> on netscape either - gave me a message saying it
> needs windows to work
> with but then behaved OK on mozilla. :) :( What it
> is doing is - it
> takes the unicode value for the character, converts
> it into int and this
> is what we can do with the o/p -
> 
> <html>
> ????
> </html>
> 
>       But since what it is in the background is basically
> unicode, we
> still need a *properly configured* browser to view
> the stuff - It came
> out properly on Mozilla, didnt show up on netscpape.
> Pretty nifty - but
> the addition of transliterated/keyboard based input
> might be more welcome.
> The characters present in the interface are the ones
> from the unicode
> table for Devanagari.
> 
> Bye,
> Surya

> On 5 May 2003, Arun M wrote:
> 
> > Hi Friends,
> >
> >    We see a lot of community portals coming up
> these days based on
> > CMS like *nuke. But one of these CMS supports
> Indic. Also we dont see
> > any community discussion boards in Indian
> languages (Pls correct me).
> >
> > Some issues I see are:
> >
> >  - There is no indic(Unicode/Non unicode) support
> in most platforms.
> >  - Most browsers doesnt support unicode.
> >
> > Any community portal we build should be based on
> font encoding
> > systems, at least for some more time. A idea is
> store
> > the data in Unicode and then convert to font
> encoding at the server
> > side.  A special proxy should help here.
> >
> > Second issues is entering data from the client
> side. This is major
> > prob. Most of the sites uses Java applets for
> entering the data in
> > local languages. This may require good amount of
> modification in the
> > CMS we have now like *nuke.
> >
> > Yesterday I made a proto of this. A crude one. It
> works.
> >
> > Do you think this will be of use ? If so I will
> work on it and
> > make the code better and generic(I dont know
> Javascript much , will have to
> > learn it first).
> >
> >
> > Arun.
> >


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com




reply via email to

[Prev in Thread] Current Thread [Next in Thread]