Wednesday, October 30, 2002

Products based on Indian Language Technology @ TDIL (Technology Development for Indian Languages) Site

Kendriya Hindi Sansthan

Indic Standards

In 1983 in India there emerged a standard for coding, called ISCII (Indian Script Code for Information Interchange).

Contains mapping of different ISCII codings

FAQ - Indic Scripts and Languages - Unicode.org

GICAS: Linguistic Survey of India

What is Grierson's Linguistic Survey of India?

* Grierson's LSI is the most exhaustive survey of around 800 languages spoken in India (British India).
* It covers languages from all the four language families: Indo-Aryan, Dravidian, Tibeto-Burman and Austro-asiatic.
* It contains a large number of script samples (both printed and hand-written).
* Sample texts from various languages are provided.
* Language area maps are also given.
* It forms the basic material for the study of scripts and languages of the area.

Indian Scripts and Unicode

Unicode a.k.a. ISO 10646 covers all mayor scripts used in India today. However, the standard has several inconsitencies, short-comings and peculiarities, which need to be known to be handled correctly. This document pin-points the cave-ats, and goes into details on the convertion of ISCII into Unicode. Further, this document also includes a proposal for idioms to be used for rendering variants. These will be worked out to a complete conjunct table for each script, which will eventually appear as an appendix.

Limitations seen in Unicode and ISCII

Limitations of Unicode and ISCII
A discussion relating to their suitability for linguistic text processing in Indian languages.

Wednesday, October 23, 2002

India-n-indian.com

* Web in Indian Languages
* Rough list of the languages accommodated into the ISO 8859 series
* Multiplicity of Standards in Web
* Partial List fonts for Indian Languages in Non-Unicode Encodings
* Languages Codes for Indian Langauges
* Unicode range for Indian Scripts
* Indian Scripts and Languages

जय हनुमान की takhti

"तख्ती की मदद से आप विण्डोज़ मशीनों पर हिन्दी (देवनागरीयूनीकोड) में लिख सकते हैं। इसका इस्तेमाल ८ बिट विण्डोज़ ९x में भी किया जा सकता है। इसका आकार काफ़ी कम है (तकरीबन १०० किलोबाइट)।
यह ख़ासतौर पर एक्स॰ पी॰ के लिये लिखा गया था(इन्स्क्रिप्ट कीबोर्ड (कुञ्जीपटल) का इस्तेमाल क्वर्टीकी आदत पड़ने के बाद थोड़ा कठिन होता है), लेकिन बाद में यह पाया गया कि भारत में ज़्यादातर लोग विन ९x का इस्तेमाल करते हैं, और वे हिन्दी मे लिख नहीं पाते क्योंकि विन ९x पर इसके लिये कोई तरीका ही नहीं था। तख्ती को ८ बिट विन ९x मशीनों पर चलने लायक बनाने में काफ़ी मेहनत लगी है।"

Sharmahd Computing UniPad

SC UniPad is a Unicode� plain text editor for the Windows NT�, Windows 2000�, Windows 9x�, Windows ME� and Windows XP� operating systems.

Displays over 52000 Unicode characters instantly without installing extra fonts On-screen soft keyboard Over 60 built-in keyboard layouts Character map for easy selection of any Unicode character Import / export of over 60 codepages, encodings Unicode formats UTF-8, UTF-16, UTF-32, UTF-7, Compression Scheme, \u and more...

Yudit Home Page

Yudit is a unicode text editor for the X Window System. She can do True Type font rendering, printing, transliterated keyboard input and handwriting recognition with no dependencies on external engines. Her conversion utilities can convert text between various encodings. Keyboard input maps can also act like text converters. There is no need for a pre-installed multi-lingual environment. Menus are translated into many languages.

Tuesday, October 22, 2002

MT WorldType Devanagari

Known to be the most frequently utilized of the Northern Indic scripts, Devanagari is used to write Hindi, Marathi, Nepali, Kashmiri, Bihari, Rajasthani, as well as some minority languages. Nowadays, it is also the script most commonly used for writing Sanskrit which is the ancient predecessor of Modern Hindi. All modern-day Indic scripts are descendants of Brahmi, an extinct script which flourished more than two thousand years ago. Over the centuries, the offshoots of Brahmi branched into two broad groups: one for writing the northern Indic, mainly Indo-Aryan languages, the other for the southern Indic, Dravidian languages.

As an exemplary descendant of Brahmi script, Devanagari embodies all the features which typify the 'Brahmi model':

Monday, October 21, 2002

Itranslator - Omkarananda Ashram Himalayas

Welcome to the World of Itranslator 99 for Windows 95/98/ME/NT/XP/2000
Paramahamsa Omkarananda Saraswati

Itranslator 99 is a free utility which converts ITRANS 5.30 encoded text or text files into Devanagari or Transliteration (True Type), which can be saved in Rich Text Format (.rtf), as web page (.htm), as ANSI text (.txt), or pasted via clipboard in any other application.

GNOME translation status - GNOME 2.0 Fifth Toe hi (Mon Oct 21 04:10:29 2002 UTC)

"Project-Id-Version: anjuta2 VERSION\n"
"POT-Creation-Date: 2002-10-21 05:22+0200\n"
"PO-Revision-Date: 2002-06-08 18:55+0530\n"
"Last-Translator: Anurag Seetha \n"
"Language-Team: Hindi \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

Vartalaap, A multi-lingual multiway unicode-based communication system

VARTALAAP is a robust multilingual, interactive communication system, with versatile utility across a wide spectrum of domains. One of the main objectives of the project is to reach out to the masses, breaking language barriers and physical boundaries, through the support for not only Indian languages but for International languages as well.

Vartalaap is a Unicode based multilingual Communication system and thus provides rich support to a number of natural Indian languages as well as any International languages.

There is a question of making such applications accessible to non-English speaking users since it would otherwise be impossible to tap its true potential particularly in countries like India, where the non-English speaking population is in majority. An application independent of natural language constraints would not only broaden its user-base, across domains, but would further dissemination of information. The multi-lingual feature assumes significance in rural areas, where a person can use this vernacular to communicate. The tool even allows the user to customize the user interface in their language.

It is developed as a part of an on-going research project at National Centre for Software Technology (NCST), Bangalore. NCST has its offices at Mumbai, and Bangalore, INDIA.