Wednesday, October 30, 2002

Products based on Indian Language Technology @ TDIL (Technology Development for Indian Languages) Site

Kendriya Hindi Sansthan

Indic Standards

In 1983 in India there emerged a standard for coding, called ISCII (Indian Script Code for Information Interchange).

Contains mapping of different ISCII codings

FAQ - Indic Scripts and Languages - Unicode.org

GICAS: Linguistic Survey of India

What is Grierson's Linguistic Survey of India?

* Grierson's LSI is the most exhaustive survey of around 800 languages spoken in India (British India).
* It covers languages from all the four language families: Indo-Aryan, Dravidian, Tibeto-Burman and Austro-asiatic.
* It contains a large number of script samples (both printed and hand-written).
* Sample texts from various languages are provided.
* Language area maps are also given.
* It forms the basic material for the study of scripts and languages of the area.

Indian Scripts and Unicode

Unicode a.k.a. ISO 10646 covers all mayor scripts used in India today. However, the standard has several inconsitencies, short-comings and peculiarities, which need to be known to be handled correctly. This document pin-points the cave-ats, and goes into details on the convertion of ISCII into Unicode. Further, this document also includes a proposal for idioms to be used for rendering variants. These will be worked out to a complete conjunct table for each script, which will eventually appear as an appendix.

Limitations seen in Unicode and ISCII

Limitations of Unicode and ISCII
A discussion relating to their suitability for linguistic text processing in Indian languages.

Wednesday, October 23, 2002

India-n-indian.com

* Web in Indian Languages
* Rough list of the languages accommodated into the ISO 8859 series
* Multiplicity of Standards in Web
* Partial List fonts for Indian Languages in Non-Unicode Encodings
* Languages Codes for Indian Langauges
* Unicode range for Indian Scripts
* Indian Scripts and Languages

जय हनुमान की takhti

"तख्ती की मदद से आप विण्डोज़ मशीनों पर हिन्दी (देवनागरीयूनीकोड) में लिख सकते हैं। इसका इस्तेमाल ८ बिट विण्डोज़ ९x में भी किया जा सकता है। इसका आकार काफ़ी कम है (तकरीबन १०० किलोबाइट)।
यह ख़ासतौर पर एक्स॰ पी॰ के लिये लिखा गया था(इन्स्क्रिप्ट कीबोर्ड (कुञ्जीपटल) का इस्तेमाल क्वर्टीकी आदत पड़ने के बाद थोड़ा कठिन होता है), लेकिन बाद में यह पाया गया कि भारत में ज़्यादातर लोग विन ९x का इस्तेमाल करते हैं, और वे हिन्दी मे लिख नहीं पाते क्योंकि विन ९x पर इसके लिये कोई तरीका ही नहीं था। तख्ती को ८ बिट विन ९x मशीनों पर चलने लायक बनाने में काफ़ी मेहनत लगी है।"

Sharmahd Computing UniPad

SC UniPad is a Unicode� plain text editor for the Windows NT�, Windows 2000�, Windows 9x�, Windows ME� and Windows XP� operating systems.

Displays over 52000 Unicode characters instantly without installing extra fonts On-screen soft keyboard Over 60 built-in keyboard layouts Character map for easy selection of any Unicode character Import / export of over 60 codepages, encodings Unicode formats UTF-8, UTF-16, UTF-32, UTF-7, Compression Scheme, \u and more...

Yudit Home Page

Yudit is a unicode text editor for the X Window System. She can do True Type font rendering, printing, transliterated keyboard input and handwriting recognition with no dependencies on external engines. Her conversion utilities can convert text between various encodings. Keyboard input maps can also act like text converters. There is no need for a pre-installed multi-lingual environment. Menus are translated into many languages.

Tuesday, October 22, 2002

MT WorldType Devanagari

Known to be the most frequently utilized of the Northern Indic scripts, Devanagari is used to write Hindi, Marathi, Nepali, Kashmiri, Bihari, Rajasthani, as well as some minority languages. Nowadays, it is also the script most commonly used for writing Sanskrit which is the ancient predecessor of Modern Hindi. All modern-day Indic scripts are descendants of Brahmi, an extinct script which flourished more than two thousand years ago. Over the centuries, the offshoots of Brahmi branched into two broad groups: one for writing the northern Indic, mainly Indo-Aryan languages, the other for the southern Indic, Dravidian languages.

As an exemplary descendant of Brahmi script, Devanagari embodies all the features which typify the 'Brahmi model':

Monday, October 21, 2002

Itranslator - Omkarananda Ashram Himalayas

Welcome to the World of Itranslator 99 for Windows 95/98/ME/NT/XP/2000
Paramahamsa Omkarananda Saraswati

Itranslator 99 is a free utility which converts ITRANS 5.30 encoded text or text files into Devanagari or Transliteration (True Type), which can be saved in Rich Text Format (.rtf), as web page (.htm), as ANSI text (.txt), or pasted via clipboard in any other application.

GNOME translation status - GNOME 2.0 Fifth Toe hi (Mon Oct 21 04:10:29 2002 UTC)

"Project-Id-Version: anjuta2 VERSION\n"
"POT-Creation-Date: 2002-10-21 05:22+0200\n"
"PO-Revision-Date: 2002-06-08 18:55+0530\n"
"Last-Translator: Anurag Seetha \n"
"Language-Team: Hindi \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

Vartalaap, A multi-lingual multiway unicode-based communication system

VARTALAAP is a robust multilingual, interactive communication system, with versatile utility across a wide spectrum of domains. One of the main objectives of the project is to reach out to the masses, breaking language barriers and physical boundaries, through the support for not only Indian languages but for International languages as well.

Vartalaap is a Unicode based multilingual Communication system and thus provides rich support to a number of natural Indian languages as well as any International languages.

There is a question of making such applications accessible to non-English speaking users since it would otherwise be impossible to tap its true potential particularly in countries like India, where the non-English speaking population is in majority. An application independent of natural language constraints would not only broaden its user-base, across domains, but would further dissemination of information. The multi-lingual feature assumes significance in rural areas, where a person can use this vernacular to communicate. The tool even allows the user to customize the user interface in their language.

It is developed as a part of an on-going research project at National Centre for Software Technology (NCST), Bangalore. NCST has its offices at Mumbai, and Bangalore, INDIA.

The Linux Devanagari HOWTO: Index

This document describes how to use Devanagari language with "IndiX" - a modified X Window system on Linux that has support for Indian scripts. This will cover setting Devanagari fonts, Devanagari keyboard and some Devanagari applications. You must install IndiX system. In this HOWTO wherever I use word "Hindi", it is in context of "Devanagari script".

Please note that some parts of this HOWTO are copied from Unicode-HOWTO and other HOWTOs.

Saturday, October 19, 2002

Ethnologue report for language code: HND

HINDI: a language of India

Formal vocabulary is borrowed from Sanskrit, de-Persianized, de-Arabicized. Literary Hindi, or Hindi-Urdu, has four varieties: Hindi (High Hindi, Nagari Hindi, Literary Hindi, Standard Hindi); Urdu; Dakhini; Rekhta. State language of Delhi, Uttar Pradesh, Rajasthan, Madhya Pradesh, Bihar, Himachal Pradesh. Languages and dialects in the Western Hindi group are Hindustani, Haryanvi, Braj Bhasha, Kanauji, Bundeli; see separate entries. Spoken as mother tongue by the Saharia in Madhya Pradesh. Hindi, Hindustani, Urdu could be considered co-dialects, but have important sociolinguistic differences. National language.

Welcome to the Sanskrit Library

Maintained by:
Peter M. Scharf, Director
Malcolm D. Hyman, Linguist

Access to the library requires registration.

देवनागरी की मुद्रलिपियाँ

ENGLISH-URDU DICTIONARY

This is the first English-Urdu Dictionary which has been made available on the WWW. It was launched exactly 5 years ago. The aim has been to reach those persons as well, who are not familiar with Urdu script. Besides there still are several technical complications which have compelled me to abstain from the usage of Urdu script.

Language in India

Language In India is a monthly online journal devoted to the study of the languages spoken in the Indian sub-continent. We wish to present the scholarly research findings on these languages in popular language. Our focus is on language use in mass media, education and administration, speech and hearing, sociolinguistic and political aspects relating to these languages and the society in the Indian subcontinent. We wish to present the linguistic descriptions, interdisciplinary research, and current issues of importance relating to Indian languages. This online journal publishes not only articles, but also book-length reports and studies.

Indian alphabet comparison page

Eden Golshani's Indian language alphabet comparison page. Compares Nagari, Punjabi, Bengali and Gujarati alphabets.

Indic language fonts

One of the most comprehensive list of Indic fonts links. Frequently updated.

Maintained by:
Luc Devroye (copyright)
School of Computer Science
McGill University
Montreal, Canada H3A 2K6

Mellon Project - Univ of Michigan

malhar - Mellonsite for Advanced Levels of Hindi-Urdu Acquisition and Research

This Website uses "Xdvng", a dynamic Devanagari font. In order to see it fully and automatically display, any version of Netscape from 4.0 through 4.76 (not Opera and not version 6.0 or 6.1 of Netscape) may be used on a PC (not on a Mac). It is also visible using IE on a PC, but the shape of the aksharas is less elegant (and at present not every page has been provided with the code that enables "Xdvng" to display in IE). Others, users of Macs and/or Opera, may also see the Devanagari font but only by first downloading and installing it. Sources for the download include jtrans.

Hindi page at UPenn.

Hindi Program: Web based Hindi Materials

Alphabet, Videos, Audio files, Student projects(98), Student projects(99), Student Projects (2002), Hindi Chat Page, Dynamic HTML test page (IE only), Other links, Download XDVNG font for Windows(xdvng.zip), Download Xdvng font for Mac, Download Jaipur font for Windows, Download Jaipur font for Mac, Read document for Jaipur Key assignment

Urdu in India since Independence

by Ralph Russel

The link language of everyday communication in India continues to be, as it was before Independence, one which is as much Hindi as it is Urdu. It is true that since independence the government has shown apathy and worse towards Urdu. But the proponents of Urdu focus almost exclusively on the injustices done to Urdu. They too often call upon somebody else � such as the government � to do something instead of doing it themselves. They have failed to take advantage of factors that favour Urdu. The defence of Urdu requires an increase in the number of people who have a command of it. The first step is education. But one need not depend on government-run Urdu medium schools. Those who have a command of Urdu can start teaching it in their own neighborhoods. Confining Urdu to the Persian script also works against its spread. There is a large readership for Urdu works written in Devanagari script and also for Urdu works introduced through English.

Columbia University

SOME USEFUL SOURCES ON HINDI/URDU LANGUAGE AND LITERATURE

"Review Article: Christopher King, One Language, Two Scripts."

by Srivastava, Sushil
from, Social Scientist. v 23, no. 263-65 (April-June 1995)

Raley, Rita

The Containment and Re-Deployment of English India
A Teleology of Letters; or, From a "Common Source" to a Common Language

by Rita Raley

A theoretical article on the role of Gilchrist in shaping Hindi/Urdu:

Rani Ketaki Ki Kahani

Written by Insha'Allah Khan, this is considered the first Hindi short story. The story is written in Khadi Boli and is supposed to be written around 1803 AD. The Abhivyakti web site presents the story, in Shusha font.

An Introduction to Hindi and Urdu : INTRODUCTION : The Development of Hindi and Urdu

from Barz, Richard and Yogendra Yadav. An Introduction to Hindi and Urdu. Delhi, Munshiram Manoharlal, 2000.

The Development of Hindi and Urdu

The Two Meanings of the Term Hindi
Hindi and English
The Linguistic Situation in Ancient India
Hindi Literature : the Heroic Period
Hindi Literature : the Bhakti Period
Hindi and Urdu Literature : the 17th and 18th Centuries
Hindi and Urdu Literature : the Modern Period
Hindi Dialects and Folk Literature
Hindi Outside of India
Hindi and Urdu Grammars and Dictionaries

Rajbhasha.com

A good comprehensive resource about Hindi as a language. Includes constitutional rules, guidelines, grammar, dictionary and literature pieces. The site is a volunteer effort by Dr. R K Gupta. Uses a dynamic font.

Ojha, Gaurisankara Hiracanda Nagari anka aura akshara Prayaga: Prayaga [1949]

Snapshot of pages from the book @ Digital South Asia Library. No fonts required.

TITUS Is Testing Unicode Script management

TITUS is Thesaurus Indogermanischer Text- und Sprachmaterialien.

Sample Page of Devanagari Script (to test your browsers)

Devanagari - Test for Unicode support in Web browsers

Alan Wood�s Unicode Resource.

A lot of information about publishing Hindi content on the web in Unicode Devanagari.

Dictionaries

* English-Hindi dictionary at IIIT, Hyderabad. Produces output in ISCII, Shusha and iTrans.
* Hindi-English and English-Hindi dictionaries at Rajbhasha.com. Uses Dynamic fonts.
* Hindi-English Electronic Dictionary Maintained by K. Machida

Hindi Language Resources

A site with links to several Hindi resources on the Web, compiled by Yashwant Malaiya.