Reviving India's linguistic heritage through Advanced AI

With a cultural heritage spanning over 1,600 languages, India stands as a living testament to the richness of linguistic diversity that flourishes across its vast landscapes. The Indus Project, a ‘Civilizational’ endeavour, seeks to amplify the voices of all Indic languages rooted in the legacy of the great Indus Civilisation. This pioneering initiative aspires to construct an Open Source Large Language AI model, catering to the requirements of a staggering 25% of the global population.

An indigenous Large Language Mode (LLM)

Tech Mahindra has already embarked on this ground-breaking endeavour with a promise to reshape the landscape of linguistic AI in India. This initiative aims to construct an indigenous Large Language Model (LLM), specifically designed to converse in a multitude of Indic languages. Did you know that around 615 million users speak 40+ dialects of Hindi? While the total number of speakers in the world who consider English their first language is only 375 million!

In the first phase, the Indus Project targets the inclusion of a remarkable 40 Hindi dialects, paving the way for an ever-expanding roster. Constructing an LLM of this magnitude demands an extensive dataset, and Tech Mahindra has drawn inspiration from 'Bhashini', a similar undertaking initiated by PM Narendra Modi to amass datasets on Indic languages.

A monumental opportunity

Rooted in the belief that local dialects are paramount for effective communication, Tech Mahindra's Makers Lab spearheads this ‘Civilizational’ initiative’. Beginning with Hindi, the project envisions an eventual expansion to encompass numerous other regional languages, rekindling India's fading linguistic treasures.

Embracing a diverse tapestry of linguistic roots, the Indus Project presents a monumental opportunity for individuals from various linguistic backgrounds to contribute. Dialects such as Dongri (Jammu & Kashmir), Kinnauri, Kangri, Chambyali, Garhwali (Himachal Pradesh), Kumauni, Jaunsari (Uttar Pradesh), Bhojpuri, Maithili, Magahi (Bihar), and many others can play a pivotal role in shaping this innovative venture.

Emblematic of Tech M’s prowess in NLP

The Indus Project is emblematic of Tech Mahindra's prowess in Natural Language Processing (NLP), and a testament to its commitment to innovation. This initiative is poised to revolutionise industries, from call centres where language is pivotal to sectors where the LLM revolutionises automation and amplifies human creativity. As the first milestone approaches with the creation of a data collection portal, the Indus Project stands as an inspiring testament to Tech Mahindra's dedication to technological excellence and cultural resurgence.

An open invitation to contribute

Tech Mahindra invites each individual, fluent in various Hindi dialects, to join hands and make a meaningful contribution – a 'bhasha daan' – by submitting a short audio phrase in their local dialect or even in general Hindi, to aid in training the cutting-edge AI system. We encourage you to extend this invitation to your family and friends as well. This is our golden opportunity to position both Tech M and India prominently on the global AI landscape.

Make your contribution now!

Enjoyed this story? Hit the Like button to let us know!
You can also share your thoughts in the Comments section below!