Saturday, April 11, 2015

Statistical Analysis of Bhagavad Gita and Translation

Bhagavad Gita

I have re-started one of my favorite projects - developing a complete Sanskrit to Bengali translator for Bhagavad Gita. I had started working on this project twice in the last 2-3 years - but every time something else came up - work pressure, office nonsense, laziness & tamas, tensions etc and as a result the projects got shelved mid-way. And also since these were open-ended ideas and not time-bound projects, there was no way of tracking progress.

This time I am taking a different approach. I've read in a book called "Influence" by Prof Cialdini that writing down one's plan, increases one's commitment level and this in turn increases the chances of success.

So here I go.

Objective:
To develop a Sanskrit to Bengali translator using words from Bhagavad Gita to build the initial dictionary

Features:
- Should be able to do samdhi analysis and detect tokens appropriately
- Should be able to resolve between different meanings and cases (phale - nominative dual as well as locative singular)
- Should be able to detect incorrect spellings for common words and suggest the correct one
- And ranslate to Bengali spoken language to the extent possible

Phase 1:
Developing the dictionary using Bhagavad Gita Chapter 1 initially

1a - Do a statistical analysis of Bhagavad Gita in terms of token count
1b - Do samdhi splits and get those tokens also
1c - Do a count of tokens and arrange the tokens in descending order of frequency
1d - Add to dictionary with forms (part of speech, tense, case, number, person etc) and Bengali meaning
Threshold level: Most common words amounting to 80% on a cumulative basis shall be used.

Phase 2:
Apply the translator to Chapter 2 and test. Identify missing words and repeat 1a thru 1d. Repeat for each chapter.

Note: The dictionary shall be extensible. If one were to replace Bengali words with appropriate Marathi words, theoretically the translation should work.

Project Details
Time: 1 month per chapter
Start Date: April 2015
End Date: October 2016

May Lord Ganesha remove all obstacles to this endeavor !Subhodeep Mukhopadhyay

1 comment:

  1. very nice project, it will be useful to many people. i have some personal question, "Do the work to achieve some goal, but goal is uncertain and beyond our control" why do not we assess this uncertainty in statistical sense. What advice in GITA are minimise this uncertainty.
    Join hand for better soln.

    ReplyDelete