In India web development and communication are very much on the rise with cheaper mobile communication being the catalyst. Use of mobile phones has transformed the culture of communication with even villagers using sophisticated computer-related words like SMS. But the major complexity arises when web documents in regional languages are displayed. Understanding the content of the document and later communication through oral or text means, becomes difficult and this is the area the current paper addresses and in the process tries a model for how the knowledge is created in the minds of illiterate user. The paper first presents how letters and words which form the basis of text-based communication can be used for content and later content-related words are chosen as bases for training in ANN. A comparison with statistical - termed algorithmic approach, here is made to bring out how ANN could be more effective. © 2012 Springer-Verlag.