|
[ << Go Back ] |
[ ^^ Goto Home ] |
|||
|
June 2004: Speech Recognition and all the lot
Kishore is back. And is in ASR mode right now. So
lets work on it. Guess the next two months are going to be hectic with this
work. Particularly when you want to work with
IIIT Students. And yes more work with PICOPETA and may be OUTSIDE-ECHO.
And i need to pack up Khabrein and get it out this month. So more research on
that side can be done. Further i must write papers on work done this summer. May 2004: Summer Interns and lots more.. Working with several summer interns on lots of projects. See my FYP/Summer Projects List. Some are core paper projects. Some are add on modules to systems we have developed i.e TTS and Khabrein. Also working on new projects under collaborations.
Further got some deliverables to send to PICOPETA.
So lots to do this month. March 2004: Paper Writing Writing couple of long time pending papers. Want to be done with it by mid April.
Yes did complete them and submitted. See
publications page for some briefs. February - March 2004: Khabrein - Online Hindi News Archives
Started to work on a basic system for Hindi (in
general Indian Language) News archiving and providing services using the news
archives. Services like one place news, spoken news bulletins, and further to
extend search, categorization, news alerts, natural language query etc. Have a
preliminary. Write to me and i can send you a link. I am not sure of the
copyright implications etc. February - March 2004: Sebsibe H/Mariam is a Ph.D. students working with us in Speech Lab here. He is a student at IIIT under exchange program from Ethiopia. He took up a semester project course to develop a Prosody Manipulation system (basically pitch and duration modifiers modules). So lately i have sitting and discussing with him various details of implementation of the modifiers and associated modules. He has done good work and has come up nice modules for the same which we plan to use for some more interesting work in coming summer.
Also he started working on Limited Domain Isolated
Word Recognition System (concentrating on Indian Languages and Amharic) under
the ARGUS Course. So we decided to build a very simple to use toolkit (RecogPack)
for rapidly building limited domain recognizers. A elementary version of
it has been built, but more work need to be done to make it computationally
practicable. Coming summer will be good time for it all. December 2003 - February 2004: Indian Language TTS - Text Processing Front End Start of December i decided to have two items on my agenda for the TTS. One of them is to build a text processing front end for the Indian Language TTS we are building here. By good, i wanted to have an well design set of blocks for font converters, text normalization, NLP, phonetization (& syllabification). Also wanted to get more standardized with the notation that the rest of department and the world follows. Basic punch to the requirement was Kishore's specification that we must be supporting Unicode, ISCII and ITrans. Unicode and ISCII are supposed, i am not sure if ITrans is worth the effort, so its on standby for now. But yes the converters for large number of fonts and notations for Hindi and Telugu are ready and language independent text normalization modules are in place and a good design for NLP modules has evolved which is pretty much in-line with the efforts going on in the other labs in LTRC.
Check out a recent presentation on
Indian Language Text Processing Front End we
have developed. November 2003: Photorealistic Visual Speech Synthesis
Have started working on developing a Visual Speech
Synthesis System built on Corpus based Approaches and Synthesis by Unit
Selection. Have a basic plan in mind involving achieving co-articulation by
use of divisemes, optimal text selection to provide coverage of divisemes,
collection and segmentation of corpus and final synthesis by unit selection
and morphing. Here are some notes i wrote
some time back. October 2003: Some recent developments in Indian Language TTS work A Unit Pruning Approach has been implemented and from preliminary observations, we find that we can remove upto 50% to 70% of the units present in the database without perceivable loss of quality (naturalness as well as intelligibility). Though need to establish it through perceptual testing. Also some objective measure can be used for evaluation. A system based on An Evolutionary Algorithm for Unit Selection has been implemented and is producing equivalent quality as earlier systems. Need to further tune the various parameters and conduct several experiments and tests.
More recently an API for the text to speech system
in Windows has been implemented for robust application development in Windows
environments. For some details see API for Windows Currently, we at Speech Lab, LTRC are concentrating our efforts on developing Text to Speech Systems for Indian Languages and related research. As we are working with Data Driven Approaches, the related work involves the work on text corpus selection, speech corpus generation and annotation and also some perceptual testing often. So some analysis of current outputs and figuring out where things are going wrong, coming up with ideas and all this is what goes on. Also as i am attending the Speech Technology Course and also coordinating it at IIIT end since it is being taken by Mr. S. P. Kishore from CMU, quite bit of my time is taken by the course and its assignments. Also i am interacting with students of Speech Course and another student working in Speech Lab for Final Yr. Project and am also working along with a Ph. D. candidate, i figure i have a lot of interaction job to do besides research and development. We are developing systems as well doing basic research. System development and integration work involves making the current synthesizer more and more useable and distributable products and basic research work involves exploring better ways of synthesis and ways on improving the current synthesis both in the text as well as the signal domains. Immediately i am exploring Speech Coding techniques to compress the corpus to a small size for easy distribution of the system. The L. M. D. S. system is a result of massive scale down of the full synthesizer to bare minimums. See the projects page for a brief of this project. Also i am exploring techniques for database pruning. Presently i am also working on developing an Evolutionary Approach for Unit Selection based Synthesis that i have come up with recently. I want to spend significant time of research on analysing the results of the changing the various parameters of the Evolutionary Algorithm. Also there are there are lot of ideas jumping in my head which i want to work on some time possibly later in life or as projects with some IIIT students. Also See: < My Projects > < My Ideas > I maintain a technical blog of things i need to discuss with my guide, my boss, my colleague and everyone else (there is only one other person). The blog is at http://speech.iiit.net/~speech/rohit/
|
||||
|
[ Top ] |
||||
This Page Is:
http://speech.iiit.net/~rohit/work.htm
Last Update On: Saturday. 12. June. 2004