GSoC - Unified Subtitle Parser for XBMC DVDPlayer

Unified Subtitle Parser and Subtile Display Filter Library for XBMC's in-house DVDPlayer video-player core


 * Proposed by student: betti
 * Proposed primary mentor: spiff
 * Proposed backup mentor: Elupus

= GSoC proposal for 2008 = This is a discussion about a proposal for XBMC's Google Summer of Code 2008

Name
Betul Kuruoglu

XBMC Forum: betti

XBMC.org wiki: ?

IRC: ?

CV: http://ceng.metu.edu.tr/~e1448943/BetulKuruoglu_CV/

Project Title
Unified Subtitle Parser and Subtile Display Filter Library for XBMC's in-house DVDPlayer video-player core

Project Synopsis
This project is composed of a subtitle parser and subtitle filter library for XBMC platform. Improving XBMC subtitle support is the main idea. The project will be coded in C++. The project will be flexible for new future improvements (as for new subtitle formats).

Project Goals
Implementation of a unified subtitle parser and subtitle display filter library for XBMC in-house DVDPlayer video-player core. For all common subtitle formats, a library that will gather subtitle parser(improving the using ones now) subtitle display filter is the main purpose. Mpeg filters, that are improved will be used. With the addition of parser, the project will be usable mainly for XBMC.

Project Benefits
A Unified Subtitle Parser for XBMC Community: This will be done for improving XBMC subtitle support. The parser will include all the common subtitle formats. This will make it useful. Also for the future, the parser will be an open source that can be improved by the others according to the new subtitle possible formats. A Subtitle Display Filter Library: This will be a subtitle display filter that will have some more additions than the using ones. Also this one will be very flexible for future changes.

Project Tasks and Deliverables & Project Schedule
April 14th - 30th: May 1st - 15th: May 16th - 30th: June 16th - 30th: July: August: After GSoC deadline:
 * Search all documents about recently improved parsers for subtitles
 * Start coding for subtitle display filter
 * Improve mpeg knowledge and coding.
 * Read documents about parsers and look their codes
 * Analyze community & developer demands for this project
 * Search on the other players' subtitle related libraries
 * Prepare a requirement analysis document.
 * Build the main part of subtitle display filter
 * Start coding for parser part
 * Develop the application (detailed timeline to be determined)
 * Test and optimize the application
 * Make documentations
 * Improve the application regarding to new demands

Personal Statement
I'm a third year undergraduate student from Middle East Technical University, Turkey. It's one of the best and most prestigious universities in Turkey and I have good grades in METU. My major is computer engineering.

Unlike most of successful programmers of my age, I have no background at high school about programming. But it took me short to cover this absence. In that manner I can easily say that, I have self-learning, easy understanding, rapid problem solution and implementation skills. I also have a good understanding of algorithms, data structures, programming paradigms and so forth.

We do challenging homeworks and projects as a part of our education. These include, implementation of some algorithms (sorting, searching, hashing; tree, graph, maze, network flow algorithms; some heuristic algorithms), writing parsers, evaluators. We also got experience about signals, pipelining, optimization, cracking, buffer overflow and some other important system concepts. (E.g. Most recently we have implemented "A Linux Shell Using Ansi-C"). And in addition we do most of our homeworks and projects using C/C++ and I have really good skills on this programing languages. And I am good at programming that includes inheritance.

Our 'all' homeworks and projects are to be developed under Linux platform. Also I'm a Linux user for almost 3 years. (I can easily find workarounds and solutions to my problems under Linux). Besides, I am a Mac OS X 10.4.x(Tiger) user for 8 months. So, I feel very comfortable working with MacOS and Linux/Unix platform. Even I use MacOS in my daily life, I have not had any trouble because of cross operating system platforms. And also I am a frequent user and coder in Windows platform so I feel very comforable in Windows platform,too.

Besides, the most important part for this project, I had an internship which is related to subtitles before.

The Details About The Project (has been done as an internship project): Named Entity Recognition on Subtitle Files, Converting to MPEG7 Format

I have done my summer practice at Intelligent Systems Lab, Department of Computer Engineering, Middle East Technical University. During my summer practice, I wrote a java program for converting subtitle files to MPEG7 format files according to proper nouns. My project has a part about NLP (Natural Language Processing). The subtitle files are in natural language form so my project is a kind of named entity recognizer specialized on subtitle files. The project is aimed to have a high percentage of reliability while searching proper nouns. Its success is over 72%. The project is composed of 2 different named entity recognizers and my improvements according to the recognizers' lacks.

Programming Language: C/C++, JAVA, Haskell, Simply Scheme, Prolog, MIPS, PIC Web Development : HTML,XML, PhotoShop Database Tools: MySQL IDEs: Eclipse