Archive:HOW-TO:Write plugins for XBMC: Difference between revisions

From Official Kodi Wiki
Jump to navigation Jump to search
({{outdated}})
>NedBot
m (Robot: Cosmetic changes)
Line 12: Line 12:
'''Note!''' As most plugins work by scraping a website they often stop working (partially or completely) when changes (redesign) are made to the website by the website owner. So if a third-party plugin does not work know that it is not usually due to a problem with XBMC, but rather the website it scrapes has had changes done to it, meaning someone will have to manually edit the plugin code to account for the new changes, then you will have to get that updated version of the plugin.
'''Note!''' As most plugins work by scraping a website they often stop working (partially or completely) when changes (redesign) are made to the website by the website owner. So if a third-party plugin does not work know that it is not usually due to a problem with XBMC, but rather the website it scrapes has had changes done to it, meaning someone will have to manually edit the plugin code to account for the new changes, then you will have to get that updated version of the plugin.


=Introduction/Prerequisites=  
= Introduction/Prerequisites =


You will need the following items:
You will need the following items:


*Wireshark - http://prdownloads.sourceforge.net/wireshark/
* Wireshark - http://prdownloads.sourceforge.net/wireshark/


*Python - http://www.python.org/download/
* Python - http://www.python.org/download/


*Tutorial source code – http://voinage-xbmc-plugins.googlecode.com/files/Plugin-tutorial.rar
* Tutorial source code – http://voinage-xbmc-plugins.googlecode.com/files/Plugin-tutorial.rar


There's also lots of streaming sources available for building plugins with:
There's also lots of streaming sources available for building plugins with:


*The [http://www.ovguide.com/ OVGuide].
* The [http://www.ovguide.com/ OVGuide].
   
   
*You can also check the [http://forum.xbmc.org/showthread.php?t=33668 plugin requests thread] here.  
* You can also check the [http://forum.xbmc.org/showthread.php?t=33668 plugin requests thread] here.  


Before you begin, search the XBMC forum for the name of the plugin you are planning on creating. It may already exist. Install Python and find where you put the [[source code|source code]].
Before you begin, search the XBMC forum for the name of the plugin you are planning on creating. It may already exist. Install Python and find where you put the [[source code]].


Right click on the file 'Default.py' and select 'Edit with Idle'.
Right click on the file 'Default.py' and select 'Edit with Idle'.


=Plugin Template : Breakdown.=
= Plugin Template : Breakdown. =


This is just to familiarise yourself with the layout of the plugin template.  
This is just to familiarise yourself with the layout of the plugin template.  
Line 39: Line 39:
''Remember – it`s all about the Tabs.''
''Remember – it`s all about the Tabs.''


Python is just like you and I, it has the ability to use information from other sources. If you don`t know anything about Python – you can read a book from a library and gain the knowledge.  
Python is just like you and I, it has the ability to use information from other sources. If you don`t know anything about Python – you can read a book from a library and gain the knowledge.  
Therefore it`s the same with a Python & it`s libraries.
Therefore it`s the same with a Python & it`s libraries.
A Python Library is a collection of useful functions that the main program can call upon and read then use the new knowledge in the main program.
A Python Library is a collection of useful functions that the main program can call upon and read then use the new knowledge in the main program.
Line 47: Line 47:
'''import urllib,urllib2,re,xbmcplugin,xbmcgui'''
'''import urllib,urllib2,re,xbmcplugin,xbmcgui'''


*'''urllib –''' a collection of http routines.
* '''urllib –''' a collection of http routines.
*'''urllib2 –''' a more advanced collection of http routines.
* '''urllib2 –''' a more advanced collection of http routines.
*'''re – '''a collection of string manipulation routines.
* '''re – '''a collection of string manipulation routines.
*'''xbmcplugin –''' specific Xbmc routines.
* '''xbmcplugin –''' specific Xbmc routines.
*'''xbmcgui –''' specific Xbmc routines.
* '''xbmcgui –''' specific Xbmc routines.  


Without the above imports, python has no reference and the plugin will error and fail.
Without the above imports, python has no reference and the plugin will error and fail.


The next thing you will see is the ('''CATEGORIES () : ''') function.
The next thing you will see is the ('''CATEGORIES () : ''') function.
This handy little function takes the stress out of XBMC Directory listing.
This handy little function takes the stress out of XBMC Directory listing.
It`s really the '''addDir()''' that lists & adds - but we are not in depth - ''YET''.
It`s really the '''addDir()''' that lists & adds - but we are not in depth - ''YET''.
Line 67: Line 67:


You define a function in python with the '''def''' command.
You define a function in python with the '''def''' command.
This function adds the first Directories that you see when you start the plugin.
This function adds the first Directories that you see when you start the plugin.  
Let`s break it down. Scan down the Plugin template.
Let`s break it down. Scan down the Plugin template.
<source lang="python">
<source lang="python">
Line 81: Line 81:




I`m not going to go into too much detail here, when you call the '''addDir()'''command you are calling the above function. All that is happening is a name, url, and image are being passed to '''XBMC''' in a format it likes and it`s being told to display it.
I`m not going to go into too much detail here, when you call the '''addDir()'''command you are calling the above function. All that is happening is a name, url, and image are being passed to '''XBMC''' in a format it likes and it`s being told to display it.


'''name''' = The name of the Directory, either a string or the product of a scrape.
'''name''' = The name of the Directory, either a string or the product of a scrape.
Line 88: Line 88:
  To make a string, give it a name then type your string in quotes.
  To make a string, give it a name then type your string in quotes.


'''a= ` this is a string’''' , notice how it turns green.
'''a= ` this is a string’''' , notice how it turns green.


For the name that '''addDir''' needs you can either pass it a string.
For the name that '''addDir''' needs you can either pass it a string.


'''addDir(`string’, ’’ , 1 , ’’)''' or a stored string '''addDir(name, ‘’ , 1 , ’’ )'''
'''addDir(`string’, ’’ , 1 , ’’)''' or a stored string '''addDir(name, ‘’ , 1 , ’’ )'''


We will examine stored strings later.
We will examine stored strings later.
Line 98: Line 98:
'''url''' = The web address, either a string or the product of a scrape.
'''url''' = The web address, either a string or the product of a scrape.


As above pass either a string ‘ hxxp://www.testing.com’ or a stored url string – url.
As above pass either a string ‘ hxxp://www.testing.com’ or a stored url string – url.


Next on the list is:
Next on the list is:
Line 120: Line 120:
</source>
</source>


We have 3 modes here, but, you can have as many as you need.
We have 3 modes here, but, you can have as many as you need.


*'''mode0''' = CATEGORIES()
* '''mode0''' = CATEGORIES()


*'''mode1'''= INDEX(url,name)
* '''mode1'''= INDEX(url,name)
*'''mode2'''=VIDEOLINKS(url,name)
* '''mode2'''=VIDEOLINKS(url,name)


So in the''' CATEGORIES () :''' function, we call '''addDir(‘’,’’,1,’’)''' which tells the plugin to move on to '''mode 1 = INDEX(url)'''.  
So in the''' CATEGORIES () :''' function, we call '''addDir(‘’,’’,1,’’)''' which tells the plugin to move on to '''mode 1 = INDEX(url)'''.  


Tell '''Xbmc''' to add the first Directories then move on to '''INDEX''' function.
Tell '''Xbmc''' to add the first Directories then move on to '''INDEX''' function.
Line 139: Line 139:
Basically a thumbnail.
Basically a thumbnail.


'''addDir (name, url , 1 , ’http://testing.com/image.gif’)''' or  
'''addDir (name, url , 1 , ’http://testing.com/image.gif’)''' or  
a stored string '''addDir(name, url , 1 ,thumb )'''
a stored string '''addDir(name, url , 1 ,thumb )'''


'''Xbmc''' can handle pretty much ''all image formats : jpg ,gif, png etc''
'''Xbmc''' can handle pretty much ''all image formats : jpg ,gif, png etc''
Line 146: Line 146:
The '''addLink(name,url,iconimage)''' is exactly the same except the '''isfolder=''' option is set to false to tell Xbmc that this is a video link and not a folder.
The '''addLink(name,url,iconimage)''' is exactly the same except the '''isfolder=''' option is set to false to tell Xbmc that this is a video link and not a folder.


The next Function is the '''INDEX (url) :''' this opens the webpage of your chosen website, pretends to be a browser, reads it, stores the page and then scans the page for links to the sacred video files.
The next Function is the '''INDEX (url) :''' this opens the webpage of your chosen website, pretends to be a browser, reads it, stores the page and then scans the page for links to the sacred video files.


<source lang="python">
<source lang="python">
Line 168: Line 168:
</source>
</source>


'''url''' - is the web address that we passed during the previous '''CATEGORIES ()''' -
'''url''' - is the web address that we passed during the previous '''CATEGORIES ()''' -
'''addDir(name, url , 1 ,thumb )'''
'''addDir(name, url , 1 ,thumb )'''


We have added a header, User-agent.  
We have added a header, User-agent.  
Line 189: Line 189:
<source lang="python">
<source lang="python">
response = urllib2.urlopen(req)
response = urllib2.urlopen(req)
link=response.read() - Holds the webpage that was read via the response.read() command.
link=response.read() - Holds the webpage that was read via the response.read() command.  
response.close() - Closes the connection after we have read the webpage.
response.close() - Closes the connection after we have read the webpage.
match=re.compile('').findall(link)
match=re.compile('').findall(link)
Line 208: Line 208:
Just familiarise yourself for now.
Just familiarise yourself for now.


''Open hxxp:/xbmc.org/forum/register.php in your browser ( I use firefox ).''
''Open hxxp:/xbmc.org/forum/register.php in your browser ( I use firefox ).''


Right click and select '''View page source.'''
Right click and select '''View page source.'''  
Scroll right down to the bottom of the page do you see : -  
Scroll right down to the bottom of the page do you see : -  
   
   
Line 227: Line 227:
Imagine that this was the url and name of a video.
Imagine that this was the url and name of a video.


*'''match[0] = smallfont'''
* '''match[0] = smallfont'''
*'''match[1] = center'''
* '''match[1] = center'''


You can experiment with regular expressions like those using a free online tool like [http://gskinner.com/RegExr/ RegExr].
You can experiment with regular expressions like those using a free online tool like [http://gskinner.com/RegExr/ RegExr].
Line 249: Line 249:
As you can see the '''VIDEOLINK()''' function is just the same really, except for '''addLink instead of addDir.'''
As you can see the '''VIDEOLINK()''' function is just the same really, except for '''addLink instead of addDir.'''


Minimize the template, open your browser, right click -->'''edit with idle''' - on '''test.py''' and lets really do something.
Minimize the template, open your browser, right click -->'''edit with idle''' - on '''test.py''' and lets really do something.  


'''If you are still unsure about what you have read, re–read.'''
'''If you are still unsure about what you have read, re–read.'''


==Test, test, test again.==
== Test, test, test again. ==


Right, let`s have some fun.
Right, let`s have some fun.


Open your Browser ( I use Firefox for the tutorial)
Open your Browser ( I use Firefox for the tutorial)


'''Ctrl & click the link :''' - http://www.ovguide.com/
'''Ctrl & click the link :''' - http://www.ovguide.com/


Scroll down – see TV DASH, that`s the fellow for this tutorial.
Scroll down – see TV DASH, that`s the fellow for this tutorial.


'''Click it or Ctrl & click the link :''' - http://www.tvdash.com/
'''Click it or Ctrl & click the link :''' - http://www.tvdash.com/
Line 267: Line 267:
Once it`s open – Bookmark it.
Once it`s open – Bookmark it.


[[Image:image1.jpg]]
[[File:image1.jpg]]


Excellent Choice, at first glance we have thumbnails for the movies & tv – great for Xbmc.
Excellent Choice, at first glance we have thumbnails for the movies & tv – great for Xbmc.
Line 275: Line 275:
Look at the '''top center of of the screen, Home, movies, tvshows,games.'''
Look at the '''top center of of the screen, Home, movies, tvshows,games.'''


They will become our main directories and as there are only two that we need, we will add them manually.
They will become our main directories and as there are only two that we need, we will add them manually.  


Before we do that lets get the scraping and testing done.  
Before we do that lets get the scraping and testing done.  


'''Click on the ‘ MOVIES ‘ link.'''
'''Click on the ‘ MOVIES ‘ link.'''




[[Image:image2.jpg]]
[[File:image2.jpg]]




Line 317: Line 317:
'''url="hxxp://www.tvdash.com/movies.html"'''
'''url="hxxp://www.tvdash.com/movies.html"'''


'''Press '''‘ F5 ’''' – which is really run module in Python terms.
'''Press '''‘ F5 ’''' – which is really run module in Python terms.






[[image:image3.jpg]]
[[File:image3.jpg]]




Line 327: Line 327:
The parentheses are the graphical representation of what`s stored in match.  
The parentheses are the graphical representation of what`s stored in match.  


'''In this case, nothing, yet.'''
'''In this case, nothing, yet.'''


The ‘''' Print match''' ‘   command causes the display, you can print any variable or string.
The ‘''' Print match''' ‘ command causes the display, you can print any variable or string.


Remove it by putting # in front of it #print match see how it turns red.
Remove it by putting # in front of it #print match see how it turns red.


Any text you put after a # character becomes a non-useable  
Any text you put after a # character becomes a non-useable  


'''note !''' - Great for adding your plugin title.
'''note !''' - Great for adding your plugin title.


The reason for the empty '''match''' variable is that we haven`t provided any regular expression to scrape the data, we will soon.
The reason for the empty '''match''' variable is that we haven`t provided any regular expression to scrape the data, we will soon.
Line 343: Line 343:
All commands are case sensitive so type exactly what you have in '''test.py''' only.
All commands are case sensitive so type exactly what you have in '''test.py''' only.


Type ‘'''print response.info()'''‘ -->ENTER   this will bring up the server response from the website.
Type ‘'''print response.info()'''‘ -->ENTER this will bring up the server response from the website.


[[image:image4.jpg]]
[[File:image4.jpg]]


See how powerful Python is ?
See how powerful Python is ?
Line 359: Line 359:
'''Press Ctrl & U to view the page source.'''
'''Press Ctrl & U to view the page source.'''


At this point it is '''very easy''' to be put off, don`t be, once your eye is in – it becomes second nature. Honest.
At this point it is '''very easy''' to be put off, don`t be, once your eye is in – it becomes second nature. Honest.


[[image:image5.jpg]]
[[File:image5.jpg]]


Hmmm ....... Press '''Ctrl & f''' to bring up the search and enter the name of the first movie.
Hmmm ....... Press '''Ctrl & f''' to bring up the search and enter the name of the first movie.
Line 367: Line 367:
'''10 things i hate about you''' , Now this is more like it.
'''10 things i hate about you''' , Now this is more like it.


[[image:image6.jpg]]
[[File:image6.jpg]]


The '''name, iconimage (Thumbnail), url are all here.'''
The '''name, iconimage (Thumbnail), url are all here.'''
Line 379: Line 379:
Once it has completed, you will have a display of the webpage ready for scraping.
Once it has completed, you will have a display of the webpage ready for scraping.


[[image:image7.jpg]]
[[File:image7.jpg]]


Press '''Ctrl & f''' to bring up the search and enter '''10 things I hate about you''' --->ENTER
Press '''Ctrl & f''' to bring up the search and enter '''10 things I hate about you''' --->ENTER
Line 393: Line 393:
The following bits are what we need for the tutorial.
The following bits are what we need for the tutorial.


*'''http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg = Thumbnail'''
* '''http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg = Thumbnail'''


*'''http://www.tvdash.com/movies 10_Things_I_Hate_About_You_(1999)_(Requested).html = Url'''
* '''http://www.tvdash.com/movies 10_Things_I_Hate_About_You_(1999)_(Requested).html = Url'''


*'''10 Things I Hate About You (1999) (Requested)</a> = name'''
* '''10 Things I Hate About You (1999) (Requested)</a> = name'''




Line 408: Line 408:
Close your Python shell and restore your '''test.py'''
Close your Python shell and restore your '''test.py'''


Place your cursor in between the ‘ ‘ of the '''match=re.compile ( ‘ ‘ ).findall(link)'''
Place your cursor in between the ‘ ‘ of the '''match=re.compile ( ‘ ‘ ).findall(link)'''


Press''' Ctrl & V.''' You should see this.
Press''' Ctrl & V.''' You should see this.


<source lang="python">
<source lang="python">
match=re.compile(‘src="http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg" border="0" /></td>\n     <td class="namewidth"><a title="Watch 10 Things I Hate About You (1999) (Requested) Online" href="http://www.tvdash.com/movies/10_Things_I_Hate_About_You_(1999)_(Requested).html">10 Things I Hate About You (1999) (Requested)</a>').findall(link)
match=re.compile(‘src="http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg" border="0" /></td>\n <td class="namewidth"><a title="Watch 10 Things I Hate About You (1999) (Requested) Online" href="http://www.tvdash.com/movies/10_Things_I_Hate_About_You_(1999)_(Requested).html">10 Things I Hate About You (1999) (Requested)</a>').findall(link)
</source>
</source>


The text between the single quotes is green because Python recognizes a string and highlights it.  
The text between the single quotes is green because Python recognizes a string and highlights it.  


*'''Green = good'''
* '''Green = good'''


*'''Red = bad''' , providing the code was not after the # character.
* '''Red = bad''' , providing the code was not after the # character.


Python uses this to mark notes.
Python uses this to mark notes.
Line 426: Line 426:
Future reference.
Future reference.


If the scrape has single quotes in it - '''src= ‘ http://tester.html ‘'''
If the scrape has single quotes in it - '''src= ‘ http://tester.html ‘'''


you will need to change the wrapping quotes – '''match=re.compile( ‘src= ‘ http://tester.html ‘ ‘ )'''
you will need to change the wrapping quotes – '''match=re.compile( ‘src= ‘ http://tester.html ‘ ‘ )'''


to ''' match=re.compile( “src= ‘ http://tester.html ‘   “ )''' to not confuse Python.
to ''' match=re.compile( “src= ‘ http://tester.html ‘ “ )''' to not confuse Python.


Okay, let`s scrape that site.
Okay, let`s scrape that site.


There are many useful regular expressions that Python understands but generally for basic site scraping you only need - brackets around what you want to capture then delete the text and replace with:
There are many useful regular expressions that Python understands but generally for basic site scraping you only need - brackets around what you want to capture then delete the text and replace with:  


(.+?) Yep, it really can be that simple – sometimes.
(.+?) – Yep, it really can be that simple – sometimes.


Let`s just say it means take everything inside the brackets – it is a basic tutorial. He,he.
Let`s just say it means take everything inside the brackets – it is a basic tutorial. He,he.
Line 444: Line 444:
'''Try here for really in depth help – I did.'''  
'''Try here for really in depth help – I did.'''  


*http://www.amk.ca/python/howto/regex/  
* http://www.amk.ca/python/howto/regex/  


*http://www.python.org/doc/2.5.2/lib/re-syntax.html This will explain the true meaning.  
* http://www.python.org/doc/2.5.2/lib/re-syntax.html This will explain the true meaning.  


*http://www.python.org/doc/2.5.2/lib/module-re.html
* http://www.python.org/doc/2.5.2/lib/module-re.html


*http://www.regular-expressions.info/python.html
* http://www.regular-expressions.info/python.html


Shall we scrape ?
Shall we scrape ?
Line 457: Line 457:


<source lang="python">
<source lang="python">
match=re.compile('src="(.+?)" border="0" /></td>\n     <td class="namewidth"><a title=".+?" href="(.+?)">(.+?)</a>').findall(link)
match=re.compile('src="(.+?)" border="0" /></td>\n <td class="namewidth"><a title=".+?" href="(.+?)">(.+?)</a>').findall(link)
</source>
</source>


Line 467: Line 467:


If the movie is ‘''' Friday 13th''' ‘ it would say ‘'''Watch Friday 13th'''’
If the movie is ‘''' Friday 13th''' ‘ it would say ‘'''Watch Friday 13th'''’
not ‘'''Watch 10 things I hate about you'''’
not ‘'''Watch 10 things I hate about you'''’  


so we tell python to skip it with '''.+?'''  
so we tell python to skip it with '''.+?'''  
Line 479: Line 479:
Okay go back and re-read the last section.
Okay go back and re-read the last section.


Honestly people today, wouldn`t have happened in my day . .  
Honestly people today, wouldn`t have happened in my day . .  


Ready now ?
Ready now ?


[[image:image8.jpg]]
[[File:image8.jpg]]


Maximize your Python shell and type '''>>>Match'''   --->ENTER
Maximize your Python shell and type '''>>>Match''' --->ENTER


[[image:image9.jpg]]
[[File:image9.jpg]]


Houston, we have an RE match.
Houston, we have an RE match.
Line 493: Line 493:
The variable match now holds the information we wanted in the form of a '''Tuple.'''
The variable match now holds the information we wanted in the form of a '''Tuple.'''


[[image:tribble.gif]] No not a Tribble – bad Trekkie.
[[File:tribble.gif]] No not a Tribble – bad Trekkie.


'''For information on tuples try here :'''
'''For information on tuples try here :'''
Line 503: Line 503:
Let`s use the '''len()''' command – length of variable.
Let`s use the '''len()''' command – length of variable.


Type '''>>> len(match)''' -->ENTER
Type '''>>> len(match)''' -->ENTER


We get '''>>>251'''
We get '''>>>251'''
Line 525: Line 525:
'''>>>match[10] or match[-10] etc'''
'''>>>match[10] or match[-10] etc'''


We have 3 things stored in match try : '''match[0][0]         hmmmm'''
We have 3 things stored in match try : '''match[0][0] hmmmm'''


===Putting it together.===
=== Putting it together. ===


Well where does that leave us ?
Well where does that leave us ?
Line 549: Line 549:
If you would like '''Xbmc''' to display a custom thumbnail for the '''MOVIES''' directory.
If you would like '''Xbmc''' to display a custom thumbnail for the '''MOVIES''' directory.


Put a url to an image or a path to an image, inside the single quotes in the thumbnail area.
Put a url to an image or a path to an image, inside the single quotes in the thumbnail area.


[[image:image10.jpg]]
[[File:image10.jpg]]


The path should be on your xbox not your PC.
The path should be on your xbox not your PC.
Line 598: Line 598:
We now should have:
We now should have:


[[image:image11.jpg]]
[[File:image11.jpg]]


<source lang="python">
<source lang="python">
Line 607: Line 607:
Here we go, for every occurrence of the '''Tuple (thumbnail,url,name)''' found in '''match''' add the directory :-  
Here we go, for every occurrence of the '''Tuple (thumbnail,url,name)''' found in '''match''' add the directory :-  


'''MOVIE NAME, URL, MODE=2, THUMBNAIL to Xbmc.'''
'''MOVIE NAME, URL, MODE=2, THUMBNAIL to Xbmc.'''


Lets go to '''mode=2''' which is '''VIDEOLINKS(url,name) :-'''
Lets go to '''mode=2''' which is '''VIDEOLINKS(url,name) :-'''
Line 613: Line 613:
We are passing the '''Url and the name from the INDEX(url)''' - which only recieved the '''Url from CATEGORIES()'''
We are passing the '''Url and the name from the INDEX(url)''' - which only recieved the '''Url from CATEGORIES()'''
   
   
along to '''VIDEOLINKS (url,name)''' whatever lives in the brackets is passed along.
– along to '''VIDEOLINKS (url,name)''' whatever lives in the brackets is passed along.


If you change this remember to change the mode area also !
If you change this remember to change the mode area also !
Line 622: Line 622:
'''Right click''' on the first movie link and '''Copy Link location.'''
'''Right click''' on the first movie link and '''Copy Link location.'''


[[image:image12.jpg]]
[[File:image12.jpg]]


Before we move on it`s worth pointing out that nearly all video on websites is '''hidden or at least never just a direct link :'''
Before we move on it`s worth pointing out that nearly all video on websites is '''hidden or at least never just a direct link :'''
Line 630: Line 630:
They will generally be hosted elsewhere by a provider such as:
They will generally be hosted elsewhere by a provider such as:


'''Megavideo, Tudou, Youku, Youtube, Google, Veoh.'''
'''Megavideo, Tudou, Youku, Youtube, Google, Veoh.'''


They require a lot more handling than '''direct links''', but the routines are provided for you.
They require a lot more handling than '''direct links''', but the routines are provided for you.
Line 642: Line 642:
Paste the link into your restored browser address bar and press ENTER
Paste the link into your restored browser address bar and press ENTER


[[image:image13.jpg]]
[[File:image13.jpg]]


Well, what do we look for ?
Well, what do we look for ?
Line 663: Line 663:
Press ‘'''F5'''’ – to run '''test.py'''
Press ‘'''F5'''’ – to run '''test.py'''


Type >>>'''link''' -->ENTER
Type >>>'''link''' -->ENTER


[[image:image14.jpg]]
[[File:image14.jpg]]


Press '''Ctrl & F''' and type ‘'''Megavideo'''’  
Press '''Ctrl & F''' and type ‘'''Megavideo'''’  


We know it`s '''Megavideo because of the Red play button.'''
We know it`s '''Megavideo because of the Red play button.'''
Line 699: Line 699:
Close the Python shell then select '''test.py'''
Close the Python shell then select '''test.py'''


press ‘ '''F5''' ’ and type   >>>'''match'''
press ‘ '''F5''' ’ and type >>>'''match'''


[[image:image15.jpg]]
[[File:image15.jpg]]


So '''match[0]''' now holds the '''url'''.  
So '''match[0]''' now holds the '''url'''.  
Line 714: Line 714:
We will find that later, for now.
We will find that later, for now.


'''Copy the RE from match=re.compile(‘ here ’) in test.py'''
'''Copy the RE from match=re.compile(‘ here ’) in test.py'''


Restore '''Default.py''' and paste into the '''match=re.compile(‘ here ’).findall(link)'''
Restore '''Default.py''' and paste into the '''match=re.compile(‘ here ’).findall(link)'''
   
   
In '''VIDEOLINKS (url,name)'''
In '''VIDEOLINKS (url,name)'''
Line 738: Line 738:
Don`t panic, well maybe just a little.
Don`t panic, well maybe just a little.


[[image:image16.jpg]]
[[File:image16.jpg]]


Highlight and copy '''part 1 then restore Default.py''' and paste it below '''VIDEOLINKS(url,name)'''
Highlight and copy '''part 1 then restore Default.py''' and paste it below '''VIDEOLINKS(url,name)'''
Line 762: Line 762:
Restore your browser and click on ‘ MOVIES ’ scroll down  
Restore your browser and click on ‘ MOVIES ’ scroll down  


until you find ‘ '''Streets of fire (1984)''' ’
until you find ‘ '''Streets of fire (1984)''' ’


[[image:image17.jpg]]
[[File:image17.jpg]]


Click on it. Whoa, a new video provider.
Click on it. Whoa, a new video provider.
Line 772: Line 772:
See the download link..... hmmmmm
See the download link..... hmmmmm


[[image:image18.jpg]]
[[File:image18.jpg]]


I will help you a bit here as it isn`t as easy as it seems.
I will help you a bit here as it isn`t as easy as it seems.
Line 806: Line 806:
'''Ctrl & V'''
'''Ctrl & V'''


Press ‘ '''F5''' ’ and type >>>'''match''' -->'''ENTER'''
Press ‘ '''F5''' ’ and type >>>'''match''' -->'''ENTER'''


[[image:image19.jpg]]
[[File:image19.jpg]]


[[image:image20.jpg]]
[[File:image20.jpg]]


'''match''' comes up empty, the '''RE''' is different.
'''match''' comes up empty, the '''RE''' is different.
Line 818: Line 818:
Close the Python shell and restore '''test.py'''
Close the Python shell and restore '''test.py'''


Press ‘'''F5'''’ type   >>>'''link''' -->'''ENTER'''
Press ‘'''F5'''’ type >>>'''link''' -->'''ENTER'''


[[image:image21.jpg]]
[[File:image21.jpg]]


Try '''Ctrl & f then ‘stage’'''
Try '''Ctrl & f then ‘stage’'''
Line 841: Line 841:
Press ENTER
Press ENTER


All it means is that if the length of match is less than 1 and the   :  
All it means is that if the length of match is less than 1 and the :  


tells Python that the instruction is completed.
tells Python that the instruction is completed.
Line 847: Line 847:
See how Python uses tabs, this is correct don`t worry.
See how Python uses tabs, this is correct don`t worry.


Try leaving the   ''':'''   out and press ENTER , see what happens.
Try leaving the ''':''' out and press ENTER , see what happens.


This will lead to an error , press ‘F5’
This will lead to an error , press ‘F5’


[[image:error.jpg]]
[[File:error.jpg]]


Python likes Colons.
Python likes Colons.


Press '''OK''' and put the   ''':'''   back.
Press '''OK''' and put the ''':''' back.


Next after checking the tabs, type
Next after checking the tabs, type
Line 888: Line 888:
           match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link)  
           match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link)  
           match[0]=re.sub('&amp;','&',match[0])
           match[0]=re.sub('&amp;','&',match[0])
</source>  
</source>  


Type   >>>'''match''' -->ENTER
Type >>>'''match''' -->ENTER


[[image:image22.jpg]]
[[File:image22.jpg]]


Ok, the '''StageVu''' url.  
Ok, the '''StageVu''' url.  
Line 933: Line 933:




Under the last match type:    
Under the last match type:  


'''match[0]=re.sub('&amp;','&',match[0])'''
'''match[0]=re.sub('&amp;','&',match[0])'''
Line 974: Line 974:
We now have a link to a page that has the direct link to a video file.
We now have a link to a page that has the direct link to a video file.


[[image:image23.jpg]]
[[File:image23.jpg]]


Right Click and '''View page source'''
Right Click and '''View page source'''


'''Ctrl & F'''   type ‘'''avi'''’  
'''Ctrl & F''' type ‘'''avi'''’  


You will get three choices.  
You will get three choices.  
Line 984: Line 984:
Press ‘'''F3'''’ to keep searching.
Press ‘'''F3'''’ to keep searching.


'''<param name="src" value="(.+?)" />'''   is the best choice.
'''<param name="src" value="(.+?)" />''' is the best choice.


[[image:image24.jpg]]
[[File:image24.jpg]]


So, close the page source tab or window and restore '''test.py'''
So, close the page source tab or window and restore '''test.py'''
Line 992: Line 992:
Press ‘'''F5'''’
Press ‘'''F5'''’


'''Ctrl & F'''
'''Ctrl & F'''  


type ‘'''avi'''’ and press ‘'''F3'''’  
type ‘'''avi'''’ and press ‘'''F3'''’  
Line 1,011: Line 1,011:
In other words try a piece of code, but, if an error occurs use another piece of code.
In other words try a piece of code, but, if an error occurs use another piece of code.


It`s a lifesaver. [[image:lifesaver.jpg]]
It`s a lifesaver. [[File:lifesaver.jpg]]


The last time we looked at '''test.py''' it looked like this.
The last time we looked at '''test.py''' it looked like this.
Line 1,081: Line 1,081:
You will thank me later.
You will thank me later.


Cheers, Vin. [[image:devil.jpg]]  
Cheers, Vin. [[File:devil.jpg]]  




Line 1,096: Line 1,096:
Its called by typing it`s name with the video link to resolve in the brackets.
Its called by typing it`s name with the video link to resolve in the brackets.


'''Megavideo(match[0])''' - Remember that Megavideo link is stored in '''match[0]'''
'''Megavideo(match[0])''' - Remember that Megavideo link is stored in '''match[0]'''


It takes the url that you supply and resolves it into a megavideo flv then adds a playable link.
It takes the url that you supply and resolves it into a megavideo flv then adds a playable link.
Line 1,119: Line 1,119:


<source lang="python">
<source lang="python">
match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) match[0]=re.sub('&amp;','&',match[0])
match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) match[0]=re.sub('&amp;','&',match[0])
req = urllib2.Request(match[0])
req = urllib2.Request(match[0])
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
Line 1,141: Line 1,141:
We then scrape it with the new '''RE''' and add a link so xbmc can play it.
We then scrape it with the new '''RE''' and add a link so xbmc can play it.


The full code is available if you open and run '''Toughlove.py'''
The full code is available if you open and run '''Toughlove.py'''


Before you run it put a '''#''' in front of addlink as that command is only good for xbmc.
Before you run it put a '''#''' in front of addlink as that command is only good for xbmc.
Line 1,205: Line 1,205:
If it`s working and formed well you should see.
If it`s working and formed well you should see.


[[image:image25.jpg]]
[[File:image25.jpg]]


That`s fine.  
That`s fine.  
Line 1,217: Line 1,217:
Don`t give up on getting it right if you used the Last resort !
Don`t give up on getting it right if you used the Last resort !


====Looking back , scanning forward.====
==== Looking back , scanning forward. ====




Well folksages,
Well folksages,


That about wraps it up.  
That about wraps it up.  


Make a folder and name it :- Whatever you want, try '''Tvdash'''
Make a folder and name it :- Whatever you want, try '''Tvdash'''
Line 1,228: Line 1,228:
Open your Tv Dash bookmark and Right click on the Tv Dash logo.
Open your Tv Dash bookmark and Right click on the Tv Dash logo.


Select '''View Background image''' right click on the image
Select '''View Background image''' right click on the image


and Save image as '''try default.tbn'''
and Save image as '''try default.tbn'''


Everytime you highlight the tvdash folder in xbmc it will display that image.
Everytime you highlight the tvdash folder in xbmc it will display that image.
Line 1,243: Line 1,243:




=====PS=====
===== PS =====


Oh , I almost forgot.
Oh , I almost forgot.
Line 1,263: Line 1,263:
http://tvdash.com/main.php
http://tvdash.com/main.php


so in the '''addLink(name,’http://tvdash.com/’+url,’’)''' simple eh.
so in the '''addLink(name,’http://tvdash.com/’+url,’’)''' simple eh.


Pm me if you need help  
Pm me if you need help  
Line 1,278: Line 1,278:




======Swimming with Wireshark======
====== Swimming with Wireshark ======


I`m back . . because nobody else would come.
I`m back . . because nobody else would come.
Line 1,297: Line 1,297:
If you are running '''Windows'''  
If you are running '''Windows'''  


*Right click on your internet connection icon - located bottom right, near the time.
* Right click on your internet connection icon - located bottom right, near the time.


*Select - Open Network connections
* Select - Open Network connections


*Move the mouse pointer over your connection and write down the network card name.
* Move the mouse pointer over your connection and write down the network card name.


Open Wireshark, '''cue scary music'''
Open Wireshark, '''cue scary music'''
Line 1,307: Line 1,307:
Firstly, go to:
Firstly, go to:


*Edit --->Preferences--->Capture
* Edit --->Preferences--->Capture


[[image:wire1.jpg]]
[[File:wire1.jpg]]


*Make sure that the '''Default Interface''' is set to your network card name.
* Make sure that the '''Default Interface''' is set to your network card name.


*Capture packets in promiscuous mode is unchecked.
* Capture packets in promiscuous mode is unchecked.


*Apply your changes.
* Apply your changes.


Let`s hunt some Internet Truffles.
Let`s hunt some Internet Truffles.
Line 1,321: Line 1,321:
-Will be continued, a deadline approaches.
-Will be continued, a deadline approaches.


 
[[Category:How To|Python Tutorial]]
[[category:How To|Python Tutorial]]
[[Category:Inner Workings]]
[[category:Inner Workings]]
[[Category:Python]]
[[category:Python]]
[[Category:Plugins]]
[[Category:Plugins]]
[[category:Development]]
[[Category:Development]]
[[category:Skin Development]]
[[Category:Skin Development]]
[[Category:Add-ons]]
[[Category:Add-ons]]
[[Category:To-Do]]
[[Category:To-Do]]

Revision as of 16:13, 7 September 2011

Time.png THIS PAGE IS OUTDATED:

This page or section has not been updated in a long time, no longer applies, refers to features that have been replaced/removed, and/or may not be reliable.

This page is only kept for historical reasons, or in case someone wants to try updating it.

Template:XBMC faq toc Inline XBMC features a Python Engine and XBMC GUI interface have built-in 'Add-on' support for displaying content listings that present to that GUI interface.

Plugins, unlike scripts, do not really provide new functionality to XBMC, instead what they do do is provide an easy way to present content listings in XBMC through the native GUI interface. Content is usually online sources like Internet audio and video streams, (like Internet TV-channels, and radio-station, or Podcasts), or pictures from websites which as Flickr and Picasa Web.

Please feel free to add more samples of simple plugin functions with comments that you know or figure out while you are learning to write or edit plugins for XBMC. Adding anything at all no matter how basic, if its not already here add it! someone will more then likely benefit from it. The more difficult your snippet please heavily comment it with "#" don't be stingy on the comments you can never have too much information, what's simple to you may make no sense at all to someone else, also URLs that were helpful to what you were doing would be great to add to your snippet or to the bookmark section (python sites, chat rooms, etc).

For end-user intruction on how to install plugins for XBMC please see HOW-TO install and use plugins in XBMC.

Note! As most plugins work by scraping a website they often stop working (partially or completely) when changes (redesign) are made to the website by the website owner. So if a third-party plugin does not work know that it is not usually due to a problem with XBMC, but rather the website it scrapes has had changes done to it, meaning someone will have to manually edit the plugin code to account for the new changes, then you will have to get that updated version of the plugin.

Introduction/Prerequisites

You will need the following items:

There's also lots of streaming sources available for building plugins with:

Before you begin, search the XBMC forum for the name of the plugin you are planning on creating. It may already exist. Install Python and find where you put the source code.

Right click on the file 'Default.py' and select 'Edit with Idle'.

Plugin Template : Breakdown.

This is just to familiarise yourself with the layout of the plugin template. It`s not in depth – we will delve deeper later.

Remember – it`s all about the Tabs.

Python is just like you and I, it has the ability to use information from other sources. If you don`t know anything about Python – you can read a book from a library and gain the knowledge. Therefore it`s the same with a Python & it`s libraries. A Python Library is a collection of useful functions that the main program can call upon and read then use the new knowledge in the main program.

To use them we need to import them, you will only use 5 at maximum for basic plugin creation.

import urllib,urllib2,re,xbmcplugin,xbmcgui

  • urllib – a collection of http routines.
  • urllib2 – a more advanced collection of http routines.
  • re – a collection of string manipulation routines.
  • xbmcplugin – specific Xbmc routines.
  • xbmcgui – specific Xbmc routines.

Without the above imports, python has no reference and the plugin will error and fail.

The next thing you will see is the (CATEGORIES () : ) function. This handy little function takes the stress out of XBMC Directory listing. It`s really the addDir() that lists & adds - but we are not in depth - YET.

def CATEGORIES():
      addDir("","",1,"")
      addDir( "","",1,"")
      addDir( "","",1,"")
      addDir("","",1,"")

You define a function in python with the def command. This function adds the first Directories that you see when you start the plugin. Let`s break it down. Scan down the Plugin template.

def addDir(name,url,mode,iconimage):
    u=sys.argv[0]+"?url="+urllib.quote_plus(url)+"&mode="+str(mode)+"name="+urllib.quote_plus(name)
    ok=True
    liz=xbmcgui.ListItem(name, iconImage="DefaultFolder.png",thumbnailImage=iconimage)
    liz.setInfo( type="Video", infoLabels={ "Title": name })
    ok=xbmcplugin.addDirectory(handle=int(sys.argv[1]),url=u,listitem=liz,isFolder=True)
    return ok


I`m not going to go into too much detail here, when you call the addDir()command you are calling the above function. All that is happening is a name, url, and image are being passed to XBMC in a format it likes and it`s being told to display it.

name = The name of the Directory, either a string or the product of a scrape.

Python likes strings or stored strings.

To make a string, give it a name then type your string in quotes.

a= ` this is a string’ , notice how it turns green.

For the name that addDir needs you can either pass it a string.

addDir(`string’, ’’ , 1 , ’’) or a stored string addDir(name, ‘’ , 1 , ’’ )

We will examine stored strings later.

url = The web address, either a string or the product of a scrape.

As above pass either a string ‘ hxxp://www.testing.com’ or a stored url string – url.

Next on the list is:

mode = Where in the program to jump to .

Look at the bottom of the Template.

if mode==None or url==None or len(url)<1:
        print ""
        CATEGORIES()
     
elif mode==1:
        print ""+url
        INDEX(url)
        
elif mode==2:
        print ""+url
        VIDEOLINKS(url,name)

We have 3 modes here, but, you can have as many as you need.

  • mode0 = CATEGORIES()
  • mode1= INDEX(url,name)
  • mode2=VIDEOLINKS(url,name)

So in the CATEGORIES () : function, we call addDir(‘’,’’,1,’’) which tells the plugin to move on to mode 1 = INDEX(url).

Tell Xbmc to add the first Directories then move on to INDEX function.

See it`s not awful.

The last term here is :

Iconimage = An image for display with the Directory or link, either string or stored string. Basically a thumbnail.

addDir (name, url , 1 , ’http://testing.com/image.gif’) or a stored string addDir(name, url , 1 ,thumb )

Xbmc can handle pretty much all image formats : jpg ,gif, png etc

The addLink(name,url,iconimage) is exactly the same except the isfolder= option is set to false to tell Xbmc that this is a video link and not a folder.

The next Function is the INDEX (url) : this opens the webpage of your chosen website, pretends to be a browser, reads it, stores the page and then scans the page for links to the sacred video files.

def INDEX(url):
        req = urllib2.Request(url)
        req.add_header('User-Agent', ' Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
        response = urllib2.urlopen(req)
        link=response.read()
        response.close()
        match=re.compile('').findall(link)
        for thumbnail,url,name in match:
                addDir(name,url,2,'')

We are going to call on the urllib2 library not the urllib library to connect to the web page because we can spoof the website into believing that the plugin is a real person using a browser.

req = urllib2.Request(url)
req.add_header('User-Agent', ' Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')

url - is the web address that we passed during the previous CATEGORIES () - addDir(name, url , 1 ,thumb )

We have added a header, User-agent. This allows Python to pretend to be anything else, Firefox, Internet Explorer,Ipod,Phone,Veoh.

We can send any headers here using the same format. Just copy the req.add_header and put it under the other one.

Some sites need specific headers to access their information. This pretends to be the latest Firefox.

req.add_header('User-Agent', ' Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
req.add_header('Content-type', 'octet-stream/flv')

response – adds the req calls and causes connection with the webpage.

response = urllib2.urlopen(req)
link=response.read()			- Holds the webpage that was read via the response.read() command. 
response.close()				- Closes the connection after we have read the webpage.
match=re.compile('').findall(link)
for thumbnail,url,name in match:

link - is the holder for the page url from CATEGORIES that was read. We close the open connection as we have the page and want to save memory.

Now we scrape the screen for the relevant information. We are using the re library here:

match=re.compile("").findall(link)

match- is the holder for the compiled bits of information we have scraped. We will explore this in depth in the next part of the tutorial.

Just familiarise yourself for now.

Open hxxp:/xbmc.org/forum/register.php in your browser ( I use firefox ).

Right click and select View page source. Scroll right down to the bottom of the page do you see : -

<div class="smallfont" align="center">

To scrape the words smallfont and center we would use –

match=re.compile('<div class="(.+?)" align="(.+?)">').findall(link)

so match now holds – smallfont and center. The order matters so remember smallfont is first, center is second. Imagine that this was the url and name of a video.

  • match[0] = smallfont
  • match[1] = center

You can experiment with regular expressions like those using a free online tool like RegExr.

Lets move on.

for ? in match:- Gets all the information stored in match and creates multiple links or directories on your screen.

for url,name in match:
	addDir(name,url,1,’’)

url = The first thing held in match name = The second

You can call them whatever you wish.

So a directory will be created named center with the url smallfont As you can see the VIDEOLINK() function is just the same really, except for addLink instead of addDir.

Minimize the template, open your browser, right click -->edit with idle - on test.py and lets really do something.

If you are still unsure about what you have read, re–read.

Test, test, test again.

Right, let`s have some fun.

Open your Browser ( I use Firefox for the tutorial)

Ctrl & click the link : - http://www.ovguide.com/

Scroll down – see TV DASH, that`s the fellow for this tutorial.

Click it or Ctrl & click the link : - http://www.tvdash.com/

Once it`s open – Bookmark it.

Image1.jpg

Excellent Choice, at first glance we have thumbnails for the movies & tv – great for Xbmc.

The site looks clean and tidy – great for grabbing urls, without headaches.

Look at the top center of of the screen, Home, movies, tvshows,games.

They will become our main directories and as there are only two that we need, we will add them manually.

Before we do that lets get the scraping and testing done.

Click on the ‘ MOVIES ‘ link.


Image2.jpg


A lovely list of movies, ha, nothing makes me feel more like a Pirate than this – Aaargh Jim`lad.

Hoist the test.py

Remember the name of the first movie listed here, we will need it later.

First copy the url from the address bar of your browser.

Right click and copy.

Click to restore the test.py window you will see this or something like it.

import urllib2,urllib,re

url=' Delete me'

req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3 Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=lemon.read()
response.close()
match=re.compile('Delete me 2').findall(link)
print match

Click and delete the text url="Delete me" ---> url=" "

Replace it with the url you have stored. Ctrl & V.

url="hxxp://www.tvdash.com/movies.html"

Press ‘ F5 ’ – which is really run module in Python terms.


Image3.jpg


The parentheses are the graphical representation of what`s stored in match.

In this case, nothing, yet.

The ‘ Print match ‘ command causes the display, you can print any variable or string.

Remove it by putting # in front of it #print match see how it turns red.

Any text you put after a # character becomes a non-useable

note ! - Great for adding your plugin title.

The reason for the empty match variable is that we haven`t provided any regular expression to scrape the data, we will soon.

But for now, we have a nice Python shell open.

All commands are case sensitive so type exactly what you have in test.py only.

Type ‘print response.info()‘ -->ENTER this will bring up the server response from the website.

Image4.jpg

See how powerful Python is ?

This command is really useful, but not to be used just yet.

Sometimes Cookies that are required to access sites reside here and can be accessed and manipulated to your benefit.

Commands and tabulation is what Python is all about.

Back Sir/Madam, back I say, to your browser and let`s look at the source of the page.

Press Ctrl & U to view the page source.

At this point it is very easy to be put off, don`t be, once your eye is in – it becomes second nature. Honest.

File:Image5.jpg

Hmmm ....... Press Ctrl & f to bring up the search and enter the name of the first movie.

10 things i hate about you , Now this is more like it.

File:Image6.jpg

The name, iconimage (Thumbnail), url are all here.

Great.

First, maximize your Python shell.

In your Python shell type : - link --->ENTER

Once it has completed, you will have a display of the webpage ready for scraping.

File:Image7.jpg

Press Ctrl & f to bring up the search and enter 10 things I hate about you --->ENTER

You will see the above section.

The highlighted area is exactly what we need, it has the thumbnail,name and url.

src="http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg" border="0" /></td>\n <td class="namewidth"><a title="Watch 10 Things I Hate About You (1999) (Requested) Online" href="http://www.tvdash.com/movies/10_Things_I_Hate_About_You_(1999)_(Requested).html">10 Things I Hate About You (1999) (Requested)</a>

The following bits are what we need for the tutorial.

  • 10 Things I Hate About You (1999) (Requested)</a> = name


To scrape the text we need to show Python what to take and where it is.

Have a go at highlighting that whole section, hold your left mouse button and drag it across the section.

Then press Ctrl & c to copy to clipboard.

Close your Python shell and restore your test.py

Place your cursor in between the ‘ ‘ of the match=re.compile ( ‘ ‘ ).findall(link)

Press Ctrl & V. You should see this.

match=re.compile(‘src="http://farm4.static.flickr.com/3190/2881209823_799288181c_m.jpg" border="0" /></td>\n <td class="namewidth"><a title="Watch 10 Things I Hate About You (1999) (Requested) Online" href="http://www.tvdash.com/movies/10_Things_I_Hate_About_You_(1999)_(Requested).html">10 Things I Hate About You (1999) (Requested)</a>').findall(link)

The text between the single quotes is green because Python recognizes a string and highlights it.

  • Green = good
  • Red = bad , providing the code was not after the # character.

Python uses this to mark notes.

Future reference.

If the scrape has single quotes in it - src= ‘ http://tester.html

you will need to change the wrapping quotes – match=re.compile( ‘src= ‘ http://tester.html ‘ ‘ )

to match=re.compile( “src= ‘ http://tester.html ‘ “ ) to not confuse Python.

Okay, let`s scrape that site.

There are many useful regular expressions that Python understands but generally for basic site scraping you only need - brackets around what you want to capture then delete the text and replace with:

(.+?) – Yep, it really can be that simple – sometimes.

Let`s just say it means take everything inside the brackets – it is a basic tutorial. He,he.

When trouble brews and trust me , it will.

Try here for really in depth help – I did.

Shall we scrape ?

Following the tutorial - put brackets around what you want to capture then delete the text and replace with (.+?) we get :

match=re.compile('src="(.+?)" border="0" /></td>\n <td class="namewidth"><a title=".+?" href="(.+?)">(.+?)</a>').findall(link)

So here we have scraped the image,url and name.

You noticed didn`t you.

One section has been replaced with .+? that is because that text will not be the same for every movie on the page.

If the movie is ‘ Friday 13th ‘ it would say ‘Watch Friday 13th’ not ‘Watch 10 things I hate about you

so we tell python to skip it with .+?

– No brackets, no scrape.

Let`s move on . .you should end up with thi . . .

What do you mean you`re not ready.

Okay go back and re-read the last section.

Honestly people today, wouldn`t have happened in my day . .

Ready now ?

File:Image8.jpg

Maximize your Python shell and type >>>Match --->ENTER

File:Image9.jpg

Houston, we have an RE match.

The variable match now holds the information we wanted in the form of a Tuple.

File:Tribble.gif No not a Tribble – bad Trekkie.

For information on tuples try here :

http://en.wikibooks.org/wiki/Python_Programming/Tuples

How many matches did we get ?

Let`s use the len() command – length of variable.

Type >>> len(match) -->ENTER

We get >>>251

so our RE pattern matched 251 times.

That`s 251 movies.

Python always starts indexing(counting) from zero.

So the first match is match[0] not match[1]

That means that the length really is 250

so the to address the last item it is match[250] or match[-1]

A negative index number goes backwards, handy eh.

Have a go, try a few things

>>>match[10] or match[-10] etc

We have 3 things stored in match try : match[0][0] hmmmm

Putting it together.

Well where does that leave us ?

So far we have scraped some information and stored it in the variable match. We have established that information stored in python variables can be accessed individually.

We have established how to read a webpage contents then store and display them in python.

Let`s put it together..

Restore Default.py and put your cursor between the quotes of the name area.

Type ‘ MOVIES ‘ and again with the url area this time restore test.py and copy, using Ctrl & c, the url=' http://www.tvdash.com/movies.html’ text.

Go back to Default.py and paste using Ctrl&V between the single quotes in the url area.

addDir("MOVIES","http://www.tvdash.com/movies.html",1,"http://reggiebibbs.files.wordpress.com/2007/12/the-movies360-crop.jpg")

If you would like Xbmc to display a custom thumbnail for the MOVIES directory.

Put a url to an image or a path to an image, inside the single quotes in the thumbnail area.

File:Image10.jpg

The path should be on your xbox not your PC.

Q:/XBMC/PLUGINS/VIDEO/YOUR PLUGIN/IMAGES/Movie.png

Where Q:/ is a Directory of xbmc for xbox. For other platforms use a Directory that you wish for your platform.

Or try :

translatePath(...)
    translatePath(path) -- Returns the translated path.
     
    path           : string or unicode - Path to format
     
    *Note, Only useful if you are coding for both Linux and the Xbox.
           e.g. Converts 'T:\script_data' -> '/home/user/XBMC/UserData/script_data'
           on Linux. Would return 'T:\script_data' on the Xbox.
     
    example:
      - fpath = xbmc.translatePath('T:\script_data')

So the previous.

addDir("MOVIES","http://www.tvdash.com/movies.html",1," http://reggiebibbs.files.wordpress.com/2007/12/the-movies360-crop.jpg")

Adds a directory folder named MOVIES to Xbmc with the url

http://www.tvdash.com/movies.html

moving to mode=1 with the thumbnail

http://reggiebibbs.files.wordpress.com/2007/12/the-movies360-crop.jpg

Let`s move to mode=1 which is INDEX(url)

Go to test.py and copy the text between the single quotes of

match=re.compile(‘ ’)

Go to Default.py and delete the text between the match =re.compile(‘’) quotes and paste.

We now should have:

File:Image11.jpg

for thumbnail,url,name in match:
                addDir(name,url,2,thumbnail)

Here we go, for every occurrence of the Tuple (thumbnail,url,name) found in match add the directory :-

MOVIE NAME, URL, MODE=2, THUMBNAIL to Xbmc.

Lets go to mode=2 which is VIDEOLINKS(url,name) :-

We are passing the Url and the name from the INDEX(url) - which only recieved the Url from CATEGORIES()

– along to VIDEOLINKS (url,name) whatever lives in the brackets is passed along.

If you change this remember to change the mode area also !

Close Python shell to clear the shell page.

Restore test.py and restore your browser window. Right click on the first movie link and Copy Link location.

File:Image12.jpg

Before we move on it`s worth pointing out that nearly all video on websites is hidden or at least never just a direct link :

- http://tvdash.com/10_things_i_h8_aboutyou.flv

They will generally be hosted elsewhere by a provider such as:

Megavideo, Tudou, Youku, Youtube, Google, Veoh.

They require a lot more handling than direct links, but the routines are provided for you.

Rather than creating a class for you to import.

(Well I did, but this is a basic tutorial)

Just copy and paste them into Default.py as functions.

Paste the link into your restored browser address bar and press ENTER

File:Image13.jpg

Well, what do we look for ?

See the evil red play button.

This equals Megavideo – bane of streaming with it`s cut offs.

It`s like the eye of Sauron isn`t it

Megavideo is very tricky to resolve but thanks to Coolblaze we can do it rather well.


Select and restore test.py

Delete the url text inside the single quotes and paste in the new url with Ctrl & V

We get : url=’http://www.tvdash.com/movies/10_Things_I_Hate_About_You_(1999)_(Requested).html’

Press ‘F5’ – to run test.py

Type >>>link -->ENTER

File:Image14.jpg

Press Ctrl & F and type ‘Megavideo

We know it`s Megavideo because of the Red play button.

Look for :

<embed src="http://www.megavideo.com/v/JE9VCAET" type="application/x-shockwave-flash"

The above is all we need for the RE string so highlight it in the Python shell window and copy it with Ctrl & C and close the python shell.

Restore the test.py and delete the text inside the single quotes of match=re.compile(‘ ’)

Paste the copied text.

You should have :

match=re.compile(‘<embed src="http://www.megavideo.com/v/JE9VCAET" type="application/x-shockwave-flash"’).findall(link)

Set your RE :

match=re.compile(‘<embed src="(.+?) " type=".+?"’).findall(link)

So we will take the url to the video, skipping the type= ’.+?’

Try your own RE expressions.

Close the Python shell then select test.py

press ‘ F5 ’ and type >>>match

File:Image15.jpg

So match[0] now holds the url.

Unfortunately, Xbmc cannot follow this type of link (YET) as it calls a hidden swf file.

Xbmc needs a directly resolved link.

http://www723.megavideo.com/files/c7302786700985d0f6ea102f556dae8b/

We will find that later, for now.

Copy the RE from match=re.compile(‘ here ’) in test.py

Restore Default.py and paste into the match=re.compile(‘ here ’).findall(link)

In VIDEOLINKS (url,name)

If this was a direct link to an flv or an avi you would use

for url in match:
                addLink(name,url,'')

Or just simply:-

addLink(name,match[0],)


As we need a resolved Megavideo url let`s press on.

Open up 'Videoroutines.py you should see a very scary looking routine.

Don`t panic, well maybe just a little.

File:Image16.jpg

Highlight and copy part 1 then restore Default.py and paste it below VIDEOLINKS(url,name)

It starts with def ajoin(arr) and ends with return endstr.

Great, at this point don`t try to understand it , it`s quite frightening.

The next step is to copy part 2, so back to Videoroutines.py

Copy part 2 with Ctrl & C

Restore Default.py and paste it below part 1.

Ok that`s the Megavideo routine in place.

What it does is call up a hidden page that Megavideo uses for information on their files. Scrape the information then use it to decrypt and resolve the Flv file.

A round of applause for Coolblaze of Xbmc forums and other fame, hurrah !

For now, close all open Python windows.

Restore your browser and click on ‘ MOVIES ’ scroll down

until you find ‘ Streets of fire (1984)

File:Image17.jpg

Click on it. Whoa, a new video provider.

It`s StageVu – thankfully they use direct links.

See the download link..... hmmmmm

File:Image18.jpg

I will help you a bit here as it isn`t as easy as it seems.

View source

Ctrl & f

Look for stagevu

<iframe style='border: 0; width: 750px; height: 480px' src='http://stagevu.com/embed?width=720&amp;height=405&amp;uid=hugygadfizom' scrolling='no'></iframe>

This is different than the Megavideo RE.

<embed src="http://www.megavideo.com/v/JE9VCAET" type="application/x-shockwave-flash"

Also our Videolink(url,name) is coded for the Megavideo format.

Right.

We need to change the Function, but for now open test.py

Right click and -->Edit with Idle.

Restore your browser and copy the url from the address bar.

Select test.py and replace the old url with the new one.

Ctrl & V

Press ‘ F5 ’ and type >>>match -->ENTER

File:Image19.jpg

File:Image20.jpg

match comes up empty, the RE is different.

Remember the len() command from earlier, it`s time to use it.

Close the Python shell and restore test.py

Press ‘F5’ type >>>link -->ENTER

File:Image21.jpg

Try Ctrl & f then ‘stage’

Anything ?

Just a beep. They have tried to stop you finding the link.

Scroll up until you find the highlighted section.

Copy it. Close the Python shell.

Restore test.py

Under the last match type

if len(match)<1:

Press ENTER

All it means is that if the length of match is less than 1 and the :

tells Python that the instruction is completed.

See how Python uses tabs, this is correct don`t worry.

Try leaving the : out and press ENTER , see what happens.

This will lead to an error , press ‘F5’

Error.jpg

Python likes Colons.

Press OK and put the : back.

Next after checking the tabs, type

match=re.compile().findall(link)

Paste the copied text between the single quotes.

You should have :

match=re.compile(' <iframe style=\'border: 0; width: 750px; height: 480px\' src=\'http://stagevu.com/embed?width=720&amp;height=405&amp;uid=hugygadfizom\' scrolling=\'no\'></iframe>').findall(link)

Delete everything within the embed src "" and replace with (.+?)

Below is what you test.py should look like, press ‘F5


import urllib2,urllib,re

url='http://www.tvdash.com/movies/Streets_of_Fire_(1984).html'

req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
match=re.compile('<embed src="(.+?)" type=".+?"').findall(link)
if len(match)<1:
          match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) 
          match[0]=re.sub('&amp;','&',match[0])

Type >>>match -->ENTER

File:Image22.jpg

Ok, the StageVu url.

No it isn`t.

It`s a link to another page because the website used a frame

Hence the <iframe> tag.

Look at the link.

Does it look okay ?

http://stagevu.com/embed?width=720&amp;height=405&amp;uid=hugygadfizom

Try copying and following the link in your browser.

‘GHOST IN THE SHELL’ , what the hell ?

The & is the problem.

Welcome to the re.sub(‘ area1 ’,’area2 ’, string ) command.

Regular expression substitute.

So we now have a program that checks two RE`s

If the first one fails and stores the information in match.

We access that with match[0] as there is only one link.

Basically swap whatever you put in area1 for whatever is in area2.

The string is whatever holds the text to be swapped, in this case match[0]

it could be a text file a response.read() stored webpage, anything really.


Under the last match type:

match[0]=re.sub('&','&',match[0])

The whole thing should look like this.

import urllib2,urllib,re

url='http://www.tvdash.com/movies/Streets_of_Fire_(1984).html'

req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
match=re.compile('<embed src="(.+?)" type=".+?"').findall(link)
if len(match)<1:
          match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) 
          match[0]=re.sub('&amp;','&',match[0])

Press ‘F5’ and type >>> match[0] -->ENTER

http://stagevu.com/embed?width=720&height=405&uid=hugygadfizom

http://stagevu.com/embed?width=720&amp;height=405&amp;uid=hugygadfizom


Bit of a difference, eh ?

Close the Python shell.

Try following both links to see the difference.

Leave the new link open & close the old one.

We now have a link to a page that has the direct link to a video file.

File:Image23.jpg

Right Click and View page source

Ctrl & F type ‘avi

You will get three choices.

Press ‘F3’ to keep searching.

<param name="src" value="(.+?)" /> is the best choice.

File:Image24.jpg

So, close the page source tab or window and restore test.py

Press ‘F5

Ctrl & F

type ‘avi’ and press ‘F3

Until you find the <param name= link.

Copy it and close the python shell.

It`s time to introduce basic error handling.

try:
	whatever
except:
	whatever

In other words try a piece of code, but, if an error occurs use another piece of code.

It`s a lifesaver. File:Lifesaver.jpg

The last time we looked at test.py it looked like this.


import urllib2,urllib,re

url='http://www.tvdash.com/movies/Streets_of_Fire_(1984).html'

req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
match=re.compile('<embed src="(.+?)" type=".+?"').findall(link)
if len(match)<1:
          match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) 
          match[0]=re.sub('&amp;','&',match[0])

Well we need to change it around to include the error handling.


import urllib2,urllib,re

url='http://www.tvdash.com/movies/Streets_of_Fire_(1984).html'

req = urllib2.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
try:
        match=re.compile('<embed src="(.+?)" type=".+?"').findall(link)
        Megavideo(match[0])
except:
        match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) 
        match[0]=re.sub('&amp;','&',match[0])
      req = urllib2.Request(match[0])
        req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
        response = urllib2.urlopen(req)
        link=response.read()
        response.close()
        match=re.compile('<param name="src" value="(.+?)" />').findall(link)
        addLink(name,match[0],’’)

It seems like a big leap, it isn`t really.

You should be getting more familiar with IDLE

now so change the code to resemble the above.

This will be a really good lesson in Python formatting. He he he.

Remember to press enter after each command and you shouldn`t go far wrong.

Remember to press ‘F5’ every once in a while to see if you have made a mistake.

Remember Python is case sensitive.

Python will tell you in no uncertain terms, you may hate it and this may have stopped your fun.

It`s tough love, honest.

You will thank me later.

Cheers, Vin. Devil.jpg


Did that take a while . . . . Sorry.

What were we doing, hmmmm , yes.

Code explained.

Try : - finding a match then call the Megavideo function we dropped in earlier.

That`s right time to add it.

Its called by typing it`s name with the video link to resolve in the brackets.

Megavideo(match[0]) - Remember that Megavideo link is stored in match[0]

It takes the url that you supply and resolves it into a megavideo flv then adds a playable link.

addLink(name, movielink+'?.flv',)

Sometimes Xbmc doesn`t recognize the Megavideo link is an flv so the

+’?.flv’

Simply adds the ?.flv to the movielink.

Try here for more on strings :

http://www.python.org/doc/2.3.5/tut/node5.html

Except: -

if an error occurs, like not finding a match, then execute some new code.

Which in this case is the RE and urllib2 actions needed to get that direct StageVu link.

match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) match[0]=re.sub('&amp;','&',match[0])
req = urllib2.Request(match[0])
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
response = urllib2.urlopen(req)
link=response.read()
response.close()
match=re.compile('<param name="src" value="(.+?)" />').findall(link)
addLink(name,match[0],’’)

So first we scrape the link using our new RE.

Then we fix the link with the re.sub command, but it`s a new page and we need the direct link.

So, we open and read the new page.

req = urllib2.Request(match[0])

We request match[0] as that is the index representation of the link stored in match.

We then scrape it with the new RE and add a link so xbmc can play it.

The full code is available if you open and run Toughlove.py

Before you run it put a # in front of addlink as that command is only good for xbmc.

Type >>> match[0]

Direct StageVu link.

http://n13.stagevu.com/vlink/d583998babb6d14b49214f2a15f31260.avi

Right, folks time to copy the new Videolinks code over to Default.py

Close everything other than test.py and Default.py

Select Default.py :- in the VIDEOLINKS(url,name)

highlight and delete the match=re.compile(‘’)

It should look like this:

def VIDEOLINKS(url,name):
          req = urllib2.Request(url)
          req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
          response = urllib2.urlopen(req)
          link=response.read()
          response.close()

Select test.py and highlight and copy everything below response.close()

Back to Default.py and paste below response.close()

We should now have :


def VIDEOLINKS(url,name):
          req = urllib2.Request(url)
          req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
          response = urllib2.urlopen(req)
          link=response.read()
          response.close()
          try:
              match=re.compile('<embed src="(.+?)" type=".+?"').findall(link)
              Megavideo(match[0])
          except:
              match=re.compile('<iframe style=\'border: 0; width: 750px; height: 480px\' src=\'(.+?)\' scrolling=\'no\'></iframe>').findall(link) 
              match[0]=re.sub('&amp;','&',match[0])
          req = urllib2.Request(match[0])
              req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
              response = urllib2.urlopen(req)
              link=response.read()
              response.close()
              match=re.compile('<param name="src" value="(.+?)" />').findall(link)
              addLink(name,match[0],’’)

Press ‘F5’ , any errors ?

If so fix your formatting.

If it`s working and formed well you should see.

Image25.jpg

That`s fine.

You will probably get : No modules named Xbmc & Xbmc etc.

I have one module running.

If you still can`t get it working use the full program : - LastResort.py

Don`t give up on getting it right if you used the Last resort !

Looking back , scanning forward.

Well folksages,

That about wraps it up.

Make a folder and name it :- Whatever you want, try Tvdash

Open your Tv Dash bookmark and Right click on the Tv Dash logo.

Select View Background image right click on the image

and Save image as try default.tbn

Everytime you highlight the tvdash folder in xbmc it will display that image.

FTP the folder over to XBMC/PLUGINS/VIDEO

Enjoy your hardwork.

Thanks for reading,

Your friendly neighbourhood Voinage / Vin.


PS

Oh , I almost forgot.

There is still a TV section, have a go yourself.

A few things to keep in mind.

The Tv section is very different and you will have to create two new functions.

Remember the modes they allow you to direct the flow of the plugin and add new functions.

Sometimes scraped urls are not the full url

---  main.php 

when it needs to be

http://tvdash.com/main.php

so in the addLink(name,’http://tvdash.com/’+url,’’) simple eh.

Pm me if you need help

Try looking at the Python in my plugins for new Video routines or new plugin tricks.

Rest easy Brothers and Sisters.

Voinage has left the building.

(Try the Last resort.py for the full plugin with a few errors)

Fix them people, test your skills.


Swimming with Wireshark

I`m back . . because nobody else would come.

If you haven`t downloaded and installed Wireshark, do it now, or be forever roasted in the bowels of a scrape monkey.

The link is here:

http://www.wireshark.org/download.html


This is going to be slightly less informative than a tv show with David Attenborough, although I would like to see him coding in Python.

Here we have a small coding Python from dual core India, today we will be following etc

Finally, it`s installed, I can shut up now.

If you are running Windows

  • Right click on your internet connection icon - located bottom right, near the time.
  • Select - Open Network connections
  • Move the mouse pointer over your connection and write down the network card name.

Open Wireshark, cue scary music

Firstly, go to:

  • Edit --->Preferences--->Capture

File:Wire1.jpg

  • Make sure that the Default Interface is set to your network card name.
  • Capture packets in promiscuous mode is unchecked.
  • Apply your changes.

Let`s hunt some Internet Truffles.

-Will be continued, a deadline approaches.