Scrapers: Difference between revisions

From Official Kodi Wiki
Jump to navigation Jump to search
m (Robot: Changing Category:Video Library to Category:Video library)
(Convert to Portal)
(13 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{mininav|[[Development]]|[[Add-on development]]}}
{{mininav|[[Development]] {{l2| [[Music library]] }} {{l2|[[Video library]]}} }}
{{see also|Video library|Music library}}
{{cleanup}}
Scrapers (also outside the XBMC community more commonly referred to as "''[http://en.wikipedia.org/wiki/Web_scraper web scrapers]''" or "''[http://en.wikipedia.org/wiki/Importer_(computing) importers]''") are used by XBMC Media Center to go online to automatically retrieve [http://en.wikipedia.org/wiki/Metadata metadata] information and artwork ([[thumbnails]]poster/cover/[[fanart]]) from websites for Games, Music, Music, Videos, Movies, and TV Shows. This meta data information and artwork is then used for the XBMC program, music, and video [[Library|libraries]].<br />


= Introduction to Scrapers =
Before version 2.0 of XBMC Media Center the only available web scraper we had was a [[IMDb (Video Library)|IMDb]] importer. In later versions it has become possible to create custom scrapers to collect data from almost any web-page which list information in a consistent way, and today XBMC comes with many scrapers for several languages and different websites.
To learn more about actually using scrapers in XBMC, please look at:
* [[Set Content (Video Library)]]
And to learn more about '''creating''' scrapers, please look at this article:
* [[HOW-TO Write Media Info Scrapers]]


XBMC come with several scrapers for Movies, TV shows and Music Videos which are stored in xbmc\system\scrapers\video. They are just specially formatted [http://en.wikipedia.org/wiki/XML XML] files.
= Using Scrapers =
* The location of the scrapers has changed for EDEN Beta 3 - the \scrapers directory is old. They are now add-ons- see the 'the movieDB' as an example- just copy the directories metadata.themoviedb.org and metadata.common.themoviedb.org to your own names, change a few xml guts and you will have a new scraper that you can play with. THEN these instructions become relevant again.
Scrapers are used to contact online information providers to download metadata for video and music items to be added to the library.


The scraper XML file consists of text processing operations that work over a set of text buffers, labelled $$1 to $$20. The basic operation is RegExp. A single RegExp is a text processing command that takes text in buffer 'input', runs <expression> on it... formats the matches with 'output' and places (or appends) the result in buffer 'dest'. A scraper XML file must use RegExps to implement functions that XBMC will call such as GetSearchResults.
A number of scrapers are installed by default when installing {{kodi}}. Other scrapers are available for installation from the official repository. The below table lists all official scrapers and their compatability.


== Prerequisites ==
{|
* Knowledge of [[Regular Expression (RegEx) Tutorial|regular expressions]]
! style="background-color:salmon;" | Scrapers compatible with v18 Leia
* Basic [http://en.wikipedia.org/wiki/XML XML] syntax knowledge.
! style="background-color:#a0e75a;" | Scrapers compatible with v19 Matrix
! style="background-color:skyBlue;" | Scrapers compatible with v20 Nexus
|-
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Album}} scrapers compatible with v18 Leia
|-
|
<dpl>
category = Leia_add-on_repository
category = Album_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Album}} scrapers compatible with v19 Matrix
|-
|
<dpl>
category = Matrix_add-on_repository
category = Album_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Album}} scrapers compatible with v20 Nexus
|-
|
<dpl>
category = Nexus_add-on_repository
category = Album_scraper_add-ons
</dpl>
|}
|-
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Artist}} scrapers compatible with v18 Leia
|-
|
<dpl>
category = Leia_add-on_repository
category = Artist_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Artist}} scrapers compatible with v19 Matrix
|-
|
<dpl>
category = Matrix_add-on_repository
category = Artist_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Artist}} scrapers compatible with v20 Nexus
|-
|
<dpl>
category = Nexus_add-on_repository
category = Artist_scraper_add-ons
</dpl>
|}
|-
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Music Video}} scrapers compatible with v18 Leia
|-
|
<dpl>
category = Leia_add-on_repository
category = Music_video_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Music Video}} scrapers compatible with v19 Matrix
|-
|
<dpl>
category = Matrix_add-on_repository
category = Music_video_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Music Video}} scrapers compatible with v20 Nexus
|-
|
<dpl>
category = Nexus_add-on_repository
category = Music_video_scraper_add-ons
</dpl>
|}
|-
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Movie}} scrapers compatible with v18 Leia
|-
|
<dpl>
category = Leia_add-on_repository
category = Movie_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Movie}} scrapers compatible with v19 Matrix
|-
|
<dpl>
category = Matrix_add-on_repository
category = Movie_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|Movie}} scrapers compatible with v20 Nexus
|-
|
<dpl>
category = Nexus_add-on_repository
category = Movie_scraper_add-ons
</dpl>
|}
|-
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|TV Show}} scrapers compatible with v18 Leia
|-
|
<dpl>
category = Leia_add-on_repository
category = TV_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|TV Show}} scrapers compatible with v19 Matrix
|-
|
<dpl>
category = Matrix_add-on_repository
category = TV_scraper_add-ons
</dpl>
|}
|
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0.0em 0 0.0em 0em; {{border-radius|5px}}; {{box-shadow|0px|1px|3px|#878787}};"
! scope="row" style="width:500px; background-color:#dfedf0; text-align:left;"| List of {{red|TV Show}} scrapers compatible with v20 Nexus
|-
|
<dpl>
category = Nexus_add-on_repository
category = TV_scraper_add-ons
</dpl>
|}
|}


== Layout ==
To see a full scraper, see the [http://xbmc.svn.sourceforge.net/viewvc/xbmc/trunk/addons/metadata.themoviedb.org/tmdb.xml?revision=35744&view=markup themoviedb reference implementation] in SVN.
The general layout of scraper.xml is as follows:


  <scraper>
{| style="width: 100%;"
    <NfoUrl>
|-
      <RegExp>
| style="width: 5%; vertical-align:top;" | [[File:language_globe_icon.png|50px|left]]
        <expression></expression>
| style="width: 44%; vertical-align:top;" |'''{{resize|1.3em| [[HOW-TO:Create_Video_Library|Creating your Video Library]]}}'''
      </RegExp>
Setting up your video library for the first time? Follow this How To guide
    </NfoUrl>
    <CreateSearchUrl>
      <RegExp>
        <expression></expression>
      </RegExp>
    </CreateSearchUrl>
    <GetSearchResults>
      <RegExp>
        <expression></expression>
      </RegExp>
    </GetSearchResults>
    <GetDetails>
      <RegExp>
        <expression></expression>
      </RegExp>
    </GetDetails>
    <CustomFunction>
      <RegExp>
        <expression></expression>
      </RegExp>
    </CustomFunction>
  </scraper>


If RegExp tags are being nested they are being worked through in a [http://en.wikipedia.org/wiki/Lifo lifo] manner.
| style="width: 2%;" |
| style="width: 5%; vertical-align:top;" | [[File:language_globe_icon.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[HOW-TO:Create_Music_Library|Creating your Music Library]]}}'''
Setting up your music library for the first time? Follow this How To guide
|-
| style="width: 5%; vertical-align:top;" | [[File:language_globe_icon.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[Incorrect_and_missing_videos|Video scraping problems]]}}'''
Something not working as expected? Check this page for common solutions.
| style="width: 2%;" |
| style="width: 5%; vertical-align:top;" |
| style="width: 44%; vertical-align:top;" |
|-
| style="width: 5%; vertical-align:top;" | [[File:language_globe_icon.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[Changing Scrapers]]}}'''


== XBMC/Scraper Interaction ==
| style="width: 2%;" |
[[File:Scraperinterface.png]]
| style="width: 5%; vertical-align:top;" | [[File:language_globe_icon.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[:Category:Scraper|List of all Scraper pages]]}}'''
|}


== XML character entity references ==
Unlike traditional HTML with its large range of character entity references, in XML there are only five predefined character entity references. These are used to escape characters that are markup sensitive in certain contexts:


* &amp;amp; → &
* &amp;lt; → <
* &amp;gt; → >
* &amp;quot; → "
* &amp;apos; → '


This means that in the regular expressions you actually have to use entities instead of the actual characters.
= Scraper Development =
Historically, {{kodi}} scrapers were XML based. While XML scrapers have been used successfully for well over a decade, their use has been deprecated, since the release of {{kodi}} v19 Matrix, in favour of Python based scrapers.


For example, the following would be wrong:
XML scrapers have become difficult to develop and maintain as they are no longer suited to the advanced REST API's used by the information providers, which results in frequent maintenance and updates to cope with API changes.


  <CreateSearchUrl dest="3">
The following pages are available to help with the development process of scrapers. Although XML scrapers are deprecated, and no future XML scrapers will be accepted into the official repository, the two pages dedicated to XML scrapers contain useful information.
    <RegExp input="$$1" output="http://www.example.com/search.php?title=\1&new=true" dest="3">
      <expression noclean="1"> <a href="example">.[^<]* </expression>
    </RegExp>
  </CreateSearchUrl>


Use instead:


  <CreateSearchUrl dest="3">
{| style="width: 100%;"
    <RegExp input="$$1" output="http://www.example.com/search.php?title=\1&amp;amp;new=true" dest="3">
|-
      <expression noclean="1"> &amp;lt;a href=&amp;quot;example&amp;quot;&amp;gt;.[^&amp;lt;]* </expression>
| style="width: 5%; vertical-align:top;" | [[File:Crystal Project Network Connection Manager.png|50px|left]]
    </RegExp>
| style="width: 44%; vertical-align:top;" |'''{{resize|1.3em| [[Python_movie_scraper_development|Developing a Movie scraper in Python]]}}'''
  </CreateSearchUrl>
| style="width: 2%;" |
| style="width: 5%; vertical-align:top;" | [[File:Crystal Project Network Connection Manager.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[Scraper development XML|XML Scraper Development]]}}'''
XML scrapers are deprecated. This page may still contain useful development information.
|-
| style="width: 5%; vertical-align:top;" | [[File:Crystal Project Network Connection Manager.png|50px|left]]
| style="width: 44%; vertical-align:top;" |'''{{resize|1.3em| [[Python_tv_scraper_development|Developing a TV Show scraper in Python]]}}'''
| style="width: 2%;" |
| style="width: 5%; vertical-align:top;" | [[File:Crystal Project Network Connection Manager.png|50px|left]]
| style="width: 44%; vertical-align:top;" | '''{{resize|1.3em| [[HOW-TO:Write_media_scrapers|Writing an XML scraper]]}}'''
XML scrapers are deprecated. This page may still contain useful development information.
|}


If you open the scraper XML file in a web browser the actual characters are shown for the entities, be careful if you cut&paste.


== Regular Expression Engine ==
There are a few things to note about the regular expression engine:
* Laziness doesn't work. e.g. <xml>.*?</xml> (For a workaround see http://www.regular-expressions.info/repeat.html An Alternative to Laziness)
* \w or \d does not work, use [a-zA-Z] and [0-9] instead
* Regular expressions are case sensitive
* A dot matches a new line
* Beginning of line ^ and end of line $ does not seem to work


== Tags in Detail ==


=== <scraper> ===


  <scraper name="jaded video" content="movies" thumb="jaded.jpg" framework="1.0" date="2009-05-22"></scraper>
{{updated|19}}


The sole purpose of this tag is to define the name, type and thumbnail of scraper.
[[Category:Karellen]]
 
* name: The name of the scraper
* content: movies/tvshows/mvid
* thumb: relative path to a scraper thumbnail
* framework: Scraper Framework being used. The latest is 1.0 (as used in XBMC 9.04 Babylon)
* date: Date this version of the scraper has been released in [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601]
 
=== <url></url> ===
This is an output string that the scraper.xml generates and passes to XBMC. Url tags can have the following attributes:<br />
* spoof: The referrer url which should be send. This is sometimes needed for pages with direct linking protection.
* post: if the post attribute is present variables in the url will be send via POST. (see http://trac.xbmc.org/browser/trunk/XBMC/system/scrapers/video/adultfilmdatabase.xml?view=markup as a reference on how to use post)
* function: See GetDetails.
Example:
<url spoof="http://www.google.com">http://akas.imdb.com/find?s=tt;q=\1</url>
Every url returned from the scrapers has to be inside an url tag. An exception is CreateSearchUrl where the url tag is optional.
 
=== <RegExp><expression></RegExp> ===
A RegExp is a text processing command that takes text in buffer 'input', runs <expression> on it... formats the matches with 'output' and places the result in buffer 'dest'
 
  <RegExp conditional="override" input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
    <expression repeat="yes" noclean="1" trim="1" clear="no">([a-zA-Z][^,]*)</expression>
  </RegExp>
 
==== <expression> ====
The <expression> tag holds the regular expression. Here it is: <xml>([a-zA-Z][^,]*)whatever (.[^-]*)</xml>
The regular expression matches are stored in \1, \2, ... , \9 and can be used in the output attribute of the RegExp tag. If the expression is blank, the input buffer contents is put into match \1.
 
* repeat="yes/no": Repeat the regular expression.
* noclean="1,..,9": By default html tags and special characters are stripped from the matches \1, ..., \9. By setting noclean="1, ..., 9" you can stop this behavior.
* trim="1,..,9": Trim white spaces from the end of matches 1 to 9.
* clear=yes/no": If set to yes, if the expression fails dest is cleared
 
==== <RegExp> ====
The RegExp tag sets the input which will is being searched by the regular expression
 
* input="$$x" where x=1 to 9: The input variable holds the text that will get searched by the regular expression
* output: Defines how the output should look like. Here you can use \1, ..., \9 which represent regular expression matches.
* dest="x" where x=1 to 9: The variable to which the output should be stored to. If clear is true, its previous content will get cleared if the expression fails.
* conditonal="<condition>": A condition that must resolve to TRUE for the particular RegExp to be run. Currently the only available condition is "override", which is set based on the Language Override setting in the scraper.
 
=== <NfoUrl> ===
It's the task of NfoUrl to return the url for GetDetails from an nfo.<br />
'''Inputs''':<br />
* $$1: Variable 1 holds the content of the Nfo file.
 
=== <CreateSearchUrl> ===
 
  <CreateSearchUrl dest="3">
    <RegExp input="$$1" output="http://akas.imdb.com/find?s=tt;q=\1" dest="3">
      <expression noclean="1"></expression>
    </RegExp>
  </CreateSearchUrl>
 
'''Inputs''':<br />
* $$1: Variable 1 holds the search string. This is usually the filename stripped by some words e.g. DVDRip, Xvid ..
 
The purpose of this function is to create a variable which will hold the url of the search result page.
* dest="x" where x=1 to 9: Variable x shall hold the url of the search result page.
 
In the example above <CreateSearchUrl dest="3"> means that variable 3 should hold the search results page url.
The Regular expression tag searches through variable 1 and stores the searchstring in variable 3. Since no expression is specified the entire variable $$1 is matched and stored in \1. Since this does not contain html tags it is not necessary to clean \1.
 
=== <GetSearchResults> ===
 
    <GetSearchResults dest="8">
      <RegExp input="$$5" output="&lt;?xml version="1.0" encoding="iso-8859-1" standalone="yes"?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
        <RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\1&lt;/title&gt;&lt;url&gt;http://jadedvideo.com/yz_resultJAVA.asp?PRODUCT_ID=\2&lt;/url&gt;&lt;/entity&gt;" dest="5">
          <expression repeat="yes">&lt;font color="#000FF"&gt;(.[^"]*)&lt;/big&gt;.[^"]*SKU: ([0-9]+)&lt;br&gt;</expression>
        </RegExp>
        <expression noclean="1"></expression>
      </RegExp>
    </GetSearchResults>
 
'''Inputs''':<br />
* $$1: Variable 1 hold the content of the search URL returned to CreateSearchUrl.
 
It is the task of this function to return a list of search results in the following format:
 
    <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
    <results>
      <entity>
        <title></title>
        <url></url>
        <url></url>
      </entity>
      <entity>
        <title></title>
        <url></url>
        <url></url>
      </entity>
    </results>
 
There can be up to 9 <url> tags within each movie tag. The webpages specified in the url tags should hold detailed information about the movie/tv series/mvid. In most cases one url will be enough. The url tag can have the argument function="nameoffunction" e.g.
<xml><url function="CustomFunction">theurl</url></xml>
In this case other then GetDetails, "CustomFunction" will be executed after GetDetails has run.
 
When using the scraper the user will be asked in XBMC to select one of the movies from the list which was returned by this function. The one which gets selected will be processed from there on.
 
By default XBMC will try to intelligently re-order the list of returned entries by comparing titles with the search string. If the results you are returning are already in a properly sorted order set <results sorted="yes">.
 
=== <GetDetails> ===
Here the actual data which is being saved in the database is being retrieved. For movies the following format should be returned:
 
  <details>
    <title></title>
    <year></year>
    <director></director>
    <top250></top250>
    <mpaa></mpaa>
    <tagline></tagline>
    <runtime></runtime>
    <thumb></thumb>
    <credits></credits>
    <rating></rating>
    <votes></votes>
    <genre></genre>
    <actor>
      <name></name>
      <role></role>
    </actor>
    <outline></outline>
    <plot></plot>
  </details>
 
'''Inputs:'''<br />
* $$x where x= 1 to 9: Variable x holds the content of the URL specified in the <url> tag number x in GetSearchResults.
e.g. if there are two <url> tags (without argument function) returned by GetSearchResults. Variable 1 will contain the content of the first url tag and variable 2 the content of the second url tag.
 
'''<thumb>:'''<br />
The thumb tag can have a spoof argument (like <url>). This feature can be used to circumvent direct linking protections.
The firefox extensions refspoof http://refspoof.mozdev.org/ can help in testing if referrer spoofing works.
(see from http://xbmc.svn.sourceforge.net/viewvc/xbmc/trunk/XBMC/system/scrapers/video/adultcdmovies.xml?view=markup for an example)
 
'''Multiple tags:'''<br />
* The following tags can appear more than once in <details>:
** genre
** credits
** director
** actor
 
=== <CustomFunction> ===
This function should return an xml similar to GetDetails. For tags that can have multiple values the values of CustomFunction and GetDetails are combined (genre etc). For the rest the last CustomFunction will take precedence.
 
'''Inputs:'''
* $$1: Variable 1 holds the content of the URL specified in the url tag with the function="CustomFunction" argument.
e.g. for
 
  <url function="CustomFunction">http://www.imdb.com/title/tt0452624/</url>
 
variable 1 would hold the content of "http://www.imdb.com/title/tt0452624/".
 
[[Category:XBMC Manual]]
[[Category:Development]]
[[Category:Scraper]]
[[Category:Scraper]]
[[Category:Music library]]
[[Category:Video library]]
[[Category:Video library]]
[[Category:Music Library]]
[[Category:Disambiguation_pages]]

Revision as of 04:44, 29 May 2022

Home icon grey.png   ▶ Development
▶ Music library
▶ Video library
▶ Scrapers


Using Scrapers

Scrapers are used to contact online information providers to download metadata for video and music items to be added to the library.

A number of scrapers are installed by default when installing Kodi. Other scrapers are available for installation from the official repository. The below table lists all official scrapers and their compatability.

Scrapers compatible with v18 Leia Scrapers compatible with v19 Matrix Scrapers compatible with v20 Nexus
List of Album scrapers compatible with v18 Leia
List of Album scrapers compatible with v19 Matrix
List of Album scrapers compatible with v20 Nexus
List of Artist scrapers compatible with v18 Leia
List of Artist scrapers compatible with v19 Matrix
List of Artist scrapers compatible with v20 Nexus
List of Music Video scrapers compatible with v18 Leia
List of Music Video scrapers compatible with v19 Matrix
List of Music Video scrapers compatible with v20 Nexus
List of Movie scrapers compatible with v18 Leia
List of Movie scrapers compatible with v19 Matrix
List of Movie scrapers compatible with v20 Nexus
List of TV Show scrapers compatible with v18 Leia
List of TV Show scrapers compatible with v19 Matrix
List of TV Show scrapers compatible with v20 Nexus


Language globe icon.png
Creating your Video Library

Setting up your video library for the first time? Follow this How To guide

Language globe icon.png
Creating your Music Library

Setting up your music library for the first time? Follow this How To guide

Language globe icon.png
Video scraping problems

Something not working as expected? Check this page for common solutions.

Language globe icon.png
Changing Scrapers
Language globe icon.png
List of all Scraper pages


Scraper Development

Historically, Kodi scrapers were XML based. While XML scrapers have been used successfully for well over a decade, their use has been deprecated, since the release of Kodi v19 Matrix, in favour of Python based scrapers.

XML scrapers have become difficult to develop and maintain as they are no longer suited to the advanced REST API's used by the information providers, which results in frequent maintenance and updates to cope with API changes.

The following pages are available to help with the development process of scrapers. Although XML scrapers are deprecated, and no future XML scrapers will be accepted into the official repository, the two pages dedicated to XML scrapers contain useful information.


Crystal Project Network Connection Manager.png
Developing a Movie scraper in Python
Crystal Project Network Connection Manager.png
XML Scraper Development

XML scrapers are deprecated. This page may still contain useful development information.

Crystal Project Network Connection Manager.png
Developing a TV Show scraper in Python
Crystal Project Network Connection Manager.png
Writing an XML scraper

XML scrapers are deprecated. This page may still contain useful development information.