Python movie scraper development: Difference between revisions
No edit summary |
No edit summary |
||
Line 62: | Line 62: | ||
A scraper receives the current path settings via <code>pathSettings</code> query parameters as a JSON-encoded string with each action call. | A scraper receives the current path settings via <code>pathSettings</code> query parameters as a JSON-encoded string with each action call. | ||
=Example Movie Scraper= | |||
A very high-level example of a Python movie scraper can be found in the Kodi source code as [https://github.com/xbmc/xbmc/tree/master/addons/metadata.demo.movies metadata.demo.movie addon]. | A very high-level example of a Python movie scraper can be found in the Kodi source code as [https://github.com/xbmc/xbmc/tree/master/addons/metadata.demo.movies metadata.demo.movie addon]. |
Revision as of 08:16, 7 June 2022
![]() |
![]() ![]() |
![]() |
![]() |
Wiki revamp: This page or section of the wiki undergoing a revamp. Content is still being added and organized. Please pardon the dust and anything that might not be working. |
Introduction
Historically, Kodi has been supporting XML scraping addons that allow parsing online information sources about movies, TV shows, music and so on. However, this approach has its limitations. First, XML parsing definitions with regular expressions are difficult to write and maintain. Second, many information sources implemented REST APIs for getting information and regular expressions are not suitable for parsing JSON and data. That is why Python scrapers have been introduced.
Python scrapers are written in Python language and have similar structure to media addons (media plugins).
They are also called by special plugin://
URLs with query parameters. The main query parameter is action
that defines a scraping stage.
Scrapers are also interact with Kodi via the functions of xbmcplugin
Python API module and pass information to Kodi through xbmcgui.ListItem
instances.
Basically, a Python scraper is a media plugin that passes information to Kodi instead of presenting lists of playable media items. A movie scraper must support "action" calls that are described below.
Actions
Find
The find
action is used for searching for a specific movie by title and optionally a year that are passed as additional query parameters of the plugin call.
This action should use xbmcplugin.addDirectoryItem
or xbmcplugin.addDirectoryItems
to pass xbmcgui.ListItem
instances to Kodi.
If only one instance is passed then it is considered as a perfect match. Otherwise the media file won't be matched and will need to be resolved manually by selecting
a necessary item from a list.
The xbmcgui.ListItem
instances must be assigned the following properties:
- label: passed as
label
parameter to the class constructor. This is the label that is presented to a user during scraping.
- url: passed as
url
parameter to the class constructor. This should be some unique string that can be used to request all the necessary movie info from data provider's
website or API. It can be, for example, a link to a movie page on a movie information website or a some unique ID to request movie information from a REST API.
- thumb (optional): passed via
setArtwork()
ofxbmcgui.ListItem
class instance method. This should be a URL of a movie poster, for example.
NfoUrl
The NfoUrl
action is called as an alternative to find
if an .NFO file is present along with a movie file. The entire .NFO file contents are passed as nfo
parameter. This action should use xbmcplugin.addDirectoryItem
to pass a single xbmcgui.ListItem
instance to Kodi.
The xbmcgui.ListItem
instances must be assigned the following properties:
- url: passed as
url
parameter to the class constructor. This should be some unique string that can be used to request all the necessary movie info from data provider's
website or API. It can be, for example, a link to a movie page on a movie information website or a some unique ID to request movie information from a REST API.
- Unique IDs: set via
ListItem.setUniqueIDs()
instance method. A scraper should set at least the default unique from the movie information site it works with.
Optionally it can set unique IDs from other online movie databases.
getdetails
The getdetails
action must pass a single xbmcgui.ListItem
via xbmcplugin.setResolvedUrl()
function. This action receives url query parameter
from the previous stages and must set as much information to the xbmcgui.ListItem
instance as possible.
getartwork
The getdetails
action must pass a single xbmcgui.ListItem
via xbmcplugin.setResolvedUrl()
function. This action receives id query parameter
that is the default unique ID set by NfoUrl or getdetails actions. This action should set available artwork using ListItem.addAvailableArtwork()
and ListItem.setAvailableFanart()
instance methods.
Path Settings
A scraper receives the current path settings via pathSettings
query parameters as a JSON-encoded string with each action call.
Example Movie Scraper
A very high-level example of a Python movie scraper can be found in the Kodi source code as metadata.demo.movie addon.