Difference between revisions of "Python movie scraper development"

From Official Kodi Wiki
Jump to navigation Jump to search
m
Line 10: Line 10:
scrapers have been introduced.
scrapers have been introduced.


Python scrapers are written in Python language and have similar structure to [https://kodi.wiki/view/HOW-TO:Video_addon media addons] (plugins).
Python scrapers are written in Python language and have similar structure to [https://kodi.wiki/view/HOW-TO:Video_addon media addons] (media plugins).
They are also called by special <code>plugin://</code> URLs with query parameters. The main query parameter is <code>action</code> that defines a scraping stage.
They are also called by special <code>plugin://</code> URLs with query parameters. The main query parameter is <code>action</code> that defines a scraping stage.
Scrapers are also interact with Kodi via the functions of <code>xbmcplugin</code> Python API module and pass information to Kodi through <code>xbmcgui.ListItem</code> instances.
Basically, a Python scraper is a media plugin that passes information to Kodi instead of presenting lists of playable media items. A movie scraper must support "action" calls
that are described below.
=Actions=
==Find==
The <code>find</code> action is used for searching for a specific movie by title and optionally a year that are passed as additional query parameters of the plugin call.
This action should use <code>xbmcplugin.addDirectoryItem</code> or <code>xbmcplugin.addDirectoryItems</code> to pass <code>xbmcgui.ListItem</code> instances to Kodi.
If only one instance is passed then it is considered as a perfect match. Otherwise the media file won't be matched and will need to be resolved manually by selecting
a necessary item from a list.
The <code>xbmcgui.ListItem</code> instances must be assigned the following properties:
'''label''': passed as <code>label</code> parameter to the class constructor. This is the label that is presented to a user during scraping.
'''url''': passed as <code>url</code> parameter to the class constructor. This should be some unique string that can be used to request all the necessary movie info from data provider's
website or API. It can be, for example, a link to a movie page on a movie information website or a some unique ID to request movie information from a REST API.
'''thumb''' (optional): passed via <code>setArtwork()</code> of <code>xbmcgui.ListItem</code> class instance method. This should be a URL of a movie poster, for example.

Revision as of 13:23, 6 June 2022

Home icon grey.png   ▶ Scrapers
▶ Development
▶ Python movie scraper development
Wiki logo.png Wiki revamp:
This page or section of the wiki undergoing a revamp. Content is still being added and organized. Please pardon the dust and anything that might not be working.

1 Introduction

Historically, Kodi has been supporting XML scraping addons that allow parsing online information sources about movies, TV shows, music and so on. However, this approach has its limitations. First, XML parsing definitions with regular expressions are difficult to write and maintain. Second, many information sources implemented REST APIs for getting information and regular expressions are not suitable for parsing JSON and data. That is why Python scrapers have been introduced.

Python scrapers are written in Python language and have similar structure to media addons (media plugins). They are also called by special plugin:// URLs with query parameters. The main query parameter is action that defines a scraping stage. Scrapers are also interact with Kodi via the functions of xbmcplugin Python API module and pass information to Kodi through xbmcgui.ListItem instances.

Basically, a Python scraper is a media plugin that passes information to Kodi instead of presenting lists of playable media items. A movie scraper must support "action" calls that are described below.

2 Actions

2.1 Find

The find action is used for searching for a specific movie by title and optionally a year that are passed as additional query parameters of the plugin call. This action should use xbmcplugin.addDirectoryItem or xbmcplugin.addDirectoryItems to pass xbmcgui.ListItem instances to Kodi. If only one instance is passed then it is considered as a perfect match. Otherwise the media file won't be matched and will need to be resolved manually by selecting a necessary item from a list. The xbmcgui.ListItem instances must be assigned the following properties:

label: passed as label parameter to the class constructor. This is the label that is presented to a user during scraping.

url: passed as url parameter to the class constructor. This should be some unique string that can be used to request all the necessary movie info from data provider's website or API. It can be, for example, a link to a movie page on a movie information website or a some unique ID to request movie information from a REST API.

thumb (optional): passed via setArtwork() of xbmcgui.ListItem class instance method. This should be a URL of a movie poster, for example.