Difference between revisions of "Python movie scraper development"

From Official Kodi Wiki
Jump to navigation Jump to search
(First Creation)
 
Line 1: Line 1:
{{mininav|[[Scrapers]] {{l2|[[Development]]}} }}
{{mininav|[[Scrapers]] {{l2|[[Development]]}} }}
=Introduction=
Historically, Kodi has been supporting [https://kodi.wiki/view/HOW-TO:Write_media_scrapers XML scraping addons] that allow parsing online information sources
about movies, TV shows, music and so on. However, this approach has its limitations. First, XML parsing definitions with regular expressions are difficult to write and maintain.
Second, many information sources implemented REST APIs for getting information and regular expressions are not suitable for parsing JSON and data. That is why Python
scrapers have been introduced.
Python scrapers are written in Python language and have similar structure to [https://kodi.wiki/view/HOW-TO:Video_addon media addons] (plugins).
They are also called by special <code>plugin://</code> URLs with query parameters. The main query parameter is <code>action</code> that defines a scraping stage.

Revision as of 07:41, 25 May 2022

Home icon grey.png   ▶ Scrapers
▶ Development
▶ Python movie scraper development

Introduction

Historically, Kodi has been supporting XML scraping addons that allow parsing online information sources about movies, TV shows, music and so on. However, this approach has its limitations. First, XML parsing definitions with regular expressions are difficult to write and maintain. Second, many information sources implemented REST APIs for getting information and regular expressions are not suitable for parsing JSON and data. That is why Python scrapers have been introduced.

Python scrapers are written in Python language and have similar structure to media addons (plugins). They are also called by special plugin:// URLs with query parameters. The main query parameter is action that defines a scraping stage.