Scraping Music: Difference between revisions

From Official Kodi Wiki
Jump to navigation Jump to search
mNo edit summary
Line 90: Line 90:


The two tables in the following link detail which entries can be overwritten for artists and albums.
The two tables in the following link detail which entries can be overwritten for artists and albums.
:'''''See: [[Music#Available_nfo_Tags|Available NFO Tags]]'''''
:'''''See: [[NFO_files/Music|Available NFO Tags]]'''''




Line 97: Line 97:
A number of scrapers are available to install, most catering to specific music genres.
A number of scrapers are available to install, most catering to specific music genres.


By default {{kodi}} uses the following scrapers
By default {{kodi}} uses the following scrapers in v18 and below
*Artists- '''''[[Add-on:Universal_Artist_Scraper|Universal Artist Scraper]]'''''
*Artists- '''''[[Add-on:Universal_Artist_Scraper|Universal Artist Scraper]]'''''
*Albums- '''''[[Add-on:Universal_Album_Scraper|Universal Album Scraper]]'''''
*Albums- '''''[[Add-on:Universal_Album_Scraper|Universal Album Scraper]]'''''
Line 105: Line 105:
They are also the only two scrapers that currently download '''Extended Artwork'''.
They are also the only two scrapers that currently download '''Extended Artwork'''.


: '''''See also: List of [[:Category:Artist_scraper_add-ons| Artist Scrapers]]'''''
{{Future|19|The XML based Universal scrapers have been replaced by the new Python scrapers, currently labelled as "Generic Artist Scraper" and "Generic Album Scraper"}}
 
: '''''See also: List of [[:Category:Album_scraper_add-ons| Album Scrapers]]'''''




Line 308: Line 306:
[[Category:First time user]]
[[Category:First time user]]
[[Category:Music library]]
[[Category:Music library]]
[[Category:Advanced topics]]
[[Category:Quick Start Guide]]
[[Category:Quick Start Guide]]

Revision as of 04:28, 21 July 2020

Steps to create your Music Library
Nuvola apps kscd.png
1. Guide Main Page
2. Music Settings
3. Music File Tagging
4. Scanning Music Into Library
5. Artist information folder
6. Scraping Additional Music data

7. Update Music Library
8. NFO Files
8.1 Music NFO Files
9. Artwork
9.1 Music Artwork


10. Import-export library
10.1 Music Import-Export
11. Backup & Recover
Zappy.png
Home icon grey.png   ▶ Music library ▶ Scraping Music


Introduction

Music Scraping downloads additional information about the Artists and Albums in your library, as well as downloading any missing artwork. This step is optional, but if you choose to scrape, it must be performed after your music is Scanned into the library.

The source of information for the scrape can be either:

  1. An online source using one of the Kodi Scrapers
  2. Local information from either NFO files or data in a Single File export.

For this guide, it is assumed this is your first setup, so you will have no local files. The scraped data is being retrieved from an online source using the scrapers.

v18 introduces a number of changes and improvements to scraping. The most prominent being the Artist information folder. Also new is the ability for Kodi and the scrapers to download and cache Extended Artwork. These are covered below.


What is Scraped

If the information and artwork is available at the online site used by the active scrapers, the scraping process will download the additional data and artwork into your existing library.


Artists

The below table details the additional information Kodi scrapes into the library. This data cannot be scanned from tags.

Artist Metadata
biography birthday date and place date and place died disambiuation discography gender genres instruments
mbid's moods SortName type styles years active (band) formed (band) disbanded
Artist Artwork
banner clearart clearlogo fanart fanart# (extrafanart) folder (thumb) landscape


Albums

The below table details the additional information Kodi scrapes into the library. This data cannot be scanned from tags.

Album Metadata
moods mbid's (artwork related) musicbrainzreleasegroupid rating review styles theme votes
Album Artwork
back discart fanart folder (thumb) spine


What is Overwritten

Some information that has been scanned from music file tags can be overwritten with data derived from the scraping process. This can be useful if your tags are incomplete or outdated. It allows the library (not the music file tags) to be updated with current data.

To allow this to occur, Enable the following setting. Leave it Disabled if you are happy with your music file tags.

See: Prefer Online Information

The two tables in the following link detail which entries can be overwritten for artists and albums.

See: Available NFO Tags


Scrapers

A number of scrapers are available to install, most catering to specific music genres.

By default Kodi uses the following scrapers in v18 and below

These two scrapers will search MusicBrainz, last.fm, allmusic.com and amazon.de for metadata and fanart.tv, last.fm and allmusic.com for artwork.

They are also the only two scrapers that currently download Extended Artwork.


Future.png FUTURE Matrix FEATURE:
The XML based Universal scrapers have been replaced by the new Python scrapers, currently labelled as "Generic Artist Scraper" and "Generic Album Scraper"


Extended Artwork

New in v18 is the feature to scrape and cache extended artwork in the music (and video) library without the need for additional add-ons. This is still an experimental feature (to work out bugs). [1]

Basic Artwork consists of:

  • Artists- fanart, folder
  • Albums- folder

Extended Artwork consists of:

  • Artists- banner, clearart, clearlogo, extra fanart, landscape
  • Albums- back, discart, spine


To use this feature, there are two requirements as follows:

1. The following tags are required in the advancedsettings.xml file and Kodi restarted before scraping extended artwork.
<advancedsettings>
	<musiclibrary>
		<artistextraart>
			<arttype>banner</arttype>
			<arttype>clearart</arttype>
			<arttype>clearlogo</arttype>
			<arttype>landscape</arttype>
			<arttype>fanart1</arttype>
			<arttype>fanart2</arttype>
			<arttype>fanart3</arttype>
			<arttype>fanart4</arttype>
		</artistextraart>
		<albumextraart>
			<arttype>back</arttype>
			<arttype>discart</arttype>
			<arttype>spine</arttype>
		</albumextraart>
	</musiclibrary>
</advancedsettings>
2. Scraper Settings

The default Universal Artist Scraper and Universal Album Scraper have been updated to scrape extended artwork. These settings are disabled by default and need to be enabled in the scraper settings pages.

See: Universal Artist Scraper
See: Universal Album Scraper

Extra Fanart

Extra Fanart was traditionally saved in the extrafanart folder. Additional fanart images were saved to this folder for use as a slideshow by a compatible skin, usually by utilising the Skin Helper Service add-on or the Artist Slideshow add-on. The downside to this method is the skin needs to read the extra fanart directly from the extrafanart folder. This requires hard drives to wake causing delays in the GUI while waiting for the drives to become accessible. Filenames are not important as the skin slideshow simply displayed any image file in the folder.

v18 introduces caching of extra fanart, just like any other artwork in the library, and is also considered experimental. The advantages are quicker loading times and fewer wakings of hard disks. It is more difficult for skinners to implement this method.

Only a few skins are currently supporting this method, but more skins will implement this in the future.

There are three requirements to using this method...

1. Compatible Skin
Check with the skinner if this method is used by the skin.
2. advancedsettings.xml
Tags are required in the advancedsettings.xml file as shown in the previous section.
3. Saving Extra Fanart
The extra fanart must be saved alongside the other artwork (as shown in the image) with the following requirements:
  • fanart# is Extra Fanart. As many or as few can be added, but there must be a corresponding whitelist entry in the advancedsettings.xml file as shown above.
  • Do not zero pad the digits, ie use 1 not 01; use 12 not 012
  • The Music Library does not support the use of the Extrafanart folder
  • The Universal Artist Scraper does not download extra fanart. The extra fanart must be manually sourced and added to the Artist folder

Performing the Scrape

The following two sections will guide you through the scrape process. Be aware that scraping Artist and Albums are two separate processes.


Artists

To scrape additional information for Artists, follow these steps:

  1. Select Music from the main menu (Image 1)
  2. Select Artists from the category list (Image 2)
  3. Highlight any Artist (Image 3)
  4. Call up the Context Menu for the selected Artist (Image 3)
  5. Select Query info for all artists (Image 3)

Depending upon the size of your library, this process could take many hours. As an example, a library of around 900 artists required over three hours to complete the scrape. This is dependent on your internet connection and the load on the servers hosting the scraper database. It is best to perform the scrape overnight when Kodi is not in use.

Once complete, it should be run a second time to ensure "busy" responses were not received from the scraper site due to overloaded servers.


Albums

To scrape additional information for Albums, follow these steps:

  1. Select Music from the main menu (Image 1)
  2. Select Albums from the category list (Image 2)
  3. Highlight any Album (Image 3)
  4. Call up the Context Menu for the selected Album (Image 3)
  5. Select Query info for all albums (Image 3)

Depending upon the size of your library, this process could take many hours. As an example, a library of around 300 albums required almost two hours to complete the scrape. This is dependent on your internet connection and the load on the servers hosting the scraper database. It is best to perform the scrape overnight when Kodi is not in use.

Once complete, the scrape should be run a second time to ensure "busy" responses were not received from the scraper site due to overloaded servers.


Unscraped Items

Unfortunately there are numerous[2] [3] online music databases that contain artist and album information. Not all of them allow API access, and no one site is complete as every user has their preference of which site to update. Simply put, there is no IMDB or TheMovieDB equivelant for music, though discogs.com seems to be the largest of all the online databases.

When browsing your library, it will become apparent that not all Artists or Albums had additional information scraped. When a scrape fails due to no data available your options are

  1. Contribute back to the free sites used by the Kodi scrapers and update missing and incorrect entries (preferred option)
  2. Create NFO files which contain the additional information.

If using the NFO files option, a quick guide is...

  1. Create your Artist information folder without subfolders somewhere outside of your Music Source(s).
  2. Set the Artist information folder in the Music Settings page
  3. Use the export option Artist Folders only to create the artist subfolders
  4. Locate artwork
    • Artists- save it in the Artist's folder in the Artist information folder, using the naming scheme displayed above
    • Albums- save it in the Album folder containing the music files
  5. Create NFO files and enter the additional data to be added to the library and
    • Artists- save it in the Artist's folder in the Artist information folder, named artist.nfo
    • Albums- save it in the Album folder containing the music files and name it album.nfo
    • See Music NFO files for sample Artist and Album nfo files.
  6. Refresh the new information into the library


End of Setup

Congratulations! You have reached the end of the second module of the Setup Guide. You should now have a fully functioning Music library full of information and artwork.

The following steps in the guide will help you tweak and modify your library. There is no requirement for you to read on at this stage. Become comfortable with your new library. When you feel there is need for a change, come back to this guide and continue with the remaining modules. We do recommend you take care to backup your library.


Display of scraped information using alternate skins:




The remainder of this page deals with special use cases or provides information for technical interest and reference only. It can be safely disregarded if these do not apply to you


Next step: Update Music Library


Technical Details

Scraping of additional artist and album information does not happen when you run a library update (including when first adding a source) unless "Fetch additional info on update" is enabled. Currently this defaults to disabled. It is strongly recommended to keep this disabled for several reasons:

  • It allows you to perform the initial tag scan quickly. In v17 and below, when Fetch additional info on update is enabled, the two steps happen immediately each album and artist is added. This slows down populating the library. It is a background task, but still gives the impression that it is taking a considerable time to populate the library. In reality it is slowed down by the online scraping, and not the local tag scanning. This will be changed in v18 where Kodi will perform the initial tag scan then, once complete, will commence the online scrape.
  • The strategy of immediately scraping each album and artist as it is added can also have unwanted consequences. A common example involves compilation albums. Say for example you have both Fetch additional info on update and Show song and album artists as enabled, and a folder called "Compilations" containing all your Various Artist compliation albums. As the processing takes place in alphabetical order, this folder gets processed before the albums of artists in folders beginning with a letter later in the alphabet. The additional information for the song artists from these compilation albums gets added to the library and scraped before the other albums by that artist had been scanned, and so any artist.nfo file is never found and applied.
  • Bad tagging makes a messy library and then the online scrape compounds errors and makes the library difficult to unscramble. It is far better for the user to perform the initial scan, find and fix the bad tags, then perform the on-line scrape knowing there will be no further errors.
  • Scraping online also often leaves many gaps on first pass, then the user has to manually query for all anyway, often several times, to fill these gaps. This was because lack of correct throttling, and general server overload has meant that online scraping often failed (seen in the log as 503 server errors). This is expected to be fixed for v18.


References



Return to top


Next step: Update Music Library