General information about migration to Python 3

From Official Kodi Wiki
Jump to: navigation, search

1 Overview

Python 3 was released 9 years ago and EOL for Python 2.7 is scheduled for 2020. Currently more and more Python project are switching to Python 3 or 2/3 compatible code. The issue of migrating Kodi’s built-in Python interpreter to Python 3.x was brought up on the Kodi forum several times. Now, thanks to a successful GSOC 2017 project, we have a working Python 3 interpreter for Kodi. That is why on Kodi DevCon 2017 Team Kodi decided that we will switch to Python 3 in the future Kodi release (v.19 preliminarily).

Unfortunately, Python 3.x versions are not backward-compatible with Python 2.x, so we decided to urge addon developers to start porting their addons to Python 3 by making the addons compatible with both Python 2 and 3 and submitting them to the official repo. This way, when Kodi with Python 3 is released, we won’t have a situation when we have no working addons.

2 The Process

  • Starting from Kodi 18 (Leia), only addons that are compatible with both Python 2 and 3 will be accepted to the official addon repository.
  • We will provide Kodi builds with Python 3 so that addon developers can tests if their addons are really compatible with Python 3.
  • A thread will be created on the Kodi forum so that addon devs can get help with their migration process.

3 Python 3 And 2 Differences

Below is a brief overview of the main differences between Python 3 and 2.

3.1 Unicode strings by default

In Python 2 built-in str type holds the sequence of bytes so it can be used to store both binary data and textual data in ASCII or any of 8-bit fixed-length encodings (there were plenty). Python 2 also has unicode type that can store text in various writing systems. A minimal unit of a unicode object is a Unicode codepoint (a Unicode character). Both str and unicode objects can be mixed and matched together, for example, concatenated using + operator.

In Python 3 str type holds Unicode characters, and for binary data a new bytes type was added. This type was also added to Python 2.7 to simplify porting to Python 3 but in Python 2.7 bytes is simply an alias for str. The str and bytes types are incompatible, and an attempt to use both together will result in TypeError exception. A bytes object can be converted to str using .decode() method, and a str is converted to bytes using .encode() method.

3.2 Absolute Import

Python 3 uses so called “absolute import” and you can no longer import neighbor modules from inside a package using only a module name. You need to either to use either a fully qualified name or a “dotted” notation.

Let’s assume that you have a foo package that contains bar and spam modules:

foo/
  __init__.py
  bar.py
  spam.py    

And you need to import bar module from spam. In Python 2 you can do:

import bar

But in Python 3 you need to use either

import foo.bar

or

from . import bar

3.3 Floating Point Division

In Python 2 applying division operator to int numbers produces int result. For example:

>>> 4 / 3
1

In Python 3 division operator always gives you a float result.

>>> 4/3
1.3333333333333333

To get an int result you need to use integer division operator:

>>> 4 // 3
1

3.4 Print Function

In Python 3 print statement has been replaced with print() function. This is not very relevant to Kodi because using print in Python addons is discouraged and xbmc.log() function should be used to write messages to the Kodi log.

3.5 Default Iterators

In Python 3 most function and methods that produce sequences return iterators instead of lists. For example, range() function and .items(), .keys(), .values() of dict type produce iterators instead of lists, and their Python 2 analogues — xrange() function and .iter* methods of dict type — have been removed.

3.6 No long Integers

In Python 3 long integer type has been removed and int type can now hold a number of arbitrary length.

Those are only some of the differences between Python 2 and 3. For a more complete list see official Python documentation.

4 Some Useful Information

As it was said above, Python 3 includes a number of incompatible changes, so to avoid situation when after upgrading Kodi for Python 3 support most Python addons will be broken developers should start to port their addons to be compatible with both Python 2 and 3. This is not a trivial task but fortunately there are a number of tools and recommendations to simplify this process. Here I’ll give you some advice and information about tools that will simplify creating portable code that runs on both Python 2 and 3.

4.1 Know Your Strings!

This is the most problematic part of porting Python 2 code to Python 3 so I put it first. One of the most notable differences between Python 2 and 3 is clear separation between “textual” and “binary” strings, that is, between textual data and their binary representation. This difference is often difficult to understand, especially for novice developers (not only in Python), and the fact that in Python 2 you can mix together str (binary data) and unicode (textual data) adds to that difficulty.

There are many explanations in the Internet, but here are the most basic things about strings that you need to know:

“Textual” or Unicode strings (unicode in Python 2 and str in Python 3) hold text as a sequence of characters. A minimal unit in such strings is a Unicode character — an abstract entity that represent an alphabet symbol, a punctuation sign or other symbol included in the Unicode character table. Of course, such symbols have concrete internal binary representation (a computer is a binary machine after all), but this representation is not relevant to us. All you need to know is that “textual” strings contain text units without being tied to concrete binary representation.

Unlike Unicode strings, binary strings (str in Python 2 and bytes in both Python 2 and 3) hold binary data, and a minimal unit of such strings is a byte or 8 bits. Historically, in Python 2 such strings are used for textual data encoded in ASCII or other 1-byte fixed-length encoding (e.g. Windows 1251 for Cyrillic alphabets). However, this approach has its limitation, that is why unicode type was introduced in Python 2, and in Python 3 it was renamed to str and made a default container for textual data.

Unfortunately, Python 2 allows to mix and match Unicode and binary strings in the same context and this creates big problems when porting Python 2 code to Python 3. So in order to successfully port your code to Python 3 you need to carefully consider how textual data are processed in your addon. The following recommendations will help you avoid problems when writing Python 2/3 compatible code.

4.2 Avoid using binary strings for text

Store all your text as Unicode strings. This means that all binary strings received from external sources (textual files, websites, various APIs) need to be decoded using appropriate encoding (UTF-8 in most cases). If you need to use strings literals (although using hardcoded strings for user-facing text in Kodi is strongly discouraged), they need to be defined as unicode strings as well. You can use either

from __future__ import unicode_literals

at the beginning of your modules to enable unicode strings by default as in Python 3 or u-strings, e.g. u'foo' (Python 3 understands them too) — it doesn't really matter as long as you are doing it consistently.

4.3 Know the libraries you are using

Working with Python Standard Library and third-party libraries (including Kodi Python API) you should know that types their functions and methods expect and what types they return. For example, in popular requests library Response.text property returns a Unicode string and Response.content returns a binary string.

If a function/method/property returns a binary string, you need to decode it to a Unicode string using .decode() method. Don’t use unicode class constructor because unicode type does not exist in Python 3.

If you are reading textual files from disk, it is better to use io.open() instead of built-in open(). io.open() can decode file contents to Unicode using specified encoding and works identically both in Python 2 and 3. Example:

import io

with io.open('foo.txt', 'r', encoding='utf-8') as fo:
   text = fo.read()

4.4 Potential pitfalls

When writing Python 2/3-compatible code or converting existing Python 2 codebase to compatible code you should pay attention to the following Python language constructions where TypeError exceptions may happen because of incompatible string types if you don’t get your strings in order:

  • String concatenations with + operator ('foo' + 'bar').
  • String formatting — both “old style” ('foo %s' % 'bar') and “new style” ('foo {}'.format('bar')).
  • String joining (', '.join(['foo', 'bar'])).

In all those cases always make sure that you work with Unicode strings only. Another source of subtle bugs is functions and methods that accept and return binary strings (str type) in Python 2 but Unicode strings in Python 3 (again, str type but with different properties). You can use the following utility functions to “normalize” strings that are provided to such functions/methods and received from them:

import sys

__all__ = ['PY2', 'py2_encode', 'py2_decode']

PY2 = sys.version_info[0] == 2


def py2_encode(s, encoding='utf-8'):
   """
   Encode Python 2 ``unicode`` to ``str``

   In Python 3 the string is not changed.   
   """
   if PY2 and isinstance(s, unicode):
       s = s.encode(encoding)
   return s


def py2_decode(s, encoding='utf-8'):
   """
   Decode Python 2 ``str`` to ``unicode``

   In Python 3 the string is not changed.
   """
   if PY2 and isinstance(s, str):
       s = s.decode(encoding)
   return s

Those utility functions are included in Kodi Six library that is described in the following section.

5 Helper Libraries and Tools

5.1 2to3

2to3 script is created by Python developers to help converting existing Python 2 code to Python 3. On Windows it is included in Python distribution (Tools subfolder) but on other OSes you man need to install it separately. For example, on Ubuntu it is included in python-examples package. Note that this script is created for conversion from Python 2 to 3, not for writing portable code, so you need to treat its output with care.

5.2 Modernize

https://python-modernize.readthedocs.io/en/latest/ Modernize] script works on top of 2to3 and is supposed to help you to convert your existing Python 2 code to Python 2/3-compatible. However, in my experiments it did not work reliably even in simple cases so I cannot recommend it for usage.

5.3 Six

Six was the first library developed to simplify Python 2 to 3 migration. It provides a set of wrappers that hide differences between Python 2 and 3 behind its API. It is less intrusive than the following alternative because it does not monkey-patch built-in names, but in order to use Six library effectively you need to learn its API. Six can be used both for writing new Python 2/3-compatible addons and converting existing ones, but it requires good knowledge of Python 2/3 differences to pick necessary Six features that address specific differences.

Six library is included in the Kodi addon repository as script.module.six addon.

5.4 Future

Future library, like Six, was created to simplify porting existing Python 2 code to 3. But it uses a different approach than Six. Future monkey-patches built-in Python objects to make Python 2 behave like Python 3. The advantage of such approach is that code written using Future is close to idiomatic Python 3 and requires little re-work if you decide to drop Python 2 support in the future. However, such approach may cause problems in some rare edge-cases.

Future library also includes futurize command-line utility for converting existing Python 2 code to 2/3-compatible and in my experiments this utility showed good results. Future library (without additional utilities) is included in the Kodi addon repository as script.module.future addon.

5.5 Kodi Six

Kodi Six library (available as script.module.kodi-six addon) is created to normalize string handling in Kodi Python API based on different Python versions by applying wrappers around Kodi API functions and classes. In Kodi API based on Python 2 Kodi Six wrappers have the following effects:

  • All functions and methods that expect string arguments can accept both UTF-8 encoded str objects and unicode objects.
  • All functions and methods that return string data return unicode objects.

In Kodi API based on Python 3 the wrappers have no effect because Python 3-based API works only with str (Unicode) objects. This eliminates the need for ad hoc encoding and decoding string when working with Kodi API. To use the wrappers you need to import Kodi API xbmc* modules from Kodi Six instead of importing them directly:

from kodi_six import xbmc, xbmcaddon, xbmcgui, xbmcplugin, xbmcvfs

5.6 PyCharm

PyCharm, except for being a really good Python IDE, also provides code compatibility inspection that helps you write Python 2/3-portable code. Open Settings (Alt+F7) > Editor > Inspections > Python > Code compatibility inspections and check Python version that you want to support.

PyCharm Community Edition is free and provides all features you need for creating Python addons for Kodi.

6 Writing Portable Code

Here I’ll give you some tips about how to write portable code:

  • Learn the differences between Python 2 and 3. You need to know at least the most important differences between the two major Python versions.
  • Use version control system — git or mercurial — to track changes in your code. If you are porting existing code, do it in a separate branch.
  • No matter if you are going to write a brand new addon or to port existing addon to Python 3, carefully choose your tools. It is totally possible to write portable code without any helper tools and libraries, but you need to know what you are doing. However, in most cases I’d recommend you to use Future library and its utilities. Carefully read Future documentation.
  • Put the following line at the beginning of all your modules:
from __future__ import absolute_import, division, unicode_literals

This will enable respective Python 3 features in your code. You don’t need to import print_function (another Python 3 feature) because in Kodi addons you should to use xbmc.log() to write messages to the Kodi log file.

6.1 Writing New Addons

Check the Quick Start Guide section of Future library documentation. You can also use Six library, but, as it was said previously, you need to learn its API to pick the necessary features to address specific Python version differences, while Future allows to write your code in (almost) idiomatic Python 3.

A brief procedure for writing new Python 2/3-compatible addons with Future library:

  1. Create a new virtual environment with Python 3 interpreter and activate it.
  2. Install Future library: pip install future.
  3. Point your IDE (Integrated Development Environment) to that environment. For example, in PyCharm: File > Setting > Project > Project Interpreter.
  4. Put the following line at the beginning of your Python code:
from __future__ import absolute_import, division, unicode_literals
from future import standard_library
from future.builtins import *
standard_library.install_aliases()

Write your addon using Python 3 syntax and standard library names. However, you may still need to use wrappers from Future library for some specific cases (e.g. iterators over dict elements). Read Future docs for more info.

6.2 Porting Existing Addons

A brief procedure for porting existing code to Python 3-compatible with Future library:

  1. Install Future library into your working Python 2 virtual environment: pip install future.
  2. Run futurize utility to convert all your scripts and modules to portable code.

Test your new or converted addon in Kodi with Python 2 interpreter and fix all found issues. After that test the addon in Kodi with Python 3 interpreter and again fix all found issues. You can use Web-PDB debugger for troubleshooting issues in your code. It is compatible with both Python 2 and 3.

7 Links