Python 2 vs Python 3: Syntax differences
Vadim Kolobanov
Posted on November 6, 2021
Photo by Caspar Camille Rubin on Unsplash
It is not worth learning Python 2.x in any case more than curiosity. The version is deprecated and any projects using 2.x are legacy code.
Print function
The print operator has been replaced by the print() function, with named arguments to replace most of the syntax of the old print operator. Examples:
Python2: print "The answer is", 2*2
Python3: print("The answer is", 2*2)
Python2: print x, # The comma at the end suppresses the line feed
Python3: print(x, end=" ") # Adds a space instead of a line feed
Python2: print # Prints a line feed
Python3: print() # We need to call the function!
Python2: print >>sys.stderr, "fatal error"
Python3: print("fatal error", file=sys.stderr)
Python2: print (x, y) # print repr((x, y))
Python3: print((x, y)) # Not to be confused with print(x, y)!
You can also set up a separator between the elements, for example:
print("There are <", 2**32, "> possibilities!", sep="")
There are <4294967296> possibilities!
The print() function does not support the "program space" feature of the old print operator. For example, in Python 2, print "A\n", "B" will print "A\nB\n";
but in Python 3, print("A\n", "B") will print "A\n B\n".
Mappings and iterators instead of lists
Some well-known methods do not return lists in Python 3:
The dict.keys(), dict.items() and dict.values() dictionary methods return "mappings" instead of lists. For example, it no longer works:
k = d.keys();
k.sort().
Use k = sorted(d).The dict.iterkeys(), dict.iteritems() and dict.itervalues() methods are no longer supported.
map() and filter() return iterators. If you really need a list, a quick fix would be list(map(...)), but often the best fix would be to use list generators (especially when the original code uses lambda expressions), or rewrite the code so that it doesn't need a list as such. It is especially difficult that map() causes side effects of the function; the correct transformation is to use a loop (creating a list is just wasteful).
range() now behaves like xrange(), but works with values of any size. xrange() no longer exists.
zip() returns an iterator.
Comparison operators
Python 3 has simplified the rules for comparison operators:
Comparison operators (<, <=, >=, >) raise a TypeError exception when operands are not ordered. Thus, expressions like 1 < ", 0 > None or len <= len are no longer allowed, and, for example, None < None raises TypeError rather than returns False. The consequence is that sorting a list with different data types no longer makes sense - all elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types are always unequal to each other.
builtin.sorted() and list.sort() no longer accept the cmp argument providing the comparison function. Use the key argument instead. The key and reverse arguments are now "keyword-only".
The cmp() function should be considered deprecated, and the special cmp() method is not supported in Python 3.
Use lt() for sorting, eq() with hash() for comparison. (If you really need the cmp() functionality, you can use the expression (a > b) -(a < b) as an equivalent for cmp(a,b).)
Integers
PEP 0237: long renamed to int.
PEP 0238: An expression of the form 1/2 returns float. Use 1//2 to cut off the fractional part. (This syntax has been around since Python 2.2)
The sys.maxint constant has been removed since there is no longer a limit to the value of integers. However, sys.maxsize can be used as a number larger than any practical index of a list or string. sys.maxsize corresponds to the "natural" size of an integer and, as a rule, has the same value as sys.maxint on the same platform (provided the same build parameters).
repr() from a long integer does not include the more terminating character L, so the code that certainly cuts off this character will cut off the last digit instead. (Use str() instead.)
Octal literals no longer have the form 0720; use 0o720.
Text, Unicode, and 8-bit strings
Everything you knew about binary data and Unicode has changed.
Python 3 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings.
All text is Unicode however, Unicode-encoded strings are represented as binary data. The type used to store text is str, the type used to store data in bytes. The biggest difference with python 2.x is that any attempt to combine text and data in Python 3.0 raises a TypeError, whereas if you mixed Unicode and 8-bit strings in Python 2.x, it would work if an 8-bit string contained only 7-bit (ASCII) characters, but you would get a UnicodeDecodeError if it contains non-ASCII characters. This behavior has caused numerous mournful faces over the years.
As a consequence of this change in philosophy, a significant part of the code that uses Unicode, encodings or binary data is likely to change. This is a change for the better since python 2.x had numerous errors related to mixing encoded and decoded text. To be prepared for this, Python 2.x should start using Unicode for all unencoded text, and str only for binary or encoded data. Then the 2to3 tool will do most of the work for you.
You can no longer use the literal u"..." for Unicode text. However, you should use the literal b"..." for binary data.
Since str and bytes cannot be mixed, you should always explicitly convert them. Use str.encode() to go from str to bytes and bytes.decode() to go from bytes to str. You can also use bytes(s, encoding=...) and str(b, encoding=...), respectively.
Like str, the bytes type is immutable. There is a separate mutable type for binary data, bytearray. Almost all functions that accept bytes also accept bytearray.
All backslashes in "raw" string literals are interpreted literally. This means that "\U" and "\u" in raw strings are not considered particularly. For example, r"\u20ac" is a string of 6 characters in Python 3.0, while in 2.6, ur"\u20ac" was a single character "euro". (Of course, this change only affects raw string literals).
The built-in abstract base string type has been removed. Use str instead. str and bytes don't have enough common functionality to justify a common base class. The 2to3 tool (see below) replaces each occurrence of a base string with str.
PEP 3138: repr() for a string no longer escapes non-ASCII characters. However, it still escapes the control characters
PEP 3120: The default encoding of the source code is now UTF-8.
PEP 3131: Non-ASCII characters are allowed in identifiers. (However, the standard library remains ASCII, except for the authors' names in the comments.)
The StringIO and cStringIO modules have been removed. Instead, import the io module and use io.StringIO or io.BytesIO for text and data respectively.
Built-in functions
PEP 3135: New super(). Now you can call super() without arguments and (provided that it is an instance method defined inside the class definition) the class and instance will be automatically selected. With arguments, the behavior of super() remains unchanged.
PEP 3111: raw_input() renamed to input(). Instead of input() in Python 2, you can use eval(input()).
Added the next() function that calls the __ next __() method of the object.
Moved intern() to sys.intern().
Removed: apply(). Instead of apply(f, args), use f(*args).
Removed: callable(). Instead of callable(f), use hasattr(f, "call"). Function operator.is Callable() has also been removed.
Deleted: coerce().
Deleted: execfile(). Instead of execfile(fn), use exec(open(fn).read()).
Deleted: file. Use open().
Moved: reduce() to functools.reduce()
Moved: reload() to imp.reload().
Deleted: dict.has_key(). Use the in operator.
Posted on November 6, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.