Skip to content Skip to sidebar Skip to footer

How Can I Get 2.x-like Sorting Behaviour In Python 3.x?

I'm trying to replicate (and if possible improve on) Python 2.x's sorting behaviour in 3.x, so that mutually orderable types like int, float etc. are sorted as expected, and mutual

Solution 1:

Stupid idea: make a first pass to divide all the different items in groups that can be compared between each other, sort the individual groups and finally concatenate them. I assume that an item is comparable to all members of a group, if it is comparable with the first member of a group. Something like this (Python3):

import itertools

defpython2sort(x):
    it = iter(x)
    groups = [[next(it)]]
    for item in it:
        for group in groups:
            try:
                item < group[0]  # exception if not comparable
                group.append(item)
                breakexcept TypeError:
                continueelse:  # did not break, make new group
            groups.append([item])
    print(groups)  # for debuggingreturn itertools.chain.from_iterable(sorted(group) for group in groups)

This will have quadratic running time in the pathetic case that none of the items are comparable, but I guess the only way to know that for sure is to check all possible combinations. See the quadratic behavior as a deserved punishment for anyone trying to sort a long list of unsortable items, like complex numbers. In a more common case of a mix of some strings and some integers, the speed should be similar to the speed of a normal sort. Quick test:

In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j,  -5.5, 13 , 15.3, 'aa', 'zz']

In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]

It seems to be a 'stable sort' as well, since the groups are formed in the order the incomparable items are encountered.

Solution 2:

This answer aims to faithfully re-create the Python 2 sort order, in Python 3, in every detail.

The actual Python 2 implementation is quite involved, but object.c's default_3way_compare does the final fallback after instances have been given a chance to implement normal comparison rules. This is after individual types have been given a chance to compare (via the __cmp__ or __lt__ hooks).

Implementing that function as pure Python in a wrapper, plus emulating the exceptions to the rules (dict and complex numbers specifically) gives us the same Python 2 sorting semantics in Python 3:

from numbers import Number


# decorator for type to function mapping special casesdefper_type_cmp(type_):
    try:
        mapping = per_type_cmp.mapping
    except AttributeError:
        mapping = per_type_cmp.mapping = {}
    defdecorator(cmpfunc):
        mapping[type_] = cmpfunc
        return cmpfunc
    return decorator


classpython2_sort_key(object):
    _unhandled_types = {complex}

    def__init__(self, ob):
       self._ob = ob

    def__lt__(self, other):
        _unhandled_types = self._unhandled_types
        self, other = self._ob, other._ob  # we don't care about the wrapper# default_3way_compare is used only if direct comparison failedtry:
            return self < other
        except TypeError:
            pass# hooks to implement special casing for types, dict in Py2 has# a dedicated __cmp__ method that is gone in Py3 for example.for type_, special_cmp in per_type_cmp.mapping.items():
            ifisinstance(self, type_) andisinstance(other, type_):
                return special_cmp(self, other)

        # explicitly raise again for types that won't sort in Python 2 eitheriftype(self) in _unhandled_types:
            raise TypeError('no ordering relation is defined for {}'.format(
                type(self).__name__))
        iftype(other) in _unhandled_types:
            raise TypeError('no ordering relation is defined for {}'.format(
                type(other).__name__))

        # default_3way_compare from Python 2 as Python code# same type but no ordering defined, go by idiftype(self) istype(other):
            returnid(self) < id(other)

        # None always comes firstif self isNone:
            returnTrueif other isNone:
            returnFalse# Sort by typename, but numbers are sorted before other types
        self_tname = ''ifisinstance(self, Number) elsetype(self).__name__
        other_tname = ''ifisinstance(other, Number) elsetype(other).__name__

        if self_tname != other_tname:
            return self_tname < other_tname

        # same typename, or both numbers, but different type objects, order# by the id of the type objectreturnid(type(self)) < id(type(other))


@per_type_cmp(dict)defdict_cmp(a, b, _s=object()):
    iflen(a) != len(b):
        returnlen(a) < len(b)
    adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
    if adiff is _s:
        # All keys in a have a matching value in b, so the dicts are equalreturnFalse
    bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
    if adiff != bdiff:
        return python2_sort_key(adiff) < python2_sort_key(bdiff)
    return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])

I incorporated handling dictionary sorting as implemented in Python 2, since that'd be supported by the type itself via a __cmp__ hook. I've stuck to the Python 2 ordering for the keys and values as well, naturally.

I've also added special casing for complex numbers, as Python 2 raises an exception when you try sort to these:

>>> sorted([0.0, 1, (1+0j), False, (2+3j)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined forcomplex numbers

You may have to add more special cases if you want to emulate Python 2 behaviour exactly.

If you wanted to sort complex numbers anyway you'll need to consistently put them with the non-numbers group; e.g.:

# Sort by typename, but numbers are sorted before other typesifisinstance(self, Number) andnotisinstance(self, complex):
    self_tname = ''else:
    self_tname = type(self).__name__
ifisinstance(other, Number) andnotisinstance(other, complex):
    other_tname = ''else:
    other_tname = type(other).__name__

Some test cases:

>>>sorted([0, 'one', 2.3, 'four', -5], key=python2_sort_key)
[-5, 0, 2.3, 'four', 'one']
>>>sorted([0, 123.4, 5, -6, 7.89], key=python2_sort_key)
[-6, 0, 5, 7.89, 123.4]
>>>sorted([{1:2}, {3:4}], key=python2_sort_key)
[{1: 2}, {3: 4}]
>>>sorted([{1:2}, None, {3:4}], key=python2_sort_key)
[None, {1: 2}, {3: 4}]

Solution 3:

Not running Python 3 here, but maybe something like this would work. Test to see if doing a "less than" compare on "value" creates an exception and then do "something" to handle that case, like convert it to a string.

Of course you'd still need more special handling if there are other types in your list that are not the same type but are mutually orderable.

from numbers import Real
from decimal import Decimal

defmotley(value):
    numeric = Real, Decimal
    ifisinstance(value, numeric):
        typeinfo = numeric
    else:
        typeinfo = type(value)

    try:
        x = value < value
    except TypeError:
        value = repr(value)

    returnrepr(typeinfo), value

>>> printsorted([0, 'one', 2.3, 'four', -5, (2+3j), (1-3j)], key=motley)
[-5, 0, 2.3, (1-3j), (2+3j), 'four', 'one']

Solution 4:

One way for Python 3.2+ is to use functools.cmp_to_key(). With this you can quickly implement a solution that tries to compare the values and then falls back on comparing the string representation of the types. You can also avoid an error being raised when comparing unordered types and leave the order as in the original case:

from functools import cmp_to_key

defcmp(a,b):
    try:
        return (a > b) - (a < b)
    except TypeError:
        s1, s2 = type(a).__name__, type(b).__name__
        return (s1 > s2) - (s1 < s2)

Examples (input lists taken from Martijn Pieters's answer):

sorted([0, 'one', 2.3, 'four', -5], key=cmp_to_key(cmp))
# [-5, 0, 2.3, 'four', 'one']sorted([0, 123.4, 5, -6, 7.89], key=cmp_to_key(cmp))
# [-6, 0, 5, 7.89, 123.4]sorted([{1:2}, {3:4}], key=cmp_to_key(cmp))
# [{1: 2}, {3: 4}]sorted([{1:2}, None, {3:4}], key=cmp_to_key(cmp))
# [None, {1: 2}, {3: 4}]

This has the disadvantage that the three-way compare is always conducted, increasing the time complexity. However, the solution is low overhead, short, clean and I think cmp_to_key() was developed for this kind of Python 2 emulation use case.

Solution 5:

I tried to implement the Python 2 sorting c code in python 3 as faithfully as possible.

Use it like so: mydata.sort(key=py2key()) or mydata.sort(key=py2key(lambda x: mykeyfunc))

defdefault_3way_compare(v, w):  # Yes, this is how Python 2 sorted things :)
    tv, tw = type(v), type(w)
    if tv is tw:
        return -1ifid(v) < id(w) else (1ifid(v) > id(w) else0)
    if v isNone:
        return -1if w isNone:
        return1ifisinstance(v, (int, float)):
        vname = ''else:
        vname = type(v).__name__
    ifisinstance(w, (int, float)):
        wname = ''else:
        wname = type(w).__name__
    if vname < wname:
        return -1if vname > wname:
        return1return -1ifid(type(v)) < id(type(w)) else1defpy2key(func=None):  # based on cmp_to_keyclassK(object):
        __slots__ = ['obj']
        __hash__ = Nonedef__init__(self, obj):
            self.obj = func(obj) if func else obj

        def__lt__(self, other):
            try:
                return self.obj < other.obj
            except TypeError:
                return default_3way_compare(self.obj, other.obj) < 0def__gt__(self, other):
            try:
                return self.obj > other.obj
            except TypeError:
                return default_3way_compare(self.obj, other.obj) > 0def__eq__(self, other):
            try:
                return self.obj == other.obj
            except TypeError:
                return default_3way_compare(self.obj, other.obj) == 0def__le__(self, other):
            try:
                return self.obj <= other.obj
            except TypeError:
                return default_3way_compare(self.obj, other.obj) <= 0def__ge__(self, other):
            try:
                return self.obj >= other.obj
            except TypeError:
                return default_3way_compare(self.obj, other.obj) >= 0return K

Post a Comment for "How Can I Get 2.x-like Sorting Behaviour In Python 3.x?"