Sorting In Sparse Matrix
I have a sparse matrix. I need to sort this matrix row-by-row and create another [sparse] matrix. Code may explain it better: # for `rand` function, you need newer version of scipy
Solution 1:
If you're willing to ignore the zero-value elements of the matrix, the code below should work. It is also much faster than implementations that use the getrow method, which is rather slow.
from itertools import izip
def sort_coo(m):
    tuples = izip(m.row, m.col, m.data)
    return sorted(tuples, key=lambda x: (x[0], x[2]))
For example:
    >>> from numpy.random import rand
    >>> from scipy.sparse import coo_matrix
    >>>
    >>> d = rand(10, 20)
    >>> d[d > .05] = 0
    >>> s = coo_matrix(d)
    >>> sort_coo(s)
    [(0, 2, 0.004775589084940246),
     (3, 12, 0.029941507166614145),
     (5, 19, 0.015030386789436245),
     (7, 0, 0.0075044957259399192),
     (8, 3, 0.047994403933129481),
     (8, 5, 0.049401058471327031),
     (9, 15, 0.040011608000125043),
     (9, 8, 0.048541825332137023)]
Depending on your needs you may want to tweak the sort keys in the lambda or further process the output. If you want everything in a row indexed dictionary you could do:
from collections import defaultdict
sorted_rows = defaultdict(list)
for i in sort_coo(m):
     sorted_rows[i[0]].append((i[1], i[2]))
Solution 2:
My bad solution is like this:
from scipy.sparse import coo_matrix
import numpy as np
a = []
for i in xrange(m.shape[0]): # assume m is square matrix.
   d = m.getrow(i)
   n = len(d.indices)
   s = zip([i]*n, d.indices, d.data)
   sorted_s = sorted(s, key=lambda v: v[2], reverse=True)
   a.extend(sorted_s)
a = np.array(a)
new_m = coo_matrix((a[:,2], (a[:,0], a[:,1])), m.shape)
There can be some simple mistakes above because I have not checked it yet. But the idea is intuitive, I guess. Is there any good solution?
Edit
This new matrix creation may be useless because if you call getrow method then the order is broken again.
Only coo_matrix.col keeps the order.
Another Solution
This one is not exact solution but it may be helpful:
def sortSparseMatrix(m, rev=True, only_indices=True):
    """ Sort a sparse matrix and return column index dictionary
    """
    col_dict = dict() 
    for i in xrange(m.shape[0]): # assume m is square matrix.
        d = m.getrow(i)
        s = zip(d.indices, d.data)
        sorted_s = sorted(s, key=lambda v: v[1], reverse=True)
        if only_indices:
            col_dict[i] = [element[0] for element in sorted_s]
        else:
            col_dict[i] = sorted_s
    return col_dict
>>> print sortSparseMatrix(m)
{0: [5, 1, 0],
 1: [1, 3, 5],
 2: [1, 2, 3, 4],
 3: [1, 5, 2, 4],
 4: [0, 3, 5, 1],
 5: [3, 4, 2]}
Post a Comment for "Sorting In Sparse Matrix"