[GSoC 2019] WEEK 7!

This week is also dedicated to the implementation of new format for sparse array. #17160 I have figured out the way to dynamically handle the dimension problem for different formats.

The most difficult part is for Compressed Sparse Row (CSR) format. This format is used by Matlab and Julia as a default format. (In fact the exact foramt is Compressed Sparse Column, which is transposed storage for CSR format.) CSR format used 3 list for storage: a flattened list for all non-zero value, a list of column index for each non-zero value and a list of pointer for first non-zero value in each row. In a two-dimension matrix, the notion of row is straight forward. But how to define a row in a multidimensional array?

In order solve this problem, I have defined the last dimension of each array as a row and flattened the first n-1 dimensions, with n as the rank of the array. I this way, a sparse array can be stored in 3 lists as the way CSR proposes.

This format is said to have a very good performance in item accessing and slicing, as well as vector product. Unfortunately, I haven’t figured out how to implement the __mul__ function to make good use of this format.

By the end of this week, 3 formats are basically implemented: List Of List(LIL), Coordiante list(COO) and Compressed Sparse Row(CSR).