Velocity Reviews > Sparse Matrix implemented as a doubly-linked list

Sparse Matrix implemented as a doubly-linked list

Guest
Posts: n/a

 06-22-2007
Hi there,

I'm a novice C programmer working with a series of large (30,000 x
30,000) sparse matrices on a Linux system using the GCC compiler. To
represent and store these matrices, I'd like to implement the sparse
matrices as a doubly-linked list, in which each non-zero cell is
stored roughly as follows:

int rownum
int colnum
double cellvalue
cell *rightptr
cell *downptr

I think this implementation makes the most sense because I need to
traverse the matrix both across rows and down columns -- the i-j-th
cell of one matrix is calculated as the matrix-product of the i-th row
by the j-th column of the other matrix.

First, does this sound like the right approach? Any thoughts or other
ideas would be most appreciated.

Second, can anybody point me to some existing code that implements a
found dozens of libraries, but I don't know how to evaluate which of
them suits my needs best. Thanks very much,

Ben Pfaff
Guest
Posts: n/a

 06-22-2007
"(E-Mail Removed)" <(E-Mail Removed)> writes:

> I'm a novice C programmer working with a series of large (30,000 x
> 30,000) sparse matrices on a Linux system using the GCC compiler. To
> represent and store these matrices, I'd like to implement the sparse
> matrices as a doubly-linked list, in which each non-zero cell is
> stored roughly as follows:
>
> int rownum
> int colnum
> double cellvalue
> cell *rightptr
> cell *downptr

This sounds feasible, but I'd like to point out that this is not
what I would call a doubly linked list. The term "doubly linked
list" usually refers to a structure in which a pair of pointers
in each element link to the previous and the next item in the
list. In your structure, on the other hand, the pointers do not
allow a single list to be traversed in both forward and reverse
order. You have a pair of singly linked lists, not just one

> First, does this sound like the right approach? Any thoughts or other
> ideas would be most appreciated.

It sounds feasible, but I know very little about matrix arithmetic.
--
"IMO, Perl is an excellent language to break your teeth on"
--Micah Cowan

CBFalconer
Guest
Posts: n/a

 06-22-2007
"(E-Mail Removed)" wrote:
>
> I'm a novice C programmer working with a series of large (30,000 x
> 30,000) sparse matrices on a Linux system using the GCC compiler. To
> represent and store these matrices, I'd like to implement the sparse
> matrices as a doubly-linked list, in which each non-zero cell is
> stored roughly as follows:
>
> int rownum
> int colnum
> double cellvalue
> cell *rightptr
> cell *downptr
>
> I think this implementation makes the most sense because I need to
> traverse the matrix both across rows and down columns -- the i-j-th
> cell of one matrix is calculated as the matrix-product of the i-th row
> by the j-th column of the other matrix.

.....

Only if you build the matrix once and then peruse it. Otherwise
maintaining the pointers will be a nightmare. comp.programming is
probably a better newsgroup for this.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

Guest
Posts: n/a

 06-23-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Hi there,
>
> I'm a novice C programmer working with a series of large (30,000 x
> 30,000) sparse matrices on a Linux system using the GCC compiler. To
> represent and store these matrices, I'd like to implement the sparse
> matrices as a doubly-linked list, in which each non-zero cell is
> stored roughly as follows:
>
> int rownum
> int colnum
> double cellvalue
> cell *rightptr
> cell *downptr
>
> I think this implementation makes the most sense because I need to
> traverse the matrix both across rows and down columns -- the i-j-th
> cell of one matrix is calculated as the matrix-product of the i-th row
> by the j-th column of the other matrix.
>
> First, does this sound like the right approach? Any thoughts or other
> ideas would be most appreciated.

It sounds reasonable if you access in the order you describe.

I suggest separating the access routines so that the application code
doesn't care what structure is being used for the sparse array.

Following is an example of a header file that I would define if I were
implementing such a 2D sparse array. The interface doesn't rely on the
structure used to implement the sparse array. You could try different
techniques for implementing the array without changing your application.
Someone here has been known to provide links to a hash table library
that could provide the basis for one implementation.

Comments welcome on the proposed interface, of course.

/* Dynamic Array Interface */

/* darray is a pointer to a dynamic sparse array descriptor. */
typedef struct darray_desc *darray;

/* Function: DarrayCreate
** Description: This function creates a dynamic sparse two-
** dimensional array with default values of zero.
** It is dynamic in the sense that storage is
** allocated for elements as stored, so a fixed size
** is not needed when the array is created.
*/
darray * /* ptr to descriptor or NULL if out of memory */
DarrayCreate (void);

/* Function: DarrayStore
** Description: This function stores a value in a specified location
** of dynamic array a. If an error occurs, x will
** not be stored.
** On entry: a is the NULL value returned by DarrayCreate.
** On exit: if return 0, x has been stored in a[i][j].
*/
int /* 0: OK, 1: out of memory */
DarrayStore (
darray a, /* darray to receive value */
int i, /* row index */
int j, /* column index */
double x /* value to be stored */
);

/* Function: DarrayGet
** Description: This function returns the last-stored value in
** a[i][j], where a is a dynamic array. If no value
** has been stored in a[i][j], 0.0 is returned.
** On entry: a is the value returned by DarrayCreate.
** On exit: a[i][j] is returned.
*/
double /* last stored value of a[i][j] */
DarrayGet (
darray a, /* darray source */
int i, /* row index */
int j /* column index */
);

/* Function: DarrayError
** Description: This function returns status indicating whether
** an access error has occurred as a result of
** calling DarrayCreate, DarrayStore, or DarrayGet
** since the call to DarrayCreate for this array.
** On entry: a is the value returned by DarrayCreate.
*/
int /* 0: no error, 1: out of memory */
DarrayError (
darray a /* darray being checked */
);

/* Function: DarrayFree
** Description: This function frees memory allocated for a dynamic
** array.
** On entry: a is the non-NULL value returned by DarrayCreate.
** On exit: The memory used for a and associated data have
** been deallocated. The value *a is no longer
** valid.
*/
void DarrayFree (
darray a /* darray to be freed */
);

/* Sample use:
** darray a, b;
** a = DarrayCreate();
** b = DarrayCreate();
** DarrayStore(a, 10, 200, 123.);
** DarrayStore(b, 3000, 4000,
** DarrayGet(a, 10, 200) + DarrayGet(b, 3001, 4000));
** // b[3000][4000] = 123.
** if (DarrayError(a) || DarrayError(b)) {
** printf ("Darray error\n");
** }
** DarrayFree(a);
** DarrayFree(b);
*/

--

Roberto Waltman
Guest
Posts: n/a

 06-24-2007
<(E-Mail Removed)> wrote:
> ...working with a series of large (30,000 x
>30,000) sparse matrices on a Linux system using the GCC compiler. To
>represent and store these matrices, I'd like to implement the sparse
>matrices as a doubly-linked list, in which each non-zero cell is
>stored roughly as follows:
>
>int rownum
>int colnum
>double cellvalue
>cell *rightptr
>cell *downptr

Unless your needs are unique, or you want to implement your own
library as a learning stepping stone to greater things, I think you
would be better off using an already invented wheel.

>I think this implementation makes the most sense because I need to
>traverse the matrix both across rows and down columns -- the i-j-th
>cell of one matrix is calculated as the matrix-product of the i-th row
>by the j-th column of the other matrix.
>
>First, does this sound like the right approach? Any thoughts or other
>ideas would be most appreciated.

That may or not be adequate depending on the distribution pattern of
the non-zero cells, which in turn, depends on the problems you are
trying to solve. For example, if all the non-zero elements in a single
column are clustered in one (or at most a few) group(s) of consecutive
cells, other implementations could be used, more space/time/both
efficient. (Search for "sparse matrix collection" for more examples)

>Second, can anybody point me to some existing code that implements a
>sparse matrix as a doubly-linked list?

As long as a sparse matrix package provides the functionality and
performance you need, for the particular type of problems you need to
solve, why should you care if internally it uses single, double, or

As a starting point, check the CSparse package from the book "Direct
Methods for Sparse Linear Systems":

http://www.ec-securehost.com/SIAM/FA02.html
http://www.cise.ufl.edu/research/sparse/CSparse/
http://www.cise.ufl.edu/research/sparse/

Source code snippet:
....

/* --- primary CSparse routines and data structures
------------------------- */
typedef struct cs_sparse /* matrix in compressed-column or
triplet form */
{
int nzmax ; /* maximum number of entries */
int m ; /* number of rows */
int n ; /* number of columns */
int *p ; /* column pointers (size n+1) or col indices (size
nzmax) */
int *i ; /* row indices, size nzmax */
double *x ; /* numerical values, size nzmax */
int nz ; /* # of entries in triplet matrix, -1 for
compressed-col */
} cs ;
....

> I've Googled sparse matrix and
>found dozens of libraries, but I don't know how to evaluate which of
>them suits my needs best. Thanks very much,

Right. There are many sparse matrix libraries, some generic, some
optimized for a particular problem domain.
A group or mailing list on mathematics may be a better place than

Roberto Waltman

Guest
Posts: n/a

 06-25-2007
Thank you all for the comments. I would definitely prefer to find a
suitable library rather than write my own. Because I need to traverse
both rows and columns, I've been looking for an implementation like
the one I described -- CSparse (for example, but many others are
similar) uses a column-dominant implementation, meaning row-based
operations will be much slower. Having said that, Roberto Waldman is
right, all I care about is whether the library will be "fast enough",
it need not be optimally fast. I'll give it a try. Thanks again
everyone,