Sparse Matrix—A Case Study in Layered Abstraction

Thomas F. Hain, Ph.D.

School of Computer & Information Sciences,

University of South Alabama

Mobile, Alabama

hain@cis.usouthal.edu

© May 5, 1999

Exposition for Instructor

A sparse matrix behaves like a two dimensional array, with the characteristic that most of the elements are zero. To save memory space, only the nonzero elements are actually saved, together with enough information to save or infer the indexes. In the described laboratory assignment, the mechanism of the sparse matrix is an array (row index) of linked lists storing only the nonzero values within a row, together with the column index of the saved element.

A sparse matrix design/implementation has been a very successful second laboratory assignment in a Data Structures course using an object-oriented implementation language (here using C++ as the sample language). In my case, the students have had CS1 and CS2 (using Java). In Data Structures I emphasize the use of C++ in the implementation of ADTs, and do not place much emphasis on inheritance and polymorphism, leaving those issues to CS2, Computer Languages, and Software Engineering. The first lab taken by the students is a very short, get-acquainted-with-the-syntax, exercise.

The educational goals of the described lab are to:

 

Laboratories

The lab assignment could be to design and implement a sparse matrix class to support a client program provided by the instructor. This client program may be nothing more than a test driver. Naturally, the public interface for the sparse matrix class would have to be specified, or can be induced from the client program. Associated with this simple lab specification must be a class discussion or handout of the mechanics of a sparse matrix (see below), and appropriate layers of abstraction. The layers of abstraction imply the design, implementation, and testing appropriate classes.

The layers are:

1. A keyed data element in a list, consisting of a data value, and a key (later used as an index). This is a rather trivial class, but eventually allows the clean separation of data from links in a linked list node.

2. A list of data values, each with an associated key, implemented as a linked list.

3. A sparse vector, whose public behavior mimics that of an ordinary one dimensional array, but whose implementation is a (linked) list of all non-zero data values, each with an associated index, in the array.

4. A sparse array, whose public interface mimics that of an ordinary two dimensional array, but whose underlying implementation is an array of sparse matrices.

Another approach would be to break the assignment into two labs or phases:

1. Design, implement, and test an unsorted list of values together with an integer key, with the usual operations: search for element (by key), insert element (maintaining unique keys), delete element, copy list, etc. The implementation should be based on the idea of a linked list class. Discuss the relative advantages/disadvantages of singly versus doubly linked, linear versus circular, sentinel controlled versus no sentinel, lists. It should be clear that singly linked, circular, sentinel controlled lists fit the bill and not cause excessive overhead.

2. Design, implement, and test a sparse vector class using the list developed in lab 1. Then design, implement and test a sparse array using the sparse vector class

Yet a third approach would be to break the project into three phases:

1. Create a list class, as above.

2. Create a sparse vector class, using the list class.

3. Use C++ templates to make the list into a generic list, and do the same for the sparse vector class. Then build a generic sparse array class using the generic sparse vector class.

 

Guide to Instructor

One of the most difficult aspect in the implementation of a sparse vector is to provide a natural looking interface. C++ has the mechanisms to provide that, in this case through operator overloading. We would like to support the client of the sparse vector class to be able to do things like:

1. sv[4] = 5;

2. x = sv[4] + 5;

3. x = 5 + sv[5];

4. cout << sv[5];

5. cout << sv;

6. sv[1] = sv[2] = sv[3] = 0;

It should be noted that this most natural syntax, and the concomitant expected semantics, is partly provided by overloading the index operator in C++, operator[]. However, the return type of operator[] is used as both an lvalue (lines 1 and 6), and rvalue (lines 2, 3, 4, and 6.) To provide the described behavior requires that the returned type of operator[] be something that has a type that I call a ValueAgent. In the ValueAgent class one can overload the assignment operator, which will take care of lines 1 and 6 above, one can provide a cast operation (which will be used automatically) to type Value (the type of the sparse matrix elements), for lines 2, 3, and 6, and one can overload operator<<(ostream&, ValueAgent&) to provide the correct behavior for lines 4 and 5.

Lab Specification Handout

The following lab handout was given to the students. It is short and sweet by the nature of the assignment, but followed up with an exposition in class, and periodic class discussions as needed. It was regarded as a difficult, but satisfying lab.

Specification of Sparse Array Abstract Data Type

You are to implement an abstract data type for a sparse (2 dimensional) array, using the class mechanism of C++ to provide appropriate data abstraction and encapsulation. The operations to be provided are

SparseArray a(100,200), b(100,200) // declaration/definition with array // size parameters.

c = a[i][j] // array element copy with bounds checking.

a[i][j] = c // array element update with bounds checking. b = a; // array assignment

 

Implementation Notes

The implementation of the sparse array will be as an array of sparse vectors. Thus a sparse vector class will be built.

The sparse vector will be implemented as a linked list of nodes each containing data of type ElementType (e.g. int) and an index. Nodes themselves will be implemented as a simple class which will have class SparseVector as a "friend". More details will be given in class (note the overloading of this word!).

Client Program

The client program will be given, and will only involve sparse array operations (i.e., on objects of type SparseArray).

 

Sample Code

The following sample code contains a full and tested implementation of the project using C++. The files listed are:

The output of the test driver is also provided.

 

// misc.h

////////////////////////////////////////////////////////////////////

//

// Miscellaneous declarations and directives

// Written by T.F. Hain, Jan '98

//

////////////////////////////////////////////////////////////////////

#ifndef _MISC_H

#define _MISC_H

#include <iostream.h>

#include <iomanip.h>

#include <stdlib.h>

#include <alloc.h>

//#define ANSI

#define BORLANDC++

#ifdef BORLANDC++

// used to test for existance of memory leaks (Borland C++)

#define BEGIN { unsigned long core=coreleft(); {

#define END } cout << "core used: " << (core-coreleft()) << endl; return 0; }

#else

#define BEGIN {

#define END }

#endif

typedef enum {FALSE, TRUE} Bool;

inline void testAlloc(void *v)

{

if (!v) {

cerr << "Memory allocation error/n/n";

exit(1);

}

}

#endif

=========================================================================

// list.h

///////////////////////////////////////////////////////////////////////

//

// Element, ListNode and List classes.

// These are helper classes used by client class SparseVector.

//

// Written by T. Hain, April 1998

//

///////////////////////////////////////////////////////////////////////

#ifndef _LIST_H

#define _LIST_H

#include "misc.h"

typedef int Key;

typedef int Value;

///////////////////////////////////////////////////////////////////////

class Element {

Key ndx_;

Value value_;

Element();

public:

Element(Key _ndx, Value _value);

Value value() const;

Key index() const;

};

///////////////////////////////////////////////////////////////////////

class ListNode {

Key ndx_;

Element element_;

ListNode *next;

ListNode();

ListNode(Key _ndx, Element _element, ListNode *_next = NULL);

friend class List;

};

///////////////////////////////////////////////////////////////////////

//

// class List

//

// List is implemented as a circular singly linked list, with a sentinel.

// It is not explicitely accessed by the client.

//

// void List::copyList(ListNode *s)

// pre: current list has only sentinel node.

// s points to sentinel of list to be copied to current list.

// post: current list contains a copy of the list pointed to by s.

//

// ListNode* List::parent(Key k)

// pre: none

// post: returns pointer to parent ListNode of node containing

// Key k, or NULL if Key k is not found in current list.

//

// void List::add(Element e)

// pre: none.

// post: element e is added to the front of the list.

// Duplicate keys are not tested for.

//

// void List::remove(Key k)

// pre: none

// post: first element with Key k is removed. If k is not

// found, no effect.

//

// Element* List::getElement(Key ndx)

// pre: a valid (possibly empty) list exists.

// post: pointer to element with Key ndx is returned, or NULL

// if ndx does not exist.

//

// void List::removeAll()

// pre: a valid (possibly empty) list exists.

// post: the list contains only a sentinel node (i..e., is empty.)

//

// int List::size() const

// pre: none

// post number of (nonzero) elements in list.

//

///////////////////////////////////////////////////////////////////////

class List {

int size_;

ListNode *sentinel;

void copyList(ListNode *);

ListNode* parent(Key);

public:

List();

List(const List &l);

~List();

const List& operator=(const List &l);

void add(Element);

void remove(Key);

Element* getElement(Key ndx);

void removeAll();

int size() const;

};

///////////////////////////////////////////////////////////////////////

inline Element::Element()

{}

inline Element::Element(Key _ndx, Value _value): ndx_(_ndx), value_(_value)

{}

inline Value Element::value() const

{ return value_; }

inline Key Element::index() const

{ return ndx_; }

inline ListNode::ListNode(Key _ndx, Element _element, ListNode *_next)

: ndx_(_ndx), element_(_element), next(_next)

{}

inline List::List() : size_(0)

{

testAlloc(sentinel = new ListNode((Key)0, Element(0,0)));

sentinel->next = sentinel;

}

inline List::~List()

{

removeAll();

delete sentinel;

}

inline Element* List::getElement(Key ndx)

{

ListNode *p = parent(ndx);

return p ? &(p->next->element_) : NULL;

}

inline int List::size() const

{ return size_; }

#endif

=========================================================================

// list.cc

///////////////////////////////////////////////////////////////

//

// Implementation file for Element and List classes

// Written by T. Hain, April 1998

//

///////////////////////////////////////////////////////////////

#include "list.h"

//-------------------------------------------------------------

List::List(const List &l) : size_(l.size_)

{

testAlloc(sentinel = new ListNode((Key)0, Element(0,0)));

sentinel->next = sentinel;

copyList(l.sentinel);

}

//-------------------------------------------------------------

void List::removeAll()

{

for (ListNode *p = sentinel->next; p != sentinel; ) {

ListNode *q = p;

p = p->next;

delete q;

}

sentinel->next = sentinel;

size_ = 0;

}

//-------------------------------------------------------------

void List::copyList(ListNode *s)

{

for (ListNode *p = s->next, *q = sentinel; p != s; p = p->next, q = q->next)

testAlloc(q->next = new ListNode(p->element_.index(), p->element_));

q->next = sentinel;

}

//-------------------------------------------------------------

ListNode* List::parent(Key _ndx)

{

sentinel->ndx_ = _ndx;

for (ListNode *p = sentinel; p->next->ndx_ != _ndx; p = p->next);

return (p->next == sentinel) ? NULL : p;

}

//-------------------------------------------------------------

const List& List::operator=(const List &l)

{

removeAll();

copyList(l.sentinel);

size_ = l.size_;

return *this;

}

//-------------------------------------------------------------

void List::add(Element e)

{

testAlloc(sentinel->next = new ListNode(e.index(), e, sentinel->next));

++size_;

}

//-------------------------------------------------------------

void List::remove(Key _ndx)

{

ListNode *p = parent(_ndx);

if (p) {

ListNode *c = p->next;

p->next = p->next->next;

delete c;

--size_;

}

}

=========================================================================

// spvec.h

//////////////////////////////////////////////////////////////////////

//

// SparseVector class header

// Written by T. Hain, April 1998

//

// Public Interface:

//

// SparseVector(Index ndx)

// pre: none

// post: (sparse) vector with index values in range 0 - (ndx-1)

// is created. Parameter ndx has default value of 0.

//

// operator[](Index ndx)

// pre: none

// post: indexed element of sparse vector is returned, and can

// be used either as lvalue or rvalue.

// exmpls: sv[3] = 3;

// a = sv[4] + 3;

// a = 3 + sv[4];

// cout << sv[4];

//

// void setSize(Index max)

// pre: no non-zero element already exists having index >= max.

// post: (sparse) vector has index range 0 - (max-1).

// E.g., used to initialize arrays of sparse vectors.

//

//////////////////////////////////////////////////////////////////////

#ifndef _SPVEC_H

#define _SPVEC_H

#include "list.h"

typedef Key Index;

class ValueAgent;

////////////////////////////////////////////////////////////////////////

class SparseVector {

List nonZeroList;

Index maxIndex;

public:

SparseVector(Index max = 0);

ValueAgent operator [](Index ndx);

void setSize(Index max);

friend class ValueAgent;

};

///////////////////////////////////////////////////////////////////////

//

// The ValueAgent class is not explicitely used by the (public) client,

// but is used to allow vector elements to be used as both

// lvalues and as rvalues. Objects of this type are returned from

// SparseVector index operations, and act as an agent to provide correct

// behaviour as lvalue (assignment) and rvalue (returning object of type

// Value).

//

// Value() const

// pre: none

// post: casts ValueAgent to public type Value for rvalue usage,

// e.g., in "x = sv[3];"

//

// operator=(Value)

// pre: none

// post: allows sparse vector element to be used as lvalue.

// e.g., in "sv[3] = x;"

//

// ostream& operator<<(ostram os, const ValueAgent va)

// pre: output device is open. Object of type Value has << operator.

// post: allows sparse vector element to be printed,

// e.g., "cout << sv[3];"

//

///////////////////////////////////////////////////////////////////////

class ValueAgent {

SparseVector *sparseVectorP;

Key ndx;

public:

ValueAgent(SparseVector *svP, Index index);

operator Value() const;

const ValueAgent& operator = (Value);

friend ostream& operator<<(ostream &os, const ValueAgent &vo);

};

///////////////////////////////////////////////////////////////////////

//

// inline definitions

//

///////////////////////////////////////////////////////////////////////

 

inline SparseVector::SparseVector(Index max): nonZeroList(), maxIndex(max)

{}

inline ValueAgent SparseVector::operator [](Key ndx)

{ return ValueAgent(this, ndx); }

inline void SparseVector::setSize(Index max)

{ maxIndex = max; }

inline ValueAgent::ValueAgent(SparseVector *svP, Key index)

: sparseVectorP(svP), ndx(index)

{}

inline ValueAgent::operator Value() const

{

Element *e = sparseVectorP->nonZeroList.getElement(ndx);

return e ? e->value() : 0;

}

inline ostream& operator<<(ostream &os, const ValueAgent &vo)

{

os << Value(vo);

return os;

}

#endif

=========================================================================

// spvec.cc

///////////////////////////////////////////////////////////

//

// SparseVector class implementation file

// Written by T. Hain, 1998

//

///////////////////////////////////////////////////////////

#include "spvec.h"

 

const ValueAgent& ValueAgent::operator = (Value value)

{

sparseVectorP->nonZeroList.remove(ndx);

if (value)

sparseVectorP->nonZeroList.add(Element(ndx, value));

return *this;

}

=========================================================================

#ifndef _SPARR_H

#define _SPARR_H

// sparr.h

//////////////////////////////////////////////////////////////////

//

// Class SparseArray -- a sparse array class

// Written by T. Hain, April, 1998

//

// Public interface:

//

// SparseArray(Index maxI, Index maxJ)

// Pre: maxI, maxJ >= 0

// Post: sparse array with given dimensions is created, and

// filled with all zeros.

//

// operator=(SparseArray sa)

// Pre: none

// Post: sa is assigned to the current matrix. Dimensions of

// both matrices are checked to be the same, otherwise

// program is terminated with error message.

//

// SparseVector operator[](Index i)

// Pre: none

// Post: returns the sparse vector in the "i"th row.

// Notes: generally used in conjuction with class SparseVector, e.g.,

// sa[2][3] = x;

// x = sa[2][3];

// cout << sa[2][3];

//

// int maxIndexI() const, int MaxIndexJ() const

// Pre: none

// Post: returns the i and j dimensions of the matrix

//

//////////////////////////////////////////////////////////////////

#include "spvec.h"

class SparseArray {

SparseVector *svp;

Index maxNdxI, maxNdxJ;

public:

SparseArray(Index maxI, Index maxJ);

SparseArray(SparseArray &sa);

~SparseArray();

SparseArray& operator=(SparseArray &sa);

SparseVector& operator[](Index i) const;

Index maxIndexI() const;

Index maxIndexJ() const;

friend ostream& operator<<(ostream &os, const SparseArray &sa);

};

//////////////////////////////////////////////////////////////////

//

// Implementation Notes

//

// SparseArray is implemented in terms of SparseVector.

// SparseVector uses ValueAgent to allow vector element to be used

// as both lvalues and rvalues.

// SparseVector is implemented in terms of List, and its helper

// class, Element and ListNode.

// List stores only nonzero elements of a SparseVector as a circular,

// sentinel controlled linked list in non-sorted order, as their

// Key/Value tuples.

//

//////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////

//

// inline definitions

//

//////////////////////////////////////////////////////////////////

inline SparseArray::~SparseArray()

{ delete [] svp; }

inline Index SparseArray::maxIndexI() const

{ return maxNdxI; }

inline Index SparseArray::maxIndexJ() const

{ return maxNdxJ; }

#endif

=========================================================================

// sparr.cc

///////////////////////////////////////////////////////////

//

// SparseArray class implementation file

// Written by T. Hain, 1998

//

///////////////////////////////////////////////////////////

 

#include "sparr.h"

SparseArray::SparseArray(Index maxSizeI, Index maxSizeJ)

{

maxNdxI = maxSizeI-1;

maxNdxJ = maxSizeJ-1;

testAlloc(svp = new SparseVector[maxSizeI]);

for (int i = 0; i <= maxNdxI; i++)

svp[i].setSize(maxSizeJ);

}

SparseArray::SparseArray(SparseArray &sa)

{

maxNdxI = sa.maxNdxI;

maxNdxJ = sa.maxNdxJ;

testAlloc(svp = new SparseVector[maxNdxI+1]);

for (int i = 0; i <= maxNdxI; i++)

svp[i] = sa.svp[i];

}

SparseArray& SparseArray::operator=(SparseArray &sa)

{

if (this == &sa) return *this;

if ((maxNdxI == sa.maxNdxI) && (maxNdxJ == sa.maxNdxJ))

for (int i = 0; i <= maxNdxI; i++)

svp[i] = sa.svp[i];

else {

cerr << ("SparseArray: cannot assign to a sparse array of diff. size");

exit(1);

}

return *this;

}

SparseVector& SparseArray::operator[](Index i) const

{

if (i > maxNdxI) {

cerr << ("SparseArray: array index (in reference) is out of bounds");

exit(1);

}

return svp[i];

}

ostream& operator<<(ostream &os, const SparseArray &sa)

{

for (int i = 0; i <= sa.maxIndexI(); i++) {

for (int j = 0; j <= sa.maxIndexJ(); j++)

cout << setw(3) << sa[i][j] << " ";

cout << endl;

}

cout << endl;

return os;

}

=========================================================================

// spvec.cc

/////////////////////////////////////////////////////////////

//

// Test driver for SparseVector class

// Written by T. Hain, April 1998

//

/////////////////////////////////////////////////////////////

#include "spvec.h"

main()

{

SparseVector sv1(4);

sv1[0] = 5; // tests of sparse vector element update

sv1[1] = 1;

sv1[2] = 2;

sv1[2] = 0;

sv1[3] = 4;

sv1[3] = 3;

SparseVector sv2 = sv1, sv3(5); // test sparse vector copy constructor

sv3 = sv2; // test sparse vector assignment

for (int i = 0; i < 4; i++) // test sparse vector element access

cout << "sv2[" << i << "] = " << sv2[i] << "\n";

return 0;

}

=========================================================================

output from spvect.cc

sv2[0] = 5

sv2[1] = 1

sv2[2] = 0

sv2[3] = 3

=========================================================================

// sparrt.cc

//////////////////////////////////////////////////////////////////////

//

// Sparse Array test driver

// Written by T. Hain, April 1998

//

//////////////////////////////////////////////////////////////////////

#include "sparr.h"

main()

BEGIN

SparseArray sa(5,5);

sa[1][1] = 0; // test sparse array element update

sa[2][3] = 23;

sa[2][3] = 0;

sa[0][0] = 100;

sa[0][2] = 2;

sa[4][4] = 44;

sa[1][4] = sa[4][1] = 14; // test sparse array element access

sa[4][1] = 41;

sa[2][0] = sa[1][4] + 6; // test use in expression

sa[3][2] = 18 + sa[4][1]; // test use in expression

SparseArray sa2 = sa, sa3(5,5); // test copy constructor

sa3 = sa2; // test assignment

cout << sa;

cout << sa3;

END

=========================================================================

output from sparr.cc

100 0 2 0 0

0 0 0 0 14

20 0 0 0 0

0 0 59 0 0

0 41 0 0 44

100 0 2 0 0

0 0 0 0 14

20 0 0 0 0

0 0 59 0 0

0 41 0 0 44

core used: 0