Programming, Scripting and Linux: December 2009

Tuesday, December 22, 2009

C++ Defining Multi Dimensional Arrays with empty brackets []

An array with 1 dimension can be defined as
int matrixA[] = {1,2,3,4};

Say we want to define a 2 dimensional matrix. It seems logical to do:

int matrixA[][] = {
                   {1,2,3,4},
                   {5,6,7,8},
                   {1,3,5,7},
                   {2,4,6,8}
    };

However this gives a compile error:

error: declaration of `matrixA' as multidimensional array must have bounds for all dimensions except the first

If we get rid of the inner curley brackets, like this, we still get the same error:
int matrixA[][] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

Note: the inner curley brackets did not mean anything to the compiler anyway, they just made the code more readible for humans.

Finally if we get rid of a square bracket in our defination, it will compile:
int matrixA[] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

However, we have now declared a 1 dimensional matrix and not a 2 dimensional matrix, which is not what we wanted.

But here is the clue as to what is going wrong. When we initialise an multi dimensional array like this:
int matrixA[4][4] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

and we access the last element by matrixA[3][3]

As the array is stored as one block of memory, this is equivalent to
int matrixA[16] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

and we access the last element by matrixA[15]

Since the compiler is just allocating 16 consequitive elements when we declare matrixA[4][4], it does not really know about the 4 rows and 4 columns in our matrix. Therefore when we do

int matrixA[] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

the compiler knows that this is the same as
int matrixA[16] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

But when we do
int matrixA[][] = { {1,2,3,4}, {5,6,7,8}, {1,3,5,7}, {2,4,6,8} };

which to the compiler is the same as
int matrixA[][] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

The compiler does not know that what we want is a table of 4 rows and 4 columns

However we can help the compiler by telling it that this is a 2 dimensional array where the number of elements in each row is 4:

int matrixB[][4] = { {1,2,3,4}, {5,6,7,8}, {1,3,5,7}, {2,4,6,8} };

or (when the inner brackets are removed)

int matrixB[][4] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };

Since we discuseed that
int matrixA[4][4]
is the same as
int matrixA[16]

You might think we could try to cast a one dimenional matrix to a 2 dimensional matrix like this:

int matrixA[] = { 1,2,3,4, 5,6,7,8, 1,3,5,7, 2,4,6,8 };
    int matrixB[4][4];
    matrixB = matrixA;

But this gives us

error: incompatible types in assignment of `int[16]' to `int[4][4]'

Even using a reinterpret_cast does not help e.g.

typedef int** matrix_type;
int matrix3[4][4] = reinterpret_cast(matrixA);

gives us

error: invalid initializer

If we declare matrix3 this way:

typedef int** matrix_type;
int** matrix3 = reinterpret_cast(matrixA);

The code will compile, but I have not found any documentation that says this is legal and at best we can expect undefined behaviour.

Another behaviour you may find when working with multi dimensional arrays is that while these are valid:
int matrixA[4][4] = { {1,2,3,4}, {5,6,7,8}, {1,3,5,7}, {2,4,6,8} };
int matrixA[4][4] = { 1,2,3,4 , 5,6,7,8 , 1,3,5,7 , 2,4,6,8 };

This is not valid
int matrixA[16] = { {1,2,3,4}, {5,6,7,8}, {1,3,5,7}, {2,4,6,8} };
gives error: brace-enclosed initializer used to initialize 'int'

Although the compiler ignores enclosed brackets shown above for the definition of matrixA[4][4], if you supply enclosed brackets, it expects you to define the array as matrixA[4][4] and does not accept matrixA[16].

Monday, December 7, 2009

compiling C++ pthreads code from vim or eclipse

In vim I used this for compiling an ordinary C++ program with
set makeprg=g++\ -ggdb\ -o\ %<\ %

But if I'm using pthreads, I need to specify the pthread library
set makeprg=g++\ -ggdb\ -lpthread\ -o\ %<\ %

And if using Boost::thread this is required
set makeprg=g++\ -ggdb\ -I/usr/include\ -lpthread\ -lboost_thread\ -o\ %<\ %

In eclipse we must specify the pthread library:
1) Select the Project
2) Go to
File->Properties->GCC C++ Linker
3) Add pthread to "Libraries (-l)

Thursday, December 3, 2009

Pitfalls class and instance variables in Python

Here is how you can get really confused with class (static) and instance variables in Python. In this short tutorial I am using IPython which is why each input starts with 'In' and output begins with 'Out'

Create a class C
In [1]: class C(object):
   ...:     x = 10
   ...:

Then check that x exists in the namespace of C
In [2]: print C.x
10

In [3]: print C.__dict__
{'__dict__': , 'x': 10, '__module__': '__main__', '__weakref__': , '__doc__': None}

Now create an object of the class C
In [4]: foo = C()

And now check that x exists in the namespace of foo
In [5]: foo.x
Out[5]: 10

Just to double check, lets examine the __dict__
In [6]: print foo.__dict__
{}

What actually occurs when you do foo.x is that the object foo is checked for an instance variable x which fails. Then the class C (which is foo's type) is checked for the class variable x. The variable x is found and the value is printed. The pitfall here is that it now looks like the object foo has an instance variable called x with a value of 10.

Now lets try to modify the value of x in the class
In [7]: C.x = 11

In [8]: print C.x
11

In [28]: print foo.x
11

Here you might be surprised that foo.x was updated, unless you know that foo.x is actually printing out the value of C.x

Now lets try to modify the value of x in the object foo
In [9]: foo.x = 12

In [10]: print foo.x
12

In [11]: print C.x
11

Note that foo.x was set to 12 as we expected, but C.x remains as 11.
The reason for this is that when foo.x is assigned a value, the object foo creates a new instance variable called 'x'. This can be checked by looking at the namespace of foo:
In [12]: print foo.__dict__
{'x': 12}

Remember before we gave foo.x a value the namespace of foo did not contain any variables.

The difference between using instance and class variables inside of class methods is that you use 'self' to refer to instance variables. Here we create a class C which has both a class and instance variable called 'x':

   In [13]: class C:
   ....:     x = 10
   ....:     def __init__(self):
   ....:         self.x = 11
   ....:

In [14]: bar = C()

In [15]: print C.__dict__
{'x': 10, '__module__': '__main__', '__doc__': None, '__init__': }

In [16]: print bar.__dict__
{'x': 11}

This shows that the C has a class variable x of value 10 and the object bar has an instance variable x (self.x) of value 11

My recommendation is to not use the same name for class and instance variables in the same class.

Wednesday, December 2, 2009

Difference between a copy constructor and assignment operator

Assignment Operator:
MyClass* x = new C;
MyClass* y = new C;
x = y; // here x.operator= is called
or see example below when pointers are not used

Copy Constructor
Is called in many instances including:
- Pass by value into a method
- Returning by value from a method

Implemented by
MyClass* x = new MyClass;
MyClass* y = new MyClass (x);

Example Code:

#include
using namespace std;

class MyClass {
    public:
        MyClass() : value(0) {cout << "default ctr called" << endl;}
        ~MyClass() {cout << "default dtr called" << endl;}

        // Copy Constructor
        MyClass(const MyClass& other) {
            cout << "copy ctr called" << endl;
            value = other.value;
        }
        
        // Assignment Operator
        MyClass& operator= (const MyClass& other) {
            cout << "assignment operator called" << endl;
            value = other.value;
            return *this;
        }
        void setValue(const int& newValue) { value = newValue; }
        int getValue() { return value; }
    private:
        int value;
};

int main () {
    MyClass x;
    MyClass y;
    x.setValue('x');
    y.setValue('y');

    cout << "x is now " << x.getValue() << endl;
    cout << "y is now " << y.getValue() << endl;

    cout << "calling \'x = y\'" << endl;
    x = y; // assignment operator

    cout << "x is now " << x.getValue() << endl;
    cout << "y is now " << y.getValue() << endl;

    cout << "calling 'MyClass z = y'" << endl;
    MyClass z = y; // copy constructor

    cout << "x is now " << x.getValue() << endl;
    cout << "z is now " << z.getValue() << endl;
}

Expected Output:
default ctr called
default ctr called
x is now 120
y is now 121
calling 'x = y'
assignment operator called
x is now 121
y is now 121
calling 'MyClass z = y'
copy ctr called
x is now 121
z is now 121
default dtr called
default dtr called
default dtr called

In summary, if assigning values to an object which already exists, the operator== method is used.

However if assigning values to a new object, the copy constructor is used

The law of the big 3

If a class owns a pointer to some object (e.g. heap memory or a file handle), the following should be created
- copy ctr
- destructor
- assignment operator

If a class is to be used in a template, the following are required
- copy ctr
- destructor
- assignment operator

Other things for templates which might be required:
- default ctrs
- equality operator
- less than operator

Programming, Scripting and Linux