Wednesday, March 27, 2013

How to implement a Restricted Boltzmann Machine in C#

If anyone wants to "feel" the difference between Matlab or Python and languages such as C#, I suggest that the first thing they do is try to program basic mathematical fundamentals, such as linear algebra.
C# is a very robust language, however, it does not help you in any way, if you want to simplify coding math oriented subjects.
You basically need to write everything from scratch.

Now, AI is usually built on neural networks, and neural networks usually process data in vectors or matrices.
And if we have matrices and vectors, we usually need outer products, dot products, triangulations, decompositions,etc...

Since Matlab is built for math, and python has the Numpy library, what would take a few lines of code using those languages would take hundreds if not thousands of lines of code in C#.
Now, yes, I am stating the "obvious", you could say that if I had an isomorphic mode code library in C#, it would be almost the same.
That is true, to a degree, even if we would have had all the same functions, the syntax itself is not suited for algebraic manipulations.
For example:
In python if we want to update the 2nd to 4th element in the first column of a matrix with a vector we need only write:
X[2:4,0] = V

Now no matter what methods you have in C#, CLR syntax does not allow such typing.
But that's the thing, isn't it. Let's say I want to test a theory right now, and if it's not good i want to move on.
I don't want to spend hours or days building or searching for helper classes just so i can "test" a simple theory.

This is where I take my hat off to Matlab and even Python, and I strongly recommend that although I think that .net (c#) is one of the best pieces of software engineering languages ever created, the simplicity and readability of Matlab / Python has a greater advantage in rapid development, specifically for research.

A Restricted Boltzmann machine is an interesting unsupervised machine learning algorithm. It is a generative stochastic neural network that can learn a probability distribution over its set of inputs.

Feature extraction really gets interesting when you stack the RBMs one on top of the other creating a Deep Belief Network.
I will not go into the theory of the Boltzmann machine, regular or restricted.

I recommend looking at the original papers by Geoffrey E. Hinton, Yoshua Bengio and more.
Also E. Chen's post on the subject and python implementation is very good and intuitive.

A word about Arrays in C#:
Standard multidimensional arrays in C# are similar in syntax to C++ and take the form of (e.g.) int[][].
They are called Jagged arrays.
In C# "Multidimensional" arrays are actually typed as: int[,] meaning they are in x,y Cartesian coordinate system.
These are much more intuitive, and require no further initialization of inner rows.
However, due to a poor CLR implementation they are much slower than jagged arrays.
Thus requiring even further code to be written in order to achieve simple and readable code for building our neural networks.

In my project I build a matrix class, and a vector class (which i don't really use) actually I think that the vector class should be inherited from a matrix class.
Our matrix class encapsulates a jagged array and has a multidimensional indexer for our ease of use.
This is the basis of creating clean and short code for our algorithm.

A Single RBM is inefficient in extracting small number of features from a large input source, so stacked RBMs (what are called deep learning networks) provide a more efficient way to spread the feature learning from layer to layer.

In the following example we trained a dual layer RBM, 1024 visible inputs to 50 hidden outputs and on to 16 hidden outputs in the 2nd layer.
We trained 100 handwritten digits, 50 epochs for the first layer, and 1500 epochs for the 2nd layer.
As you can see, the original digit is on the left, and the reconstructed (from 16 boolean features) image is on the right.
It is quite similar to the original but not perfect, cause naturally it learned other examples or writing the number "2".


00000000011111100000000000000000
00000000011111111000000000000000
00000000111111111100000000000000
00000001111111111100000000000000
00000001111111111110000000000000
00000000111111111111100000000000
00000001111111111111100000000000
00000001111111011111100000000000
00000001111110011111100000000000
00000011111110011111100000000000
00000000111111011111110000000000
00000000111110011111100000000000
00000000011110011111100000000000
00000000001100111111100000000000
00000000000000011111100000000000
00000000000000011111100000000000
00000000000000111111100000000000
00000000000000111111000000000000
00000000000000111110000000000000
00000000000000111111000000000000
00000000000001111110000000000000
00000000000001111110000000000000
00000000000011111100000000000000
00000000000011111110000000000000
00000000000011111100000000000000
00000000001111111111000000000000
00000000011111111111111111100000
00000000111111111111111111110000
00000000111111111111111111111000
00000000111111111111111111111100
00000000111111111111111111111100
00000000011111111111111111111000

00000000010001001000000000000000
00000000011011111000001100001000
00000000001111110100000000000000
00010000111111111011100000000000
00000011111111111101000100000000
00000011011101011111111001000000
00000011101110111111111000000000
00000001111000000111111000000000
00000001110010010111100000000000
00000110111000000011011000000000
00000000110101001111110000000000
00000001011000101111100000000000
00100001001010001111000000000000
00000001111000101111111000000000
00000000001000001110101000000000
00000000000010001011110001000000
00000000110010001110100001000000
00000000000100111111000000000000
00000010100000110111011001010000
00000000100001101111010000000000
00000000001000011110011001000000
00000011000101111100000011000000
00000000000000111000000000100000
00000000101101111100000101000000
00010000010111011010000100000000
00000000000111111001100100100000
00000100011111111010111000000010
00000000111111110111111111111100
00000000111111111111011111111100
00000000111111111111101101111000
00000000001101100111101101100000
00000001010110000000010001011000

So now the challange is to make our computations as efficient and fast as possible to make it more realistic for applications.

I've added the "Daydreaming" functionality.
It requires more study on my part, but it is interesting to see what features pop up when random data is inputted in to the machine.

Please feel free to view/download or fork my code on Github

Further work to be done:
This is "Simple" example.
Next phase is to add Greedy-Layer Wise Pre-Training, and supervised fine tuning.

Sources and further reading:

"Learning Deep Architectures for AI" By Joshua Bengio
"A Practical Guide to Training Restricted Boltzmann Machines" By Geoffrey Hinton
E.Chen's Python implementation 

Sunday, March 3, 2013

Nested Class Constructor Accessibility

I wanted to share this very nice, small, creative piece of code that I've stumbled upon while looking for different perspectives on the subject.

The problem at hand is, let us say, that you have a Class, a "Parent" or "Container" class, and you want this class to have the ability to instantiate other classes.
This is a very real and sometimes serious situation, where access to constructor is forbidden to everyone and everything except one controlling object.

One "usual" pattern is using a factory builder. however, usually you will be able to either access the constructor still, or accessing the factory method from almost everywhere.

There are a few good solutions out there involving interfaces. The only downside to them is that they require quite a bit of code to be written down, for such a simple functionality.

And to be honest, what are we asking for???
To have a class with a nested class that only the containing class can access!

http://stackoverflow.com/questions/1664793/how-to-restrict-access-to-nested-class-member-to-enclosing-class

Here you can see different implementations that deal with this problem.
Check out Ray Burns' answer: Answer


public class Journal
{
  private static Func<object, JournalEntry> _newJournalEntry;

  public class JournalEntry
  {
    static JournalEntry()
    {
      _newJournalEntry = (object) => new JournalEntry(object);
    }
    private JournalEntry(object value)
    {
      ...

Here we can see, that in the nested class's static constructor we initialize a delegate to our true nested class's constructor, this is a private static field in the containing class, so it can only be accesses within the containing class.

Thus, in this example, JournalEntry can only be instantiated by functionality that has to be implemented inside the Journal object.

It might seem trivial to some, and might seem complex to others, I must admit I never thought of attacking the subject in this fashion, and it is quite creative.
Kudos.