Machine Learning Projects

Learning Robot Dynamics

Using a public robotic navigation dataset from UTIAS, the project was to develop and apply the Locally Weighted Linear Regression algorithm to learn the dynamics of a robot using the available trajectory and control data. A partner also performed a similar task using a different learning method (Neural Network) and data representation. This project was a study of how different dataset formulations on the same learning method can impact results as well as how different learning methods on the same dataset formulation can impact results.

The two dataset formulations are as follows:

Dataset 1 uses the velocity command as well as one of the current positions to learn the change in state. Dataset 2 uses the current state and control to directly learn the next state. Both datasets expect to accomplish the same task of learning the robot dynamics, but are distinct in the type of inputs and expected outputs. Dataset 1 was prepared by myself, while dataset 2 was prepared by a partner and provided to me.

Both LWLR and Neural Networks are methods of supervised learning, so each dataset was split into a training set and testing set. The testing data was the section used to generate the trajectories below and all other points not shown were used as training data.

Dataset 1

Dataset 2

The plots above show a comparison of the LWLR algorithm applied to the two different dataset formulations for the same approximate section of the trajectory. The odometry curve was generated using the kinematic model for a differential drive robot, the actual/groundtruth curve is true position provided in the original dataset, and the LWR trajectory/prediction curve is the output from the LWLR algorithm. In this example, it is clear that the second dataset formulation was much better at predicting the next state and thus learning the robot dynamics. This is most likely due to the data representation being highly locally linear without many outliers, allowing LWLR to perform very well.

LWLR

Neural Network

The plots above show a comparison of the LWLR algorithm (left) and the Neural Network(right) to dataset 2 for the same approximate section of the trajectory. In this case, both algorithms performed well with a slight edge given to LWLR.

The code for this project is included in the repository linked above as well as the datasets used and detailed reports for the LWLR algorithm and comparison.

MNIST Handwriting Classification

View Repo

This was an experiment to demonstrate how data manipulation can affect the ability to learn classification boundaries. The learning method used was multi-class Softmax combined with mini-batch gradient descent to assist with faster learning.

The MNIST Handwriting dataset consists of handwritten numbers 0-9, where each sample is a 28x28 greyscale image paired with the number that is written. For each test, the data was split into 50000 training samples and 20000 testing samples. Also, each testing used a mini-batch size of 200 with a learning rate of 0.3.

The first test conducted was using the normalized raw greyscale pixel values resulting in an input data size of 784. This test required very minimal data manipulation and the results are shown on the right with the "Pixel Values" label.

The second test conducted was using edge-based histograms to represent each image. To build this new input data array, first, the raw image was separated into non-overlapping 3x3 blocks and then convolved with the horizontal and vertical edge kernels to perform edge detection. This results in two numbers that can be treated like a vector: the x component is the result of the convolution with the horizontal kernel and the y component is the result of the convolution from the vertical kernel. Because a 3x3 block size was selected, the last row and column of pixel information was not considered effectively reducing the image size to 27x27. This also means each image was broken down into 81 blocks.

The next step is to build a histogram for each block's convolution. The histogram bins correspond to the angle of the detected edge in each block and are broken into 22.5 deg. increments spanning 0 to 157.5 deg resulting in 8 total bins. Using two convolution values as a vector, the angle of the detected edge was calculated using the arctan of the vector components. If both of the components were near zero then this means an edge was not detected in that block. After calculating the edge angle, the value for the nearest bin was set to 1 while all others remained at 0. If no edge was detected, all bin values remained at 0.

So now instead of an image being represented as the raw pixel values, it is now represented by 81 histograms, one for each block, resulting in a total input data array of 648 values (8 values per histogram). The results for this data representation are shown with the "Edge-based Histogram" label.

The plots above show the training history for the 50,000 training samples. The training duration was terminated after 30 epochs, but based on the cost history plot further training would have marginal improvements.

Using the learned weights and applying them to the testing data set yields the following result:

Pixel Values: 2091 misclassifications or 10.4%
Edge-based Histogram: 992 misclassification or 4.9%

Check out the project notebook linked above for the code.