Supervised Learning of Sign Language Characters Project Description For this project, you will need to download and install the Weka machine learning code from www.cs.waikato.ac.nz/ml/weka/. This machine learning code can run different learning algorithms on the same input data. The learning algorithms that you will test are Naive Bayes, a multilayer perceptron, the IB1 instance-based learner, and the J4.8 decision tree algorithm. You will find details on specifying the input data and attributes, selecting classifiers, and interpreting the output in the Weka tutorial downloaded with the code. More extensive details are found in the "Data Mining" book written by Witten and Frank. Weka is designed to accept an ARFF file as input. Example input files are found in the data directory. The name of the learning problem (the relation) is specified, followed by the attributes (they can be nominal or real) and the data. We will use the learning algorithms to recognize sign language letters. This type of learning problem has potential for use not only in automatically recognizing and understanding sign language, but also for performing gesture recognition and other related image-based recognition tasks. I have downloaded 6 25x25, black and white images for each of the letters "c", "d", and "e". These are stored in PGM (ascii) format. Each of the 400 pixels (features) is represented by a value in the range 0-255. a) For your first step, you will use the specified machine learning algorithms implemented in Weka to learn a two-class concept that distinguishes the sign language "c" letters from the "d" letters. Submit the input files that you used, the output concept that was generated, and test the models on the training data. b) Next, test the performance using 3-fold cross validation. How does this affect the performance results, and why? Comment on the performance of each algorithm - why do you think some algorithms outperformed others? Why are the results poorer here then when the training data was used for testing. c) Devise a method of using these learning algorithms to learn a three-class problem that distinguishes the "c", "d", and "e" letters from each other, as a set of two-class problems. Explain the method you used, submit the input and output files, and summarize the results. d) Compare and contrast the concept representations that the alternative learning algorithms provide. Note that these algorithms provide a visualization option in Weka to help interpret the generated concept. What are some of the advantages and disadvantages of the alternative representations? e) Finally, test one mechanism for improving the classification accuracy of the learning algorithms. This mechanisms may include adding more training data, thresholding the images (values below x are mapped to 0, the rest are mapped to 255), or another improvement that you design. Provide a discussion of your enhancement and summarize the cross-validation results.