CSCI 479 -- Machine Learning
Spring 2026 - Assignment 2
Submit deadline: 10:00, 13 February 2026, Friday

This assignment uses the same data format used by Netflix Prize which is a competition supported by Netflix about 10 years ago when Netflix wanted to have the best recommendation model it could get.

Between the training data set and the testng data set, there are altogether 12,672 ratings that 1000 users (the first 990 users are in the training data set and the last 10 users are in the testing data set) provided on 100 movies in twenty days. Each data item is a quadruplet of the form <user, movie, date of grade, grade>. The user and movie fields are integer IDs of the user and the movie respectively, the date of grade takes the format of "yyyy-mm-dd", and grades are from 1 to 5 (integral and inclusive) stars.

Note that you should only use these numbers as references. You should explore the data sets to get a more accurate understanding of the data.

Your tasks:

Overall, your task is to build a similarity based model to recommend movie(s) to the 10 users in the testing data set.

Specifically, you need to:

  1. design and implement a program to transform the data set from its current format to a traditional vector format where one user is one vector.
    You don't have to use this recommended user vector format. You can propose your own data format if 1) your own data format can be used consistently in your similarity function and recommendation algorithm; and 2) you can justify that your recommendation system using your own data format performs at least not too worse compared with the traditional vector data format and its associated recommendation algorithms.
  2. Design and implement a similarity (or distance) function that calculates the distance between any two user objects.
    (The fist two parts are the same tasks as described in Lab 3.)
  3. design and implement the K Nearest Neighbors algorithm to make recommendations to a user, using the 10 users in the testing data set as test cases.

What to Submit:

Submit a document that explains the whole process of building and using your recommendation system. This document should include at least the following sections:

How to submit:

Choose one of the following two ways to submit your work:


Last updated: January 27, 2026