ML4T Software Installation

From Quantitative Analysis Software Courses
Jump to: navigation, search

Attention

The information on this page is for those who are interested to have a Python development environment on their own machine. Keep in mind that even if you set up your own environment, your code still must run correctly on the GT servers. Please see ML4T_Software_Setup for information on how to use those servers, and how to check out the code scaffolding for the projects.

Overview

There are two main environments available to you to develop and test your code for this class:

  1. An Ubuntu Linux image we have created that you can run in a VM on your machine
  2. One of several high performance machines at Georgia Tech

Both of these have been set up with the same, correct software libraries. Your code MUST run properly in one of these environments, otherwise it may not run correctly in our auto grader. If your code fails to run in the auto grader environment, you might not get credit for the assignment. So it is very important that you ensure that you have access to one of these environments.

You may, for convenience, choose to also manually install the software on your personal machine. Keep in mind, however, that this is not officially supported and it is at your own risk: ML4T_Software_Manual_Installation

Important note: We use a specific, static dataset for this course, which we will provide. If you download your own data from Yahoo (or elsewhere), you will get wrong answers on assignments.

Install, set up and test a virtual machine

If you don't want to connect remotely to GT machines, we have created a VM image with the same operating system and software libraries that you can download here:

and run using VirtualBox, Oracle's open source VMM. The credentials for the main account on this image use ml4t (case sensitive) as both the username and password, should you need to make changes. This image is configured without any optimizations enabled to be as platform agnostic as possible, but we encourage you to enable hardware acceleration, graphics acceleration, and guest additions to improve performance (changing the virtual disk image from dynamically allocated to fixed may also improve performance).

Optional software

Data

It is no longer necessary to separately download and install the data. It is provided as part of the github repository.

Test installation

Test your environment by running the script validate_env.py from the ml4t/ directory:

   python validate_env.py

If it complains, or if any of the installed library versions are older than the desired versions, fix the problems, and then repeat.

A clean output from validate_env.py is required for MC1-Homework-2.