ML4T Software Installation
Contents
Attention
The information on this page is for those who are interested to have a Python development environment on their own machine. Keep in mind that even if you set up your own environment, your code still must run correctly on the GT servers. Please see ML4T_Software_Setup for information on how to use those servers, and how to check out the code scaffolding for the projects.
Overview
There are two main environments available to you to develop and test your code for this class:
- An Ubuntu Linux image we have created that you can run in a VM on your machine
- One of several high performance machines at Georgia Tech
Both of these have been set up with the same, correct software libraries. Your code MUST run properly in one of these environments, otherwise it may not run correctly in our auto grader. If your code fails to run in the auto grader environment, you might not get credit for the assignment. So it is very important that you ensure that you have access to one of these environments.
You may, for convenience, choose to also manually install the software on your personal machine. Keep in mind, however, that this is not officially supported and it is at your own risk: ML4T_Software_Manual_Installation
Important note: We use a specific, static dataset for this course, which we will provide. If you download your own data from Yahoo (or elsewhere), you will get wrong answers on assignments.
Install, set up and test a virtual machine
If you don't want to connect remotely to GT machines, we have created a VM image with the same operating system and software libraries that you can download here:
- Download: ml4t-image-0.1.zip (md5 1dca6c95b48817d826a20bed478c89f4)
and run using VirtualBox, Oracle's open source VMM. The credentials for the main account on this image use ml4t (case sensitive) as both the username and password, should you need to make changes. This image is configured without any optimizations enabled to be as platform agnostic as possible, but we encourage you to enable hardware acceleration, graphics acceleration, and guest additions to improve performance (changing the virtual disk image from dynamically allocated to fixed may also improve performance).
Optional software
Data
It is no longer necessary to separately download and install the data. It is provided as part of the github repository.
Test installation
Test your environment by running the script validate_env.py from the ml4t/ directory:
python validate_env.py
If it complains, or if any of the installed library versions are older than the desired versions, fix the problems, and then repeat.
A clean output from validate_env.py is required for MC1-Homework-2.