Holy Hand Grenade of Antioch
Contents
Updates / FAQs
- 2019-08-08 First draft
Overview
In this optional project you will implement a an agent that trades in a simulated environment that includes dozens of other trading agents. The success of your contributed code and your score on the project will depend on how profitable your agent's trading is. The following rules apply:
- Your agent starts each morning with $100,000 in cash.
- You will trade only one asset, JPM.
- Trading begins at 9:30AM, the market closes at 4:00PM.
- Your score depends on the value of your portfolio as of market close, including cash and stock positions.
- Your agent should never initiate a trade that will cause your portfolio to exceed a leverage of 1.0.
Background:
- A documentary film about the Holy Hand Grenade of Antioch: https://www.youtube.com/watch?v=xOrgLj9lOwk
About the ABIDES simulator and getting started
You will implement your trading agent to run within the Agent-Based Interactive Discrete Event Simulation (ABIDES). ABIDES was designed by Prof. Tucker Balch and David Byrd at Georgia Tech. David is the lead architect and developer of ABIDES. The development of ABIDES has been supported by the NSF and It is now available as open source at GitHub.
ABIDES is used in research at J.P. Morgan to develop and evaluate trading algorithms and models of market structure.
- Please see our arXiv paper for preliminary documentation: https://arxiv.org/abs/1904.12066
- This wiki page describes the environment your agent will run in: https://github.com/abides-sim/abides/wiki/SparseDiscreteMeanRevertingMarket
Get the ABIDES simulation distribution at GitHub here: https://github.com/abides-sim/abides
Note that you should never issue a pull request to this repo. It will expose your code to others and you will be perceived as uncool. To run an example simulation in the default configuration of background agents use the following Unix command lines:
[XX please replace the below with correct command lines]
cd *blah* python *blah*
You will find a subdirectory
ABIDES/contributed_traders/*blah*
The contributed_traders directory is where we collect agents that will contribute to our ecosystem of traders. We hope, perhaps to include yours there in the future. There is a subdirectory for each participating agent. Note that the subdirectory *blah* contains a basic example agent after which you can pattern the trading agent you design.
[XX I made up the above as a way to handle (eventually) many contributed traders. Let me know if you end up thinking it is a decent approach or if you have a better way to handle it.]
What you should do
Create your own directory:
ABIDES/contributed_traders/yourID_agentname
Where ID is a set of characters that uniquely identify you. If you are a Georgia Tech student for instance, this should be your login ID (e.g., mine is tb34). "agentname" is your specific name for this agent. We should have separate names for agents submitted by the same person because in the future you might improve the one you wrote, or you might want to contribute a new one with a different name. An example directory and agent name then might be tb34_shark. Copy the template code into that subdirectory and rename it "yourID_agentname.py" and be sure also that your agent's class name is "yourID_agentname"
You can assume that your agent will have read and write access to "your" subdirectory. So you can store a learned policy there and perhaps update it between runs. Do not attempt to read or write from any other directory. It is required that you follow the path name conventions used by the example agent provided, namely that the subdirectory location is relative. If you do not follow that convention, your code will break and we will not grade it. [XX note that this will require someone to add that example into the distributed template]
Take advantage of the Agent.kernelInitializing() method to initialize your agent. This is where, for instance, you can read in a policy if you have trained one up. Also use Agent.kernelTerminating() as the trading day wraps up to save any logs or new learning results.
Be sure to update the author(), agentname(), and number_of_counting() methods to report yourID, your agent's full name (i.e., yourID_agentname) and the correct number of counting. We will call them at evaluation time in order to build a report that describes the performance of the contributed agents.
Now, build an agent that performs well through a trading day.
At the end of the day, we will call your agent's getportfoliovalue() method to see how profitable it's trading has been. [XX do we have such a method? If not, we probably need one.]
How to submit your entry
When you are ready to submit your code take the following steps to create a zip file of your code. Note that this may be the only situation in this course for which we allow submission of a zip file.
cd ABIDES/contributed_traders zip -ry yourID_youragentname.zip yourID_youragentname
You should submit that zip file to us for assessment. Note that we will copy your zip file to our own copy of ABIDES/contributed_traders and unzip it there. It will be irritating if the wrong stuff appears at that time, so please be sure you zip it correctly.
Strategy, evaluation and scoring
Your trader will be evaluated in two ways, first we will run it in the provided environment of other trading agents and evaluate whether or not it is profitable. The majority of agents populating the simulation are ZI or Zero Intelligence agents. The ZI agents assume a mean reverting market, and they trade based on that assumption. You can exploit that information for your own profitability.
Note that our intent is to exercise your agent across an entire day of trading, but if the simulation takes too long to run we may need to shorten that time. You can assume the market will open at 9:30AM, but don't assume that you can wait until 4:00PM to make a profit.
Scoring Part 1 (up to 50 points): We will run your trader 10 times and count the number of times it is profitable. Note that we would expect a trader that acts randomly to be profitable 50% of the time, so you must do better than random to get any points. The Part 1 scoring will work as follows:
- +25 points if profitable more than 50% of the time.
- +5 points * (number of times profitable - 5). Minimum 0 points, maximum 25 points for this part.
Examples: Profitable 6 times, Part 1 score = 30 points. Profitable 2 times, score = 0 points.
Scoring Part 2 (up to 50 points): Your trader is only eligible for this phase if it scored more than 0 points in Part 1. We will add your trader together with the other traders that scored more than 0 points in Part 1. They will trade against each other along with the other agents present in the provided environment. We will group the traders by quintiles, and award points as follows:
- +50 points if in top quintile
- +40 points if in second quintile
- +30 points if in third quintile
- +20 points if in fourth quintile
- +10 points if in fifth quintile
Grading policies
- Late submissions cannot be considered.
- Please attempt this project only for the fun of it, because you enjoy the trading problem. It is likely to require a lot of effort that may not be worth it in terms of the points received. Do not count on points from this assignment to make a change in your grade.
- We will test your code on one of the buffet servers. If your submitted code does not work because you did not test it on a buffet server, we will not consider it further.
- If your code fails to run for other reasons, we may not grade it, even if it is not your fault.
- There are many ways to cheat on this project, such as to "forge" the component that returns your profit, or to use extra leverage. We will check for these things and treat them as violations of the honor code.
Can we use your code?
One motivation for this effort is to create an "ecosystem" of trading agents that will combine to provide a realistic environment for trading research. If your agent performs well (and runs fast) we may want to add it to our publicly distributed collection of trading agents. Note that it is our intention to include the best agents from this semester into the distribution for use next semester, so the next group of students will have to compete against your contributed agents. We would be pleased if you will allow us to use your code!
If you are willing to allow us to distribute your agent code, please add the following comment to all of the files you submit:
# Author: Your Name # Agent Name: # # The author of this code hereby permits it to be included as a part of the ABIDES distribution, # and for it to be released under any open source license the ABIDES authors choose to release ABIDES under.
Note that if you include other people's code (e.g., snippets from scikit learn) we may not be able to use your contribution.
Hints & resources
The main difference between this problem and the earlier one is that you must deal with continuous state. Deep Q-Learning is one approach to this problem. You are welcome also to consider other solutions if you like. Here are some links to Deep Q-Learning approaches:
- This blog is a good starting point: http://karpathy.github.io/2016/05/31/rl/
- An overview of Deep RL: https://arxiv.org/abs/1701.07274
- An article in Nature: https://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
What to turn in
Turn your project in via t-square. All the code necessary to run your learner must be submitted. We will call only your methods in CRLearner following the specification described above. You are allowed to access/use library code, but it must be submitted and run as .py files. If you do use code that was not written by you, you must include comments providing proper credit and citations.
- Your CRLearner as CRLearner.py
- Other python files as necessary to support your learner.
Rubric
Only your CRLearner class will be tested.
- The code for the learner must reflect an effort to create a continuous state learner (not a repackaged discrete state learner like Q-Learning).
- We will create a number of groups of test cases, where each group reflects essentially the same navigation problem but with progressively higher resolution. e.g., multiple square worlds of different sizes 5x5 world, 10x10 world, 100x100, 1000x1000, etc. Your learner will not know the dimensions of the world it is in.
- We will test your learner against N (value of N to be determined later) test worlds with 500 iterations in each world. One "iteration" means your robot reaches the goal one time, or the simulation times out. Your CRLearner retains its state, and then we allow it to navigate to the goal again, over and over, 500 times.
- Benchmark: We do not have a reference solution for this problem. We will instead use the best student's submission as the benchmark. We will select a number of test cases that the benchmark can solve, then use those as the cases we test other submissions against. We will take the median reward of the benchmark across all of those 500 iterations.
- Your score: For each world we will take the median cost your solution finds across all 500 iterations.
- For a test to be successful, your learner should find a total reward >= 1.5 x the benchmark.
- There will be 10 test cases, each test case is worth 9.0 points.
- Is the author() method correctly implemented (-100% if not)
Required, Allowed & Prohibited
Required:
- Your project must be coded in Python 3.x.
- Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
- All code required to run the agent must be submitted. We will not debug your code.
- All code in your submitted agent must be written by you.
Allowed:
- You can develop your code on your personal machine, but it must also run successfully on one of the university provided machines or virtual images.
- Your code may use standard Python libraries.
- You may use the NumPy, SciPy, matplotlib and Pandas libraries. Be sure you are using the correct versions.
- You may reuse sections of code (up to 5 lines) that you collected from other students or the internet.
- Code provided by the instructor, or allowed by the instructor to be shared.
- You may reuse code from the internet that you include as support files (it must be credited and cited).
Prohibited:
- Any libraries not listed in the "allowed" section above.