This week I worked on a new algorithm class and analysis of our two algorithmic classes for what components can be optimized. The new algorithm class is a UCB1-based load balancer with a key difference being that exploration is enforced directly on the allowed server choices at decision time rather than being a component of score. This is done by maintaining two data structures; one contains the set of k server indices least recently chosen and the other is an ordered list of n-k servers most recently chosen. After a choice is made, the structures are updated accordingly. The least-recent choice is made via some scoring system that is variable. We will test both a direct best response time choice as well as a random variable with response-time based distribution. The number k is also an adjustable variable between 1 and the server count n.
I also find two adjustable variables of interest in last week’s e-greedy algorithm. Specifically, the random variable of epsilon, which can range from 0 to 1 exclusive, the number of round robin iterations for initial exploration, and the behavior to take in the non-epsilon case. This behavior can be easily tested with both random and round robin in the 1-epsilon cases. Next week I hope to create two additional load balancers that use these algorithms with our new metric of Network I/O (regarding which I will also assist Nakul in the data retrieval portion).