Mitul’s Status Report for 4/30

This week I worked on a new algorithm class and analysis of our two algorithmic classes for what components can be optimized. The new algorithm class is a UCB1-based load balancer with a key difference being that exploration is enforced directly on the allowed server choices at decision time rather than being a component of score. This is done by maintaining two data structures; one contains the set of k server indices least recently chosen and the other is an ordered list of n-k servers most recently chosen. After a choice is made, the structures are updated accordingly. The least-recent choice is made via some scoring system that is variable. We will test both a direct best response time choice as well as a random variable with response-time based distribution. The number k is also an adjustable variable between 1 and the server count n.

I also find two adjustable variables of interest in last week’s e-greedy algorithm. Specifically, the random variable of epsilon, which can range from 0 to 1 exclusive, the number of round robin iterations for initial exploration, and the behavior to take in the non-epsilon case. This behavior can be easily tested with both random and round robin in the 1-epsilon cases. Next week I hope to create two additional load balancers that use these algorithms with our new metric of Network I/O (regarding which I will also assist Nakul in the data retrieval portion).

Team Status Report for 4/30

This week, we worked as a group to consolidate and finalize our testing implementation which currently consists of 3 user classes that run 20 parallel users for a total of 60 for a 1 hour testing plan. We also expanded our server architecture implementation plan to include 4 different virtual machine setups. Each of these has either uniform or varied VM hardware specifications as well as uniform or varied geographic locations. All of this will create additional contexts for testing the relative performance of our load balancers in different environments and use cases.

We also have finalized our load balancer algorithm specifications, including what variables in our custom algorithms can be optimized with recurrent testing. Over this next week, we hope to apply our testing suite to the algorithms for both tuning our custom decision-makers as well as doing final comparisons and data presentation. We will compile this final information into our last project deliverables (poster, video, and final report).

Mitul’s Status Report for 4/23

This week I leveraged the response time header implementation I completed last week as a metric for load balancing decisions. I completed our first custom load balancing algorithm, which is based on a simple solution to the multi-armed bandit problem. Multi-armed bandit is a classic reinforcement learning scenario in which a gambler (decision maker) must repeatedly play one of many slots machines with initially unknown reward probabilities, trying to maximize reward. I find multi-armed bandit to be highly related to the load balancing decision, where the performance of each load balancing decision can be thought of as the slots rewards. The key dilemma is between exploitation (choosing the best current estimated server) vs. exploration (improving estimates for all servers).

The algorithm is based on pattern called epsilon-greedy. It starts with an initialization period where response time info is collected for each server in round robin fashion. Then, a random variable between 0 and 1 is created for each decision. If it is less than epsilon, the the server with lowest average response time (exploitation). Otherwise, the server chooses one of any servers at random or in round robin fashion (exploration). A key difference between load balancing and multi-armed bandit is that server performance is not fixed. Thus, response times are only maintained for the most recent 10 values are retained to ensure data relevance and limit memory needs.

I also wrote an outline for our final presentation slides and am almost finished on another load balancing algorithm based on UCB1 reinforcement. Over the next week, I plan to complete this new algorithm and make two similar ones to the e-greedy and UCB1 based ones by leveraging a new metric that our team is implementing for the video servers, network input/output volume.

Team Status Report for 4/23

This week we met several times as a team to specify some important implementation details of our project. On testing, we adjusted our previous baseline of a custom built server that sent HTTP-requests to a plan to use a third-party load testing platform that could run user simulation scripts. This change was made to accommodate the need for realistic user request patterns (requesting video chunks according to the different buffer rates of videos with different specs (dynamism, resolution, etc.) , which would be difficult without direct HTML interaction. Furthermore, a third party would allow us to consolidate user-collected metrics into one platform that can compile into graphs and report this directly. We attempted debugging several different platform implementations (Flood.io, dot-com.monitor, BlazeMeter) and script types (Taurus, Selenium, JMeter) but do not have a fully satisfactory load testing setup yet. We hope to have that early next week.

We also adjusted our plan for video server metric collection. Some analysis of the impact of video server responses showed us that Network I/O is a much more likely bottleneck for our video servers than processor utilization so we now intend to find and retrieve that metric instead for a custom load balancing decider. The metric would either be found via asynchronous background monitoring in our own program or via API calls to an Amazon Cloudwatch monitor. Over the next week, we hope to integrate our testing environment fully and finish three custom load balancing deciders for comparison.

Mitul’s Status Report for 4/16

This week I worked on the processor utilization percent based traditional load balancer as well as response time calculation for our custom load balancer. We originally planned to get processor utilization metrics from AWS CloudWatch, an interactive monitoring tool for AWS EC2 instances. Unfortunately, I had issues trying to collect data in increments shorter than 5 minutes and such a delay would be excessive for load balancing decisions that happen in seconds. My current approach is to calculate processor utilization directly in our video server code and send it as an additional request header for the LB to parse. I may have to adjust how often the parsing happens based on how much load this creates.

On the response time end, we had to significantly change our plan for how to collect response time due to our new understanding that expecting this information from users, simulated or otherwise, is infeasible. Instead we plan to now get the response time from when the load balancer chooses a server to request to when said server sends its corresponding response. Specifically, the load balancer would send the start time as a request header, and the video server would include response time as a response header.

I also documented a more refined initial approach for our custom load balancer. Because volume of datapoints is not a good metric for choosing a server (old data is far less useful in load balancing than in traditional multi-arm reinforcement learning), I find that epsilon-greedy is a better starting point algorithm than UCB1. Furthermore, to ensure recent data is most valued and to reduce memory concerns, the algorithm will remove datapoints prior to a certain point (e.g. only most recent 20 response times are kept for decisions). I also plan to limit the algorithm from choosing one of the last k-chosen servers again. This accounts for delays in number adjustment while also promoting dynamism in server exploration and monitoring.

Team Status Report for 4/10

This week we met regularly in order to share personal updates and repository code for our working video stream with load balancing. Specifically, a new repository was made for streaming from the Amazon S3 database and the details were communicated for the necessary script additions for EC2 deployment. A similar process was done later in the week for the random and round robin proxy servers that load balanced between the streaming deployment group. We also logged several different repositories and their respective purposes; the purposes were categorized as experimental, locally working, or deployment ready.

We also managed to resolve a lengthy issue with the initial deployment of the video streaming servers. We intended to use AWS CodeDeploy to automate group deployment of many identical servers. However, this was an unfamiliar process for us and presented new problems. The issues were three-fold consisting of missing security permissions for our intended port usage, ending processes improperly managing residual files, and inaccurate file navigation in the virtual machine. We now know to keep an eye on these aspects for future deployment and benefited from a smooth deployment of the load balancing proxies. Solving major issues of individual assignments as a group worked well so we plan to continue that practice going forward.

Mitul’s Status Report for 4/10

Last week I was in progress of adjusting our video application to retrieve video from separate MongoDB database servers. However, we discussed and instead decided to replace that with a single auto-scaling Amazon S3 bucket to serve our database needs. This will let us more easily upload new videos and verify and database integration errors on AWS consoles. This past week I completed the app adjustment to retrieve video from S3. Unfortunately, the AWS node.js function suite did not fully follow the structure of our previous video chunk retrieval framework. I cut out that framework so that we could have working deployment but unfortunately the video currently fully preloads and does not support scrubbing. I will be fixing that issue over the following week.

I also did some research on node-http-proxy as a module to enable our simple and lightweight load balancers, specifically the random choice and round robin (set sequential choice). Both of these load balancers can utilize high-level proxy functions for a fully pass-through request-response architecture in which the contents of the stream do not need to be read. I completed locally working versions of these load balancers that could each proxy for 10 different AWS front-end servers described in the first paragraph. I’m currently working on extracting front-end response times for our first reinforcement learning load balancers. I plan to build a latancy-based framework around this response time to complete our first reinforement learning algorithm by next week.

Mitul’s Status Report for 4/2

This week, I proposed a system architecture change to the group. Because we are ultimately only monitoring the performance of a load balancer and its effect on the overall system, we can reduce our minimum viable product to that of a single user-facing application load balancer that connects users’ browsers to a chosen server node from the front-end tier. This architecture can allow us to simplify the further back-end structure to one of two options. Either each front-end node has a dedicated video database to retrieve from or all front-end nodes have a shared scalable database that they can retrieve from. I am debugging a MongoDB implementation for the first of those options as a test.

As we are waiting on Jason’s group deployment automation for testing the system within an AWS environment, I also began researching our options for user simulation. Flood.io is a promising third party solution I found that I plan to bring up to the team soon. It is a scalable load testing platform that can be directly written to with user simulation scripts and provides its own automated application performance monitoring. Furthermore, flood can be directly integrated to an AWS deployment. A considerable amount of testingĀ  (500 user hours) can be done for free before we may need to decide to spend some of our budget on further tests.