Pankaj Tanwar
Published on

Building a tiny load balancing service using PID Controllers 🚘

––– views

Recently, I came across an engineering blog by Dropbox, talking about their latest iteration of Robinhood (an in-house built load balancing service). It is designed to dynamically adjust node weights for even load distribution by leveraging Proportional Integral Derivative (PID) controllers.

I always had a blurry memory of control theory - But I never really had the opportunity to put it into practice. So, I decided to spent my Christmas evening implementing a PID controller (a mini-version of Robinhood service in python) and observing how well it works for simulations. Yeah, my search skills are pathetic - on the web, I couldn't find any good resource explaining PID controller implementation. So, this post is for future me and my fellow strangers on the internet trying to understand PID controller.

What is a PID Controller?

A proportional–integral–derivative controller (PID controller or three-term controller) is a feedback-based control loop mechanism commonly used to manage machines and processes that require continuous control and automatic adjustment.

Thanks Wikipedia, but that went slightly over my head. Let's hear a simpler version:

PID controller is a beautiful system - It gets some input data, calculates the difference between the actual and desired output values - and automatically adjusts the control input to keep the resulting output close to the desired output value. In fancier terms, the input data are process variables, desired output value is the setpoint, difference between actual and desired value is the error rate - and PID controller adjusts the process variables to keep error rate as close to zero as possible.

As the name suggests - it consists of 3 components

  1. Proportional (P) reacts to the current error. A large error results in a larger adjustment
  2. Integral (I) accounts for past errors to get rid of accumulated steady-state errors
  3. Derivatives (D) predicts future errors by reacting to the rate of error change

PID controllers are the superheroes of industrial engineers BUT but they also have some interesting use cases in the computer science domain. Let's understand more by breaking this down into mathematical equations, and we'll keep an example handy as well to see how it actually works.

Example - We have a car currently running at 50 kmph, but we need to adjust its speed to maintain 60 kmph

1. Error Calculation

Easy-peasy, simply subtract the two values above to get the error

e(t)=set point−current valuee(t) = \text{set point} - \text{current value}

e(t) is error at time t. It will be 10 kmph (60 kmph - 50 kmph) as per our example.

2. Proportional (P)

To react to the current error, we need to have a scaling constant (let's call it Kp). It relates the size of the error to the size of the input.

P=Kp∗e(t)P = Kp * e(t)

So, if Kp = 2 then P = 2 * 10 = 20, this is the throttle increase we need to bring the speed closer to 60 kmph. Cool right? Wait, If current speed of a car is 59kmph, error is 1kmph. The proportional response may be too small to bridge this gap. How do we solve that? Welcome my another friend - Integral.

3. Integral (I)

As we talked above, it solves the problem of steady state errors. Error of 1 kmph is a steady error.

I=Ki∗∫0te(t) dtI = K_i * \int_0^t e(t) \, dt \quad

So, If we keep Ki = 0.5, then after 10 seconds

I=0.5∗(1∗10)=5 throttle increaseI = 0.5 * (1 * 10) = 5 \text{ throttle increase}

Perfect, we adjusted to 60 kmph now.

4. Derivative (D)

Derivative predicts future errors based on rate of change of the error. It will help us prevent overshooting as sometime P & I can mess up together.

Let's say the car is going from 55 kmph to 60 kmph, so overtime the value of error decreases really fast as the car nears the setpoint (60 kmph). At 55 kmph, error is 5, at 58 kmph its 2 and at 59.5 kmph its 0.5. So the system might overshoot 60 kmph as P & I do not slow down their corrections as the car nears the setpoint.

D=Kdâ‹…de(t)dtD = K_d \cdot \frac{d e(t)}{dt}

With Kd = 1, when error changes from 2 to 0.5 in 1 second, D will become -1.5 basically a throttle decrease.

5. PID output

It's nothing just the sum of each.

Output=P+I+DOutput = P + I + D

Ah ok, I know it's boring at this point. How about taking a practical example and try to code it out to see if this even make sense?

Building Mini-Robinhood

Let's build a mini-version of Robinhood. We have a bunch of nodes. Each node has a weight attached. Weight determines how much traffic that node will get - very famous weighted round robin LB algo. The goal is to make sure that each node's utilization is close to the avg utilization. We will achieve this by using a PID controller. It will dynamically adjust the weight of nodes.

Let's gooooo.

Step 1 - PID Controller class

class PIDController:
def __init__(self, kp, ki, kd):
self.kp = kp # proportional gain
self.ki = ki # integral gain
self.kd = kd # derivative gain
self.prev_error = 0 # previous error for derivative term
self.integral_error = 0 # commulative error for integral term
def compute(self, setpoint, current_value):
"""
It will compute the out of the PID controller
arguments
setpoint: target value (avg utilization)
current_value: current utilization of the node
returns
adjustment to be applied to the node's weight
"""
error = setpoint - current_value
self.integral_error += error
derivative = error - self.prev_error
pid_output = self.kp * error + self.ki * self.integral_error + self.kd * derivative
self.prev_error = error
return pid_output

Step 2 - Mini Robinhood Code & Demo Class

import random
from pid_controller import PIDController
# nodes - with their respective sample weights and utilization
nodes = [
{ "node_id": 1, "weight": 0.5, "utilization": 1 },
{ "node_id": 2, "weight": 0.7, "utilization": 1 },
{ "node_id": 3, "weight": 0.4, "utilization": 1 },
]
# distributing traffic among nodes
def distribute_traffic():
total_weight = sum(node["weight"] for node in nodes)
for node in nodes:
node["utilization"] += random.uniform(0.05, 0.15) * ( node["weight"] / total_weight ) # just throwing traffic as per their weights
def avg_utilization():
return sum(node["utilization"] for node in nodes) / len(nodes)
# --------- Mini Robinhood Demo ---- #
class MiniRobinhoodDemo:
@staticmethod
def run():
# setting up a pid_controller for each node
pid_controllers = [
PIDController(1, 0.1, 0.5) for _ in nodes
]
# simulating for 10 iterations
for itr in range(10):
print('Iteration: ', itr)
avg_util = avg_utilization()
print(f"Average Utilization: {avg_util}")
# now, adjusting weights using PID controllers
for index, node in enumerate(nodes):
adjustment = pid_controllers[index].compute(avg_util, node["utilization"])
node["weight"] += adjustment
print(f" Node {node['node_id']} - Utilization: {node['utilization']:.2f}, New Weight: {node['weight']:.2f}")
distribute_traffic()
print('----------------------------------')
print('Done')
if __name__ == "__main__":
MiniRobinhoodDemo.run()

Just go ahead and hit run, the output is gonna look like.

pankaj.tanwar@the2ndfloorguy-X0 mini-robinhood % python3 mini_robinhood.py
Iteration: 0
Average Utilization: 1.0
Node 1 - Utilization: 1.00, New Weight: 0.50
Node 2 - Utilization: 1.00, New Weight: 0.70
Node 3 - Utilization: 1.00, New Weight: 0.40
----------------------------------
Iteration: 1
Average Utilization: 1.036338004761167
Node 1 - Utilization: 1.03, New Weight: 0.51
Node 2 - Utilization: 1.06, New Weight: 0.66
Node 3 - Utilization: 1.02, New Weight: 0.43
----------------------------------
Iteration: 2
Average Utilization: 1.0746097021837298
Node 1 - Utilization: 1.07, New Weight: 0.52
Node 2 - Utilization: 1.10, New Weight: 0.63
Node 3 - Utilization: 1.06, New Weight: 0.45
----------------------------------
Iteration: 3
..................

It shows pretty clearly how we adjusted the weights dynamically. Feel free to make any tweaks to the solution on my GitHub.

Use cases of PID controller

  1. Self Driving Cars : Steering control is pretty popular. The idea is that the car must stay in the center of the lane while compensating for the deviations due to road curves, wind or may be even errors in the sensory system. The PID controller, automatically adjusts the steering angle to minimize the error (difference between car's position and the center of the lane). Other quite common scenarios are cruise control, path following and my favourite suspension control.
  2. Task Schedulers : The controller can prioritize and allocate tasks to ensure efficient use of resources.
  3. Cinnamon by Uber used this century old tech to build a Mean Load Shedder.

It was a surprisingly fun and rewarding exercise to dive into. Now that I’ve wrapped up my ramblings, I invite your comments on it. If you find any technical inaccuracies, let me know, please. I'm active on X (twitter) as @the2ndfloorguy and if you are interested in what an unfunny & strange programmer will do next, see you there!