Nowadays, technology is booming. The scientific and engineering research, and also many high-tech applications, require high efficiency of floating-point calculation. Therefore, AVX512 instruction set is given to birth. However, in practical experiment, a problem has been found: The throughput of Non-AVX tasks will suffer unexpected decrease when there are AVX512 tasks running on the same server. This project is to avoid the throughput decrease of Non-AVX tasks, in order to achieve the optimal overall efficiency of the server.
The whole design consists of Four parts:
The design consists of three parts: hardware, reinforcement learning algorithm, and user interface. When the number of active cores increase, the corresponding frequency and throughput will decrease. Also, for AVX512 workload and Non-AVX one, the throughput will be different.
The data of CPU status will be reformatted into a state table which records the original throughput and the updated throughput. The gym environment will generate an action to migrate the workload and calculate the reward. With this action, the CPU status will be updated. The user interface will vividly display the CPU status and show the learning curve.
The main facts that influence the convergence of the training result are the action spaces, reward function and the observation spaces, where lots of attempts were made to modify the model.
The intelligent tuning framework with reinforcement learning (RL) allows customers to optimize their server using general APIs or user-friendly GUI. The key to achieve this goal is to construct a uniform and robust framework and abstract server tuning problems as universal RL environment. Besides, the uniform and stable framework also makes sense.