Last timecover the topic of using io controller from cgroup v2. I thought about writing about the io controller this time too, but the functions of the io controller in cgroup v2 are more difficult than I imagined, and I did not have enough understanding to write an article, so this time I will start from. cgroup v2 I would like to talk about the CPU controller used[1]。
I wrote about the CPU controller in this series in 2014Chapter 4tea At that time there was no cgroup v2 yet, so cgroup v1[2]It was a used explanation
Later, cgroup v2 was implemented, and many features that can only be used with cgroup v2 were implemented.Last timeThe same applies to the cooperation between the io controller and the memory controller presented.
nonetheless, in fact, cgroup v1 and v2 CPU controllers have almost no functional difference. Features added after cgroup v2 became stable are also well implemented in v1[3].nonetheless, for cgroup v2, the interface file isChapter 49It follows the conventions presented in , and the specifications can be changed for the improvement of the details.
Therefore, this time, I would like to introduce the operation to limit CPU bandwidth from cgroup v2, and also introduce the functions that have been added since then. And let’s dig a little deeper into the features that limit the bandwidth we use in the CPU controller.
Bandwidth limitation is done by the CPU controller
Let’s start with a quick explanation of how a bandwidth limit is set and how it works.
In cgroup v2, the files in Table 1 are the main files used to limit bandwidth using the CPU controller. Method setting in v1 isChapter 4Please take a look.
parameter name | function | operate |
---|---|---|
cpu. |
The period used as the unit for the width limit and the limit value of this time unit. The format is “ “max 100000 “ |
reading and writing |
cpu. |
View CPU usage statistics by jobs in a cgroup | loaded |
In cgroup v1, each setting value had a separate file, and each file contained only one value. In contrast, v2 writes two values to a single file. It will output in the same format as it.Chapter 49explained inMultiple values are separated by spacesThe equivalent
cpu.
set the period and limit.setperiodBetween,limitThe CPU can only be used for the set time. This is the same as in v1.
Default value is 100 ms duration, no limitmax
string set.max
means unlimited.
The limit set here is per CPU. If you have an environment with multiple CPU cores and want to limit to 2 CPUs when the period is set to 100 ms, use “200000 100000
“will put
Simply illustratedFigure 1It seems. here as a period100000
but 1)50000
is put
Let’s say you have a job that needs 200 ms to work. If only this task can use the CPU, if the limit value is not set, see Figure 1-1)
nonetheless, if we set 50 milliseconds as the limit value here, Figure 1-2)
After explaining how the width limitation works in Figure 1,cpu.
Let’s take a look at the contents of the file.
cpu.stat
Content filekey name | meaning | appear |
---|---|---|
usage_ |
CPU time used by tasks in the cgroup |
cgroup v2 only |
user_ |
User CPU time used by work in the cgroup |
cgroup v2 only |
system_ |
System CPU time used by work in the cgroup |
cgroup v2 only |
nr_ |
The number of periods during which jobs in the cgroup could run | always |
nr_ |
The number of times a task in the cgroup has reached its limit and is limited | always |
throttled_ throttled_ |
The total amount of time that jobs in the cgroup failed to run because they reached their limit |
always |
nr_ |
The number of periods during which the burst occurred in the cgroup | 5. |
burst_ burst_ |
The total time spent not working in the cgroup is over the limit |
5. |
where the first three in table 2 are cgroup v1cpu.
does not appear in the record. This is probably because v1 has a cpuacct controller, which has the same stats. nonetheless, the values you get with v1’s cpuacct controller and v2’scpu.
Note that the values found in the file have different units.
The next three items appear regardless of the cgroup version and the kernel version. Among them, the key name isthrottoled_
Items starting with have different units for v1 and v2. That’s why the key names are different. v1 is in nanoseconds, v2 is in microseconds, and v1 at the end of the key name is_time
and in v2_usec
came back
nr_
andburst_
burst_
)
At that timenr_
、nr_
、throttled_
for the three values shown in Figure 1-2)
nr_
: Because the task is using the CPU for 4 times in a period of 100 ms”4“periods nr_
: I hit the limit 3 times.”3“throttled throttled_
“150“usec
The value is entered like this.
Bandwidth limit set in cgroup v2
So far, I have explained the files used to limit bandwidth using the cgroup v2 CPU controller.Chapter 4Let’s set the limit value as we did in , and see the movement.
The running example here is Ubuntu 22.
Here we create a cgroup called “test01” with a period of 100 ms, a limit of 50 ms, and a shell PID.
$ sudo mkdir /sys/fs/cgroup/test01 (test01 cgroupの作成) $ echo "50000 100000" | sudo tee /sys/fs/cgroup/test01/cpu.max (期間100ms、制限値50msを設定) $ echo 5467 | sudo tee /sys/fs/cgroup/test01/cgroup.procs (シェルのPIDをtest01 cgroupに登録)
Now run this command on the shell with PID:5467.
$ while :; do true ; done
Start another shell withtop
When you execute the command, you can see that the CPU usage rate is 50% as shown below, and the limit is working as set.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5467 tenforw+ 20 0 5044 4172 3504 R 50.0 0.1 0:05.24 bash
cpu.
Let’s try it for a while so you can see what the file is. Create the “test01” cgroup as in the previous example, and set the period and limit to “50000 100000
“Likecpu.
put in then register the shell’s PID in “test01”.
$ sudo mkdir /sys/fs/cgroup/test01/ $ echo "50000 100000" | sudo tee /sys/fs/cgroup/test01/cpu.max 50000 100000 $ echo 5467 | sudo tee /sys/fs/cgroup/test01/cgroup.procs 5467
In this state, execute the following command in the shell with PID: 5467.
$ timeout 1 yes > /dev/null
Full CPU usage for 1 second. Immediately after the execution, move the shell from the “test01” cgroup to the root of the cgroup so that it does not count in “test01”, and the “test01”.cpu.
I will take a look
$ cat /sys/fs/cgroup/test01/cpu.stat :(略) nr_periods 17 nr_throttled 10 throttled_usec 495760
The period is 100 milliseconds, so if you run it for 1 second, it will be 10 times. During this time, the limit should be in effect all the time, so the number of times you are limited should be 10. When I checked, the street value isnr_
will appear in It should only use half the CPU per second.throttled_
The value is as expected, almost 500 milliseconds.nr_
As for the value immediately after cgroup creation,10
He has not come
Assign quotas to CPUs
Now let’s talk about how bandwidth throttling actually works.
The limit is not designed in such a way that it counts usage and caps when the limit is about to be exceeded. Rather than restricting, it is more appropriate to give the right to use the CPU up to the limit value. CPU time allocation up to this limit, herequotacalled
In a large system with many CPUs, counting and limiting usage requires the total usage per CPU. When you do this often, it can be a big burden. It is better to assign up to the limit value in advance in a place independent of the CPU, and then assign to the CPU from there, releasing the load for aggregation.
The quotas assigned to cgroups are CPU-independent global quotas for each cgroup.
This slice is the sysctl parameter sched_
defined in
$ sudo sysctl -a | grep sched_cfs_bandwidth_slice_us kernel.sched_cfs_bandwidth_slice_us = 5000
As above, 5 milliseconds is set by default. In other words, CPUs are allocated from the quota pool in 5 ms increments. Increasing this value decreases transfer overhead. Also, lower values allow finer control and may need to be adjusted along with duration, limits, etc., depending on the nature of the workload.
If we set the limit to 50ms as in the previous example,Figure 2This means that 10 installments of them will receive money from the global pool each period. And when the deadline is over, it will be reset, and 10 new items will be assigned according to the set value.
Now let’s look at how slices are transferred from the global quota pool to multiple CPUs and use them.
Figure 3there is a 20ms limit set for the cgroup and there are 2 CPUs.
- In (1), a slice was transferred due to a request from a task that used the CPU on CPU1, and the task was executed for 5 milliseconds and the transferred slice was used.
- In (2), a slice was transferred due to a request from a task that used the CPU on CPU2, and the task was executed for 5 milliseconds and the transferred slice was used.
- In (3), a request is made by the task that uses the CPU on CPU1, the slice is transferred, and the task goes to 2.
It runs for 5 milliseconds, and then there is a request from the job again on CPU1, and the remaining quota is reduced to 2. 5 milliseconds have passed - In (4), a slice was transferred due to a request from a task that used the CPU on CPU2, and the task was executed for 5 milliseconds and the transferred slice was used.
- In (5), a request was made to a task that uses the CPU on CPU1. nonetheless, there are no more slices left in the global pool, so the task cannot be executed during this period.
- After that, 100 milliseconds passed in ⑥, and the quota was filled by 20 milliseconds as the next period entered.
I explained how the slices are transferred to the global pool and used every time the CPU makes a request.
If the allocated tranches are not used during the period, the tranches will be reset at the end of the period and new tranches will be allocated for the limit value in the next period.
summary
So far, we have used a very simple, simplified case to illustrate how bandwidth is made in the CPU controller.
Here
In Figure 3, each job used the given slice, and all finished in exactly 5 milliseconds. Many of you may notice that this is not actually the case, and that there are cases where the process ends without using up the slices, and there are cases where the process time is longer.
In fact, bandwidth limitation is not a process of simply allocating slices to the CPU, resetting the allocation up to that point when the period comes, and assigning a new slice in the next period, as explained in this article. nonetheless, the basic idea is as described here.
Next time, I will explain this slice allocation and return behavior, and look a little better at bandwidth limits.
reference
Documents used to write this article
#53rd #Linux #kernel #container #functionCPU #limit #bandwidth #cgroup #gihyo.jp