流量控制概述
linux下通过tc traffic control 框架及系列实现和工具来实现对出口,甚至入口流量的控制,所谓的控制,就是进行包延迟传输,
丢包,包损坏,带宽限制,针对某个ip规则进行限制等等,来达到模拟网络异常状况,包优先级传输,或者更多功能;
从手册上看:主要提供一下几种控制:
SHAPING: 平滑突发流量,如限制传输速率,小于有效带宽,作用于出口
       When traffic is shaped, its rate of transmission is under
       control. Shaping may be more than lowering the available
       bandwidth - it is also used to smooth out bursts in
       traffic for better network behaviour. Shaping occurs on
       egress.
SCHEDULING : 作用于出口,调度数据包的传输,比如优先级等
       By scheduling the transmission of packets it is possible
       to improve interactivity for traffic that needs it while
       still guaranteeing bandwidth to bulk transfers. Reordering
       is also called prioritizing, and happens only on egress.
POLICING: 作用于入口流量
       Whereas shaping deals with transmission of traffic,
       policing pertains to traffic arriving. Policing thus
       occurs on ingress.
DROPPING: 当流量超过阈值,丢弃数据包,作用于入口和出口;
       Traffic exceeding a set bandwidth may also be dropped
       forthwith, both on ingress and on egress.
例子:
ref: https://netbeez.net/blog/how-to-use-the-linux-traffic-control/
| 1 | 查看:当前只有默认的先入先出规则 | 
- 指令解释:
 qdisc: modify the scheduler (aka queuing discipline) 即实际的使用是依赖的qidsc机制
 add: add a new rule 添加一个排队规则
 dev eth0: rules will be applied on device eth0 排队规则作用对象一般是网卡
 root: modify the outbound traffic scheduler (aka known as the egress qdisc) 修改出口流量调度程序
 netem: use the network emulator to emulate a WAN property 使用wan网络模拟器
 delay: the network property that is modified
 200ms: introduce delay of 200 ms
tc是系统如linux提供的用户层操作指令,这里用的是shell指令:
更多  https://man7.org/linux/man-pages/man8/tc.8.html 
流量控制的基本实现原理
在linux内核中,流量控制用Qos实现,实际上使用了qdisc队列;主要是出口队列;(egress)
在链路层,每个数据包通过邻居子系统后,或者说离开协议栈后,都会由dev_queue_xmit(dev.c)来进一步调用相关设备驱动的发送函数
来发送出去; 而qdisc队列,和相关的排队规则即作用在dev_queue_xmit之后,设备驱动发送函数之前;
流量控制的实现和基本流程:
相关代码:
| 1 | sch_generic.c | 
流程
在内核中的整体处理流程,及位置:
| 1 | net/core/dev.c | 
流量控制的结构:
构成流量控制的基本元素有三种: 排队规则,类和过滤器
| 1 | 
 | 
- 排队规则: 
 在启用了流量控制的情况下,每个网络设备至少会配置一个排队规则;排队规则包括简单的fifo缓冲和令牌桶等,而精确的排队规则通常需要管理多个队列;
 常见的排队规则由 fifo,令牌桶tbf(token bucket filter)等;
- 排队规则的分类: 
 排队规则至少有一个队列,可能简单,如fifo排队规则,也有复杂如令牌桶;通常排队规则分无类和有类两种,无类规则简单,内部不能包含可配置的子类及内部规则
 而有类则可包含多个类,如上图,且每个类又可以包含一个排队规则,这里的排队规则叫内部规则,可以是有类和无类的;
 无类规则不可被用户配置,而有类的可以;- 1 
 2
 3- 分为可分类的qdisc和不可分类的qdisc实现: 
 不可分类:pfifo ,pfifo_fast,red,sfq,tbf
 可分类:cbq,htb,prio- 如默认: - 1 - qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 - 在tc指令中,如下的,其中结尾的 qdisc [qdisc specific parameters] 就是指定具体的排队规则类型; - 1 
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183- 创建qdisc 规则 
 tc [ OPTIONS ] qdisc [ add | change | replace | link | delete ]
 dev DEV [ parent qdisc-id | root ] [ handle qdisc-id ] [
 ingress_block BLOCK_INDEX ] [ egress_block BLOCK_INDEX ] qdisc [
 qdisc specific parameters ]
 eg:
 目前支持的qdisc: 无类的:
 The classless qdiscs are:
 choke CHOKe (CHOose and Keep for responsive flows, CHOose and
 Kill for unresponsive flows) is a classless qdisc designed
 to both identify and penalize flows that monopolize the
 queue. CHOKe is a variation of RED, and the configuration
 is similar to RED.
 codel CoDel (pronounced "coddle") is an adaptive "no-knobs"
 active queue management algorithm (AQM) scheme that was
 developed to address the shortcomings of RED and its
 variants.
 [p|b]fifo
 Simplest usable qdisc, pure First In, First Out behaviour.
 Limited in packets or in bytes.
 fq Fair Queue Scheduler realises TCP pacing and scales to
 millions of concurrent flows per qdisc.
 fq_codel
 Fair Queuing Controlled Delay is queuing discipline that
 combines Fair Queuing with the CoDel AQM scheme. FQ_Codel
 uses a stochastic model to classify incoming packets into
 different flows and is used to provide a fair share of the
 bandwidth to all the flows using the queue. Each such flow
 is managed by the CoDel queuing discipline. Reordering
 within a flow is avoided since Codel internally uses a
 FIFO queue.
 fq_pie FQ-PIE (Flow Queuing with Proportional Integral controller
 Enhanced) is a queuing discipline that combines Flow
 Queuing with the PIE AQM scheme. FQ-PIE uses a Jenkins
 hash function to classify incoming packets into different
 flows and is used to provide a fair share of the bandwidth
 to all the flows using the qdisc. Each such flow is
 managed by the PIE algorithm.
 gred Generalized Random Early Detection combines multiple RED
 queues in order to achieve multiple drop priorities. This
 is required to realize Assured Forwarding (RFC 2597).
 hhf Heavy-Hitter Filter differentiates between small flows and
 the opposite, heavy-hitters. The goal is to catch the
 heavy-hitters and move them to a separate queue with less
 priority so that bulk traffic does not affect the latency
 of critical traffic.
 ingress
 This is a special qdisc as it applies to incoming traffic
 on an interface, allowing for it to be filtered and
 policed.
 mqprio The Multiqueue Priority Qdisc is a simple queuing
 discipline that allows mapping traffic flows to hardware
 queue ranges using priorities and a configurable priority
 to traffic class mapping. A traffic class in this context
 is a set of contiguous qdisc classes which map 1:1 to a
 set of hardware exposed queues.
 multiq Multiqueue is a qdisc optimized for devices with multiple
 Tx queues. It has been added for hardware that wishes to
 avoid head-of-line blocking. It will cycle though the
 bands and verify that the hardware queue associated with
 the band is not stopped prior to dequeuing a packet.
 netem Network Emulator is an enhancement of the Linux traffic
 control facilities that allow to add delay, packet loss,
 duplication and more other characteristics to packets
 outgoing from a selected network interface.
 pfifo_fast
 Standard qdisc for 'Advanced Router' enabled kernels.
 Consists of a three-band queue which honors Type of
 Service flags, as well as the priority that may be
 assigned to a packet.
 pie Proportional Integral controller-Enhanced (PIE) is a
 control theoretic active queue management scheme. It is
 based on the proportional integral controller but aims to
 control delay.
 red Random Early Detection simulates physical congestion by
 randomly dropping packets when nearing configured
 bandwidth allocation. Well suited to very large bandwidth
 applications.
 rr Round-Robin qdisc with support for multiqueue network
 devices. Removed from Linux since kernel version 2.6.27.
 sfb Stochastic Fair Blue is a classless qdisc to manage
 congestion based on packet loss and link utilization
 history while trying to prevent non-responsive flows (i.e.
 flows that do not react to congestion marking or dropped
 packets) from impacting performance of responsive flows.
 Unlike RED, where the marking probability has to be
 configured, BLUE tries to determine the ideal marking
 probability automatically.
 sfq Stochastic Fairness Queueing reorders queued traffic so
 each 'session' gets to send a packet in turn.
 tbf The Token Bucket Filter is suited for slowing traffic down
 to a precisely configured rate. Scales well to large
 bandwidths.
 无类的,在添加规则时需要注意:
 In the absence of classful qdiscs, classless qdiscs can only be
 attached at the root of a device. Full syntax:
 tc qdisc add dev DEV root QDISC QDISC-PARAMETERS
 To remove, issue
 tc qdisc del dev DEV root
 The pfifo_fast qdisc is the automatic default in the absence of a
 configured qdisc.
 有类的:
 ATM Map flows to virtual circuits of an underlying
 asynchronous transfer mode device.
 CBQ Class Based Queueing implements a rich linksharing
 hierarchy of classes. It contains shaping elements as
 well as prioritizing capabilities. Shaping is performed
 using link idle time calculations based on average packet
 size and underlying link bandwidth. The latter may be ill-
 defined for some interfaces.
 DRR The Deficit Round Robin Scheduler is a more flexible
 replacement for Stochastic Fairness Queuing. Unlike SFQ,
 there are no built-in queues -- you need to add classes
 and then set up filters to classify packets accordingly.
 This can be useful e.g. for using RED qdiscs with
 different settings for particular traffic. There is no
 default class -- if a packet cannot be classified, it is
 dropped.
 DSMARK Classify packets based on TOS field, change TOS field of
 packets based on classification.
 ETS The ETS qdisc is a queuing discipline that merges
 functionality of PRIO and DRR qdiscs in one scheduler. ETS
 makes it easy to configure a set of strict and bandwidth-
 sharing bands to implement the transmission selection
 described in 802.1Qaz.
 HFSC Hierarchical Fair Service Curve guarantees precise
 bandwidth and delay allocation for leaf classes and
 allocates excess bandwidth fairly. Unlike HTB, it makes
 use of packet dropping to achieve low delays which
 interactive sessions benefit from.
 HTB The Hierarchy Token Bucket implements a rich linksharing
 hierarchy of classes with an emphasis on conforming to
 existing practices. HTB facilitates guaranteeing bandwidth
 to classes, while also allowing specification of upper
 limits to inter-class sharing. It contains shaping
 elements, based on TBF and can prioritize classes.
 PRIO The PRIO qdisc is a non-shaping container for a
 configurable number of classes which are dequeued in
 order. This allows for easy prioritization of traffic,
 where lower classes are only able to send if higher ones
 have no packets available. To facilitate configuration,
 Type Of Service bits are honored by default.
 QFQ Quick Fair Queueing is an O(1) scheduler that provides
 near-optimal guarantees, and is the first to achieve that
 goal with a constant cost also with respect to the number
 of groups and the packet length. The QFQ algorithm has no
 loops, and uses very simple instructions and data
 structures that lend themselves very well to a hardware
 implementation.
 
 #创建规则:
 tc qdisc add dev eth0 root handle 1:0 htb default 1
 #添加一个tbf规则,绑定到eth0上,命名为1:0 ,默认归类为1
 #handle:为规则命名或指定某规则- 排队规则在内核中的表示结构: - 1 
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42- 描述排队规则的结构: 
 struct Qdisc {
 int (*enqueue)(struct sk_buff *skb, --上面提到的两个函数
 struct Qdisc *sch,
 struct sk_buff **to_free);
 struct sk_buff * (*dequeue)(struct Qdisc *sch);
 unsigned int flags;
 const struct Qdisc_ops *ops;//队列操作的接口,每个排队规则都必须实现该接口,如pfifo,tbf
 struct qdisc_size_table __rcu *stab;
 struct list_head list;
 u32 handle; //和tc的handle对应 句柄,排队规则,类和过滤器都有一个32位的句柄标识;
 u32 parent; //父句柄
 void *u32_node;
 struct netdev_queue *dev_queue;//和netdevice挂钩
 struct sk_buff_head q;//队列当前的数据包数
 ..
 }
 struct Qdisc_ops {
 struct Qdisc_ops *next;//用于链接已注册的各种排队规则的操作接口
 const struct Qdisc_class_ops *cl_ops;//所在规则提供的类操作接口
 char id[IFNAMSIZ];
 int priv_size;
 int (*enqueue)(struct sk_buff *skb, //将数据包加入排队规则的函数
 struct Qdisc *sch,
 struct sk_buff **to_free);
 struct sk_buff * (*dequeue)(struct Qdisc *);
 struct sk_buff * (*peek)(struct Qdisc *);
 int (*init)(struct Qdisc *, struct nlattr *arg);//排队规则的初始化
 void (*reset)(struct Qdisc *);
 void (*destroy)(struct Qdisc *);
 int (*change)(struct Qdisc *, struct nlattr *arg);
 void (*attach)(struct Qdisc *);
 int (*dump)(struct Qdisc *, struct sk_buff *);
 int (*dump_stats)(struct Qdisc *, struct gnet_dump *);
 struct module *owner;
 };
- 类: 
 类: 定义在排队规则中,报文通过过滤器,过滤,分配到不同的类中;排队规则可以没有类,如fifo先进先出,也可以有多个类
 类中也可以有内部的排队规则,包被过滤器过滤为某个类后,在这个类中通过fifo的排队规则出去,或者其他规则,这里的规则就是内部规则;
 创建类:- 1 
 2
 3
 4
 5
 6
 7
 8
 9- tc [ OPTIONS ] class [ add | change | replace | delete ] dev DEV 
 parent qdisc-id [ classid class-id ] qdisc [ qdisc specific
 parameters ]
 eg:
 #创建分类
 tc class add dev eth0 parent 1:0 classid 1:1 htb rate 10Mbit burst 15k
 #为eth0下的root队列1:0添加一个分类并命名为1:1,类型为htb,带宽为10M- 类的表示: - 1 
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25- 在linux中,以xxx_class来表示,如htb: 
 struct htb_class {
 struct Qdisc_class_common common;
 struct psched_ratecfg rate;
 struct psched_ratecfg ceil;
 s64 buffer, cbuffer;/* token bucket depth/rate */
 s64 mbuffer; /* max wait time */
 u32 prio; /* these two are used only by leaves... */
 int quantum; /* but stored for parent-to-leaf return */
 struct tcf_proto __rcu *filter_list; /* class attached filters */ 类的过滤器链
 int filter_cnt;
 int refcnt; /* usage count of this class */
 int level; /* our level (see above) */
 unsigned int children;
 struct htb_class *parent; /* parent class */
 struct gnet_stats_rate_est64 rate_est;
 /*
 * Written often fields
 */
 struct gnet_stats_basic_packed bstats;
 struct tc_htb_xstats xstats; /* our special stats */
- 过滤器 
 过滤器: 具体的过滤规则,用来分类;包含若干个匹配条件,如果符合条件的包,被分类到具体的类中;
一个类至少有一个过滤器,可能有多个过滤器,
tc指令:
| 1 | tc [ OPTIONS ] filter [ add | change | replace | delete | get ] | 
在内核中的结构:
| 1 | struct tcf_proto { | 
qdisc的例子: pfifo ,ftb等
通过fifo学习如何实现一个规则;
默认情况下是pfifo,这个通过dev_open挂到设备上;如果需要其他的,通过tc后->netlink再操作到dev结构等上;
| 1 | struct Qdisc noop_qdisc = { | 
tc工具的netlink接口
定义在sch_api.c,主要操作排队规则中的类和过滤器;
tc是通过netlink向内核通信,从而实现创建,修改qos等功能
本文只是给了一个流程和具体认知,通过本文来知道tc大致原理和框架,从而为进一步提供便利和查找依据;