1
0
Fork 0
alistair23-linux/net/core
John Fastabend 4f57c087de net: implement mechanism for HW based QOS
This patch provides a mechanism for lower layer devices to
steer traffic using skb->priority to tx queues. This allows
for hardware based QOS schemes to use the default qdisc without
incurring the penalties related to global state and the qdisc
lock. While reliably receiving skbs on the correct tx ring
to avoid head of line blocking resulting from shuffling in
the LLD. Finally, all the goodness from txq caching and xps/rps
can still be leveraged.

Many drivers and hardware exist with the ability to implement
QOS schemes in the hardware but currently these drivers tend
to rely on firmware to reroute specific traffic, a driver
specific select_queue or the queue_mapping action in the
qdisc.

By using select_queue for this drivers need to be updated for
each and every traffic type and we lose the goodness of much
of the upstream work. Firmware solutions are inherently
inflexible. And finally if admins are expected to build a
qdisc and filter rules to steer traffic this requires knowledge
of how the hardware is currently configured. The number of tx
queues and the queue offsets may change depending on resources.
Also this approach incurs all the overhead of a qdisc with filters.

With the mechanism in this patch users can set skb priority using
expected methods ie setsockopt() or the stack can set the priority
directly. Then the skb will be steered to the correct tx queues
aligned with hardware QOS traffic classes. In the normal case with
single traffic class and all queues in this class everything
works as is until the LLD enables multiple tcs.

To steer the skb we mask out the lower 4 bits of the priority
and allow the hardware to configure upto 15 distinct classes
of traffic. This is expected to be sufficient for most applications
at any rate it is more then the 8021Q spec designates and is
equal to the number of prio bands currently implemented in
the default qdisc.

This in conjunction with a userspace application such as
lldpad can be used to implement 8021Q transmission selection
algorithms one of these algorithms being the extended transmission
selection algorithm currently being used for DCB.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19 23:31:10 -08:00
..
Makefile net: support time stamping in phy devices. 2010-07-18 19:15:26 -07:00
datagram.c Fix a typo in datagram.c and sctp/socket.c. 2010-12-06 13:10:11 -08:00
dev.c net: implement mechanism for HW based QOS 2011-01-19 23:31:10 -08:00
dev_addr_lists.c net: include linux/proc_fs.h in dev_addr_lists.c 2010-04-07 16:46:36 -07:00
drop_monitor.c drop_monitor: use genl_register_family_with_ops() 2010-07-26 20:59:42 -07:00
dst.c net/dst: dst_dev_event() called after other notifiers 2010-11-09 12:17:16 -08:00
ethtool.c ethtool: Report link-down while interface is down 2010-12-10 15:55:23 -08:00
fib_rules.c Revert "ipv4: Allow configuring subnets as local addresses" 2010-12-23 12:03:57 -08:00
filter.c net: filter: dont block softirqs in sk_run_filter() 2011-01-18 21:33:05 -08:00
flow.c net: return operator cleanup 2010-09-23 14:33:39 -07:00
gen_estimator.c pkt_sched: remov unnecessary bh_disable 2010-09-10 12:47:59 -07:00
gen_stats.c net/core: EXPORT_SYMBOL cleanups 2010-07-12 12:57:55 -07:00
iovec.c net: Limit socket I/O iovec total length to INT_MAX. 2010-10-28 11:47:52 -07:00
kmap_skb.h [PATCH] severing skbuff.h -> highmem.h 2006-12-04 02:00:29 -05:00
link_watch.c net/core: EXPORT_SYMBOL cleanups 2010-07-12 12:57:55 -07:00
neighbour.c net: kill unused macros 2010-12-19 21:59:35 -08:00
net-sysfs.c net: use NUMA_NO_NODE instead of the magic number -1 2010-12-16 13:16:06 -08:00
net-sysfs.h xps: Add CONFIG_XPS 2010-11-28 18:24:14 -08:00
net-traces.c netdev: Add tracepoints to netdev layer 2010-09-07 17:51:33 +02:00
net_namespace.c net_ns: add __rcu annotations 2010-10-25 14:18:27 -07:00
netevent.c net/core: EXPORT_SYMBOL cleanups 2010-07-12 12:57:55 -07:00
netpoll.c Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2011-01-07 16:58:04 -08:00
pktgen.c pktgen: Remove unnecessary prefix from pr_<level> 2010-12-20 10:30:06 -08:00
request_sock.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-08 13:47:38 -08:00
rtnetlink.c netlink: support setting devgroup parameters 2011-01-19 23:31:10 -08:00
scm.c scm: lower SCM_MAX_FD 2010-11-24 11:16:43 -08:00
skbuff.c netfilter: fix compilation when conntrack is disabled but tproxy is enabled 2011-01-12 20:25:08 +01:00
sock.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2011-01-13 10:25:58 -08:00
stream.c net: Fix the condition passed to sk_wait_event() 2010-10-03 20:41:32 -07:00
sysctl_net_core.c rps: add __rcu annotations 2010-10-25 14:18:27 -07:00
timestamping.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-17 12:27:22 -08:00
user_dma.c net/core/user_dma.c: Use frag list abstraction interfaces. 2009-06-09 00:19:10 -07:00
utils.c net: return operator cleanup 2010-09-23 14:33:39 -07:00