1
0
Fork 0
Commit Graph

634724 Commits (c540594f864bb4645573c2c0a304919fabb3d7ea)

Author SHA1 Message Date
Daniel Borkmann c540594f86 bpf, mlx4: fix prog refcount in mlx4_en_try_alloc_resources error path
Commit 67f8b1dcb9 ("net/mlx4_en: Refactor the XDP forwarding rings
scheme") added a bug in that the prog's reference count is not dropped
in the error path when mlx4_en_try_alloc_resources() is failing from
mlx4_xdp_set().

We previously took bpf_prog_add(prog, priv->rx_ring_num - 1), that we
need to release again. Earlier in the call path, dev_change_xdp_fd()
itself holds a reference to the prog as well (hence the '- 1' in the
bpf_prog_add()), so a simple atomic_sub() is safe to use here. When
an error is propagated, then bpf_prog_put() is called eventually from
dev_change_xdp_fd()

Fixes: 67f8b1dcb9 ("net/mlx4_en: Refactor the XDP forwarding rings scheme")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12 23:31:57 -05:00
Ivan Khoronzhuk b602e491a5 net: ethernet: ti: davinci_cpdma: free memory while channel destroy
While create/destroy channel operation memory is not freed. It was
supposed that memory is freed while driver remove. But a channel
can be created and destroyed many times while changing number of
channels with ethtool.

Based on net-next/master

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12 21:06:50 -05:00
David S. Miller 6b8609d859 Merge branch 'hns-fixes'
Salil Mehta says:

====================
Bug fixes & Code improvements in HNS driver

This patch-set introduces some bug fixes and code improvements.
These have been identified during internal review or testing of
the driver by internal Hisilicon teams.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:50 -05:00
Kejian Yan 66355f52ca net: hns: add the support to add/remove the ucast entry to/from table
This patch adds the support to add or remove the unicast entries
to the table and remove from the table.

Reported-by: Daode Huang <huangdaode@hisilicon.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Kejian Yan ec2cafe682 net: hns: add multicast tcam table clear
There is no clear operation before add a new multicast tcam table,
so the tcam table will be overflow when add more entries.

Reported-by: Daode Huang <huangdaode@hisilicon.com>
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 590457f4ec net: hns: modify tcam table of mask_key
The packets of wrong mac address(only the last bit is different) can be
received in Big-endian by current definition of mask_key. Thus it needs
to be modified to support Big-endian and ensure Big-endian normal.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 9d189b853c net: hns: modify tcam table of mac mc-entry
The current definition of mac_mc_entry is only suitable for
Little-endian. Thus it needs to modify tcam table of mac mc-entry
to support both Little-endian and Big-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 928971b6bc net: hns: modify tcam table of mac mc-port
Little-endian is only supported by current tcam table to add
or delete mac mc-port. This patch makes it support both
Little-endian and Big-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 39a6c9ebcb net: hns: modify table index to get mac entry
Big-endian is not supported by the current definition of table index to get
mac entry. It needs to be modified to support both Little-endian
and Big-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie c9c0b37072 net: hns: modify tcam table of mac uc-entry
The current definition of mac_uc_entry is only suitable for
Little-endian. Thus it needs to modify tcam table of mac uc-entry
to support both Little-endian and Big-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 5483bfcb16 net: hns: modify tcam table and set mac key
The current definition of dsaf_drv_tbl_tcam_key is only suitable for
Little-endian. If data is stored in Big-endian, this may lead to
error in data use. Shift operation can make it work normally in both
Big-endian and Little-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie d30721d459 net: hns: modify buffer format of cpu data to le64
Hardware ring buffer data is stored in Little-endian. Thus cpu data
should be modified to Little-endian.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Daode Huang 20b3385aaa net: hns: fix to intimate the link-status change by adding LF/RF method
In current scenario, when the interface is disabled we reset the XGMAC
RX/TX functionality. This operation does not affects the PHY layer/SFP
and which appears UP to the remote end(this behaviour is unlike GMAC).
The result is remote end keeps on sending the packets which gets partly
processed by XMAC and dropped. Since these are partly processed these
appears as errored packets in the packet counter statistics.

This patch fixes this behaviour and adds local-fault and remote-fault
functionality which can be used to intimate the remote peer whenever
the state of the interface changes. This patch also removes the
existing hns_dsaf_xge_core_srst_by_port function which was being used
to reset the RX/TX functionality at XGE Core.

Reported-by: Jun He <hjat2005@huawei.com>
Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie 28b3012400 net: hns: modify ethtool statistics value error
This patch modify the gmac_rx_filt_pkt and gmac_rx_octets_total_filt
statistics value. The two statistics is inconsistent with register,
and just the opposite.

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Signed-off-by: Jun He <hjat2005@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Qianqian Xie f165e03b94 net: hns: delete redundant macro definition
This patch deletes redundant macro definitions in hns drivers.
And change the .h file containing relation to make the layers
more clearly

Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Signed-off-by: Weiwei Deng <dengweiwei@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Daode Huang da2ef1e558 net: hns: bug fix about restart auto-negotiation
When set auto-negotiation off and duplex half, if run "ethtool -r ethX"
on port with phy, then the port will be failed to work. It should
forbid to start auto-negotiation when auto-negotiate is off. This
patch add the limited condition.

Reported-by: Jinchuang Tian <tianjinchuang1@huawei.com>
Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Reviewed-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Daode Huang 2e7c80577e net: hns: set default mac pause time to 0xffff
The default mac pause time set to 0xff which is too short for pausing,
this patch change it to the max value 0xffff.

Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Reviewed-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Kejian Yan 1f5fa2dd1c net: hns: fix for promisc mode in HNS driver
If set promisc mode when there is some traffic, The service nic will
cause system halted. We reserve the last 6 tcam entry for the 6 ports.
If promisc mode is enabled, we can config the relative tcam as fuzzy
matching and set to be valid, or set the tcam to be invalid

Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Kejian Yan 153b1d4870 net: hns: add fuzzy match of tcam table for hns
Since there is not enough tcam table entries for vlan and multicast
address, HNSv2 needs to add support of fuzzy matching of TCAM tables.
To add fuzzy match of TCAM, we Add the property to mask the bits to
be fuzzy matched

Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Kejian Yan edd9a29829 Doc: hisi: hns adds mc-mac-mask property
Since there is not enough tcam table entries for every vlan and multicast
address, HNS needs to add support of fuzzy matching of TCAM tables. Adding
the property to mask the bits to be fuzzy matched, so update the bindings
document

Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10 11:45:37 -05:00
Eric Dumazet f522a5fcf6 tcp: remove unaligned accesses from tcp_get_info()
After commit 6ed46d1247 ("sock_diag: align nlattr properly when
needed"), tcp_get_info() gets 64bit aligned memory, so we can avoid
the unaligned helpers.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 22:51:33 -05:00
David S. Miller d401c1d1e8 This feature and cleanup patchset includes the following changes:
- netlink and code cleanups by Sven Eckelmann (3 patches)
 
  - Cleanup and minor fixes by Linus Luessing (3 patches)
 
  - Speed up multicast update intervals, by Linus Luessing
 
  - Avoid (re)broadcast in meshes for some easy cases,
    by Linus Luessing
 
  - Clean up tx return state handling, by Sven Eckelmann (6 patches)
 
  - Fix some special mac address handling cases, by Sven Eckelmann
    (3 patches)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdBQJYI6GFFhxzd0BzaW1vbnd1bmRlcmxpY2guZGUACgkQoSvjmEKS
 nqHyEw/9GkYNRQJOk1JMuW0cDvj9uWqoendvXRNPVkCvqh4gjX4o+aQaeyumv1/v
 eYqpslQWmSsrIlGQ6UGCegzyzZ7jXo6ZijM7wvz2bWwB2C0NzUGlBBCzOeA6Bui/
 3Fq+Xmx0Xcf5+c82YmrLor/yYp4FIFTao4+a80vHzQeI/Hg8RuJTOFJdtVNV3JPP
 VrfzMAPLLXJPPKHjt1PN3lfANWqX6nWLUMhHBNkMpYB+mMdyaCve6X+MxPF+WYBH
 wBO8spU35chW7dp8HOncof5nRDv2xVHWs6TN2kdJ762YrZ1oL0GXwWXViKhWskSQ
 QEeOLboyj3IuwPsxOQOLQEbAMrp6jqj3L/6lYWRkV2U6Bbi8EYdozW8L3utxMcvA
 Dft8D2U5JAD5ja0VUFyGhwNaBFien2B9JSEwsyOLtUbaQSASNyvym75WrN2Ey/d7
 JhBzUt6Iwh8RNJylY3nG5OkoNnyXYv3VrQLsIW4QTHc8Um9eaiOeFHtuAi6WNBtI
 HgMwPcdErNbmPd3w9OM6kk6aBQ/DTUK/7CNUKYVoGDayGxKYGDwqhoog9zm18wrt
 wc/TtdIY+q95hgm8fDCJefrnkaIxDJrVtChs30N/pJ24MeKcHuibop3HzxIngze2
 zPZTuXRKA2VSt79+EV4KORAutexi1WQIN7nRH1a8zMsYfyMKG8Y=
 =1xrj
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20161108-v2' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
pull request for net-next: batman-adv 2016-11-08 v2

This feature and cleanup patchset includes the following changes:

 - netlink and code cleanups by Sven Eckelmann (3 patches)

 - Cleanup and minor fixes by Linus Luessing (3 patches)

 - Speed up multicast update intervals, by Linus Luessing

 - Avoid (re)broadcast in meshes for some easy cases,
   by Linus Luessing

 - Clean up tx return state handling, by Sven Eckelmann (6 patches)

 - Fix some special mac address handling cases, by Sven Eckelmann
   (3 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 22:15:28 -05:00
David S. Miller a6dfdb4e1c Merge branch 'PHC-freq-fine-tuning'
Richard Cochran says:

====================
PHC frequency fine tuning

This series expands the PTP Hardware Clock subsystem by adding a
method that passes the frequency tuning word to the the drivers
without dropping the low order bits.  Keeping those bits is useful for
drivers whose frequency resolution is higher than 1 ppb.

The appended script (below) runs a simple demonstration of the
improvement.  This test needs two Intel i210 PCIe cards installed in
the same PC, with their SDP0 pins connected by copper wire.  Measuring
the estimated offset (from the ptp4l servo) and the true offset (from
the PPS) over one hour yields the following statistics.

|        |   Est. Before |    Est. After |   True Before |    True After |
|--------+---------------+---------------+---------------+---------------|
| min    | -5.200000e+01 | -1.600000e+01 | -3.100000e+01 | -1.000000e+00 |
| max    | +5.700000e+01 | +2.500000e+01 | +8.500000e+01 | +4.000000e+01 |
| pk-pk: | +1.090000e+02 | +4.100000e+01 | +1.160000e+02 | +4.100000e+01 |
| mean   | +6.472222e-02 | +1.277778e-02 | +2.422083e+01 | +1.826083e+01 |
| stddev | +1.158006e+01 | +4.581982e+00 | +1.207708e+01 | +4.981435e+00 |

Here the numbers in units of nanoseconds, and the ~20 nanosecond PPS
offset is due to input/output delays on the i210's external interface
logic.

With the series applied, both the peak to peak error and the standard
deviation improve by a factor of more than two.  These two graphs show
the improvement nicely.

  http://linuxptp.sourceforge.net/fine-tuning/fine-est.png

  http://linuxptp.sourceforge.net/fine-tuning/fine-tru.png
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:20:01 -05:00
Richard Cochran e4788b800f ptp: dp83640: Use the high resolution frequency method.
The dp83640 has a frequency resolution of about 0.029 ppb.
This patch lets users of the device benefit from the
increased frequency resolution when tuning the clock.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:19:53 -05:00
Richard Cochran c79e975e1f ptp: igb: Use the high resolution frequency method.
The 82580 and related devices offer a frequency resolution of about
0.029 ppb.  This patch lets users of the device benefit from the
increased frequency resolution when tuning the clock.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:19:53 -05:00
Richard Cochran d8d2635419 ptp: Introduce a high resolution frequency adjustment method.
The internal PTP Hardware Clock (PHC) interface limits the resolution for
frequency adjustments to one part per billion.  However, some hardware
devices allow finer adjustment, and making use of the increased resolution
improves synchronization measurably on such devices.

This patch adds an alternative method that allows finer frequency tuning
by passing the scaled ppm value to PHC drivers.  This value comes from
user space, and it has a resolution of about 0.015 ppb.  We also deprecate
the older method, anticipating its removal once existing drivers have been
converted over.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Suggested-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:19:53 -05:00
Eric Dumazet 149d6ad836 net: napi_hash_add() is no longer exported
There are no more users except from net/core/dev.c
napi_hash_add() can now be static.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:16:05 -05:00
Eric Dumazet ef8d759b52 bnxt_en: do not call napi_hash_add()
This is automatically done from netif_napi_add(), and we want to not
export napi_hash_add() anymore in the following patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:16:05 -05:00
Tobias Klauser de464375da bpf: Remove unused but set variables
Remove the unused but set variables min_set and max_set in
adjust_reg_min_max_vals to fix the following warning when building with
'W=1':

  kernel/bpf/verifier.c:1483:7: warning: variable ‘min_set’ set but not used [-Wunused-but-set-variable]

There is no warning about max_set being unused, but since it is only
used in the assignment of min_set it can be removed as well.

They were introduced in commit 484611357c ("bpf: allow access into map
value arrays") but seem to have never been used.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:15:00 -05:00
Yotam Gigi f41cd11d64 tc_act: Remove tcf_act macro
tc_act macro addressed a non existing field, and was not used in the
kernel source.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 21:14:05 -05:00
David S. Miller 5db5b39515 Merge branch 'ipv6-sr'
David Lebrun says:

====================
net: add support for IPv6 Segment Routing

v5:
 - Check SRH validity when adding a new route with lwtunnels and
   when setting an IPV6_RTHDR socket option.
 - Check that hdr->segments_left is not out of bounds when processing
   an SR-enabled packet.
 - Add __ro_after_init attribute to seg6_genl_policy structure.
 - Add CONFIG_IPV6_SEG6_INLINE option to enable or disable
   direct header insertion.

v4:
 - Change @cleanup in ipv6_srh_rcv() from int to bool
 - Move checksum helper functions into header file
 - Add common definition for SR TLVs
 - Add comments for HMAC computation algorithm
 - Use rhashtable to store HMAC infos instead of linked list
 - Remove packed attribute for struct sr6_tlv_hmac
 - Use dst cache only if CONFIG_DST_CACHE is enabled

v3:
 - Fix compilation for CONFIG_IPV6={n,m}

v2:
 - Remove packed attribute from sr6 struct and replaced unaligned
   16-bit flags with two 8-bit flags.
 - SR code now included by default. Option CONFIG_IPV6_SEG6_HMAC
   exists for HMAC support (which requires crypto dependencies).
 - Replace "hidden" calls to mutex_{un,}lock to direct calls.
 - Fix reverse xmas tree coding style.
 - Fix cast-from-void*'s.
 - Update skb->csum to account for SR modifications.
 - Add dst_cache in seg6_output.

Segment Routing (SR) is a source routing paradigm, architecturally
defined in draft-ietf-spring-segment-routing-09 [1]. The IPv6 flavor of
SR is defined in draft-ietf-6man-segment-routing-header-02 [2].

The main idea is that an SR-enabled packet contains a list of segments,
which represent mandatory waypoints. Each waypoint is called a segment
endpoint. The SR-enabled packet is routed normally (e.g. shortest path)
between the segment endpoints. A node that inserts an SRH into a packet
is called an ingress node, and a node that is the last segment endpoint
is called an egress node.

From an IPv6 viewpoint, an SR-enabled packet contains an IPv6 extension
header, which is a Routing Header type 4, defined as follows:

struct ipv6_sr_hdr {
        __u8    nexthdr;
        __u8    hdrlen;
        __u8    type;
        __u8    segments_left;
        __u8    first_segment;
        __u8    flag_1;
        __u8    flag_2;
        __u8    reserved;

        struct in6_addr segments[0];
};

The first 4 bytes of the SRH is consistent with the Routing Header
definition in RFC 2460. The type is set to `4' (SRH).

Each segment is encoded as an IPv6 address. The segments are encoded in
reverse order: segments[0] is the last segment of the path, and
segments[first_segment] is the first segment of the path.

segments[segments_left] points to the currently active segment and
segments_left is decremented at each segment endpoint.

There exist two ways for a packet to receive an SRH, we call them
encap mode and inline mode. In the encap mode, the packet is encapsulated
in an outer IPv6 header that contains the SRH. The inner (original) packet
is not modified. A virtual tunnel is thus created between the ingress node
(the node that encapsulates) and the egress node (the last segment of the path).
Once an encapsulated SR packet reaches the egress node, the node decapsulates
the packet and performs a routing decision on the inner packet. This kind of
SRH insertion is intended to use for routers that encapsulates in-transit
packet.

The second SRH insertion method, the inline mode, acts by directly inserting
the SRH right after the IPv6 header of the original packet. For this method,
if a particular flag (SR6_FLAG_CLEANUP) is set, then the penultimate segment
endpoint must strip the SRH from the packet before forwarding it to the last
segment endpoint. This insertion method is intended to use for endhosts,
however it is also used for in-transit packets by some industry actors.
Note that directly inserting extension headers may break several mechanisms
such as Path MTU Discovery, IPSec AH, etc. For this reason, this insertion
method is only available if CONFIG_IPV6_SEG6_INLINE is enabled.

Finally, the SRH may contain TLVs after the segments list. Several types of
TLVs are defined, but we currently consider only the HMAC TLV. This TLV is
an answer to the deprecation of the RH0 and enables to ensure the authenticity
and integrity of the SRH. The HMAC text contains the flags, the first_segment
index, the full list of segments, and the source address of the packet. While
SR is intended to use mostly within a single administrative domain, the HMAC
TLV allows to verify SR packets coming from an untrusted source.

This patches series implements support for the IPv6 flavor of SR and is
logically divided into the following components:

        (1) Data plane support (patch 01). This patch adds a function
            in net/ipv6/exthdrs.c to handle the Routing Header type 4.
            It enables the kernel to act as a segment endpoint, by supporting
            the following operations: decrementation of the segments_left field,
            cleanup flag support (removal of the SRH if we are the penultimate
            segment endpoint) and decapsulation of the inner packet as an egress
            node.

        (2) Control plane support (patches 02..03 and 07..09). These patches enables
            to insert SRH on locally emitted and/or forwarded packets, both with
            encap mode and with inline mode. The SRH insertion is controlled through
            the lightweight tunnels mechanism. Furthermore, patch 08 enables the
            applications to insert an SRH on a per-socket basis, through the
            setsockopt() system call. The mechanism to specify a per-socket
            Routing Header was already defined for RH0 and no special modification
            was performed on this side. However, the code to actually push the RH
            onto the packets had to be adapted for the SRH specifications.

        (3) HMAC support (patches 04..06). These patches adds the support of the
            HMAC TLV verification for the dataplane part, and generation for
            the control plane part. Two hashing algorithms are supported
            (SHA-1 as legacy and SHA-256 as required by the IETF draft), but
            additional algorithms can be easily supported by simply adding an
            entry into an array.

[1] https://tools.ietf.org/html/draft-ietf-spring-segment-routing-09
[2] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:13 -05:00
David Lebrun 8bc66a4423 ipv6: sr: add documentation file for per-interface sysctls
This patch adds documentation for some SR-related per-interface
sysctls.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun a149e7c7ce ipv6: sr: add support for SRH injection through setsockopt
This patch adds support for per-socket SRH injection with the setsockopt
system call through the IPPROTO_IPV6, IPV6_RTHDR options.
The SRH is pushed through the ipv6_push_nfrag_opts function.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 613fa3ca9e ipv6: add source address argument for ipv6_push_nfrag_opts
This patch prepares for insertion of SRH through setsockopt().
The new source address argument is used when an HMAC field is
present in the SRH, which must be filled. The HMAC signature
process requires the source address as input text.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 9baee83406 ipv6: sr: add calls to verify and insert HMAC signatures
This patch enables the verification of the HMAC signature for transiting
SR-enabled packets, and its insertion on encapsulated/injected SRH.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 4f4853dc1c ipv6: sr: implement API to control SR HMAC structure
This patch provides an implementation of the genetlink commands
to associate a given HMAC key identifier with an hashing algorithm
and a secret.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun bf355b8d2c ipv6: sr: add core files for SR HMAC support
This patch adds the necessary functions to compute and check the HMAC signature
of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and
hmac(sha256).

In order to avoid dynamic memory allocation for each HMAC computation,
a per-cpu ring buffer is allocated for this purpose.

A new per-interface sysctl called seg6_require_hmac is added, allowing a
user-defined policy for processing HMAC-signed SR-enabled packets.
A value of -1 means that the HMAC field will always be ignored.
A value of 0 means that if an HMAC field is present, its validity will
be enforced (the packet is dropped is the signature is incorrect).
Finally, a value of 1 means that any SR-enabled packet that does not
contain an HMAC signature or whose signature is incorrect will be dropped.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 6c8702c60b ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
This patch creates a new type of interfaceless lightweight tunnel (SEG6),
enabling the encapsulation and injection of SRH within locally emitted
packets and forwarded packets.

>From a configuration viewpoint, a seg6 tunnel would be configured as follows:

  ip -6 ro ad fc00::1/128 encap seg6 mode encap segs fc42::1,fc42::2,fc42::3 dev eth0

Any packet whose destination address is fc00::1 would thus be encapsulated
within an outer IPv6 header containing the SRH with three segments, and would
actually be routed to the first segment of the list. If `mode inline' was
specified instead of `mode encap', then the SRH would be directly inserted
after the IPv6 header without outer encapsulation.

The inline mode is only available if CONFIG_IPV6_SEG6_INLINE is enabled. This
feature was made configurable because direct header insertion may break
several mechanisms such as PMTUD or IPSec AH.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 915d7e5e59 ipv6: sr: add code base for control plane support of SR-IPv6
This patch adds the necessary hooks and structures to provide support
for SR-IPv6 control plane, essentially the Generic Netlink commands
that will be used for userspace control over the Segment Routing
kernel structures.

The genetlink commands provide control over two different structures:
tunnel source and HMAC data. The tunnel source is the source address
that will be used by default when encapsulating packets into an
outer IPv6 header + SRH. If the tunnel source is set to :: then an
address of the outgoing interface will be selected as the source.

The HMAC commands currently just return ENOTSUPP and will be implemented
in a future patch.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
David Lebrun 1ababeba4a ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
Implement minimal support for processing of SR-enabled packets
as described in
https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02.

This patch implements the following operations:
- Intermediate segment endpoint: incrementation of active segment and rerouting.
- Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH
  and routing of inner packet.
- Cleanup flag support for SR-inlined packets: removal of SRH if we are the
  penultimate segment endpoint.

A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled
packets. Default is deny.

This patch does not provide support for HMAC-signed packets.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:40:06 -05:00
Arnd Bergmann dc0b2c9cb4 net: mii: report 0 for unknown lp_advertising
The newly introduced mii_ethtool_get_link_ksettings function sets
lp_advertising to an uninitialized value when BMCR_ANENABLE is not
set:

drivers/net/mii.c: In function 'mii_ethtool_get_link_ksettings':
drivers/net/mii.c:224:2: error: 'lp_advertising' may be used uninitialized in this function [-Werror=maybe-uninitialized]

As documented in include/uapi/linux/ethtool.h, the value is
expected to be zero when we don't know it, so let's initialize
it to that.

Fixes: bc8ee596af ("net: mii: add generic function to support ksetting support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:26:58 -05:00
Jan Beulich 6c27f99d35 xen-netback: prefer xenbus_scanf() over xenbus_gather()
For single items being collected this should be preferred as being more
typesafe (as the compiler can check format string and to-be-written-to
variable match) and more efficient (requiring one less parameter to be
passed).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:24:35 -05:00
Hangbin Liu 1af92836e5 igmp: Document sysctl force_igmp_version
There is some difference between force_igmp_version and force_mld_version.
Add document to make users aware of this.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 20:22:55 -05:00
Asbjørn Sloth Tønnesen 3f9b9770b4 net: l2tp: fix negative assignment to unsigned int
recv_seq, send_seq and lns_mode mode are all defined as
unsigned int foo:1;

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:55:36 -05:00
Asbjørn Sloth Tønnesen 57ceb8611d net: l2tp: cleanup: remove redundant condition
These assignments follow this pattern:

	unsigned int foo:1;
	struct nlattr *nla = info->attrs[bar];

	if (nla)
		foo = nla_get_flag(nla); /* expands to: foo = !!nla */

This could be simplified to: if (nla) foo = 1;
but lets just remove the condition and use the macro,

	foo = nla_get_flag(nla);

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:55:36 -05:00
Asbjørn Sloth Tønnesen 97b7af097e net: l2tp: netlink: l2tp_nl_tunnel_send: set UDP6 checksum flags
This patch causes the proper attribute flags to be set,
in the case that IPv6 UDP checksums are disabled, so that
userspace ie. `ip l2tp show tunnel` knows about it.

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:55:36 -05:00
Asbjørn Sloth Tønnesen 7ff516ffe4 net: l2tp: only set L2TP_ATTR_UDP_CSUM if AF_INET
Only set L2TP_ATTR_UDP_CSUM in l2tp_nl_tunnel_send()
when it's running over IPv4.

This prepares the code to also have IPv6 specific attributes.

Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:55:36 -05:00
Asbjørn Sloth Tønnesen 3f11ec045f net: l2tp: change L2TP_ATTR_UDP_ZERO_CSUM6_{RX, TX} attribute types
The attributes L2TP_ATTR_UDP_ZERO_CSUM6_RX and
L2TP_ATTR_UDP_ZERO_CSUM6_TX are used as flags,
but is defined as a u8 in a comment.

This patch redocuments them as flags.

Adding nla_policy entries would break API, so not doing that.

CC: Tom Herbert <therbert@google.com>
Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:55:36 -05:00
Eric Dumazet d61d072e87 net-gro: avoid reorders
Receiving a GSO packet in dev_gro_receive() is not uncommon
in stacked devices, or devices partially implementing LRO/GRO
like bnx2x. GRO is implementing the aggregation the device
was not able to do itself.

Current code causes reorders, like in following case :

For a given flow where sender sent 3 packets P1,P2,P3,P4

Receiver might receive P1 as a single packet, stored in GRO engine.

Then P2-P4 are received as a single GSO packet, immediately given to
upper stack, while P1 is held in GRO engine.

This patch will make sure P1 is given to upper stack, then P2-P4
immediately after.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 18:48:54 -05:00
David S. Miller 8e6e596b06 Merge branch 'sfc-udp-rss'
Edward Cree says:

====================
sfc: enable 4-tuple UDP RSS hashing

EF10 based NICs have configurable RSS hash fields, and can be made to take the
ports into the hash on UDP (they already do so for TCP).  This patch series
enables this, in order to improve spreading of UDP traffic.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 13:59:17 -05:00