1
0
Fork 0
Commit Graph

52 Commits (redonkable)

Author SHA1 Message Date
Taehee Yoo d02ae00ab6 netfilter: nf_tables: fix NULL pointer dereference on nft_ct_helper_obj_dump()
commit b71534583f upstream.

In the nft_ct_helper_obj_dump(), always priv->helper4 is dereferenced.
But if family is ipv6, priv->helper6 should be dereferenced.

Steps to reproduces:

   #test.nft
   table ip6 filter {
	   ct helper ftp {
		   type "ftp" protocol tcp
	   }
	   chain input {
		   type filter hook input priority 4;
		   ct helper set "ftp"
	   }
   }

   %nft -f test.nft
   %nft list ruleset

we can see the below messages:

[  916.286233] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  916.294777] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  916.302613] Modules linked in: nft_objref nf_conntrack_sip nf_conntrack_snmp nf_conntrack_broadcast nf_conntrack_ftp nft_ct nf_conntrack nf_tables nfnetlink [last unloaded: nfnetlink]
[  916.318758] CPU: 1 PID: 2093 Comm: nft Not tainted 4.17.0-rc4+ #181
[  916.326772] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[  916.338773] RIP: 0010:strlen+0x1a/0x90
[  916.342781] RSP: 0018:ffff88010ff0f2f8 EFLAGS: 00010292
[  916.346773] RAX: dffffc0000000000 RBX: ffff880119b26ee8 RCX: ffff88010c150038
[  916.354777] RDX: 0000000000000002 RSI: ffff880119b26ee8 RDI: 0000000000000010
[  916.362773] RBP: 0000000000000010 R08: 0000000000007e88 R09: ffff88010c15003c
[  916.370773] R10: ffff88010c150037 R11: ffffed002182a007 R12: ffff88010ff04040
[  916.378779] R13: 0000000000000010 R14: ffff880119b26f30 R15: ffff88010ff04110
[  916.387265] FS:  00007f57a1997700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
[  916.394785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  916.402778] CR2: 00007f57a0ac80f0 CR3: 000000010ff02000 CR4: 00000000001006e0
[  916.410772] Call Trace:
[  916.414787]  nft_ct_helper_obj_dump+0x94/0x200 [nft_ct]
[  916.418779]  ? nft_ct_set_eval+0x560/0x560 [nft_ct]
[  916.426771]  ? memset+0x1f/0x40
[  916.426771]  ? __nla_reserve+0x92/0xb0
[  916.434774]  ? memcpy+0x34/0x50
[  916.434774]  nf_tables_fill_obj_info+0x484/0x860 [nf_tables]
[  916.442773]  ? __nft_release_basechain+0x600/0x600 [nf_tables]
[  916.450779]  ? lock_acquire+0x193/0x380
[  916.454771]  ? lock_acquire+0x193/0x380
[  916.458789]  ? nf_tables_dump_obj+0x148/0xcb0 [nf_tables]
[  916.462777]  nf_tables_dump_obj+0x5f0/0xcb0 [nf_tables]
[  916.470769]  ? __alloc_skb+0x30b/0x500
[  916.474779]  netlink_dump+0x752/0xb50
[  916.478775]  __netlink_dump_start+0x4d3/0x750
[  916.482784]  nf_tables_getobj+0x27a/0x930 [nf_tables]
[  916.490774]  ? nft_obj_notify+0x100/0x100 [nf_tables]
[  916.494772]  ? nf_tables_getobj+0x930/0x930 [nf_tables]
[  916.502579]  ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables]
[  916.506774]  ? nft_obj_notify+0x100/0x100 [nf_tables]
[  916.514808]  nfnetlink_rcv_msg+0x8ab/0xa86 [nfnetlink]
[  916.518771]  ? nfnetlink_rcv_msg+0x550/0xa86 [nfnetlink]
[  916.526782]  netlink_rcv_skb+0x23e/0x360
[  916.530773]  ? nfnetlink_bind+0x200/0x200 [nfnetlink]
[  916.534778]  ? debug_check_no_locks_freed+0x280/0x280
[  916.542770]  ? netlink_ack+0x870/0x870
[  916.546786]  ? ns_capable_common+0xf4/0x130
[  916.550765]  nfnetlink_rcv+0x172/0x16c0 [nfnetlink]
[  916.554771]  ? sched_clock_local+0xe2/0x150
[  916.558774]  ? sched_clock_cpu+0x144/0x180
[  916.566575]  ? lock_acquire+0x380/0x380
[  916.570775]  ? sched_clock_local+0xe2/0x150
[  916.574765]  ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
[  916.578763]  ? sched_clock_cpu+0x144/0x180
[  916.582770]  ? lock_acquire+0x193/0x380
[  916.590771]  ? lock_acquire+0x193/0x380
[  916.594766]  ? lock_acquire+0x380/0x380
[  916.598760]  ? netlink_deliver_tap+0x262/0xa60
[  916.602766]  ? lock_acquire+0x193/0x380
[  916.606766]  netlink_unicast+0x3ef/0x5a0
[  916.610771]  ? netlink_attachskb+0x630/0x630
[  916.614763]  netlink_sendmsg+0x72a/0xb00
[  916.618769]  ? netlink_unicast+0x5a0/0x5a0
[  916.626766]  ? _copy_from_user+0x92/0xc0
[  916.630773]  __sys_sendto+0x202/0x300
[  916.634772]  ? __ia32_sys_getpeername+0xb0/0xb0
[  916.638759]  ? lock_acquire+0x380/0x380
[  916.642769]  ? lock_acquire+0x193/0x380
[  916.646761]  ? finish_task_switch+0xf4/0x560
[  916.650763]  ? __schedule+0x582/0x19a0
[  916.655301]  ? __sched_text_start+0x8/0x8
[  916.655301]  ? up_read+0x1c/0x110
[  916.655301]  ? __do_page_fault+0x48b/0xaa0
[  916.655301]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
[  916.655301]  __x64_sys_sendto+0xdd/0x1b0
[  916.655301]  do_syscall_64+0x96/0x3d0
[  916.655301]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  916.655301] RIP: 0033:0x7f57a0ff5e03
[  916.655301] RSP: 002b:00007fff6367e0a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[  916.655301] RAX: ffffffffffffffda RBX: 00007fff6367f1e0 RCX: 00007f57a0ff5e03
[  916.655301] RDX: 0000000000000020 RSI: 00007fff6367e110 RDI: 0000000000000003
[  916.655301] RBP: 00007fff6367e100 R08: 00007f57a0ce9160 R09: 000000000000000c
[  916.655301] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff6367e110
[  916.655301] R13: 0000000000000020 R14: 00007f57a153c610 R15: 0000562417258de0
[  916.655301] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 fa 53 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df 48 89 fd 48 83 ec 08 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f
[  916.655301] RIP: strlen+0x1a/0x90 RSP: ffff88010ff0f2f8
[  916.771929] ---[ end trace 1065e048e72479fe ]---
[  916.777204] Kernel panic - not syncing: Fatal exception
[  916.778158] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-16 09:45:14 +02:00
Pablo M. Bermudo Garay dfc46034b5 netfilter: nf_tables: add select_ops for stateful objects
This patch adds support for overloading stateful objects operations
through the select_ops() callback, just as it is implemented for
expressions.

This change is needed for upcoming additions to the stateful objects
infrastructure.

Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-04 13:25:09 +02:00
Liping Zhang d91fc59cd7 netfilter: introduce nf_conntrack_helper_put helper function
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-15 12:42:29 +02:00
Florian Westphal 694a0055f0 netfilter: nft_ct: allow to set ctnetlink event types of a connection
By default the kernel emits all ctnetlink events for a connection.
This allows to select the types of events to generate.

This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones
and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0.

This was already possible via iptables' CT target, but the nft version has
the advantage that it can also be used with already-established conntracks.

The added nf_ct_is_template() check isn't a bug fix as we only support
mark and labels (and unlike ecache the conntrack core doesn't copy those).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-19 17:55:16 +02:00
Florian Westphal cc41c84b7e netfilter: kill the fake untracked conntrack objects
resurrect an old patch from Pablo Neira to remove the untracked objects.

Currently, there are four possible states of an skb wrt. conntrack.

1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tables at any point in time
4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
   IPS_UNTRACKED_BIT in ct->status.

Untracked is supposed to be identical to case 1.  It exists only
so users can check

-m conntrack --ctstate UNTRACKED vs.
-m conntrack --ctstate INVALID

e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
supposed to be a no-op.

Thus currently we need to check
 ct == NULL || nf_ct_is_untracked(ct)

in a lot of places in order to avoid altering untracked objects.

The other consequence of the percpu untracked object is that all
-j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
(inc/dec the untracked conntracks refcount).

This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
make the distinction instead.

The (few) places that care about packet invalid (ct is NULL) vs.
packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
but all other places can omit the nf_ct_is_untracked() check.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-15 11:47:57 +02:00
Arushi Singhal d4ef383541 netfilter: Remove exceptional & on function name
Remove & from function pointers to conform to the style found elsewhere
in the file. Done using the following semantic patch

// <smpl>
@r@
identifier f;
@@

f(...) { ... }
@@
identifier r.f;
@@

- &f
+ f
// </smpl>

Signed-off-by: Arushi Singhal <arushisinghal19971997@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-07 18:24:47 +02:00
David S. Miller 16ae1f2236 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/genet/bcmmii.c
	drivers/net/hyperv/netvsc.c
	kernel/bpf/hashtab.c

Almost entirely overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 16:41:27 -07:00
Liping Zhang 4494dbc6de netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid
We should jump to invoke __nft_ct_set_destroy() instead of just
return error.

Fixes: edee4f1e92 ("netfilter: nft_ct: add zone id set support")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-15 17:15:54 +01:00
Florian Westphal 1a64edf54f netfilter: nft_ct: add helper set support
this allows to assign connection tracking helpers to
connections via nft objref infrastructure.

The idea is to first specifiy a helper object:

 table ip filter {
    ct helper some-name {
      type "ftp"
      protocol tcp
      l3proto ip
    }
 }

and then assign it via

nft add ... ct helper set "some-name"

helper assignment works for new conntracks only as we cannot expand the
conntrack extension area once it has been committed to the main conntrack
table.

ipv4 and ipv6 protocols are tracked stored separately so
we can also handle families that observe both ipv4 and ipv6 traffic.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-13 13:42:09 +01:00
Liping Zhang 10596608c4 netfilter: nf_tables: fix mismatch in big-endian system
Currently, there are two different methods to store an u16 integer to
the u32 data register. For example:
  u32 *dest = &regs->data[priv->dreg];
  1. *dest = 0; *(u16 *) dest = val_u16;
  2. *dest = val_u16;

For method 1, the u16 value will be stored like this, either in
big-endian or little-endian system:
  0          15           31
  +-+-+-+-+-+-+-+-+-+-+-+-+
  |   Value   |     0     |
  +-+-+-+-+-+-+-+-+-+-+-+-+

For method 2, in little-endian system, the u16 value will be the same
as listed above. But in big-endian system, the u16 value will be stored
like this:
  0          15           31
  +-+-+-+-+-+-+-+-+-+-+-+-+
  |     0     |   Value   |
  +-+-+-+-+-+-+-+-+-+-+-+-+

So later we use "memcmp(&regs->data[priv->sreg], data, 2);" to do
compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
result in big-endian system, as 0~15 bits will always be zero.

For the similar reason, when loading an u16 value from the u32 data
register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
the 2nd method will get the wrong value in the big-endian system.

So introduce some wrapper functions to store/load an u8 or u16
integer to/from the u32 data register, and use them in the right
place.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-13 13:30:28 +01:00
Florian Westphal 427345d612 netfilter: nft_ct: fix random validation errors for zone set support
Dan reports:
 net/netfilter/nft_ct.c:549 nft_ct_set_init()
 error: uninitialized symbol 'len'.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: edee4f1e92 ("netfilter: nft_ct: add zone id set support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-23 21:50:28 +01:00
Florian Westphal edee4f1e92 netfilter: nft_ct: add zone id set support
zones allow tracking multiple connections sharing identical tuples,
this is needed e.g. when tracking distinct vlans with overlapping ip
addresses (conntrack is l2 agnostic).

Thus the zone has to be set before the packet is picked up by the
connection tracker.  This is done by means of 'conntrack templates' which
are conntrack structures used solely to pass this info from one netfilter
hook to the next.

The iptables CT target instantiates these connection tracking templates
once per rule, i.e. the template is fixed/tied to particular zone, can
be read-only and therefore be re-used by as many skbs simultaneously as
needed.

We can't follow this model because we want to take the zone id from
an sreg at rule eval time so we could e.g. fill in the zone id from
the packets vlan id or a e.g. nftables key : value maps.

To avoid cost of per packet alloc/free of the template, use a percpu
template 'scratch' object and use the refcount to detect the (unlikely)
case where the template is still attached to another skb (i.e., previous
skb was nfqueued ...).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08 14:16:23 +01:00
Florian Westphal 5c178d81b6 netfilter: nft_ct: prepare for key-dependent error unwind
Next patch will add ZONE_ID set support which will need similar
error unwind (put operation) as conntrack labels.

Prepare for this: remove the 'label_got' boolean in favor
of a switch statement that can be extended in next patch.

As we already have that in the set_destroy function place that in
a separate function and call it from the set init function.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08 14:16:23 +01:00
Florian Westphal ab23821f7e netfilter: nft_ct: add zone id get support
Just like with counters the direction attribute is optional.
We set priv->dir to MAX unconditionally to avoid duplicating the assignment
for all keys with optional direction.

For keys where direction is mandatory, existing code already returns
an error.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-08 14:16:22 +01:00
Florian Westphal c74454fadd netfilter: add and use nf_ct_set helper
Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff.
This avoids changing code in followup patch that merges skb->nfct and
skb->nfctinfo into skb->_nfct.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-02 14:31:54 +01:00
Liping Zhang 949a358418 netfilter: nft_ct: add average bytes per packet support
Similar to xt_connbytes, user can match how many average bytes per packet
a connection has transferred so far.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-03 14:33:26 +01:00
Florian Westphal ecb2421b5d netfilter: add and use nf_ct_netns_get/put
currently aliased to try_module_get/_put.
Will be changed in next patch when we add functions to make use of ->net
argument to store usercount per l3proto tracker.

This is needed to avoid registering the conntrack hooks in all netns and
later only enable connection tracking in those that need conntrack.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-12-04 21:16:50 +01:00
Pablo Neira Ayuso 254432613c netfilter: nft_ct: add notrack support
This patch adds notrack support.

I decided to add a new expression, given that this doesn't fit into the
existing set operation. Notrack doesn't need a source register, and an
hypothetical NFT_CT_NOTRACK key makes no sense since matching the
untracked state is done through NFT_CT_STATE.

I'm placing this new notrack expression into nft_ct.c, I think a single
module is too much.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-10-26 16:35:16 +02:00
Liping Zhang 7bfdde7045 netfilter: nft_ct: report error if mark and dir specified simultaneously
NFT_CT_MARK is unrelated to direction, so if NFTA_CT_DIRECTION attr is
specified, report EINVAL to the userspace. This validation check was
already done at nft_ct_get_init, but we missed it in nft_ct_set_init.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-25 14:54:04 +02:00
Liping Zhang d767ff2c84 netfilter: nft_ct: unnecessary to require dir when use ct l3proto/protocol
Currently, if the user want to match ct l3proto, we must specify the
direction, for example:
  # nft add rule filter input ct original l3proto ipv4
                                 ^^^^^^^^
Otherwise, error message will be reported:
  # nft add rule filter input ct l3proto ipv4
  nft add rule filter input ct l3proto ipv4
  <cmdline>:1:1-38: Error: Could not process rule: Invalid argument
  add rule filter input ct l3proto ipv4
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Actually, there's no need to require NFTA_CT_DIRECTION attr, because
ct l3proto and protocol are unrelated to direction.

And for compatibility, even if the user specify the NFTA_CT_DIRECTION
attr, do not report error, just skip it.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-25 14:54:02 +02:00
David S. Miller c42d7121fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
they are:

1) Count pre-established connections as active in "least connection"
   schedulers such that pre-established connections to avoid overloading
   backend servers on peak demands, from Michal Kubecek via Simon Horman.

2) Address a race condition when resizing the conntrack table by caching
   the bucket size when fulling iterating over the hashtable in these
   three possible scenarios: 1) dump via /proc/net/nf_conntrack,
   2) unlinking userspace helper and 3) unlinking custom conntrack timeout.
   From Liping Zhang.

3) Revisit early_drop() path to perform lockless traversal on conntrack
   eviction under stress, use del_timer() as synchronization point to
   avoid two CPUs evicting the same entry, from Florian Westphal.

4) Move NAT hlist_head to nf_conn object, this simplifies the existing
   NAT extension and it doesn't increase size since recent patches to
   align nf_conn, from Florian.

5) Use rhashtable for the by-source NAT hashtable, also from Florian.

6) Don't allow --physdev-is-out from OUTPUT chain, just like
   --physdev-out is not either, from Hangbin Liu.

7) Automagically set on nf_conntrack counters if the user tries to
   match ct bytes/packets from nftables, from Liping Zhang.

8) Remove possible_net_t fields in nf_tables set objects since we just
   simply pass the net pointer to the backend set type implementations.

9) Fix possible off-by-one in h323, from Toby DiPasquale.

10) early_drop() may be called from ctnetlink patch, so we must hold
    rcu read size lock from them too, this amends Florian's patch #3
    coming in this batch, from Liping Zhang.

11) Use binary search to validate jump offset in x_tables, this
    addresses the O(n!) validation that was introduced recently
    resolve security issues with unpriviledge namespaces, from Florian.

12) Fix reference leak to connlabel in error path of nft_ct, from Zhang.

13) Three updates for nft_log: Fix log prefix leak in error path. Bail
    out on loglevel larger than debug in nft_log and set on the new
    NF_LOG_F_COPY_LEN flag when snaplen is specified. Again from Zhang.

14) Allow to filter rule dumps in nf_tables based on table and chain
    names.

15) Simplify connlabel to always use 128 bits to store labels and
    get rid of unused function in xt_connlabel, from Florian.

16) Replace set_expect_timeout() by mod_timer() from the h323 conntrack
    helper, by Gao Feng.

17) Put back x_tables module reference in nft_compat on error, from
    Liping Zhang.

18) Add a reference count to the x_tables extensions cache in
    nft_compat, so we can remove them when unused and avoid a crash
    if the extensions are rmmod, again from Zhang.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24 22:02:36 -07:00
Florian Westphal 23014011ba netfilter: conntrack: support a fixed size of 128 distinct labels
The conntrack label extension is currently variable-sized, e.g. if
only 2 labels are used by iptables rules then the labels->bits[] array
will only contain one element.

We track size of each label storage area in the 'words' member.

But in nftables and openvswitch we always have to ask for worst-case
since we don't know what bit will be used at configuration time.

As most arches are 64bit we need to allocate 24 bytes in this case:

struct nf_conn_labels {
    u8            words;   /*     0     1 */
    /* XXX 7 bytes hole, try to pack */
    long unsigned bits[2]; /*     8     24 */

Make bits a fixed size and drop the words member, it simplifies
the code and only increases memory requirements on x86 when
less than 64bit labels are required.

We still only allocate the extension if its needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-22 17:04:55 +02:00
Liping Zhang 590025a27f netfilter: nft_ct: fix unpaired nf_connlabels_get/put call
We only get nf_connlabels if the user add ct label set expr successfully,
but we will also put nf_connlabels if the user delete ct lable get expr.
This is mismathced, and will cause ct label expr cannot work properly.

Also, if we init something fail, we should put nf_connlabels back.
Otherwise, we may waste to alloc the memory that will never be used.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-19 19:52:03 +02:00
Liping Zhang 3f8b61b7f9 netfilter: nft_ct: make byte/packet expr more friendly
If we want to use ct packets expr, and add a rule like follows:
  # nft add rule filter input ct packets gt 1 counter

We will find that no packets will hit it, because
nf_conntrack_acct is disabled by default. So It will
not work until we enable it manually via
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_acct".

This is not friendly, so like xt_connbytes do, if the user
want to use ct byte/packet expr, enable nf_conntrack_acct
automatically.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:16:02 +02:00
Florian Westphal c8607e0200 netfilter: nft_ct: fix expiration getter
We need to compute timeout.expires - jiffies, not the other way around.
Add a helper, another patch can then later change more places in
conntrack code where we currently open-code this.

Will allow us to only change one place later when we remove per-ct timer.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-08 14:55:14 +02:00
Florian Westphal 1ad8f48df6 netfilter: nftables: add connlabel set support
Conntrack labels are currently sized depending on the iptables
ruleset, i.e. if we're asked to test or set bits 1, 2, and 65 then we
would allocate enough room to store at least bit 65.

However, with nft, the input is just a register with arbitrary runtime
content.

We therefore ask for the upper ceiling we currently have, which is
enough room to store 128 bits.

Alternatively, we could alter nf_connlabel_replace to increase
net->ct.label_words at run time, but since 128 bits is not that
big we'd only save sizeof(long) so it doesn't seem worth it for now.

This follows a similar approach that xtables 'connlabel'
match uses, so when user inputs

    ct label set bar

then we will set the bit used by the 'bar' label and leave the rest alone.

This is done by passing the sreg content to nf_connlabels_replace
as both value and mask argument.
Labels (bits) already set thus cannot be re-set to zero, but
this is not supported by xtables connlabel match either.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-05-05 16:27:59 +02:00
Florian Westphal adff6c6560 netfilter: connlabels: change nf_connlabels_get bit arg to 'highest used'
nf_connlabel_set() takes the bit number that we would like to set.
nf_connlabels_get() however took the number of bits that we want to
support.

So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
This changes nf_connlabels_get() to take the highest bit that we want
to set.

Callers then don't have to cope with a potential integer wrap
when using nf_connlabels_get(bit + 1) anymore.

Current callers are fine, this change is only to make folloup
nft ct label set support simpler.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-18 20:39:48 +02:00
Pablo Neira Ayuso efaea94aaf netfilter: nft_ct: keep counters away from CONFIG_NF_CONNTRACK_LABELS
This is accidental, they don't depend on the label infrastructure.

Fixes: 48f66c905a ("netfilter: nft_ct: add byte/packet counter support")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Florian Westphal <fw@strlen.de>
2016-01-14 19:41:16 +01:00
David S. Miller 9b59377b75 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next, they are:

1) Release nf_tables objects on netns destructions via
   nft_release_afinfo().

2) Destroy basechain and rules on netdevice removal in the new netdev
   family.

3) Get rid of defensive check against removal of inactive objects in
   nf_tables.

4) Pass down netns pointer to our existing nfnetlink callbacks, as well
   as commit() and abort() nfnetlink callbacks.

5) Allow to invert limit expression in nf_tables, so we can throttle
   overlimit traffic.

6) Add packet duplication for the netdev family.

7) Add forward expression for the netdev family.

8) Define pr_fmt() in conntrack helpers.

9) Don't leave nfqueue configuration on inconsistent state in case of
   errors, from Ken-ichirou MATSUZAWA, follow up patches are also from
   him.

10) Skip queue option handling after unbind.

11) Return error on unknown both in nfqueue and nflog command.

12) Autoload ctnetlink when NFQA_CFG_F_CONNTRACK is set.

13) Add new NFTA_SET_USERDATA attribute to store user data in sets,
    from Carlos Falgueras.

14) Add support for 64 bit byteordering changes nf_tables, from Florian
    Westphal.

15) Add conntrack byte/packet counter matching support to nf_tables,
    also from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-08 20:53:16 -05:00
Florian Westphal 48f66c905a netfilter: nft_ct: add byte/packet counter support
If the accounting extension isn't present, we'll return a counter
value of 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-01-08 14:44:09 +01:00
Florian Westphal d5f79b6e4d netfilter: nft_ct: include direction when dumping NFT_CT_L3PROTOCOL key
one nft userspace test case fails with

'ct l3proto original ipv4' mismatches 'ct l3proto ipv4'

... because NFTA_CT_DIRECTION attr is missing.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-18 14:45:45 +01:00
Patrick McHardy 49499c3e6e netfilter: nf_tables: switch registers to 32 bit addressing
Switch the nf_tables registers from 128 bit addressing to 32 bit
addressing to support so called concatenations, where multiple values
can be concatenated over multiple registers for O(1) exact matches of
multiple dimensions using sets.

The old register values are mapped to areas of 128 bits for compatibility.
When dumping register numbers, values are expressed using the old values
if they refer to the beginning of a 128 bit area for compatibility.

To support concatenations, register loads of less than a full 32 bit
value need to be padded. This mainly affects the payload and exthdr
expressions, which both unconditionally zero the last word before
copying the data.

Userspace fully passes the testsuite using both old and new register
addressing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:29 +02:00
Patrick McHardy b1c96ed37c netfilter: nf_tables: add register parsing/dumping helpers
Add helper functions to parse and dump register values in netlink attributes.
These helpers will later be changed to take care of translation between the
old 128 bit and the new 32 bit register numbers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:28 +02:00
Patrick McHardy fad136ea0d netfilter: nf_tables: convert expressions to u32 register pointers
Simple conversion to use u32 pointers to the beginning of the registers
to keep follow up patches smaller.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:25 +02:00
Patrick McHardy a55e22e92f netfilter: nf_tables: get rid of NFT_REG_VERDICT usage
Replace the array of registers passed to expressions by a struct nft_regs,
containing the verdict as a seperate member, which aliases to the
NFT_REG_VERDICT register.

This is needed to seperate the verdict from the data registers completely,
so their size can be changed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:07 +02:00
Patrick McHardy d07db9884a netfilter: nf_tables: introduce nft_validate_register_load()
Change nft_validate_input_register() to not only validate the input
register number, but also the length of the load, and rename it to
nft_validate_register_load() to reflect that change.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:50 +02:00
Patrick McHardy 27e6d2017a netfilter: nf_tables: kill nft_validate_output_register()
All users of nft_validate_register_store() first invoke
nft_validate_output_register(). There is in fact no use for using it
on its own, so simplify the code by folding the functionality into
nft_validate_register_store() and kill it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:50 +02:00
Patrick McHardy 1ec10212f9 netfilter: nf_tables: rename nft_validate_data_load()
The existing name is ambiguous, data is loaded as well when we read from
a register. Rename to nft_validate_register_store() for clarity and
consistency with the upcoming patch to introduce its counterpart,
nft_validate_register_load().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:49 +02:00
Patrick McHardy 45d9bcda21 netfilter: nf_tables: validate len in nft_validate_data_load()
For values spanning multiple registers, we need to validate that enough
space is available from the destination register onwards. Add a len
argument to nft_validate_data_load() and consolidate the existing length
validations in preparation of that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:49 +02:00
David Miller c1f8667677 netfilter: Fix switch statement warnings with recent gcc.
More recent GCC warns about two kinds of switch statement uses:

1) Switching on an enumeration, but not having an explicit case
   statement for all members of the enumeration.  To show the
   compiler this is intentional, we simply add a default case
   with nothing more than a break statement.

2) Switching on a boolean value.  I think this warning is dumb
   but nevertheless you get it wholesale with -Wswitch.

This patch cures all such warnings in netfilter.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 15:20:50 -04:00
Patrick McHardy fe92ca45a1 netfilter: nft_ct: split nft_ct_init() into two functions for get/set
For value spanning multiple registers, we need to validate the length
of data loads. In order to add this to nft_ct, we need the length from
key validation. Split the nft_ct_init() function into two functions
for the get and set operations as preparation for that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-02 21:29:45 +02:00
Patrick McHardy e88e514e1f netfilter: nft_ct: add missing ifdef for NFT_MARK setting
The set operation for ct mark is only valid if CONFIG_NF_CONNTRACK_MARK is
enabled.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-02 21:29:07 +02:00
Patrick McHardy d46f2cd260 netfilter: nft_ct: remove family from struct nft_ct
Since we have the context available during destruction again, we can
remove the family from the private structure.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:19 +01:00
Patrick McHardy 62472bcefb netfilter: nf_tables: restore context for expression destructors
In order to fix set destruction notifications and get rid of unnecessary
members in private data structures, pass the context to expressions'
destructor functions again.

In order to do so, replace various members in the nft_rule_trans structure
by the full context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:17 +01:00
Florian Westphal d2bf2f34cc netfilter: nft_ct: labels get support
This also adds NF_CT_LABELS_MAX_SIZE so it can be re-used
as BUILD_BUG_ON in nft_ct.

At this time, nft doesn't yet support writing to the label area;
when this changes the label->words handling needs to be moved
out of xt_connlabel.c into nf_conntrack_labels.c.

Also removes a useless run-time check: words cannot grow beyond
4 (32 bit) or 2 (64bit) since xt_connlabel enforces a maximum of
128 labels.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-19 11:41:25 +01:00
Patrick McHardy 51292c0735 netfilter: nft_ct: fix missing NFT_CT_L3PROTOCOL key in validity checks
The key was missing in the list of valid keys, add it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:05:33 +01:00
Arturo Borrero 2a53bfb3e0 netfilter: nft_ct: fix unconditional dump of 'dir' attr
We want to make sure that the information that we get from the kernel can
be reinjected without troubles. The kernel shouldn't return an attribute
that is not required, or even prohibited.

Dumping unconditionally NFTA_CT_DIRECTION could lead an application in
userspace to interpret that the attribute was originally set, while it
was not.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 13:16:17 +01:00
Kristian Evensen 847c8e2959 netfilter: nft_ct: fix compilation warning if NF_CONNTRACK_MARK is not set
net/netfilter/nft_ct.c: In function 'nft_ct_set_eval':
net/netfilter/nft_ct.c:136:6: warning: unused variable 'value' [-Wunused-variable]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-15 11:00:14 +01:00
Kristian Evensen c4ede3d382 netfilter: nft_ct: Add support to set the connmark
This patch adds kernel support for setting properties of tracked
connections. Currently, only connmark is supported. One use-case
for this feature is to provide the same functionality as
-j CONNMARK --save-mark in iptables.

Some restructuring was needed to implement the set op. The new
structure follows that of nft_meta.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 19:07:44 +01:00
Patrick McHardy 9638f33ecf netfilter: nft_ct: load both IPv4 and IPv6 conntrack modules for NFPROTO_INET
The ct expression can currently not be used in the inet family since
we don't have a conntrack module for NFPROTO_INET, so
nf_ct_l3proto_try_module_get() fails. Add some manual handling to
load the modules for both NFPROTO_IPV4 and NFPROTO_IPV6 if the
ct expression is used in the inet family.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:57:32 +01:00