redonkable/remarkable-linux

Author	SHA1	Message	Date
Vladimir Davydov	6de64beb34	memcg: remove KMEM_ACCOUNTED_ACTIVATED flag Currently we have two state bits in mem_cgroup::kmem_account_flags regarding kmem accounting activation, ACTIVATED and ACTIVE. We start kmem accounting only if both flags are set (memcg_can_account_kmem()), plus throughout the code there are several places where we check only the ACTIVE flag, but we never check the ACTIVATED flag alone. These flags are both set from memcg_update_kmem_limit() under the set_limit_mutex, the ACTIVE flag always being set after ACTIVATED, and they never get cleared. That said checking if both flags are set is equivalent to checking only for the ACTIVE flag, and since there is no ACTIVATED flag checks, we can safely remove the ACTIVATED flag, and nothing will change. Let's try to understand what was the reason for introducing these flags. The purpose of the ACTIVE flag is clear - it states that kmem should be accounting to the cgroup. The only requirement for it is that it should be set after we have fully initialized kmem accounting bits for the cgroup and patched all static branches relating to kmem accounting. Since we always check if static branch is enabled before actually considering if we should account (otherwise we wouldn't benefit from static branching), this guarantees us that we won't skip a commit or uncharge after a charge due to an unpatched static branch. Now let's move on to the ACTIVATED bit. As I proved in the beginning of this message, it is absolutely useless, and removing it will change nothing. So what was the reason introducing it? The ACTIVATED flag was introduced by commit `a8964b9b84` ("memcg: use static branches when code not in use") in order to guarantee that static_key_slow_inc(&memcg_kmem_enabled_key) would be called only once for each memory cgroup when its kmem accounting was activated. The point was that at that time the memcg_update_kmem_limit() function's work-flow looked like this: bool must_inc_static_branch = false; cgroup_lock(); mutex_lock(&set_limit_mutex); if (!memcg->kmem_account_flags && val != RESOURCE_MAX) { /* The kmem limit is set for the first time / ret = res_counter_set_limit(&memcg->kmem, val); memcg_kmem_set_activated(memcg); must_inc_static_branch = true; } else ret = res_counter_set_limit(&memcg->kmem, val); mutex_unlock(&set_limit_mutex); cgroup_unlock(); if (must_inc_static_branch) { / We can't do this under cgroup_lock */ static_key_slow_inc(&memcg_kmem_enabled_key); memcg_kmem_set_active(memcg); } So that without the ACTIVATED flag we could race with other threads trying to set the limit and increment the static branching ref-counter more than once. Today we call the whole memcg_update_kmem_limit() function under the set_limit_mutex and this race is impossible. As now we understand why the ACTIVATED bit was introduced and why we don't need it now, and know that removing it will change nothing anyway, let's get rid of it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	f8570263ee	memcg, slab: RCU protect memcg_params for root caches We relocate root cache's memcg_params whenever we need to grow the memcg_caches array to accommodate all kmem-active memory cgroups. Currently on relocation we free the old version immediately, which can lead to use-after-free, because the memcg_caches array is accessed lock-free (see cache_from_memcg_idx()). This patch fixes this by making memcg_params RCU-protected for root caches. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	f717eb3abb	slab: do not panic if we fail to create memcg cache There is no point in flooding logs with warnings or especially crashing the system if we fail to create a cache for a memcg. In this case we will be accounting the memcg allocation to the root cgroup until we succeed to create its own cache, but it isn't that critical. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	842e287369	memcg: get rid of kmem_cache_dup() kmem_cache_dup() is only called from memcg_create_kmem_cache(). The latter, in fact, does nothing besides this, so let's fold kmem_cache_dup() into memcg_create_kmem_cache(). This patch also makes the memcg_cache_mutex private to memcg_create_kmem_cache(), because it is not used anywhere else. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	2edefe1155	memcg, slab: fix races in per-memcg cache creation/destruction We obtain a per-memcg cache from a root kmem_cache by dereferencing an entry of the root cache's memcg_params::memcg_caches array. If we find no cache for a memcg there on allocation, we initiate the memcg cache creation (see memcg_kmem_get_cache()). The cache creation proceeds asynchronously in memcg_create_kmem_cache() in order to avoid lock clashes, so there can be several threads trying to create the same kmem_cache concurrently, but only one of them may succeed. However, due to a race in the code, it is not always true. The point is that the memcg_caches array can be relocated when we activate kmem accounting for a memcg (see memcg_update_all_caches(), memcg_update_cache_size()). If memcg_update_cache_size() and memcg_create_kmem_cache() proceed concurrently as described below, we can leak a kmem_cache. Asume two threads schedule creation of the same kmem_cache. One of them successfully creates it. Another one should fail then, but if memcg_create_kmem_cache() interleaves with memcg_update_cache_size() as follows, it won't: memcg_create_kmem_cache() memcg_update_cache_size() (called w/o mutexes held) (called with slab_mutex, set_limit_mutex held) ------------------------- ------------------------- mutex_lock(&memcg_cache_mutex) s->memcg_params=kzalloc(...) new_cachep=cache_from_memcg_idx(cachep,idx) // new_cachep==NULL => proceed to creation s->memcg_params->memcg_caches[i] =cur_params->memcg_caches[i] // kmem_cache_create_memcg takes slab_mutex // so we will hang around until // memcg_update_cache_size finishes, but // nothing will prevent it from succeeding so // memcg_caches[idx] will be overwritten in // memcg_register_cache! new_cachep = kmem_cache_create_memcg(...) mutex_unlock(&memcg_cache_mutex) Let's fix this by moving the check for existence of the memcg cache to kmem_cache_create_memcg() to be called under the slab_mutex and make it return NULL if so. A similar race is possible when destroying a memcg cache (see kmem_cache_destroy()). Since memcg_unregister_cache(), which clears the pointer in the memcg_caches array, is called w/o protection, we can race with memcg_update_cache_size() and omit clearing the pointer. Therefore memcg_unregister_cache() should be moved before we release the slab_mutex. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	96403da244	memcg: fix possible NULL deref while traversing memcg_slab_caches list All caches of the same memory cgroup are linked in the memcg_slab_caches list via kmem_cache::memcg_params::list. This list is traversed, for example, when we read memory.kmem.slabinfo. Since the list actually consists of memcg_cache_params objects, we have to convert an element of the list to a kmem_cache object using memcg_params_to_cache(), which obtains the pointer to the cache from the memcg_params::memcg_caches array of the corresponding root cache. That said the pointer to a kmem_cache in its parent's memcg_params must be initialized before adding the cache to the list, and cleared only after it has been unlinked. Currently it is vice-versa, which can result in a NULL ptr dereference while traversing the memcg_slab_caches list. This patch restores the correct order. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	959c8963fc	memcg, slab: fix barrier usage when accessing memcg_caches Each root kmem_cache has pointers to per-memcg caches stored in its memcg_params::memcg_caches array. Whenever we want to allocate a slab for a memcg, we access this array to get per-memcg cache to allocate from (see memcg_kmem_get_cache()). The access must be lock-free for performance reasons, so we should use barriers to assert the kmem_cache is up-to-date. First, we should place a write barrier immediately before setting the pointer to it in the memcg_caches array in order to make sure nobody will see a partially initialized object. Second, we should issue a read barrier before dereferencing the pointer to conform to the write barrier. However, currently the barrier usage looks rather strange. We have a write barrier after setting the pointer and a read barrier before reading the pointer, which is incorrect. This patch fixes this. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	1aa1325425	memcg, slab: clean up memcg cache initialization/destruction Currently, we have rather a messy function set relating to per-memcg kmem cache initialization/destruction. Per-memcg caches are created in memcg_create_kmem_cache(). This function calls kmem_cache_create_memcg() to allocate and initialize a kmem cache and then "registers" the new cache in the memcg_params::memcg_caches array of the parent cache. During its work-flow, kmem_cache_create_memcg() executes the following memcg-related functions: - memcg_alloc_cache_params(), to initialize memcg_params of the newly created cache; - memcg_cache_list_add(), to add the new cache to the memcg_slab_caches list. On the other hand, kmem_cache_destroy() called on a cache destruction only calls memcg_release_cache(), which does all the work: it cleans the reference to the cache in its parent's memcg_params::memcg_caches, removes the cache from the memcg_slab_caches list, and frees memcg_params. Such an inconsistency between destruction and initialization paths make the code difficult to read, so let's clean this up a bit. This patch moves all the code relating to registration of per-memcg caches (adding to memcg list, setting the pointer to a cache from its parent) to the newly created memcg_register_cache() and memcg_unregister_cache() functions making the initialization and destruction paths look symmetrical. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	363a044f73	memcg, slab: kmem_cache_create_memcg(): fix memleak on fail path We do not free the cache's memcg_params if __kmem_cache_create fails. Fix this. Plus, rename memcg_register_cache() to memcg_alloc_cache_params(), because it actually does not register the cache anywhere, but simply initialize kmem_cache::memcg_params. [akpm@linux-foundation.org: fix build] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	3965fc3652	slab: clean up kmem_cache_create_memcg() error handling Currently kmem_cache_create_memcg() backoffs on failure inside conditionals, without using gotos. This results in the rollback code duplication, which makes the function look cumbersome even though on error we should only free the allocated cache. Since in the next patch I am going to add yet another rollback function call on error path there, let's employ labels instead of conditionals for undoing any changes on failure to keep things clean. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Sasha Levin	309381feae	mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE Most of the VM_BUG_ON assertions are performed on a page. Usually, when one of these assertions fails we'll get a BUG_ON with a call stack and the registers. I've recently noticed based on the requests to add a small piece of code that dumps the page to various VM_BUG_ON sites that the page dump is quite useful to people debugging issues in mm. This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what VM_BUG_ON() does, also dumps the page before executing the actual BUG_ON. [akpm@linux-foundation.org: fix up includes] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Naoya Horiguchi	e3bba3c3c9	fs/proc/page.c: add PageAnon check to surely detect thp stable_page_flags() checks !PageHuge && PageTransCompound && PageLRU to know that a specified page is thp or not. But sometimes it's not enough and we fail to detect thp when the thp is on pagevec. This happens only for a few seconds after LRU list operations, but it makes it difficult to control our applications depending on this flag. So this patch adds another check PageAnon to detect thps on pagevec. It might not give the future extensibility for thp pagecache, but it's OK at least for now. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Vladimir Davydov	8ff69e2c85	memcg: do not use vmalloc for mem_cgroup allocations The vmalloc was introduced by `3332794878` ("memcgroup: use vmalloc for mem_cgroup allocation"), because at that time MAX_NUMNODES was used for defining the per-node array in the mem_cgroup structure so that the structure could be huge even if the system had the only NUMA node. The situation was significantly improved by commit `45cf7ebd5a` ("memcg: reduce the size of struct memcg 244-fold"), which made the size of the mem_cgroup structure calculated dynamically depending on the real number of NUMA nodes installed on the system (nr_node_ids), so now there is no point in using vmalloc here: the structure is allocated rarely and on most systems its size is about 1K. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@openvz.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Vlastimil Babka	01cc2e5869	mm: munlock: fix potential race with THP page split Since commit `ff6a6da60b` ("mm: accelerate munlock() treatment of THP pages") munlock skips tail pages of a munlocked THP page. There is some attempt to prevent bad consequences of racing with a THP page split, but code inspection indicates that there are two problems that may lead to a non-fatal, yet wrong outcome. First, __split_huge_page_refcount() copies flags including PageMlocked from the head page to the tail pages. Clearing PageMlocked by munlock_vma_page() in the middle of this operation might result in part of tail pages left with PageMlocked flag. As the head page still appears to be a THP page until all tail pages are processed, munlock_vma_page() might think it munlocked the whole THP page and skip all the former tail pages. Before `ff6a6da60`, those pages would be cleared in further iterations of munlock_vma_pages_range(), but NR_MLOCK would still become undercounted (related the next point). Second, NR_MLOCK accounting is based on call to hpage_nr_pages() after the PageMlocked is cleared. The accounting might also become inconsistent due to race with __split_huge_page_refcount() - undercount when HUGE_PMD_NR is subtracted, but some tail pages are left with PageMlocked set and counted again (only possible before `ff6a6da60`) - overcount when hpage_nr_pages() sees a normal page (split has already finished), but the parallel split has meanwhile cleared PageMlocked from additional tail pages This patch prevents both problems via extending the scope of lru_lock in munlock_vma_page(). This is convenient because: - __split_huge_page_refcount() takes lru_lock for its whole operation - munlock_vma_page() typically takes lru_lock anyway for page isolation As this becomes a second function where page isolation is done with lru_lock already held, factor this out to a new __munlock_isolate_lru_page() function and clean up the code around. [akpm@linux-foundation.org: avoid a coding-style ugly] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Dave Hansen	f0b791a34c	mm: print more details for bad_page() bad_page() is cool in that it prints out a bunch of data about the page. But, I can never remember which page flags are good and which are bad, or whether ->index or ->mapping is required to be NULL. This patch allows bad/dump_page() callers to specify a string about why they are dumping the page and adds explanation strings to a number of places. It also adds a 'bad_flags' argument to bad_page(), which it then dumps out separately from the flags which are actually set. This way, the messages will show specifically why the page was bad, specifically which flags it is complaining about, if it was a page flag combination which was the problem. [akpm@linux-foundation.org: switch to pr_alert] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Dan Streetman	12ab028be0	mm/zswap.c: change params from hidden to ro The "compressor" and "enabled" params are currently hidden, this changes them to read-only, so userspace can tell if zswap is enabled or not and see what compressor is in use. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Vladimir Murzin <murzin.v@gmail.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Weijie Yang <weijie.yang@samsung.com> Acked-by: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Dave Hansen	57ea8171d2	mm: documentation: remove hopelessly out-of-date locking doc Documentation/vm/locking is a blast from the past. In the entire git history, it has had precisely Three modifications. Two of those look to be pure renames, and the third was from 2005. The doc contains such gems as: > The page_table_lock is grabbed while holding the > kernel_lock spinning monitor. > Page stealers hold kernel_lock to protect against a bunch of > races. Or this which talks about mmap_sem: > 4. The exception to this rule is expand_stack, which just > takes the read lock and the page_table_lock, this is ok > because it doesn't really modify fields anybody relies on. expand_stack() doesn't take any locks any more directly, and the mmap_sem acquisition was long ago moved up in to the page fault code itself. It could be argued that we need to rewrite this, but it is dangerous to leave it as-is. It will confuse more people than it helps. Signed-off-by: Dave Hansen <dave.hansen@intel.com> Cc: Hugh Dickins <hughd@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Michal Simek	372c7209d6	microblaze: extable: sort the exception table at build time Sort the exception table at build-time rather than during boot. Microblaze is the same case as AARCH64 that's why EM_MICROBLAZE conditional check was added to allow cross-compilation on machines which are not running the latest libc-dev. Inspired by AARCH64 commit `adace89562` ("arm64: extable: sort the exception table at build time"). Signed-off-by: Michal Simek <michal.simek@xilinx.com> Acked-by: David Daney <david.daney@cavium.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Geert Uytterhoeven	3fdb38bd1f	cris: provide {in,out}[wl]_p() drivers/staging/comedi/drivers/das6402.c: In function 'intr_handler': drivers/staging/comedi/drivers/das6402.c:164:3: error: implicit declaration of function 'outw_p' [-Werror=implicit-function-declaration] drivers/staging/speakup/speakup_dtlk.c: In function 'synth_probe': drivers/staging/speakup/speakup_dtlk.c:362:2: error: implicit declaration of function 'inw_p' [-Werror=implicit-function-declaration] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Jiri Pirko	813f020c5d	rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info This check is not needed because the same check is done before fill_slave_info is used in rtnl_link_slave_info_fill. Also, by removing this check, kernel will fillup IFLA_INFO_SLAVE_KIND even for slaves of masters which does not implement fill_slave_info. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:21:48 -08:00
David S. Miller	9fac865db0	Merge branch 'qlcnic' Himanshu Madhani says: ==================== qlcnic: Refactoring and enhancements for all adapters. This patch series contains following patches. o Enhanced debug data collection when we are in Tx-timeout situation. o Enhanced MSI-x vector calculation for defualt load path as well as for TSS/RSS ring change path. o Refactored interrupt coalescing code for all adapters. o Refactored interrupt handling as well as cleanup of poll controller code patch for all adapters. o changed rx_mac_learn type to boolean. Please apply to net-next. ==================== Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:39 -08:00
Himanshu Madhani	18cae184e4	qlcnic: update version to 5.3.55 Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Himanshu Madhani	cb9327d567	qlcnic: Enhance logic to calculate msix vectors. o Refactored MSI-x vector calculation for All adapters. Decoupled logic in the code which was using same call to request MSI-x vectors in default driver load, as well as during set_channel() operation for TSS/RSS. This refactoring simplifies code for TSS/RSS code path as well as probe path of the driver load for all adapters. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Himanshu Madhani	a514722afe	qlcnic: Refactor interrupt coalescing code for all adapters. o Refactor configuration of interrupt coalescing parameters for all supported adapters. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Manish chopra	2b018ad9fe	qlcnic: Update poll controller code path Add support for MSI/MSI-X mode in poll controller routine. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Manish chopra	2cc5752e49	qlcnic: Interrupt code cleanup o Added hardware ops for interrupt enable/disable functions Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Himanshu Madhani	95b3890ae3	qlcnic: Enhance Tx timeout debugging. o Dump each Tx queue details with all descriptors, queue indices and Tx queue stats to imporve data colletion in situations where Tx timeout occurs. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:09 -08:00
Sucheta Chakraborty	72ebe3495f	qlcnic: Use bool for rx_mac_learn. o Use boolean type instead of u8. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:13:08 -08:00
Nikolay Aleksandrov	0681a28264	bonding: fix u64 division After the option conversion downdelay and updelay divide a u64 and on a 32 bit this causes the following errors: ERROR: "__udivdi3" [drivers/net/bonding/bonding.ko] undefined! ERROR: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined! Fix it by using a normal int instead because newval->value is capped at INT_MAX by the way the option is defined. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 16:10:24 -08:00
Jiri Pirko	237266f76d	rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:40:51 -08:00
Ben Hutchings	d9317aea16	sfc: Use the correct maximum TX DMA ring size for SFC9100 As part of a workaround for a hardware erratum in the SFC9100 family (SF bug 35388), the TX_DESC_UPD_DWORD register address is also used for communicating with the event block, and only descriptor pointer values < 2048 are valid. If the TX DMA ring size is increased to 4096 descriptors (which the firmware still allows) then we may write a descriptor pointer value >= 2048, which has entirely different and undesirable effects! Limit the TX DMA ring size correctly when this workaround is in effect. Fixes: `8127d661e7` ('sfc: Add support for Solarflare SFC9100 family') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:40:51 -08:00
Shradha Shah	8533ccf3e7	Add Shradha Shah as the sfc driver maintainer. I will be taking over the work from Ben Hutchings. Signed-off-by: Shradha Shah <sshah@solarflare.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:40:50 -08:00
Or Gerlitz	d0bc65557a	net/vxlan: Share RX skb de-marking and checksum checks with ovs Make sure the practice set by commit `0afb166` "vxlan: Add capability of Rx checksum offload for inner packet" is applied when the skb goes through the portion of the RX code which is shared between vxlan netdevices and ovs vxlan port instances. Cc: Joseph Gasparakis <joseph.gasparakis@intel.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:30:03 -08:00
Dan Carpenter	28f39ef6d8	tulip: cleanup by using ARRAY_SIZE() In this situation then ARRAY_SIZE() and sizeof() are the same, but we're really dealing with array indexes and not byte offsets so ARRAY_SIZE() is cleaner. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:28:29 -08:00
Duan Jiong	11c21a307d	ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called commit a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach") clear IPCB in ip_tunnel_xmit() , or else skb->cb[] may contain garbage from GSO segmentation layer. But commit 0e6fbc5b6c621("ip_tunnels: extend iptunnel_xmit()") refactor codes, and it clear IPCB behind the dst_link_failure(). So clear IPCB in ip_tunnel_xmit() just like commti a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach"). Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:23:16 -08:00
Gavin Shan	9fe6cb5837	net/cxgb4: Don't retrieve stats during recovery We possibly retrieve the adapter's statistics during EEH recovery and that should be disallowed. Otherwise, it would possibly incur replicate EEH error and EEH recovery is going to fail eventually. The patch reuses statistics lock and checks net_device is attached before going to retrieve statistics, so that the problem can be avoided. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:21:03 -08:00
Gavin Shan	144be3d9f7	net/cxgb4: Avoid disabling PCI device for towice If we have EEH error happens to the adapter and we have to remove it from the system for some reasons (e.g. more than 5 EEH errors detected from the device in last hour), the adapter will be disabled for towice separately by eeh_err_detected() and remove_one(), which will incur following unexpected backtrace. The patch tries to avoid it. WARNING: at drivers/pci/pci.c:1431 CPU: 12 PID: 121 Comm: eehd Not tainted 3.13.0-rc7+ #1 task: c0000001823a3780 ti: c00000018240c000 task.ti: c00000018240c000 NIP: c0000000003c1e40 LR: c0000000003c1e3c CTR: 0000000001764c5c REGS: c00000018240f470 TRAP: 0700 Not tainted (3.13.0-rc7+) MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 28000024 XER: 00000004 CFAR: c000000000706528 SOFTE: 1 GPR00: c0000000003c1e3c c00000018240f6f0 c0000000010fe1f8 0000000000000035 GPR04: 0000000000000000 0000000000000000 00000000003ae509 0000000000000000 GPR08: 000000000000346f 0000000000000000 0000000000000000 0000000000003fef GPR12: 0000000028000022 c00000000ec93000 c0000000000c11b0 c000000184ac3e40 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 c0000000009398d8 c00000000101f9c0 c0000001860ae000 GPR28: c000000182ba0000 00000000000001f0 c0000001860ae6f8 c0000001860ae000 NIP [c0000000003c1e40] .pci_disable_device+0xd0/0xf0 LR [c0000000003c1e3c] .pci_disable_device+0xcc/0xf0 Call Trace: [c0000000003c1e3c] .pci_disable_device+0xcc/0xf0 (unreliable) [d0000000073881c4] .remove_one+0x174/0x320 [cxgb4] [c0000000003c57e0] .pci_device_remove+0x60/0x100 [c00000000046396c] .__device_release_driver+0x9c/0x120 [c000000000463a20] .device_release_driver+0x30/0x60 [c0000000003bcdb4] .pci_stop_bus_device+0x94/0xd0 [c0000000003bcf48] .pci_stop_and_remove_bus_device+0x18/0x30 [c00000000003f548] .pcibios_remove_pci_devices+0xa8/0x140 [c000000000035c00] .eeh_handle_normal_event+0xa0/0x3c0 [c000000000035f50] .eeh_handle_event+0x30/0x2b0 [c0000000000362c4] .eeh_event_handler+0xf4/0x1b0 [c0000000000c12b8] .kthread+0x108/0x130 [c00000000000a168] .ret_from_kernel_thread+0x5c/0x74 Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:21:03 -08:00
Mugunthan V N	0cd8f9cc06	drivers: net: cpsw: enable promiscuous mode support Enable promiscuous mode support for CPSW. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:12:14 -08:00
Vlad Yasevich	6ef7b8a23a	net: Correctly sync addresses from multiple sources to single device When we have multiple devices attempting to sync the same address to a single destination, each device should be permitted to sync it once. To accomplish this, pass the 'sync_cnt' of the source address when adding the addresss to the lower device. 'sync_cnt' tracks how many time a given address has been succefully synced. This way, we know that if the 'sync_cnt' passed in is 0, we should sync this address. Also, turn 'synced' member back into the counter as was originally done in commit `4543fbefe6`. net: count hw_addr syncs so that unsync works properly. It tracks how many time a given address has been added via a 'sync' operation. For every successfull 'sync' the counter is incremented, and for ever 'unsync', the counter is decremented. This makes sure that the address will be properly removed from the the lower device when all the upper devices have removed it. Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> CC: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> CC: Alexandra N. Kossovsky <Alexandra.Kossovsky@oktetlabs.ru> CC: Konstantin Ushakov <Konstantin.Ushakov@oktetlabs.ru> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:06:34 -08:00
Shlomo Pongratz	a1d0cd8ed5	net/udp_offload: Handle static checker complaints Fixed few issues around using __rcu prefix and rcu_assign_pointer, also fixed a warning print to use ntohs(port) and not htons(port). net/ipv4/udp_offload.c:112:9: error: incompatible types in comparison expression (different address spaces) net/ipv4/udp_offload.c:113:9: error: incompatible types in comparison expression (different address spaces) net/ipv4/udp_offload.c:176:19: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 12:59:08 -08:00
Christoph Paasch	3ad88cf70a	tcp: metrics: Handle v6/v4-mapped sockets in tcp-metrics A socket may be v6/v4-mapped. In that case sk->sk_family is AF_INET6, but the IP being used is actually an IPv4-address. Current's tcp-metrics will thus represent it as an IPv6-address: root@server:~# ip tcp_metrics ::ffff:10.1.1.2 age 22.920sec rtt 18750us rttvar 15000us cwnd 10 10.1.1.2 age 47.970sec rtt 16250us rttvar 10000us cwnd 10 This patch modifies the tcp-metrics so that they are able to handle the v6/v4-mapped sockets correctly. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 12:48:28 -08:00
Sujith Manoharan	a64e1a4506	ath9k: Fix RX interrupt mitigation The threshold values for RX interrupt mitigation are different for AR9003 and AR9002 families. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-01-23 14:02:45 -05:00
Roman Dubtsov	7f6d740753	rt2x00: rt2800usb: mark D-Link DWA-137 as supported Signed-off-by: Roman Dubtsov <dubtsov@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-01-23 14:02:45 -05:00
Sujith Manoharan	3b745c7ba9	ath9k: Fix code mistake The commit "ath9k: Process GTT interrupts" accidentally had a line that was commented out. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-01-23 14:02:45 -05:00
ZHAO Gang	64e5acb09c	b43: fix the wrong assignment of status.freq in b43_rx() Use the right function to update frequency value. If rx skb is probe response or beacon, the wrong frequency value can cause problem that bss info can't be updated when it should be. Cc: <stable@vger.kernel.org> Fixes: `8318d78a44` ("cfg80211 API for channels/bitrates, mac80211 and driver conversion") Signed-off-by: ZHAO Gang <gamerh2o@gmail.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-01-23 14:02:45 -05:00
Andreas Fenkart	09e65f5b2b	mwifiex: fix wakeup on magic packet 8 bytes preamble 14 bytes src/dst/eth_type 6 bytes 0xff:0xff.. http://en.wikipedia.org/wiki/Wake-on-LAN#Magic_packet http://en.wikipedia.org/wiki/EtherType This will fail if we VLAN or the magic packet is encapsulated as a UDP packet... Signed-off-by: Andreas Fenkart <afenkart@gmail.com> Acked-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-01-23 14:02:45 -05:00
John W. Linville	cfa9c3fba7	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next	2014-01-23 14:00:51 -05:00
Linus Torvalds	90804ed61f	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF & jbd fixes from Jan Kara: "A cleanup of JBD log messages and UDF fix of a lockdep warning" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix lockdep warning from udf_symlink() jbd: Revise KERN_EMERG error messages	2014-01-23 10:49:23 -08:00
Masami Hiramatsu	4afc81cd1c	perf symbols: Load map before using map->map_ip() In map_groups__find_symbol() map->map_ip is used without ensuring the map is loaded. Then the address passed to map->map_ip isn't mapped at the first time. E.g. below code always fails to get a symbol at the first call; addr = /* Somewhere in the kernel text / symbol_conf.try_vmlinux_path = true; symbol__init(); host_machine = machine__new_host(); sym = machine__find_kernel_function(host_machine, addr, NULL, NULL); / Note that machine__find_kernel_function calls map_groups__find_symbol */ This ensures it by calling map__load before using it in map_groups__find_symbol(). Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "David A. Long" <dave.long@linaro.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20140123022950.7206.17357.stgit@kbuild-fedora.yrl.intra.hitachi.co.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2014-01-23 15:48:12 -03:00
Josh Boyer	b935a58dbf	perf tools: Fix traceevent plugin path definitions The plugindir_SQ definition contains $(prefix) which is not needed as the $(libdir) definition already contains prefix in it. This leads to the path including an extra prefix in it, e.g. /usr/usr/lib64. The -DPLUGIN_DIR defintion includes DESTDIR. This is incorrect, as it sets the plugin search path to include the value of DESTDIR. DESTDIR is a mechanism to install in a non-standard location such as a chroot or an RPM build root. In the RPM case, this leads to the search path being incorrect after the resulting RPM is installed (or in some cases an RPM build failure). Remove both of these unnecessary inclusions. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140122150147.GK16455@hansolo.jdub.homelinux.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2014-01-23 15:48:12 -03:00

... 4 5 6 7 8 ...

423181 commits