alistair23-linux

redonkable

Author	SHA1	Message	Date
Arnd Bergmann	ddc212313f	blkcg: simplify statistic accumulation code Some older compilers (gcc-4.4 through 4.6 in particular) struggle with the way that blkg_rwstat_read() returns a structure, leading to excessive stack usage and rather inefficient code: block/blk-cgroup.c: In function 'blkg_destroy': block/blk-cgroup.c:354:1: error: the frame size of 1296 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] block/cfq-iosched.c: In function 'cfqg_stats_add_aux': block/cfq-iosched.c:753:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] block/bfq-cgroup.c: In function 'bfqg_stats_add_aux': block/bfq-cgroup.c:299:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] I also notice that there is no point in using atomic accesses for the local variables, so storing the temporaries in simple 'u64' variables not only avoids the stack usage on older compilers but also improves the object code on modern versions. Fixes: `e6269c4454` ("blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-16 08:56:36 -07:00
Mike Snitzer	fa70d2e2c4	block: allow gendisk's request_queue registration to be deferred Since I can remember DM has forced the block layer to allow the allocation and initialization of the request_queue to be distinct operations. Reason for this is block/genhd.c:add_disk() has requires that the request_queue (and associated bdi) be tied to the gendisk before add_disk() is called -- because add_disk() also deals with exposing the request_queue via blk_register_queue(). DM's dynamic creation of arbitrary device types (and associated request_queue types) requires the DM device's gendisk be available so that DM table loads can establish a master/slave relationship with subordinate devices that are referenced by loaded DM tables -- using bd_link_disk_holder(). But until these DM tables, and their associated subordinate devices, are known DM cannot know what type of request_queue it needs -- nor what its queue_limits should be. This chicken and egg scenario has created all manner of problems for DM and, at times, the block layer. Summary of changes: - Add device_add_disk_no_queue_reg() and add_disk_no_queue_reg() variant that drivers may use to add a disk without also calling blk_register_queue(). Driver must call blk_register_queue() once its request_queue is fully initialized. - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED is not set. It won't be set if driver used add_disk_no_queue_reg() but driver encounters an error and must del_gendisk() before calling blk_register_queue(). - Export blk_register_queue(). These changes allow DM to use add_disk_no_queue_reg() to anchor its gendisk as the "master" for master/slave relationships DM must establish with subordinate devices referenced in DM tables that get loaded. Once all "slave" devices for a DM device are known its request_queue can be properly initialized and then advertised via sysfs -- important improvement being that no request_queue resource initialization performed by blk_register_queue() is missed for DM devices anymore. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-15 08:41:38 -07:00
Jens Axboe	7c3fb70f03	block: rearrange a few request fields for better cache layout Move completion related items (like the call single data) near the end of the struct, instead of mixing them in with the initial queueing related fields. Move queuelist below the bio structures. Then we have all queueing related bits in the first cache line. This yields a 1.5-2% increase in IOPS for a null_blk test, both for sync and for high thread count access. Sync test goes form 975K to 992K, 32-thread case from 20.8M to 21.2M IOPS. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-10 11:47:58 -07:00
Jens Axboe	e14575b3d4	block: convert REQ_ATOM_COMPLETE to stealing rq->__deadline bit We only have one atomic flag left. Instead of using an entire unsigned long for that, steal the bottom bit of the deadline field that we already reserved. Remove ->atomic_flags, since it's now unused. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-10 11:47:53 -07:00
Jens Axboe	0a72e7f449	block: add accessors for setting/querying request deadline We reduce the resolution of request expiry, but since we're already using jiffies for this where resolution depends on the kernel configuration and since the timeout resolution is coarse anyway, that should be fine. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-10 11:47:47 -07:00
Jens Axboe	76a86f9d02	block: remove REQ_ATOM_POLL_SLEPT We don't need this to be an atomic flag, it can be a regular flag. We either end up on the same CPU for the polling, in which case the state is sane, or we did the sleep which would imply the needed barrier to ensure we see the right state. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-10 11:47:43 -07:00
Keith Busch	9111e5686c	block: Provide blk_status_t decoding for path errors This patch provides a common decoder for block status path related errors that may be retried so various entities wishing to consult this do not have to duplicate this decision. Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-10 10:52:16 -07:00
Tejun Heo	05707b64ae	blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-09 09:31:15 -07:00
Tejun Heo	634f9e4631	blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq After the recent updates to use generation number and state based synchronization, blk-mq no longer depends on REQ_ATOM_COMPLETE except to avoid firing the same timeout multiple times. Remove all REQ_ATOM_COMPLETE usages and use a new rq_flags flag RQF_MQ_TIMEOUT_EXPIRED to avoid firing the same timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "jianchao.wang" <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-09 09:31:15 -07:00
Tejun Heo	1d9bd5161b	blk-mq: replace timeout synchronization with a RCU and generation based scheme Currently, blk-mq timeout path synchronizes against the usual issue/completion path using a complex scheme involving atomic bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence rules. Unfortunately, it contains quite a few holes. There's a complex dancing around REQ_ATOM_STARTED and REQ_ATOM_COMPLETE between issue/completion and timeout paths; however, they don't have a synchronization point across request recycle instances and it isn't clear what the barriers add. blk_mq_check_expired() can easily read STARTED from N-2'th iteration, deadline from N-1'th, blk_mark_rq_complete() against Nth instance. In fact, it's pretty easy to make blk_mq_check_expired() terminate a later instance of a request. If we induce 5 sec delay before time_after_eq() test in blk_mq_check_expired(), shorten the timeout to 2s, and issue back-to-back large IOs, blk-mq starts timing out requests spuriously pretty quickly. Nothing actually timed out. It just made the call on a recycle instance of a request and then terminated a later instance long after the original instance finished. The scenario isn't theoretical either. This patch replaces the broken synchronization mechanism with a RCU and generation number based one. 1. Each request has a u64 generation + state value, which can be updated only by the request owner. Whenever a request becomes in-flight, the generation number gets bumped up too. This provides the basis for the timeout path to distinguish different recycle instances of the request. Also, marking a request in-flight and setting its deadline are protected with a seqcount so that the timeout path can fetch both values coherently. 2. The timeout path fetches the generation, state and deadline. If the verdict is timeout, it records the generation into a dedicated request abortion field and does RCU wait. 3. The completion path is also protected by RCU (from the previous patch) and checks whether the current generation number and state match the abortion field. If so, it skips completion. 4. The timeout path, after RCU wait, scans requests again and terminates the ones whose generation and state still match the ones requested for abortion. By now, the timeout path knows that either the generation number and state changed if it lost the race or the completion will yield to it and can safely timeout the request. While it's more lines of code, it's conceptually simpler, doesn't depend on direct use of subtle memory ordering or coherence, and hopefully doesn't terminate the wrong instance. While this change makes REQ_ATOM_COMPLETE synchronization unnecessary between issue/complete and timeout paths, REQ_ATOM_COMPLETE isn't removed yet as it's still used in other places. Future patches will move all state tracking to the new mechanism and remove all bitops in the hot paths. Note that this patch adds a comment explaining a race condition in BLK_EH_RESET_TIMER path. The race has always been there and this patch doesn't change it. It's just documenting the existing race. v2: - Fixed BLK_EH_RESET_TIMER handling as pointed out by Jianchao. - s/request->gstate_seqc/request->gstate_seq/ as suggested by Peter. - READ_ONCE() added in blk_mq_rq_update_state() as suggested by Peter. v3: - Fixed possible extended seqcount / u64_stats_sync read looping spotted by Peter. - MQ_RQ_IDLE was incorrectly being set in complete_request instead of free_request. Fixed. v4: - Rebased on top of hctx_lock() refactoring patch. - Added comment explaining the use of hctx_lock() in completion path. v5: - Added comments requested by Bart. - Note the addition of BLK_EH_RESET_TIMER race condition in the commit message. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "jianchao.wang" <jianchao.w.wang@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-09 09:31:15 -07:00
Bart Van Assche	e80a0af475	lib/scatterlist: Introduce sgl_alloc() and sgl_free() Many kernel drivers contain code that allocates and frees both a scatterlist and the pages that populate that scatterlist. Introduce functions in lib/scatterlist.c that perform these tasks instead of duplicating this functionality in multiple drivers. Only include these functions in the build if CONFIG_SGL_ALLOC=y to avoid that the kernel size increases if this functionality is not used. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-06 09:18:00 -07:00
Ming Lei	25d8be77e1	block: move bio_alloc_pages() to bcache bcache is the only user of bio_alloc_pages(), so move this function into bcache, and avoid it being misused in the future. Also rename it to bch_bio_allo_pages() since it is bcache only. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-06 09:18:00 -07:00
Ming Lei	3c892a098b	block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq Firstly this patch introduces BVEC_ITER_ALL_INIT for iterating one bio from start to end. As we need to support multipage bvecs, don't access bio->bi_io_vec in copy_to_high_bio_irq(), and just use the standard iterator for that. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-06 09:18:00 -07:00
Ming Lei	86292abc5a	block: introduce bio helpers for converting to multipage bvec The following helpers are introduced for converting current users of direct access to bvec table, and prepares for supporting multipage bvec: bio_pages_all() bio_first_bvec_all() bio_first_page_all() bio_last_bvec_all() All are named as bio_*_all() to following bio_for_each_segment_all(), they can only be used on bio of !bio_flagged(bio, BIO_CLONED), that means the whole bvec table is covered. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-06 09:18:00 -07:00
Christoph Hellwig	6cc77e9cb0	block: introduce zoned block devices zone write locking Components relying only on the request_queue structure for accessing block devices (e.g. I/O schedulers) have a limited knowledged of the device characteristics. In particular, the device capacity cannot be easily discovered, which for a zoned block device also result in the inability to easily know the number of zones of the device (the zone size is indicated by the chunk_sectors field of the queue limits). Introduce the nr_zones field to the request_queue structure to simplify access to this information. Also, add the bitmap seq_zone_bitmap which indicates which zones of the device are sequential zones (write preferred or write required) and the bitmap seq_zones_wlock which indicates if a zone is write locked, that is, if a write request targeting a zone was dispatched to the device. These fields are initialized by the low level block device driver (sd.c for ZBC/ZAC disks). They are not initialized by stacking drivers (device mappers) handling zoned block devices (e.g. dm-linear). Using this, I/O schedulers can introduce zone write locking to control request dispatching to a zoned block device and avoid write request reordering by limiting to at most a single write request per zone outside of the scheduler at any time. Based on previous patches from Damien Le Moal. Signed-off-by: Christoph Hellwig <hch@lst.de> [Damien] * Fixed comments and identation in blkdev.h * Changed helper functions * Fixed this commit message Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 09:22:17 -07:00
Javier González	e53927393b	lightnvm: set target over-provision on create ioctl Allow to set the over-provision percentage on target creation. In case that the value is not provided, fall back to the default value set by the target. In pblk, set the default OP to 11% of the total size of the device Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Matias Bjørling	fae7fae407	lightnvm: make geometry structures 2.0 ready Prepare for the 2.0 revision by adapting the geometry structures to coexist with the 1.2 revision. Signed-off-by: Matias Bjørling <m@bjorling.me> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Matias Bjørling	bb27aa9ecd	lightnvm: remove lower page tables The lower page table is unused. All page tables reported by 1.2 devices are all reporting a sequential 1:1 page mapping. This is also not used going forward with the 2.0 revision. Signed-off-by: Matias Bjørling <m@bjorling.me> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Javier González	98281a90ac	lightnvm: remove unnecessary field from nvm_rq Remove the wait filed in nvm_rq. It is not used anymore, as targets rely on the functionality provided by the LightNVM subsystem when sending sync I/O. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Matias Bjørling	e3e13bcc14	lightnvm: remove hybrid ocssd 1.2 support Now that rrpc have been removed. Also remove the hybrid 1.2 support from the core. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Matias Bjørling	26f76dce60	lightnvm: use internal pblk methods Now that rrpc has been removed, the only users of the ppa helpers is pblk. However, pblk already defines similar functions. Switch pblk to use the internal ones, and remove the generic ppa helpers. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-01-05 08:50:12 -07:00
Linus Torvalds	6ba64feff6	Merge branch 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Page Table Isolation (PTI) preparatory tree from Ingo Molnar: "This does a rename to free up linux/pti.h to be used by the upcoming page table isolation feature" * 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: drivers/misc/intel/pti: Rename the header file to free up the namespace	2017-12-17 13:54:31 -08:00
Ingo Molnar	1784f9144b	drivers/misc/intel/pti: Rename the header file to free up the namespace We'd like to use the 'PTI' acronym for 'Page Table Isolation' - free up the namespace by renaming the <linux/pti.h> driver header to <linux/intel-pti.h>. (Also standardize the header guard name while at it.) Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: J Freyensee <james_p_freyensee@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 12:52:34 +01:00
Linus Torvalds	7a3c296ae0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Clamp timeouts to INT_MAX in conntrack, from Jay Elliot. 2) Fix broken UAPI for BPF_PROG_TYPE_PERF_EVENT, from Hendrik Brueckner. 3) Fix locking in ieee80211_sta_tear_down_BA_sessions, from Johannes Berg. 4) Add missing barriers to ptr_ring, from Michael S. Tsirkin. 5) Don't advertise gigabit in sh_eth when not available, from Thomas Petazzoni. 6) Check network namespace when delivering to netlink taps, from Kevin Cernekee. 7) Kill a race in raw_sendmsg(), from Mohamed Ghannam. 8) Use correct address in TCP md5 lookups when replying to an incoming segment, from Christoph Paasch. 9) Add schedule points to BPF map alloc/free, from Eric Dumazet. 10) Don't allow silly mtu values to be used in ipv4/ipv6 multicast, also from Eric Dumazet. 11) Fix SKB leak in tipc, from Jon Maloy. 12) Disable MAC learning on OVS ports of mlxsw, from Yuval Mintz. 13) SKB leak fix in skB_complete_tx_timestamp(), from Willem de Bruijn. 14) Add some new qmi_wwan device IDs, from Daniele Palmas. 15) Fix static key imbalance in ingress qdisc, from Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) net: qcom/emac: Reduce timeout for mdio read/write net: sched: fix static key imbalance in case of ingress/clsact_init error net: sched: fix clsact init error path ip_gre: fix wrong return value of erspan_rcv net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support pkt_sched: Remove TC_RED_OFFLOADED from uapi net: sched: Move to new offload indication in RED net: sched: Add TCA_HW_OFFLOAD net: aquantia: Increment driver version net: aquantia: Fix typo in ethtool statistics names net: aquantia: Update hw counters on hw init net: aquantia: Improve link state and statistics check interval callback net: aquantia: Fill in multicast counter in ndev stats from hardware net: aquantia: Fill ndev stat couters from hardware net: aquantia: Extend stat counters to 64bit values net: aquantia: Fix hardware DMA stream overload on large MRRS net: aquantia: Fix actual speed capabilities reporting sock: free skb in skb_complete_tx_timestamp on error s390/qeth: update takeover IPs after configuration change s390/qeth: lock IP table while applying takeover changes ...	2017-12-15 13:08:37 -08:00
Linus Torvalds	1f76a75561	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes: - Fix a S390 boot hang that was caused by the lock-break logic. Remove lock-break to begin with, as review suggested it was unreasonably fragile and our confidence in its continued good health is lower than our confidence in its removal. - Remove the lockdep cross-release checking code for now, because of unresolved false positive warnings. This should make lockdep work well everywhere again. - Get rid of the final (and single) ACCESS_ONCE() straggler and remove the API from v4.15. - Fix a liblockdep build warning" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/lib/lockdep: Add missing declaration of 'pr_cont()' checkpatch: Remove ACCESS_ONCE() warning compiler.h: Remove ACCESS_ONCE() tools/include: Remove ACCESS_ONCE() tools/perf: Convert ACCESS_ONCE() to READ_ONCE() locking/lockdep: Remove the cross-release locking checks locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK	2017-12-15 11:44:59 -08:00
Yuval Mintz	4a98795bc8	pkt_sched: Remove TC_RED_OFFLOADED from uapi Following the previous patch, RED is now using the new uniform uapi for indicating it's offloaded. As a result, TC_RED_OFFLOADED is no longer utilized by kernel and can be removed [as it's still not part of any stable release]. Fixes: `602f3baf22` ("net_sch: red: Add offload ability to RED qdisc") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 13:35:37 -05:00
Yuval Mintz	7a4fa29106	net: sched: Add TCA_HW_OFFLOAD Qdiscs can be offloaded to HW, but current implementation isn't uniform. Instead, qdiscs either pass information about offload status via their TCA_OPTIONS or omit it altogether. Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform uAPI for the offloading status of qdiscs. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 13:35:36 -05:00
Linus Torvalds	032b4cc8ff	Power management fix for v4.15-rc4 This fixes an issue in two recent commits that may cause pm_runtime_enable() to be called for too many times for some devices during the "thaw" transition belonging to hibernation. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJaMyQRAAoJEILEb/54YlRx8C0P/j4/LO0dLY4bNPVfeMDpZrcz iXozHkIg2wKztuO7j+A3sRhShzu90wGjYsN9t3rrEsdqQn5GlB98vU+uak6heyRI kY/yLPYMnzyZbJ7Rlix6BtZgykhs7w20qxTMuy2ZV7AbeZwODVlm9swDswLcpI1D tZwbbj+g47I0uC2cuhc0GlcFWcTEyBp9RjsdRKbrFnrDPDOBhJg2MkQwPZ1deG0a q5PmTI0uskrqJif5ovEhMgQ24pE4C6t88HZj5Dg8JO3H4vaeNrTgHSFE8UjID3fM PH2DspE9vyTa0irxEcLQmq6rzHSZZkqdirzPn5HP8gz6zQVBMeAe277r6P2DrxxJ vygiG0pnDsntiA3ygfVMkVr/pL35Y6TPy2kJpKN1+LUDC5If4iObW91qu0CN5pWJ QxUS76TIGXTEIPhHLiy+OKPFTAYAADSeZAaedhdHLE2a3iPGB6TxT4RlPU0h3txD 4+aQ4cbi2cyGTbPD83jaoKbzL5oRUxnnZYGJEXu/Opbay8LDTSHaTTYxAohTMUj5 46eBEJ+hX8SDqVeAlX1UL37Wjx0zHYPyJSjYFIm5Imr/JovftG6X7lCDlFBA86VN DKw5Swc12QmaGmC9vQPMUwtn/QN2R0WCBm6Hisi/3OIu2/+pY/xqtBI2eEgExsgZ PYVvmIwqAl2+M6HcweqJ =0D8g -----END PGP SIGNATURE----- Merge tag 'pm-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This fixes an issue in two recent commits that may cause pm_runtime_enable() to be called for too many times for some devices during the "thaw" transition belonging to hibernation" * tag 'pm-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume()	2017-12-14 18:25:03 -08:00
Linus Torvalds	0424378781	Various fix ups. - Comment fixes - Build fix - Better memory alloction (don't use NR_CPUS) - Configuration fix - Build warning fix - Enhanced callback parameter (to simplify users of trace hooks) - Give up on stack tracing when RCU isn't watching (it's a lost cause) -----BEGIN PGP SIGNATURE----- iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlozKm0UHHJvc3RlZHRA Z29vZG1pcy5vcmcACgkQ1a05Y9njSUmhXwv7BEY923K3Nl3qC6LeYmNyrZ4g1PsD nbZ+ZjU3KlMPugGbnJCJbfsS0utUp2Wd9gHT32O4BUf0/Pxjo3utXvkzRQJ3SwHT X7QhXROkicAKRFrPxj0BaiLexC+yJR23wGp2YUVHLO4Aa/ptN8BJvH22+eDpsCLc f1DWJdvdbyPBUeoHNKevjvccsUYMlnBfe1jhJ9nRWHnq1axGV3bllcd6v4T07LgK LO28Krp4/V3tVN9Sq6jBoGTrULf5O1xtuBDVtANeXdha1oMUTYr4TUzTuOQxFgNu sCIpUWTKu+PNGmU5bhlryb20C2+PveE4EMK0InVVlqrhYJ/XjXFyRaZYYkM2NAKq XmeHVjMfc6Wmrd/nepuaTZvGGrK2kK/pCX/XuQthKKjAU6rO5X+FfGiSGodLPIYB 1m+QvfX8Re2IJkswm3lS68LKtG7SnjcSB9sY6PhVe6C6oP2ya2O1hthJ+TkrfNbc lhE5HzfCJZ9ujSjcQoUGwjrhnIezQX4KnfJZ =WdMQ -----END PGP SIGNATURE----- Merge tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various fix-ups: - comment fixes - build fix - better memory alloction (don't use NR_CPUS) - configuration fix - build warning fix - enhanced callback parameter (to simplify users of trace hooks) - give up on stack tracing when RCU isn't watching (it's a lost cause)" * tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Have stack trace not record if RCU is not watching tracing: Pass export pointer as argument to ->write() ring-buffer: Remove unused function __rb_data_page_index() tracing: make PREEMPTIRQ_EVENTS depend on TRACING tracing: Allocate mask_str buffer dynamically tracing: always define trace_{irq,preempt}_{enable_disable} tracing: Fix code comments in trace.c	2017-12-14 18:21:33 -08:00
Linus Torvalds	c4f988ee51	pci-v4.15-fixes-1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaMwhcAAoJEFmIoMA60/r88ygP/3fuGrYrolx1yOM/XKjiG1Za B+FF5IK0xcV4Xrdg4K9TrqCkVBlSmTkF2wvD1raFQhvPvzNnTMI0JvI5iwUogSlA EI8rKvKRT0QRtT27URhGWCZOrPMgylE7Vtq7ZGLin2o19A4JQw3TC8mz9QirSbii C5H+WZ6iAQ3ISjOuiFUREqQDCE8FzbYQC69uh/xMRshZFG1WPsX96ZRSvVfPQUeC XwvKqiO4s9I7zyxhS/WSRnnEHO7LNSRRiq8kzP0VKKMAfTo8gLYPoeIPakvXSsQZ j7N1ycWub9DSn7wjh9Zd922wp1EYfxkCW6jilVOd+BlmOxLjyPPGgahbNIrIojwa qiTY5SNdty3h27IT3Hep4yDBoqhXXPORXgh7QRlzizu8Te2SjHY7n9qi1S8Uobn+ s98PoNYU+oelivi9MtKLKcqCaNL0RE8aCQfuA3TzCU/dzR4Trx8obV50f67vsL6e eK67S/KpOVUjHQmGec8iqnrJfHPZkxgaRTOVVYCXKidtJxTf/LyjDKYh8A2Q27A5 wXdRaSxLw25r0alAWzUEIVj/iUPd4nYKY9RTeRPWiXU+wsEUcBO+js6HE1xZAAfk ULDqZKnkKsgW9aWlQU6DzzLsCjxPTRcC8SfK79K47Otua2laABzL47mzo2sSKKFs mNq3i4OiNFq8XCsN18lo =o3Rn -----END PGP SIGNATURE----- Merge tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - add a pci_get_domain_bus_and_slot() stub for the CONFIG_PCI=n case to avoid build breakage in the v4.16 merge window if a pci_get_bus_and_slot() -> pci_get_domain_bus_and_slot() patch gets merged before the PCI tree (Randy Dunlap) - fix an AMD boot regression in the 64bit BAR support added in v4.15 (Christian König) - fix an R-Car use-after-free that causes a crash if no PCIe card is present (Geert Uytterhoeven) * tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Fix use-after-free in probe error path x86/PCI: Only enable a 64bit BAR on single-socket AMD Family 15h x86/PCI: Fix infinite loop in search for 64bit BAR placement PCI: Add pci_get_domain_bus_and_slot() stub	2017-12-14 17:02:39 -08:00
Linus Torvalds	18d40eae7f	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "17 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: arch: define weak abort() mm, oom_reaper: fix memory corruption kernel: make groups_sort calling a responsibility group_info allocators mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' tools/slabinfo-gnuplot: force to use bash shell kcov: fix comparison callback signature mm/slab.c: do not hash pointers when debugging slab mm/page_alloc.c: avoid excessive IRQ disabled times in free_unref_page_list() mm/memory.c: mark wp_huge_pmd() inline to prevent build failure scripts/faddr2line: fix CROSS_COMPILE unset error Documentation/vm/zswap.txt: update with same-value filled page feature exec: avoid gcc-8 warning for get_task_comm autofs: fix careless error in recent commit string.h: workaround for increased stack usage mm/kmemleak.c: make cond_resched() rate-limiting more efficient lib/rbtree,drm/mm: add rbtree_replace_node_cached() include/linux/idr.h: add #include <linux/bug.h>	2017-12-14 16:35:20 -08:00
Michal Hocko	4837fe37ad	mm, oom_reaper: fix memory corruption David Rientjes has reported the following memory corruption while the oom reaper tries to unmap the victims address space BUG: Bad page map in process oom_reaper pte:6353826300000000 pmd:00000000 addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping: (null) index:7f50cab1d file: (null) fault: (null) mmap: (null) readpage: (null) CPU: 2 PID: 1001 Comm: oom_reaper Call Trace: unmap_page_range+0x1068/0x1130 __oom_reap_task_mm+0xd5/0x16b oom_reaper+0xff/0x14c kthread+0xc1/0xe0 Tetsuo Handa has noticed that the synchronization inside exit_mmap is insufficient. We only synchronize with the oom reaper if tsk_is_oom_victim which is not true if the final __mmput is called from a different context than the oom victim exit path. This can trivially happen from context of any task which has grabbed mm reference (e.g. to read /proc/<pid>/ file which requires mm etc.). The race would look like this oom_reaper oom_victim task mmget_not_zero do_exit mmput __oom_reap_task_mm mmput __mmput exit_mmap remove_vma unmap_page_range Fix this issue by providing a new mm_is_oom_victim() helper which operates on the mm struct rather than a task. Any context which operates on a remote mm struct should use this helper in place of tsk_is_oom_victim. The flag is set in mark_oom_victim and never cleared so it is stable in the exit_mmap path. Debugged by Tetsuo Handa. Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org Fixes: `2129258024` ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Andrea Argangeli <andrea@kernel.org> Cc: <stable@vger.kernel.org> [4.14] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:49 -08:00
Thiago Rafael Becker	bdcf0a423e	kernel: make groups_sort calling a responsibility group_info allocators In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:49 -08:00
Arnd Bergmann	3756f6401c	exec: avoid gcc-8 warning for get_task_comm gcc-8 warns about using strncpy() with the source size as the limit: fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess] This is indeed slightly suspicious, as it protects us from source arguments without NUL-termination, but does not guarantee that the destination is terminated. This keeps the strncpy() to ensure we have properly padded target buffer, but ensures that we use the correct length, by passing the actual length of the destination buffer as well as adding a build-time check to ensure it is exactly TASK_COMM_LEN. There are only 23 callsites which I all reviewed to ensure this is currently the case. We could get away with doing only the check or passing the right length, but it doesn't hurt to do both. Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Aleksa Sarai <asarai@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:48 -08:00
Arnd Bergmann	146734b091	string.h: workaround for increased stack usage The hardened strlen() function causes rather large stack usage in at least one file in the kernel, in particular when CONFIG_KASAN is enabled: drivers/media/usb/em28xx/em28xx-dvb.c: In function 'em28xx_dvb_init': drivers/media/usb/em28xx/em28xx-dvb.c:2062:1: error: the frame size of 3256 bytes is larger than 204 bytes [-Werror=frame-larger-than=] Analyzing this problem led to the discovery that gcc fails to merge the stack slots for the i2c_board_info[] structures after we strlcpy() into them, due to the 'noreturn' attribute on the source string length check. I reported this as a gcc bug, but it is unlikely to get fixed for gcc-8, since it is relatively easy to work around, and it gets triggered rarely. An earlier workaround I did added an empty inline assembly statement before the call to fortify_panic(), which works surprisingly well, but is really ugly and unintuitive. This is a new approach to the same problem, this time addressing it by not calling the 'extern __real_strnlen()' function for string constants where __builtin_strlen() is a compile-time constant and therefore known to be safe. We do this by checking if the last character in the string is a compile-time constant '\0'. If it is, we can assume that strlen() of the string is also constant. As a side-effect, this should also improve the object code output for any other call of strlen() on a string constant. [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/20171205215143.3085755-1-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Link: https://patchwork.kernel.org/patch/9980413/ Link: https://patchwork.kernel.org/patch/9974047/ Fixes: `6974f0c455` ("include/linux/string.h: add the option of fortified string.h functions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Martin Wilck <mwilck@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:48 -08:00
Chris Wilson	338f1d9d1b	lib/rbtree,drm/mm: add rbtree_replace_node_cached() Add a variant of rbtree_replace_node() that maintains the leftmost cache of struct rbtree_root_cached when replacing nodes within the rbtree. As drm_mm is the only rb_replace_node() being used on an interval tree, the mistake looks fairly self-contained. Furthermore the only user of drm_mm_replace_node() is its testsuite... Testcase: igt/drm_mm/replace Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk Fixes: `f808c13fd3` ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:48 -08:00
Wei Wang	c47d7f56e9	include/linux/idr.h: add #include <linux/bug.h> The <linux/bug.h> was removed from radix-tree.h by commit `f5bba9d11a` ("include/linux/radix-tree.h: remove unneeded #include <linux/bug.h>"). Since that commit, tools/testing/radix-tree/ couldn't pass compilation due to tools/testing/radix-tree/idr.c:17: undefined reference to WARN_ON_ONCE. This patch adds the bug.h header to idr.h to solve the issue. Link: http://lkml.kernel.org/r/1511963726-34070-2-git-send-email-wei.w.wang@intel.com Fixes: `f5bba9d11a` ("include/linux/radix-tree.h: remove unneeded #include <linux/bug.h>") Signed-off-by: Wei Wang <wei.w.wang@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Eric Biggers <ebiggers@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:48 -08:00
Linus Torvalds	e375922fc5	Merge tag 'drm-misc-fixes-2017-12-14' of git://anongit.freedesktop.org/drm/drm-misc Pull drm fixes from Daniel Vetter: - two fixes for new core features - a corner case fix for the connnector_iter fix from last week (this one is cc: stable) - one vc4 fix * tag 'drm-misc-fixes-2017-12-14' of git://anongit.freedesktop.org/drm/drm-misc: drm/drm_lease: Prevent deadlock in case drm_lease_create() fails drm: rework delayed connector cleanup in connector_iter drm: Update edid-derived drm_display_info fields at edid property set [v2] drm/vc4: Release fence after signalling	2017-12-14 11:45:53 -08:00
Daniel Vetter	ea497bb920	drm: rework delayed connector cleanup in connector_iter PROBE_DEFER also uses system_wq to reprobe drivers, which means when that again fails, and we try to flush the overall system_wq (to get all the delayed connectore cleanup work_struct completed), we deadlock. Fix this by using just a single cleanup work, so that we can only flush that one and don't block on anything else. That means a free list plus locking, a standard pattern. v2: - Correctly free connectors only on last ref. Oops (Chris). - use llist_head/node (Chris). v3 - Add init_llist_head (Chris). Fixes: `a703c55004` ("drm: safely free connectors from connector_iter") Fixes: `613051dac4` ("drm: locking&new iterators for connector_list") Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sean Paul <seanpaul@chromium.org> Cc: <stable@vger.kernel.org> # v4.11+: `613051dac4` ("drm: locking&new iterators for connector_list" Cc: <stable@vger.kernel.org> # v4.11+ Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: David Airlie <airlied@linux.ie> Cc: Javier Martinez Canillas <javier@dowhile0.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Guillaume Tucker <guillaume.tucker@collabora.com> Cc: Mark Brown <broonie@kernel.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Matt Hart <matthew.hart@linaro.org> Cc: Thierry Escande <thierry.escande@collabora.co.uk> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171213124936.17914-1-daniel.vetter@ffwll.ch	2017-12-13 22:59:00 +01:00
Eric Dumazet	b5476022bb	ipv4: igmp: guard against silly MTU values IPv4 stack reacts to changes to small MTU, by disabling itself under RTNL. But there is a window where threads not using RTNL can see a wrong device mtu. This can lead to surprises, in igmp code where it is assumed the mtu is suitable. Fix this by reading device mtu once and checking IPv4 minimal MTU. This patch adds missing IPV4_MIN_MTU define, to not abuse ETH_MIN_MTU anymore. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-13 13:13:58 -05:00
Keith Packard	4b4df570b4	drm: Update edid-derived drm_display_info fields at edid property set [v2] There are a set of values in the drm_display_info structure for each connector which hold information derived from EDID. These are computed in drm_add_display_info. Before this patch, that was only called in drm_add_edid_modes. This meant that they were only set when EDID was present and never reset when EDID was not, as happened when the display was disconnected. One of these fields, non_desktop, is used from drm_mode_connector_update_edid_property, the function responsible for assigning the new edid value to the application-visible property. Various drivers call these two functions (drm_add_edid_modes and drm_mode_connector_update_edid_property) in different orders. This means that even when EDID is present, the drm_display_info fields may not have been computed at the time that drm_mode_connector_update_edid_property used the non_desktop value to set the non_desktop property. I've added a public function (drm_reset_display_info) that resets the drm_display_info field values to default values and then made the drm_add_display_info function public. These two functions are now called directly from drm_mode_connector_update_edid_property so that the drm_display_info fields are always computed from the current EDID information before being used in that function. This means that the drm_display_info values are often computed twice, once when the EDID property it set and a second time when EDID is used to compute modes for the device. The alternative would be to uniformly ensure that the values were computed once before being used, which would require that all drivers reliably invoke the two paths in the same order. The computation is inexpensive enough that it seems more maintainable in the long term to simply compute them in both paths. The API to drm_add_display_info has been changed so that it no longer takes the set of edid-based quirks as a parameter. Rather, it now computes those quirks itself and returns them for further use by drm_add_edid_modes. This patch also includes a number of 'const' additions caused by drm_mode_connector_update_edid_property taking a 'const struct edid *' parameter and wanting to pass that along to drm_add_display_info. v2: after review by Daniel Vetter <daniel.vetter@ffwll.ch> Removed EXPORT_SYMBOL_GPL for drm_reset_display_info and drm_add_display_info. Added FIXME in drm_mode_connector_update_edid_property about potentially merging that with drm_add_edid_modes to avoid the need for two driver calls. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171213084427.31199-1-keithp@keithp.com (danvet: cherry picked from commit 12a889bf4bca ("drm: rework delayed connector cleanup in connector_iter") from drm-misc-next since functional conflict with changes in -next and we need to make sure both have the right version and nothing gets lost.) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-12-13 14:51:37 +01:00
Mark Rutland	b899a85043	compiler.h: Remove ACCESS_ONCE() There are no longer any kernelspace uses of ACCESS_ONCE(), so we can remove the definition from <linux/compiler.h>. This patch removes the ACCESS_ONCE() definition, and updates comments which referred to it. At the same time, some inconsistent and redundant whitespace is removed from comments. Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: apw@canonical.com Link: http://lkml.kernel.org/r/20171127103824.36526-4-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-12 13:22:10 +01:00
Ingo Molnar	e966eaeeb6	locking/lockdep: Remove the cross-release locking checks This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. If we disable cross-release by default but keep the code upstream then in practice the most likely outcome is that we'll allow the situation to degrade gradually, by allowing entropy to introduce more and more false positives, until it overwhelms maintenance capacity. Another bad side effect was that people were trying to work around the false positives by uglifying/complicating unrelated code. There's a marked difference between annotating locking operations and uglifying good code just due to bad lock debugging code ... This gradual decrease in quality happened to a number of debugging facilities in the kernel, and lockdep is pretty complex already, so we cannot risk this outcome. Either cross-release checking can be done right with no false positives, or it should not be included in the upstream kernel. ( Note that it might make sense to maintain it out of tree and go through the false positives every now and then and see whether new bugs were introduced. ) Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-12 12:38:51 +01:00
Will Deacon	d89c70356a	locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y When CONFIG_GENERIC_LOCKBEAK=y, locking structures grow an extra int ->break_lock field which is used to implement raw_spin_is_contended() by setting the field to 1 when waiting on a lock and clearing it to zero when holding a lock. However, there are a few problems with this approach: - There is a write-write race between a CPU successfully taking the lock (and subsequently writing break_lock = 0) and a waiter waiting on the lock (and subsequently writing break_lock = 1). This could result in a contended lock being reported as uncontended and vice-versa. - On machines with store buffers, nothing guarantees that the writes to break_lock are visible to other CPUs at any particular time. - READ_ONCE/WRITE_ONCE are not used, so the field is potentially susceptible to harmful compiler optimisations, Consequently, the usefulness of this field is unclear and we'd be better off removing it and allowing architectures to implement raw_spin_is_contended() by providing a definition of arch_spin_is_contended(), as they can when CONFIG_GENERIC_LOCKBREAK=n. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1511894539-7988-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-12 11:24:01 +01:00
Linus Torvalds	916b20e02e	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This push fixes the following issues: - buffer overread in RSA - potential use after free in algif_aead. - error path null pointer dereference in af_alg - forbid combinations such as hmac(hmac(sha3)) which may crash - crash in salsa20 due to incorrect API usage" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: salsa20 - fix blkcipher_walk API usage crypto: hmac - require that the underlying hash algorithm is unkeyed crypto: af_alg - fix NULL pointer dereference in crypto: algif_aead - fix reference counting of null skcipher crypto: rsa - fix buffer overread when stripping leading zeroes	2017-12-11 16:32:45 -08:00
Xin Long	200809716a	fou: fix some member types in guehdr guehdr struct is used to build or parse gue packets, which are always in big endian. It's better to define all guehdr members as __beXX types. Also, in validate_gue_flags it's not good to use a __be32 variable for both Standard flags(__be16) and Private flags (__be32), and pass it to other funcions. This patch could fix a bunch of sparse warnings from fou. Fixes: `5024c33ac3` ("gue: Add infrastructure for flags and options") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 14:10:06 -05:00
Michael S. Tsirkin	a8ceb5dbfd	ptr_ring: add barriers Users of ptr_ring expect that it's safe to give the data structure a pointer and have it be available to consumers, but that actually requires an smb_wmb or a stronger barrier. In absence of such barriers and on architectures that reorder writes, consumer might read an un=initialized value from an skb pointer stored in the skb array. This was observed causing crashes. To fix, add memory barriers. The barrier we use is a wmb, the assumption being that producers do not need to read the value so we do not need to order these reads. Reported-by: George Cherian <george.cherian@cavium.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 10:52:23 -05:00
Rafael J. Wysocki	3487972d7f	PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume() Middle-layer code doing suspend-time optimizations for devices with the DPM_FLAG_SMART_SUSPEND flag set (currently, the PCI bus type and the ACPI PM domain) needs to make the core skip ->thaw_early and ->thaw callbacks for those devices in some cases and it sets the power.direct_complete flag for them for this purpose. However, it turns out that setting power.direct_complete outside of the PM core is a bad idea as it triggers an excess invocation of pm_runtime_enable() in device_resume(). For this reason, provide a helper to clear power.is_late_suspended and power.is_suspended to be invoked by the middle-layer code in question instead of setting power.direct_complete and make that code call the new helper. Fixes: `c4b65157ae` (PCI / PM: Take SMART_SUSPEND driver flag into account) Fixes: `05087360fd` (ACPI / PM: Take SMART_SUSPEND driver flag into account) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com>	2017-12-11 14:32:56 +01:00
Linus Torvalds	c465fc11e5	KVM fixes for v4.15-rc3 ARM: * A number of issues in the vgic discovered using SMATCH * A bit one-off calculation in out stage base address mask (32-bit and 64-bit) * Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts * Printing unavailable hyp mode as error * Potential spinlock deadlock in the vgic * Avoid calling vgic vcpu free more than once * Broken bit calculation for big endian systems s390: * SPDX tags * Fence storage key accesses from problem state * Make sure that irq_state.flags is not used in the future x86: * Intercept port 0x80 accesses to prevent host instability (CVE) * Use userspace FPU context for guest FPU (mainly an optimization that fixes a double use of kernel FPU) * Do not leak one page per module load * Flush APIC page address cache from MMU invalidation notifiers -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJaLA93AAoJEED/6hsPKofo9msH/2DrqT2FOKfLuxNR2FeUGWr3 lqFoBRUXrVDMINGStnWrV36h/xYzlgJl9jtSDS8dr3VxLqtrNLlDg9NmGeogoZ+k /xewr/jFYoSRfffsvrbkzORUfvu6zqvJwufiwBEJwAfcswiLqPizdFXcxtUL4eZE 9s9sIweo5zp2Xjg5yLOEkyanePKMEht/81zPkHyM+g0ZMoaPam3qZHA0lLzdyRgd G9LpSyiMFHguYYgbwipaVue3zgMY1EdmKQ8C2hEPmZd8nVau26YDwRnAwwLrmVkW sFhGO1Xi18TzQPokzALC25c9v0fqgxL5+fNyFNgWwTc2n9PSwO+IHcy699UH+3A= =Qcqd -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Radim Krčmář: "ARM: - A number of issues in the vgic discovered using SMATCH - A bit one-off calculation in out stage base address mask (32-bit and 64-bit) - Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts - Printing unavailable hyp mode as error - Potential spinlock deadlock in the vgic - Avoid calling vgic vcpu free more than once - Broken bit calculation for big endian systems s390: - SPDX tags - Fence storage key accesses from problem state - Make sure that irq_state.flags is not used in the future x86: - Intercept port 0x80 accesses to prevent host instability (CVE) - Use userspace FPU context for guest FPU (mainly an optimization that fixes a double use of kernel FPU) - Do not leak one page per module load - Flush APIC page address cache from MMU invalidation notifiers" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) KVM: x86: fix APIC page invalidation KVM: s390: Fix skey emulation permission check KVM: s390: mark irq_state.flags as non-usable KVM: s390: Remove redundant license text KVM: s390: add SPDX identifiers to the remaining files KVM: VMX: fix page leak in hardware_setup() KVM: VMX: remove I/O port 0x80 bypass on Intel hosts x86,kvm: remove KVM emulator get_fpu / put_fpu x86,kvm: move qemu/guest FPU switching out to vcpu_run KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion KVM: arm/arm64: kvm_arch_destroy_vm cleanups KVM: arm/arm64: Fix spinlock acquisition in vgic_set_owner kvm: arm: don't treat unavailable HYP mode as an error KVM: arm/arm64: Avoid attempting to load timer vgic state without a vgic kvm: arm64: handle single-step of hyp emulated mmio instructions kvm: arm64: handle single-step during SError exceptions kvm: arm64: handle single-step of userspace mmio instructions kvm: arm64: handle single-stepping trapped instructions KVM: arm/arm64: debug: Introduce helper for single-step arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one ...	2017-12-10 08:24:16 -08:00
Michal Hocko	f335195adf	kmemcheck: rip it out for real Commit `4675ff05de` ("kmemcheck: rip it out") has removed the code but for some reason SPDX header stayed in place. This looks like a rebase mistake in the mmotm tree or the merge mistake. Let's drop those leftovers as well. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-08 13:40:17 -08:00

1 2 3 4 5 ...

97471 Commits (de99a346884f019387230bc549de74456daca248)