1
0
Fork 0
Commit Graph

63626 Commits (0d945c1f966b2bcb67bb12be749da0a7fb00201b)

Author SHA1 Message Date
Christoph Hellwig 0d945c1f96 block: remove the queue_lock indirection
With the legacy request path gone there is no good reason to keep
queue_lock as a pointer, we can always use the embedded lock now.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Fixed floppy and blk-cgroup missing conversions and half done edits.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:17:28 -07:00
Christoph Hellwig 6d46964230 block: remove the lock argument to blk_alloc_queue_node
With the legacy request path gone there is no real need to override the
queue_lock.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:35 -07:00
Christoph Hellwig 57d74df907 block: use atomic bitops for ->queue_flags
->queue_flags is generally not set or cleared in the fast path, and also
generally set or cleared one flag at a time.  Make use of the normal
atomic bitops for it so that we don't need to take the queue_lock,
which is otherwise mostly unused in the core block layer now.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:19 -07:00
Christoph Hellwig 079076b341 block: remove deadline __deadline manipulation helpers
No users left since the removal of the legacy request interface, we can
remove all the magic bit stealing now and make it a normal field.

But use WRITE_ONCE/READ_ONCE on the new deadline field, given that we
don't seem to have any mechanism to guarantee a new value actually
gets seen by other threads.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:16 -07:00
Christoph Hellwig 8f4236d900 block: remove QUEUE_FLAG_BYPASS and ->bypass
Unused since the removal of the legacy request code.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-15 12:13:15 -07:00
Jens Axboe 7ff4f80356 block: remove dead queue members
No more users of ->in_flight[] or ->nr_sorted, get rid of them.

Fixes: a1ce35fa49 ("block: remove dead elevator code")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-14 15:22:49 -07:00
Christoph Hellwig 22ce0a7ccf ide: don't use req->special
Just replace it with a field of the same name in struct ide_req.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-10 08:03:52 -07:00
Christoph Hellwig 0e17e06cbf block: remove the BLKPREP_* values.
Unused now.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 19:17:14 -07:00
Christoph Hellwig 535ac5d3fe ide: cleanup ->prep_rq calling convention
The return value is just used as a binary yes/no decision, so switch
it to a bool instead of the old BLKPREP_* values returned as an int.

Also clean up a few related comments.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 19:17:13 -07:00
Christoph Hellwig 9d037ad707 block: remove req->timeout_list
Unused now that the legacy request path is gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-09 11:44:10 -07:00
Jens Axboe ae8799125d blk-mq: provide a helper to check if a queue is busy
Returns true if the queue currently has requests pending,
false if not.

DM can use this to replace the atomic_inc/dec they do per device
to see if a device is busy.

Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08 10:24:23 -07:00
Jens Axboe 7baa85727d blk-mq-tag: change busy_iter_fn to return whether to continue or not
We have this functionality in sbitmap, but we don't export it in
blk-mq for users of the tags busy iteration. This can be useful
for stopping the iteration, if the caller doesn't need to find
more requests.

Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-08 10:24:07 -07:00
Jens Axboe 4b04cc6a8f nvme: add separate poll queue map
Adds support for defining a variable number of poll queues, currently
configurable with the 'poll_queues' module parameter. Defaults to
a single poll queue.

And now we finally have poll support without triggering interrupts!

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:45:00 -07:00
Jens Axboe d1e36282b0 block: add REQ_HIPRI and inherit it from IOCB_HIPRI
We use IOCB_HIPRI to poll for IO in the caller instead of scheduling.
This information is not available for (or after) IO submission. The
driver may make different queue choices based on the type of IO, so
make the fact that we will poll for this IO known to the lower layers
as well.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:45:00 -07:00
Jens Axboe 843477d4cc blk-mq: initial support for multiple queue maps
Add a queue offset to the tag map. This enables users to map
iteratively, for each queue map type they support.

Bump maximum number of supported maps to 2, we're now fully
able to support more than 1 map.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:45:00 -07:00
Jens Axboe ea4f995ee8 blk-mq: cache request hardware queue mapping
We call blk_mq_map_queue() a lot, at least two times for each
request per IO, sometimes more. Since we now have an indirect
call as well in that function. cache the mapping so we don't
have to re-call blk_mq_map_queue() for the same request
multiple times.

Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:44:59 -07:00
Jens Axboe b3c661b15d blk-mq: support multiple hctx maps
Add support for the tag set carrying multiple queue maps, and
for the driver to inform blk-mq how many it wishes to support
through setting set->nr_maps.

This adds an mq_ops helper for drivers that support more than 1
map, mq_ops->rq_flags_to_type(). The function takes request/bio
flags and CPU, and returns a queue map index for that. We then
use the type information in blk_mq_map_queue() to index the map
set.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:44:59 -07:00
Jens Axboe f31967f0e4 blk-mq: allow software queue to map to multiple hardware queues
The mapping used to be dependent on just the CPU location, but
now it's a tuple of (type, cpu) instead. This is a prep patch
for allowing a single software queue to map to multiple hardware
queues. No functional changes in this patch.

This changes the software queue count to an unsigned short
to save a bit of space. We can still support 64K-1 CPUs,
which should be enough. Add a check to catch a wrap.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:44:59 -07:00
Jens Axboe ed76e329d7 blk-mq: abstract out queue map
This is in preparation for allowing multiple sets of maps per
queue, if so desired.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:44:59 -07:00
Jens Axboe a8908939af blk-mq: kill q->mq_map
It's just a pointer to set->mq_map, use that instead. Move the
assignment a bit earlier, so we always know it's valid.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:44:59 -07:00
Jens Axboe a0fedc857d Merge branch 'irq/for-block' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.21/block
Pull in the irq affinity commits, that are staged through Thomas's
tree.

* 'irq/for-block' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/affinity: Add support for allocating interrupt sets
  genirq/affinity: Pass first vector to __irq_build_affinity_masks()
  genirq/affinity: Move two stage affinity spreading into a helper function
  genirq/affinity: Spread IRQs to all available NUMA nodes
2018-11-07 13:43:54 -07:00
Jens Axboe 9cf2bab630 block: kill request ->cpu member
This was used for completion placement for the legacy path,
but for mq we have rq->mq_ctx->cpu for that. Add a helper
to get the request CPU assignment, as the mq_ctx type is
private to blk-mq.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe c7bb9ad174 block: get rid of q->softirq_done_fn()
With the legacy path gone, all we do is funnel it through the
mq_ops->complete() operation.

Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe 7d692330e7 block: get rid of blk_queued_rq()
No point in hiding what this does, just open code it in the
one spot where we are still using it.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe db6d995235 block: remove request_list code
It's now dead code, nobody uses it.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe 1028e4b335 bsg: move bsg-lib parts outside of request queue
Get rid of the special bsg job fn and timeout handler, move them
into a private bsg_set instead.

Mostly from Christoph, with fixes for error handling and cleanups.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe 4316b79e43 block: kill legacy parts of timeout handling
The only user of legacy timing now is BSG, which is invoked
from the mq timeout handler. Kill the legacy code, and rename
the q->rq_timed_out_fn to q->bsg_job_timeout_fn.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:33 -07:00
Jens Axboe 92bc5a2484 block: remove __blk_put_request()
Now there's no difference between blk_put_request() and
__blk_put_request() anymore, get rid of the underscore version and
convert the few callers.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe f9cd4bfe96 block: get rid of MQ scheduler ops union
This is a remnant of when we had ops for both SQ and MQ
schedulers. Now it's just MQ, so get rid of the union.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe a1ce35fa49 block: remove dead elevator code
This removes a bunch of core and elevator related code. On the core
front, we remove anything related to queue running, draining,
initialization, plugging, and congestions. We also kill anything
related to request allocation, merging, retrieval, and completion.

Remove any checking for single queue IO schedulers, as they no
longer exist. This means we can also delete a bunch of code related
to request issue, adding, completion, etc - and all the SQ related
ops and helpers.

Also kill the load_default_modules(), as all that did was provide
for a way to load the default single queue elevator.

Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe 7ca0192646 block: remove legacy rq tagging
It's now unused, kill it.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe 771a93c489 block: remove blk_complete_request()
It's now unused.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe 5e28b8d8a1 bsg: provide bsg_remove_queue() helper
All drivers do unregister + cleanup, provide a helper for that.

Cc: linux-scsi@vger.kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe aae3b069d5 bsg: pass in desired timeout handler
This will ease in the conversion to blk-mq, where we can't set
a timeout handler after queue init.

Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe c6f2882691 block: remove q->lld_busy_fn()
Nobody is using the legacy path for blk_lld_busy() anymore, remove
it.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
Jens Axboe 9ba20527f4 blk-mq: provide mq_ops->busy() hook
We'll hook into this from blk_lld_busy(), allowing blk-mq to also
return whether or not a given queue currently has requests in
progress.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:31 -07:00
Jens Axboe 600335205b ide: convert to blk-mq
ide-disk and ide-cd tested as working just fine, ide-tape and
ide-floppy haven't. But the latter don't require changes, so they
should work without issue.

Add helper function to insert a request from a work queue, since we
cannot invoke the blk-mq request insertion from IRQ context.

Cc: David Miller <davem@davemloft.net>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:31 -07:00
Linus Torvalds ecb4d529f1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - hid.git is moving towards group maintainership (where group is myself
   and Benjamin Tissoires), therefore this pull request updates
   MAINTAINERS accordingly

 - fix for hid-asus config dependency from Arnd Bergmann

 - two device-specific quirks for i2c-hid from Julian Sax and Kai-Heng
   Feng

 - other few small assorted fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: fix up .raw_event() documentation
  HID: asus: fix build warning wiht CONFIG_ASUS_WMI disabled
  HID: i2c-hid: add Direkt-Tek DTLAPY133-1 to descriptor override
  HID: moving to group maintainership model
  HID: alps: allow incoming reports when only the trackstick is opened
  Revert "HID: add NOGET quirk for Eaton Ellipse MAX UPS"
  HID: i2c-hid: Add a small delay after sleep command for Raydium touchpanel
  HID: hiddev: fix potential Spectre v1
2018-11-07 09:05:58 -08:00
Linus Torvalds a13511dfa8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle errors mid-stream of an all dump, from Alexey Kodanev.

 2) Fix build of openvswitch with certain combinations of netfilter
    options, from Arnd Bergmann.

 3) Fix interactions between GSO and BQL, from Eric Dumazet.

 4) Don't put a '/' in RTL8201F's sysfs file name, from Holger
    Hoffstätte.

 5) S390 qeth driver fixes from Julian Wiedmann.

 6) Allow ipv6 link local addresses for netconsole when both source and
    destination are link local, from Matwey V. Kornilov.

 7) Fix the BPF program address seen in /proc/kallsyms, from Song Liu.

 8) Initialize mutex before use in dsa microchip driver, from Tristram
    Ha.

 9) Out-of-bounds access in hns3, from Yunsheng Lin.

10) Various netfilter fixes from Stefano Brivio, Jozsef Kadlecsik, Jiri
    Slaby, Florian Westphal, Eric Westbrook, Andrey Ryabinin, and Pablo
    Neira Ayuso.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (50 commits)
  net: alx: make alx_drv_name static
  net: bpfilter: fix iptables failure if bpfilter_umh is disabled
  sock_diag: fix autoloading of the raw_diag module
  net: core: netpoll: Enable netconsole IPv6 link local address
  ipv6: properly check return value in inet6_dump_all()
  rtnetlink: restore handling of dumpit return value in rtnl_dump_all()
  net/ipv6: Move anycast init/cleanup functions out of CONFIG_PROC_FS
  bonding/802.3ad: fix link_failure_count tracking
  net: phy: realtek: fix RTL8201F sysfs name
  sctp: define SCTP_SS_DEFAULT for Stream schedulers
  sctp: fix strchange_flags name for Stream Change Event
  mlxsw: spectrum: Fix IP2ME CPU policer configuration
  openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS
  qed: fix link config error handling
  net: hns3: Fix for out-of-bounds access when setting pfc back pressure
  net/mlx4_en: use __netdev_tx_sent_queue()
  net: do not abort bulk send on BQL status
  net: bql: add __netdev_tx_sent_queue()
  s390/qeth: report 25Gbit link speed
  s390/qeth: sanitize ARP requests
  ...
2018-11-06 07:44:04 -08:00
Linus Walleij aa9b760cec HID: fix up .raw_event() documentation
The documentation for the .raw_event() callback says that if the
driver return 1, there will be no further processing of the event,
but this is not true, the actual code in hid-core.c looks like this:

  if (hdrv && hdrv->raw_event && hid_match_report(hid, report)) {
           ret = hdrv->raw_event(hid, report, data, size);
           if (ret < 0)
                   goto unlock;
   }

   ret = hid_report_raw_event(hid, type, data, size, interrupt);

The only return value that has any effect on the processing is
a negative error.

Correct this as it seems to confuse people: I found bogus code in
the Razer out-of-tree driver attempting to return 1 here.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-11-06 13:59:08 +01:00
David S. Miller a422757e8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains the first batch of Netfilter fixes for
your net tree:

1) Fix splat with IPv6 defragmenting locally generated fragments,
   from Florian Westphal.

2) Fix Incorrect check for missing attribute in nft_osf.

3) Missing INT_MIN & INT_MAX definition for netfilter bridge uapi
   header, from Jiri Slaby.

4) Revert map lookup in nft_numgen, this is already possible with
   the existing infrastructure without this extension.

5) Fix wrong listing of set reference counter, make counter
   synchronous again, from Stefano Brivio.

6) Fix CIDR 0 in hash:net,port,net, from Eric Westbrook.

7) Fix allocation failure with large set, use kvcalloc().
   From Andrey Ryabinin.

8) No need to disable BH when fetch ip set comment, patch from
   Jozsef Kadlecsik.

9) Sanity check for valid sysfs entry in xt_IDLETIMER, from
   Taehee Yoo.

10) Fix suspicious rcu usage via ip_set() macro at netlink dump,
    from Jozsef Kadlecsik.

11) Fix setting default timeout via nfnetlink_cttimeout, this
    comes with preparation patch to add nf_{tcp,udp,...}_pernet()
    helper.

12) Allow ebtables table nat to be of filter type via nft_compat.
    From Florian Westphal.

13) Incorrect calculation of next bucket in early_drop, do no bump
    hash value, update bucket counter instead. From Vasily Khoruzhick.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-05 17:19:25 -08:00
Martin Schwidefsky 163c8d54a9 compiler: remove __no_sanitize_address_or_inline again
The __no_sanitize_address_or_inline and __no_kasan_or_inline defines
are almost identical. The only difference is that __no_kasan_or_inline
does not have the 'notrace' attribute.

To be able to replace __no_sanitize_address_or_inline with the older
definition, add 'notrace' to __no_kasan_or_inline and change to two
users of __no_sanitize_address_or_inline in the s390 code.

The 'notrace' option is necessary for e.g. the __load_psw_mask function
in arch/s390/include/asm/processor.h. Without the option it is possible
to trace __load_psw_mask which leads to kernel stack overflow.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-05 08:14:18 -08:00
Jens Axboe 6da4b3ab9a genirq/affinity: Add support for allocating interrupt sets
A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.

Add support for defining a number of sets in the irq_affinity structure, of
varying sizes, and get each set affinitized correctly across the machine.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20181102145951.31979-5-ming.lei@redhat.com
2018-11-05 12:16:27 +01:00
Linus Torvalds 4710e78940 NFS client bugfixes for Linux 4.20
Highlights include:
 
 Bugfixes:
 - Fix build issues on architectures that don't provide 64-bit cmpxchg
 
 Cleanups:
 - Fix a spelling mistake
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb3vl/AAoJEA4mA3inWBJc5J0P/1zjDSsf/H4/Pa3aktfgwMds
 Z1clRgBJrqBRodF78ARcNI7OfZroHFYJHQVq+E0HwXbzFj4/YZGfXkKhRYSgCZyT
 uZKCNY42DirHuWR852ukQhdmskD/lWVlI4LIiwOpDpTD7v/GX5hFXpbTkHgKswDP
 G+euxbovzu7IgJP6Ww0XfGCGgBq2H8r0AitF9uSpgVmJOTjpRisodJZy94xvy0e8
 HVo6BxtBVle6N43qymO4cdssgLdAgyL+2NAhb36PL7xEthPMZvUWaPDswjro4Iir
 wAhIYmqcOXD/D8U8DcvkATkcaN9adVpmkznp+aqVE423XQy62k+J7+2d8uWbjBig
 FfdiYTxnL5RZgdSl/1JknHCxI1eEIhqiR1R0bqj50+aHR/QI4lZ7SsHQVV4y1gJL
 b96igefbzLBYKp9UN4fNHsjADvtZS5vCzjm2ep/aESP7gWB/v/UmNmMHe3y7nNnt
 mxd++0O4N6WFEf7GQljbfOtnZZGqmONw3QJV01EHqcVvn65mUkzbGq0CX9+GN17v
 sk4ThqSjHpfyla6Ih+6E9efdWOMTH/Kg+fb9ZXkcwxmde0Wl/dfQCw7iTZTGHifv
 /rmGHHvrM2uNLgWt6eE/MJ2Jb0Aq78eOAtt2zGN+tSJTThOBK20vNAK79CFIhrfj
 lKcjOb0hM+xJAt7Y9MpT
 =O9mS
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Bugfix:
   - Fix build issues on architectures that don't provide 64-bit cmpxchg

  Cleanups:
   - Fix a spelling mistake"

* tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: fix spelling mistake, EACCESS -> EACCES
  SUNRPC: Use atomic(64)_t for seq_send(64)
2018-11-04 08:20:09 -08:00
Linus Torvalds 35e7452442 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more timer updates from Thomas Gleixner:
 "A set of commits for the new C-SKY architecture timers"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dt-bindings: timer: gx6605s SOC timer
  clocksource/drivers/c-sky: Add gx6605s SOC system timer
  dt-bindings: timer: C-SKY Multi-processor timer
  clocksource/drivers/c-sky: Add C-SKY SMP timer
2018-11-04 08:15:15 -08:00
Linus Torvalds 601a88077c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "A number of fixes and some late updates:

   - make in_compat_syscall() behavior on x86-32 similar to other
     platforms, this touches a number of generic files but is not
     intended to impact non-x86 platforms.

   - objtool fixes

   - PAT preemption fix

   - paravirt fixes/cleanups

   - cpufeatures updates for new instructions

   - earlyprintk quirk

   - make microcode version in sysfs world-readable (it is already
     world-readable in procfs)

   - minor cleanups and fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  compat: Cleanup in_compat_syscall() callers
  x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
  objtool: Support GCC 9 cold subfunction naming scheme
  x86/numa_emulation: Fix uniform-split numa emulation
  x86/paravirt: Remove unused _paravirt_ident_32
  x86/mm/pat: Disable preemption around __flush_tlb_all()
  x86/paravirt: Remove GPL from pv_ops export
  x86/traps: Use format string with panic() call
  x86: Clean up 'sizeof x' => 'sizeof(x)'
  x86/cpufeatures: Enumerate MOVDIR64B instruction
  x86/cpufeatures: Enumerate MOVDIRI instruction
  x86/earlyprintk: Add a force option for pciserial device
  objtool: Support per-function rodata sections
  x86/microcode: Make revision and processor flags world-readable
2018-11-03 18:25:17 -07:00
Ingo Molnar 23a12ddee1 Merge branch 'core/urgent' into x86/urgent, to pick up objtool fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-03 23:42:16 +01:00
Eric Dumazet 3e59020abf net: bql: add __netdev_tx_sent_queue()
When qdisc_run() tries to use BQL budget to bulk-dequeue a batch
of packets, GSO can later transform this list in another list
of skbs, and each skb is sent to device ndo_start_xmit(),
one at a time, with skb->xmit_more being set to one but
for last skb.

Problem is that very often, BQL limit is hit in the middle of
the packet train, forcing dev_hard_start_xmit() to stop the
bulk send and requeue the end of the list.

BQL role is to avoid head of line blocking, making sure
a qdisc can deliver high priority packets before low priority ones.

But there is no way requeued packets can be bypassed by fresh
packets in the qdisc.

Aborting the bulk send increases TX softirqs, and hot cache
lines (after skb_segment()) are wasted.

Note that for TSO packets, we never split a packet in the middle
because of BQL limit being hit.

Drivers should be able to update BQL counters without
flipping/caring about BQL status, if the current skb
has xmit_more set.

Upper layers are ultimately responsible to stop sending another
packet train when BQL limit is hit.

Code template in a driver might look like the following :

	send_doorbell = __netdev_tx_sent_queue(tx_queue, nr_bytes, skb->xmit_more);

Note that __netdev_tx_sent_queue() use is not mandatory,
since following patch will change dev_hard_start_xmit()
to not care about BQL status.

But it is highly recommended so that xmit_more full benefits
can be reached (less doorbells sent, and less atomic operations as well)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03 15:40:01 -07:00
Michal Hocko 89c83fb539 mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
THP allocation mode is quite complex and it depends on the defrag mode.
This complexity is hidden in alloc_hugepage_direct_gfpmask from a large
part currently. The NUMA special casing (namely __GFP_THISNODE) is
however independent and placed in alloc_pages_vma currently. This both
adds an unnecessary branch to all vma based page allocation requests and
it makes the code more complex unnecessarily as well. Not to mention
that e.g. shmem THP used to do the node reclaiming unconditionally
regardless of the defrag mode until recently. This was not only
unexpected behavior but it was also hardly a good default behavior and I
strongly suspect it was just a side effect of the code sharing more than
a deliberate decision which suggests that such a layering is wrong.

Get rid of the thp special casing from alloc_pages_vma and move the
logic to alloc_hugepage_direct_gfpmask. __GFP_THISNODE is applied to the
resulting gfp mask only when the direct reclaim is not requested and
when there is no explicit numa binding to preserve the current logic.

Please note that there's also a slight difference wrt MPOL_BIND now. The
previous code would avoid using __GFP_THISNODE if the local node was
outside of policy_nodemask(). After this patch __GFP_THISNODE is avoided
for all MPOL_BIND policies. So there's a difference that if local node
is actually allowed by the bind policy's nodemask, previously
__GFP_THISNODE would be added, but now it won't be. From the behavior
POV this is still correct because the policy nodemask is used.

Link: http://lkml.kernel.org/r/20180925120326.24392-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03 10:09:37 -07:00
Sam Protsenko 94e297c50b include/linux/notifier.h: SRCU: fix ctags
ctags indexing ("make tags" command) throws this warning:

    ctags: Warning: include/linux/notifier.h:125:
    null expansion of name pattern "\1"

This is the result of DEFINE_PER_CPU() macro expansion.  Fix that by
getting rid of line break.

Similar fix was already done in commit 25528213fe ("tags: Fix
DEFINE_PER_CPU expansions"), but this one probably wasn't noticed.

Link: http://lkml.kernel.org/r/20181030202808.28027-1-semen.protsenko@linaro.org
Fixes: 9c80172b90 ("kernel/SRCU: provide a static initializer")
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03 10:09:37 -07:00