Commit graph

201 commits

Author SHA1 Message Date
Nick Piggin f5b3db0017 [PATCH] as: cooperating processes
Introduce the notion of cooperating processes (those that submit requests
close to one another), and use these statistics to make better choices about
whether or not to do anticipatory waiting.

Help and analysis from Seetharami Seelam <seelam@cs.utep.edu>

Performance testing from Seelam:

I set up my system and executed a couple of tests that I used for OLS.  I
tested with AS, cooperative process patch merged in -mm tree (which I called
Nick, below) and the cooperative patch with modifications to as_update_iohist
(which I called Seelam).

I used a dual-processor (2.28GHz Pentium 4 Xeon) system, with 1 GB main memory
and 1 MB L2 cache, running Linux 2.6.9.  Only a single processor is used for
the experiments.  I used 7.2K RPM Maxtor 10GB drive configured with ext2 file
system.

Experiment 1 (ex1) consists of reading  one Linux source trees using

  find . -type f -exec cat '{}' ';' > /dev/null.

Experiment 2 (ex2) consists of reading two disjoint Linux source trees
using

  find . -type f -exec cat '{}' ';' > /dev/null.

Experiment 3 (ex3) consists of streaming read of a 2GB file in the background
and 1 instance of the chunk reads in Experiment 1.

Timings for reading the Linux source are shown below:

             AS                     Nick          Seelam
ex1:      0m25.813s               0m27.859s      0m27.640s
ex2:      1m11.468s               1m13.918s      1m5.869s
ex3:      81m44.352s             10m38.572s      6m47.994s

The difference between the numbers in Experiment 3 must be due to the code in
as_update_iohist.  (akpm: that's not part of this patch.  So this patch is
"Nick").

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:43 -08:00
Linus Torvalds 602d4a7e2f Merge master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc-merge 2005-11-04 16:27:50 -08:00
Linus Torvalds ec1890c5df Merge git://brick.kernel.dk/data/git/linux-2.6-block 2005-11-02 08:06:02 -08:00
Stephen Rothwell 2be7a90675 Merge Paulus' tree 2005-11-02 18:15:43 +11:00
Tejun Heo ca23509fba [PATCH] blk: fix dangling pointer access in __elv_add_request
cfq's add_req_fn callback may invoke q->request_fn directly and
depending on low-level driver used and timing, a queued request may be
finished & deallocated before add_req_fn callback returns.  So,
__elv_add_request must not access rq after it's passed to add_req_fn
callback.

This patch moves rq_mergeable test above add_req_fn().  This may
result in q->last_merge pointing to REQ_NOMERGE request if add_req_fn
callback sets it but as RQ_NOMERGE is checked again when blk layer
actually tries to merge requests, this does not cause any problem.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-01 21:58:06 -08:00
Kelly Daly b420677870 merge filename and modify references to iseries/vio.h
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-11-02 15:13:57 +11:00
Kelly Daly 1ec65d76f3 merge filename and modify references to iseries/hv_types.h
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-11-02 13:46:07 +11:00
Kelly Daly e45423eac2 merge filename and modify references to iseries/hv_lp_event.h
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-11-02 12:08:31 +11:00
Kelly Daly 15b1718948 merge filename and modify reference to iseries/hv_lp_config.h
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-11-02 11:55:28 +11:00
Jens Axboe 496456c24f [BLOCK] aoe: update for combined io statistics
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-11-01 09:54:23 +01:00
Jens Axboe a362357b6c [BLOCK] Unify the seperate read/write io stat fields into arrays
Instead of having ->read_sectors and ->write_sectors, combine the two
into ->sectors[2] and similar for the other fields. This saves a branch
several places in the io path, since we don't have to care for what the
actual io direction is. On my x86-64 box, that's 200 bytes less text in
just the core (not counting the various drivers).

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-11-01 09:26:16 +01:00
Jens Axboe d72d904a53 [BLOCK] Update read/write block io statistics at completion time
Right now we do it at queueing time, which works alright for reads
(since they are usually sync), but not for async writes since we can
queue io a lot faster than we can complete it. This makes the vmstat
output look extremely bursty.

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-11-01 08:35:42 +01:00
Jens Axboe 581c1b1439 [PATCH] noop-iosched: avoid corrupted request merging
Tejun Heo notes:

   "I'm currently debugging this.  The problem is that we are using the
    generic dispatch queue directly in the noop sched and merging is NOT
    allowed on dispatch queues but generic handling of last_merge tries
    to merge requests.  I'm still trying to verify this, so I'll be back
    with results soon."

In the meantime, disable merging for noop by setting REQ_NOMERGE in
elevator_noop_add_request().

Eventually, we should add a noop_list and do the dispatching like in the
other io schedulers.  Merging is still beneficial for noop (and it has
always done it).

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31 07:46:28 -08:00
Jens Axboe 4fc207419d [PATCH] Fix on-the-fly switch from cfq i/o scheduler
Don't clear ->elevator_data on exit, if we are switching queues we are
overwriting the data of the new io scheduler.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31 07:41:28 -08:00
Linus Torvalds 4fd5f8267d Merge master.kernel.org:/home/rmk/linux-2.6-drvmodel
Manual #include fixups for clashes - there may be some unnecessary
2005-10-31 07:32:56 -08:00
Paul Mackerras 23fd07750a Merge ../linux-2.6 by hand 2005-10-31 13:37:12 +11:00
Tim Schmielau 4e57b68178 [PATCH] fix missing includes
I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.

In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch.  This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other.  So if any
hunk rejects or gets in the way of other patches, just drop it.  My scripts
will pick it up again in the next round.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Nate Diller 2ca7d93bb2 [PATCH] block cleanups: Fix iosched module refcount leak
If the requested I/O scheduler is already in place, elevator_switch simply
leaves the queue alone, and returns.  However, it forgets to call
elevator_put, so

'echo [current_sched] > /sys/block/[dev]/queue/scheduler'

will leak a reference, causing the current_sched module to be permanently
pinned in memory.

Signed-off-by: Nate Diller <nate@namesys.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:20 -08:00
Nate Diller 131dda7f89 [PATCH] block cleanups: Add kconfig default iosched submenu
Add a kconfig submenu to select the default I/O scheduler, in case
anticipatory is not compiled in or another default is preferred.  Also,
since no-op is always available, we should use it whenever the selected
default is not.

Signed-off-by: Nate Diller <nate@namesys.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:20 -08:00
Norbert Kiesel 3ccc7f293f [PATCH] delete 2 unreachable statements in drivers/block/paride/pf.c
The last patch from Jens Axboe for drivers/block/paride/pf.c introduced
pf_end_request() which sets pf_req to NULL.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:18 -08:00
Christoph Hellwig 83521d3eb8 [PATCH] cfq-iosched: move tasklist walk to elevator.c
We're trying to get rid of as much as possible tasklist walks, or at
least moving them to core code.  This patch falls into the second
category.

Instead of walking the tasklist in cfq-iosched move that into
elv_unregister.  The added benefit is that with this change the as
ioscheduler might be might unloadable more easily aswell.

The new code uses read_lock instead of read_lock_irq because the
tasklist_lock only needs irq disabling for writers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Russell King d052d1beff Create platform_device.h to contain all the platform device details.
Convert everyone who uses platform_bus_type to include
linux/platform_device.h.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-29 19:07:23 +01:00
Tejun Heo 47e627ce83 [PATCH] blk: fix merge bug in as-iosched
as-iosched deals with aliased requests differently from other ioscheds.

It links together aliased requests using rq->queuelist instead of
spilling alises to dispatch queue like other ioscheds do.  Requests
linked in this way cannot be merged.

Unfortunately, generic q->last_merge handling patch didn't take this
into account and q->last_merge could be set to an aliased request
resulting in Badness, corrupt list and eventually panic.

This explicitly marks aliased requests to be unmergeable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 10:28:13 -07:00
Pete Zaitcev 38ffdd62b0 [PATCH] ub: suppress gcc warnings for pointer casts
When building on a 64-bit platform, gcc produces a warning
"cast of a pointer to an integer of a different size".
The scatterlist.offset on the LHS is unsigned int, so I used
that originally.

Signed-off-by: Pete Zaitcev <zaitcev@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

 drivers/block/ub.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
2005-10-28 16:47:38 -07:00
Greg KH 6fbfddcb52 Merge ../bleed-2.6 2005-10-28 10:13:16 -07:00
Greg Kroah-Hartman 53f4654272 [PATCH] Driver Core: fix up all callers of class_device_create()
The previous patch adding the ability to nest struct class_device
changed the paramaters to the call class_device_create().  This patch
fixes up all in-kernel users of the function.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:52 -07:00
Kay Sievers a7fd67062e [PATCH] add sysfs attr to re-emit device hotplug event
A "coldplug + udevstart" can be simple like this:
  for i in /sys/block/*/*/uevent; do echo 1 > $i; done
  for i in /sys/class/*/*/uevent; do echo 1 > $i; done
  for i in /sys/bus/*/devices/*/uevent; do echo 1 > $i; done

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:51 -07:00
Ed L. Cashin 3dc7c55563 [PATCH] aoe: update to version 14
Signed-off-by: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Update driver version number to 14.
2005-10-28 09:52:50 -07:00
Ed L. Cashin 475172fb18 [PATCH] aoe: use get_unaligned for accesses in ATA id buffer
Signed-off-by: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Use get_unaligned for possibly-unaligned multi-byte accesses to the
ATA device identify response buffer.
2005-10-28 09:52:49 -07:00
Linus Torvalds 9be16a0392 Merge branch 'sx8' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 2005-10-28 09:16:58 -07:00
Linus Torvalds 5dd962494f Merge branch 'elevator-switch' of git://brick.kernel.dk/data/git/linux-2.6-block
Manual fixup for trivial "gfp_t" changes.
2005-10-28 08:56:34 -07:00
Linus Torvalds 28d721e24c Merge branch 'generic-dispatch' of git://brick.kernel.dk/data/git/linux-2.6-block 2005-10-28 08:53:49 -07:00
Linus Torvalds 0ee40c6628 Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block 2005-10-28 08:53:00 -07:00
Al Viro b4e3ca1ab1 [PATCH] gfp_t: remaining bits of drivers/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:51 -07:00
Al Viro 8267e268e0 [PATCH] gfp_t: block layer core
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Jens Axboe 772eca7825 [BLOCK] Leftover reference to ->max_back_kb
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 09:01:17 +02:00
Jens Axboe 64521d1a3b [BLOCK] elevator switch fixes/cleanup
- 100msec sleep is a little excessive, lots of requests can complete
  in that timeframe. Use 10msec instead.
- Rename QUEUE_FLAG_BYPASS to QUEUE_FLAG_ELVSWITCH to indicate what
  is going on.

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:48:23 +02:00
Tejun Heo cb98fc8bb9 [BLOCK] Reimplement elevator switch
This patch reimplements elevator switch.  This patch assumes generic
dispatch queue patchset is applied.

 * Each request is tagged with REQ_ELVPRIV flag if it has its elevator
   private data set.
 * Requests which doesn't have REQ_ELVPRIV flag set never enter
   iosched.  They are always directly back inserted to dispatch queue.
   Of course, elevator_put_req_fn is called only for requests which
   have its REQ_ELVPRIV set.
 * Request queue maintains the current number of requests which have
   its elevator data set (elevator_set_req_fn called) in
   q->rq->elvpriv.
 * If a request queue has QUEUE_FLAG_BYPASS set, elevator private data
   is not allocated for new requests.

 To switch to another iosched, we set QUEUE_FLAG_BYPASS and wait until
elvpriv goes to zero; then, we attach the new iosched and clears
QUEUE_FLAG_BYPASS.  New implementation is much simpler and main code
paths are less cluttered, IMHO.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:48:12 +02:00
Tejun Heo cb19833dcc [BLOCK] kill generic max_back_kb handling
This patch kills max_back_kb handling from elv_dispatch_sort() and
kills max_back_kb field from struct request_queue.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:46:01 +02:00
Tejun Heo 98b11471d7 [PATCH] 04/05 remove last_merge handling from ioscheds
Remove last_merge handling from all ioscheds.  This patch
removes merging capability of noop iosched.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2005-10-28 08:45:35 +02:00
Tejun Heo 06b86245c0 [PATCH] 03/05 move last_merge handlin into generic elevator code
Currently, both generic elevator code and specific ioscheds
participate in the management and usage of last_merge.  This
and the following patches move last_merge handling into
generic elevator code.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:45:20 +02:00
Jens Axboe b4878f245e [PATCH] 02/05: update ioscheds to use generic dispatch queue
This patch updates all four ioscheds to use generic dispatch
queue.  There's one behavior change in as-iosched.

* In as-iosched, when force dispatching
  (ELEVATOR_INSERT_BACK), batch_data_dir is reset to REQ_SYNC
  and changed_batch and new_batch are cleared to zero.  This
  prevernts AS from doing incorrect update_write_batch after
  the forced dispatched requests are finished.

* In cfq-iosched, cfqd->rq_in_driver currently counts the
  number of activated (removed) requests to determine
  whether queue-kicking is needed and cfq_max_depth has been
  reached.  With generic dispatch queue, I think counting
  the number of dispatched requests would be more appropriate.

* cfq_max_depth can be lowered to 1 again.

Original from Tejun Heo, modified version applied.

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:45:08 +02:00
Jens Axboe d9ebb192aa [PATCH] elevator: leftover function declaration
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:44:52 +02:00
Jens Axboe 1b47f531e2 [PATCH] generic dispatch fixes
- Split elv_dispatch_insert() into two functions
- Rename rq_last_sector() to rq_end_sector()

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:44:37 +02:00
Tejun Heo 8922e16cf6 [PATCH] 01/05 Implement generic dispatch queue
Implements generic dispatch queue which can replace all
dispatch queues implemented by each iosched.  This reduces
code duplication, eases enforcing semantics over dispatch
queue, and simplifies specific ioscheds.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:44:24 +02:00
Tejun Heo 2824bc9328 [PATCH] fix try_module_get race in elevator_find
This patch removes try_module_get race in elevator_find.
try_module_get should always be called with the spinlock protecting
what the module init/cleanup routines register/unregister to held. In
the case of elevators, we should be holding elv_list to avoid it going
away between spin_unlock_irq and try_module_get.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:15:58 +02:00
Chen, Kenneth W b2982649ce Following the same idea, it occurs to me that we should only update
disk stat when "now" is different from disk->stamp.  Otherwise, we
are again needlessly adding zero to the stats.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:15:43 +02:00
Chen, Kenneth W 20e5c81fcf [patch] remove gendisk->stamp_idle field
struct gendisk has these two fields: stamp, stamp_idle.  Update to
stamp_idle is always in sync with stamp and they are always the same.
Therefore, it does not add any value in having two fields tracking
same timestamp.  Suggest to remove it.

Also, we should only update gendisk stats with non-zero value.
Advantage is that we don't have to needlessly calculate memory address,
and then add zero to the content.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:15:30 +02:00
Stephen Rothwell 915124d811 powerpc: set the driver.owner field for all vio drivers
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2005-10-24 16:59:13 +10:00
Stephen Rothwell 6fdf5392ca powerpc: don't duplicate name between vio_driver and device_driver
Just set the name field directly in the device_driver structure
contained in the vio_driver struct.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2005-10-24 15:42:12 +10:00