1
0
Fork 0
Commit Graph

127 Commits (3bd3706251ee8ab67e69d9340ac2abdca217e733)

Author SHA1 Message Date
Bob Nelson aed3a8c9bb [POWERPC] Oprofile: Remove dependency on spufs module
This removes an OProfile dependency on the spufs module.  This
dependency was causing a problem for multiplatform systems that are
built with support for Oprofile on Cell but try to load the oprofile
module on a non-Cell system.

Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-28 15:07:52 +11:00
Aegis Lin 90608a2928 [POWERPC] spufs: Use separate timer for /proc/spu_loadavg calculation
The original spusched_timer was designed to take effect only when
a context is waiting in the runqueue.

This change adds an additional lower-freq timer has been added to
purely handle the spu_load updates. The new timer will be triggered
per LOAD_FREQ ticks.

Signed-off-by: Aegis Lin <aegislin@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Christoph Hellwig c9101bdb1b [POWERPC] spufs: make state_mutex interruptible
Make most places that use spu_acquire/spu_acquire_saved interruptible,
this allows getting out of the spufs code when e.g. pressing ctrl+c.
There are a few places where we get called e.g. from spufs teardown
routines were we can't simply err out so these are left with a comment.
For now I've also not touched the poll routines because it's open what
libspe would expect in terms of interrupted system calls.

Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Luke Browning e65c2f6fce [POWERPC] spufs: decouple spu scheduler from spufs_spu_run (asynchronous scheduling)
Change spufs_spu_run so that the context is queued directly to the
scheduler and the controlling thread advances directly to spufs_wait()
for spe errors and exceptions.

nosched contexts are treated the same as before.

Fixes from Christoph Hellwig <hch@lst.de>

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Luke Browning b192541b39 [POWERPC] spufs: spu_find_victim may choose wrong victim
Need to re-check priority after dropping lock.  Otherwise, a
more favored context may be preempted.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Luke Browning 91569531d1 [POWERPC] spufs: reorganize spu_run_init
This cleans up spu_run_init so that it does all of the spu
initialization for spufs_run_spu.  It initializes the spu context as
much as possible before it activates the spu and writes the runcntl
register.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Jeremy Kerr d6ad39bc53 [POWERPC] spufs: rework class 0 and 1 interrupt handling
Based on original patches from
 Arnd Bergmann <arnd.bergman@de.ibm.com>; and
 Luke Browning <lukebr@linux.vnet.ibm.com>

Currently, spu contexts need to be loaded to the SPU in order to take
class 0 and class 1 exceptions.

This change makes the actual interrupt-handlers much simpler (ie,
set the exception information in the context save area), and defers the
handling code to the spufs_handle_class[01] functions, called from
spufs_run_spu.

This should improve the concurrency of the spu scheduling leading to
greater SPU utilization when SPUs are overcommited.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Arnd Bergmann 33bfd7a738 [POWERPC] spufs: block fault handlers in spu_acquire_runnable
This change disables the logic that faults-in spu contexts under the
covers from the page fault handler.  When a fault requires a runnable
context, the handler will block until the context is scheduled by
other means.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Jeremy Kerr 7cd58e4381 [POWERPC] spufs: move fault, lscsa_alloc and switch code to spufs module
Currently, part of the spufs code (switch.o, lscsa_alloc.o and fault.o)
is compiled directly into the kernel.

This change moves these components of spufs into the kernel.

The lscsa and switch objects are fairly straightforward to move in.

For the fault.o module, we split the fault-handling code into two
parts: a/p/p/c/spu_fault.c and a/p/p/c/spufs/fault.c. The former is for
the in-kernel spu_handle_mm_fault function, and we move the rest of the
fault-handling code into spufs.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Julio M. Merino Vidal 9b1d21f858 [POWERPC] spufs: fix typos in sched.c comments
Fix a few typos in the spufs scheduler comments

Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:18 +11:00
Paul Mackerras 0ce49a3945 Merge branch 'linux-2.6' 2007-09-20 10:09:27 +10:00
Christoph Hellwig c0e7b4aa1c [POWERPC] spusched: Fix null pointer dereference in find_victim
find_victim can dereference a NULL pointer when iterating over the list
of victim spus because list_mutex only guarantees spu->ct to be stable,
but of course not to be non-NULL.

Also fix find_victim to not call spu_unbind_context without list_mutex
because that violates the above guarantee.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-19 15:26:29 +10:00
Andre Detsch 36ddbb1380 [POWERPC] spufs: Fix race condition on gang->aff_ref_spu
Affinity reference point location (gang->aff_ref_spu) is reset
when the whole gang is descheduled. However, the last member of
a gang can be descheduled while we are trying to schedule another
member of the gang. This was leading to a race condition, and
the code was using gang->aff_ref_spu in an unsafe manner.

By holding the gang->aff_mutex a little bit longer, and increment
gang->aff_sched_count (which controls when gang->aff_ref_spu
should be reset) a little bit earlier, the problem is fixed.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-09-19 15:12:16 +10:00
Andre Detsch 683e3ab2b5 [POWERPC] spufs: Fix affinity after introduction of node_allowed() calls
This patch fixes affinity reference point placement, which was not being
done in some situations, after the introduction of node_allowed() calls.

The previously used parameter, 'ctx', is just the iterator of the
previous list_for_each_entry_reverse loop, and its value might be
invalid at the end of the loop. Also, the right context to seek
for information when defining the reference ctx location
_is_ the reference ctx.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-08-03 19:18:10 +10:00
Masato Noguchi 6f6a6dc0c8 [POWERPC] spufs: Fix incorrect initialization of cbe_spu_info.spus
We currently initialize cbe_spu_info[].spus in both init_spu_base and
spu_sched_init. The initialise in spu_sched_init clears the SPU list,
so we end up with no physical SPUs. Because of this, the spu_run
syscall will block forever.

This change removes the unnecessary initialization in spu_sched_init.

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-26 16:17:55 +10:00
Christoph Hellwig 486acd4850 [CELL] spufs: rework list management and associated locking
This sorts out the various lists and related locks in the spu code.

In detail:

 - the per-node free_spus and active_list are gone.  Instead struct spu
   gained an alloc_state member telling whether the spu is free or not
 - the per-node spus array is now locked by a per-node mutex, which
   takes over from the global spu_lock and the per-node active_mutex
 - the spu_alloc* and spu_free function are gone as the state change is
   now done inline in the spufs code.  This allows some more sharing of
   code for the affinity vs normal case and more efficient locking
 - some little refactoring in the affinity code for this locking scheme

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:42:28 +02:00
Bob Nelson 1474855d08 [CELL] oprofile: add support to OProfile for profiling CELL BE SPUs
From: Maynard Johnson <mpjohn@us.ibm.com>

This patch updates the existing arch/powerpc/oprofile/op_model_cell.c
to add in the SPU profiling capabilities.  In addition, a 'cell' subdirectory
was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling code.
Exports spu_set_profile_private_kref and spu_get_profile_private_kref which
are used by OProfile to store private profile information in spufs data
structures.

Also incorporated several fixes from other patches (rrn).  Check pointer
returned from kzalloc.  Eliminated unnecessary cast.  Better error
handling and cleanup in the related area.  64-bit unsigned long parameter
was being demoted to 32-bit unsigned int and eventually promoted back to
unsigned long.

Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
2007-07-20 21:42:24 +02:00
Bob Nelson 36aaccc1e9 [CELL] oprofile: enable SPU switch notification to detect currently active SPU tasks
From: Maynard Johnson <mpjohn@us.ibm.com>

This patch adds to the capability of spu_switch_event_register so that
the caller is also notified of currently active SPU tasks.
Exports spu_switch_event_register and spu_switch_event_unregister so
that OProfile can get access to the notifications provided.

Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Bob Nelson <rrnelson@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
2007-07-20 21:42:20 +02:00
Arnd Bergmann cbc23d3e7c [CELL] spufs: integration of SPE affinity with the scheduller
This patch makes the scheduller honor affinity information for each
context being scheduled. If the context has no affinity information,
behaviour is unchanged. If there are affinity information, context is
schedulled to be run on the exact spu recommended by the affinity
placement algorithm.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:42:18 +02:00
Arnd Bergmann c5fc8d2a92 [CELL] cell: add placement computation for scheduling of affinity contexts
This patch provides the spu affinity placement logic for the spufs scheduler.
Each time a gang is going to be scheduled, the placement of a reference
context is defined. The placement of all other contexts with affinity from
the gang is defined based on this reference context location and on a
precomputed displacement offset.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:42:17 +02:00
Arnd Bergmann aa6d5b2025 [CELL] cell: add per BE structure with info about its SPUs
Addition of a spufs-global "cbe_info" array. Each entry contains information
about one Cell/B.E. node, namelly:
* list of spus (both free and busy spus are in this list);
* list of free spus (replacing the static spu_list from spu_base.c)
* number of spus;
* number of reserved (non scheduleable) spus.

SPE affinity implementation actually requires only access to one spu per
BE node (since it implements its own pointer to walk through the other spus
of the ring) and the number of scheduleable spus (n_spus - non_sched_spus)
However having this more general structure can be useful for other
functionalities, concentrating per-cbe statistics / data.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:42:11 +02:00
Masato Noguchi 7e90b74967 [CELL] spufs: use find_first_bit() instead of sched_find_first_bit()
spu_sched->bitmap has MAX_PRIO(=140) width in bits.However, since
ff80a77f20, sched_find_first_bit()
only supports 100-bit bitmaps.

Thus, spu_sched->bitmap should be treated by generic find_first_bit().

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:42:07 +02:00
Andre Detsch 27ec41d3a1 [CELL] spufs: add spu stats in sysfs and ctx stat file in spufs
This patch exports per-context statistics in spufs as long as spu
statistics in sysfs.

It was formed by merging:
"spufs: add spu stats in sysfs"   From: Christoph Hellwig
"spufs: add stat file to spufs"   From: Christoph Hellwig
"spufs: fix libassist accounting" From: Jeremy Kerr
"spusched: fix spu utilization statistics" From: Luke Browning
And some adjustments by myself, after suggestions on cbe-oss-dev.

Having separate patches was making the review process harder
than it should, as we end up integrating spus and ctx statistics
accounting much more than it was on the first implementation.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:41:50 +02:00
Jeremy Kerr e840cfe681 [CELL] spufs: Remove spurious WARN_ON for spu_deactivate for NOSCHED contexts
In 6cbf93960e64f313f6e247cbca7afaa50e3ee2c we added a WARN_ON for
calling spu_deactivate on contexts created with the SPU_CREATE_NOSCHED
flag. However, all NOSCHED contexts will need to be deactivated when
the context is destroyed, so this gives a spurious warning when any
NOSCHED context is closed.

This change removes the WARN_ON.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:41:49 +02:00
Sebastian Siewior d145031755 [CELL] spufs: remove section mismatch warning
WARNING: arch/powerpc/platforms/cell/spufs/spufs.o(.init.text+0x158): Section
mismatch: reference to .exit.text:.spu_sched_exit (between '.init_module' and
'.spu_sched_init')

was introduced by c99c1994a2
This patch removes the warning.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-07-20 21:41:46 +02:00
Christoph Hellwig fe2f896d67 [POWERPC] spufs: Add spu stats in sysfs
Export spu statistics in sysfs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:46 +10:00
Christoph Hellwig 27449971e6 [POWERPC] spusched: Fix runqueue corruption
spu_activate can be called from multiple threads at the same time on
behalf of the same spu context.  We need to make sure to only add it
once to avoid runqueue corruption.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:46 +10:00
Christoph Hellwig c77239b8be [POWERPC] spusched: Disable tick when not needed
Only enable the scheduler tick if we have any context waiting to be
scheduled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:46 +10:00
Christoph Hellwig e9f8a0b65a [POWERPC] spufs: Add stat file to spufs
Export per-context statistics in spufs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:46 +10:00
Christoph Hellwig 65de66f0b8 [POWERPC] spufs: Implement /proc/spu_loadavg
Provide load average information for spu context.  The format
is identical to /proc/loadavg, which is also where a lot of code
and concepts is borrowed from.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:46 +10:00
Christoph Hellwig 476273adc7 [POWERPC] spufs: Add tid file
The new tid file contains the ID of the thread currently running the
context, if any.  This is used so that the new spu-top and spu-ps
tools can find the thread in /proc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Jeremy Kerr 7022543ee4 [POWERPC] spufs: Trivial whitespace fixes
Remove redundant whitespace in arch/powerpc/platforms/cell/spufs/

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Christoph Hellwig df09cf3e2c [POWERPC] spusched: No preemption for nosched contexts
And last but not least we need to make sure the scheduler tick never
preempts a nosched context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Christoph Hellwig 46cbf93960 [POWERPC] spusched: Catch nosched contexts in spu_deactivate
spu_deactivate should never be called for nosched contets.  Put in
a check so we can print a stacktrace and exit early in case it
happes erroneously.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Christoph Hellwig ea1ae5949d [POWERPC] spusched: fix cpu/node binding
Add a cpus_allowed allowed filed to struct spu_context so that we always
use the cpu mask of the owning thread instead of the one happening to
call into the scheduler.  Also use this information in
grab_runnable_context to avoid spurious wakeups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Christoph Hellwig 2cf2b3b49f [POWERPC] spusched: Update scheduling paramters on every spu_run
Update scheduling information on every spu_run to allow for setting
threads to realtime priority just before running them.  This requires
some slightly ugly code in spufs_run_spu because we can just update
the information unlocked if the spu is not runnable, but we need to
acquire the active_mutex when it is runnable to protect against
find_victim.  This locking scheme requires opencoding
spu_acquire_runnable in spufs_run_spu which actually is a nice cleanup
all by itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Jeremy Kerr f3f59bec0c [POWERPC] spusched: Print out scheduling tunables with DEBUG
Print out a few scheduler tuning parameters when we've compiled
with DEBUG defined.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:45 +10:00
Jeremy Kerr 60e2423933 [POWERPC] spusched: Fix timeslice calculations
The current timeslice code mixes 'jiffies' up with 'spesched ticks'. This
change correctly defines the number of time slices each SPE contexts is
given, and clarifies the comment.

This brings the default timeslice for SPE contexts into a reasonable
range.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:44 +10:00
Christoph Hellwig fe443ef2ac [POWERPC] spusched: Dynamic timeslicing for SCHED_OTHER
Enable preemptive scheduling for non-RT contexts.

We use the same algorithms as the CPU scheduler to calculate the time
slice length, and for now we also use the same timeslice length as the
CPU scheduler. This might be not enough for good performance and can be
changed after some benchmarking.

Note that currently we do not boost the priority for contexts waiting
on the runqueue for a long time, so contexts with a higher nice value
could starve ones with less priority.  This could easily be fixed once
the rework of the spu lists that Luke and I discussed is done.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:44 +10:00
Christoph Hellwig 3790180220 [POWERPC] spusched: Switch from workqueues to kthread + timer tick
Get rid of the scheduler workqueues that complicated things a lot to
a dedicated spu scheduler thread that gets woken by a traditional
scheduler tick.  By default this scheduler tick runs a HZ * 10, aka
one spu scheduler tick for every 10 cpu ticks.

Currently the tick is not disabled when we have less context than
available spus, but I will implement this later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-03 15:24:44 +10:00
Christoph Hellwig e5c0b9ec53 [POWERPC] spufs: Don't yield nosched context
Nosched context sould never be scheduled out, thus we must not
deactivate them in spu_yield ever.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-07 11:44:40 +10:00
Christoph Hellwig bb5db29aa0 [POWERPC] spufs scheduler: Fix wakeup races
Fix the race between checking for contexts on the runqueue and actually
waking them in spu_deactive and spu_yield.

The guts of spu_reschedule are split into a new helper called
grab_runnable_context which shows if there is a runnable thread below
a specified priority and if yes removes if from the runqueue and uses
it.  This function is used by the new __spu_deactivate hepler shared
by preemption and spu_yield to grab a new context before deactivating
a specified priority and if yes removes if from the runqueue and uses
it.  This function is used by the new __spu_deactivate hepler shared
by preemption and spu_yield to grab a new context before deactivating
the old one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-06-07 11:44:39 +10:00
Randy Dunlap e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Luke Browning 4e0f4ed0df [POWERPC] spu sched: make addition to stop_wq and runque atomic vs wakeup
Addition to stop_wq needs to happen before adding to the runqeueue and
under the same lock so that we don't have a race window for a lost
wake up in the spu scheduler.

Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:55 +02:00
Christoph Hellwig a475c2f435 [POWERPC] spufs: remove woken threads from the runqueue early
A single context should only be woken once, and we should not have
more wakeups for a given priority than the number of contexts on
that runqueue position.

Also add some asserts to trap future problems in this area more
easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:54 +02:00
Arnd Bergmann 390c534304 [POWERPC] spufs: add memory barriers after set_bit
set_bit does not guarantee ordering on powerpc, so using it
for communication between threads requires explicit
mb() calls.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:54 +02:00
Christoph Hellwig e097b51328 [POWERPC] spu sched: ensure preempted threads are put back on the runqueue, part2
To not lose a spu thread we need to make sure it always gets put back
on the runqueue.  In find_victim aswell as in the scheduler tick as done
in the previous patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:53 +02:00
Christoph Hellwig b3e76cc324 [POWERPC] spu sched: ensure preempted threads are put back on the runqueue
To not lose a spu thread we need to make sure it always gets put back
on the runqueue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:53 +02:00
Christoph Hellwig 0887309589 [POWERPC] spufs: use cancel_rearming_delayed_workqueue when stopping spu contexts
The scheduler workqueue may rearm itself and deadlock when we try to stop
it.  Put a flag in place to avoid skip the work if we're tearing down
the context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-04-23 21:18:52 +02:00
Christoph Hellwig dbf8eefa2b [POWERPC] spufs: don't yield CPU in spu_yield
There is no reason to yield the CPU in spu_yield - if the backing
thread reenters spu_run it gets added to the end of the runqueue for
it's priority.  So the yield is just a slowdown for the case where
we have higher priority contexts waiting.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13 03:55:15 +10:00
Benjamin Herrenschmidt 94b2a4393c [POWERPC] Fix spu SLB invalidations
The SPU code doesn't properly invalidate SPUs SLBs when necessary,
for example when changing a segment size from the hugetlbfs code. In
addition, it saves and restores the SLB content on context switches
which makes it harder to properly handle those invalidations.

This patch removes the saving & restoring for now, something more
efficient might be found later on. It also adds a spu_flush_all_slbs(mm)
that can be used by the core mm code to flush the SLBs of all SPEs that
are running a given mm at the time of the flush.

In order to do that, it adds a spinlock to the list of all SPEs and move
some bits & pieces from spufs to spu_base.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2007-03-10 00:07:50 +01:00
Christoph Hellwig 50b520d4ef [POWERPC] avoid SPU_ACTIVATE_NOWAKE optimization
This optimization was added recently but is still buggy,
so back it out for now.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-03-10 00:07:49 +01:00
Christoph Hellwig 2eb1b12049 [POWERPC] spu sched: static timeslicing for SCHED_RR contexts
For SCHED_RR tasks we can do some really trivial timeslicing.  Basically
we fire up a time for every scheduler tick that searches for a higher
or same priority thread that is on the runqueue and if there is one
context switches to it.  Because we can't lock spus from timer context
we actually run this from a delayed runqueue instead of a timer.

A nice optimization would be to skip the actual priority bitmap search
when there are less contexts than physical spus available.  To implement
this I need a so far unpublished patch from Andre, and it will be added
after we have that patch in.

Note that right now we only do the time slicing for SCHED_RR tasks.
The code would work for SCHED_OTHER tasks aswell, but their prio
value is defered from the one the PPU thread has at time of spu_run,
and using this for spu scheduling decisions would make the code very
unfair.  SCHED_OTHER support will be enabled once we the spu scheduler
knows how to calculcate cpu_context.prio (very soon)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:43 +01:00
Christoph Hellwig 72cb360839 [POWERPC] spu sched: use DECLARE_BITMAP
use DECLARE_BITMAP in the spu scheduler instead of reimplementing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:42 +01:00
Christoph Hellwig 52f04fcf66 [POWERPC] spu sched: forced preemption at execution
If we start a spu context with realtime priority we want it to run
immediately and not wait until some other lower priority thread has
finished.  Try to find a suitable victim and use it's spu in this
case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:42 +01:00
Christoph Hellwig ae7b4c5284 [POWERPC] spu sched: update some comments
Give spu_yield a kerneldoc comment and remove the old comment
documenting spu_activate, spu_deactive and spu_yield as all of them
now have descriptive kerneldoc comments of their own.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:42 +01:00
Christoph Hellwig 678b2ff1e6 [POWERPC] spu sched: simplity spu_remove_from_active_list
If we call spu_remove_from_active_list that spu is always guaranteed
to be on the active list and in runnable state, so we can simply
do a list_del to remove it and unconditionally take the was_active
codepath.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:41 +01:00
Christoph Hellwig 26bec67386 [POWERPC] spufs: optimize spu_run
There is no need to directly wake up contexts in spu_activate when
called from spu_run, so add a flag to surpress this wakeup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:41 +01:00
Christoph Hellwig 079cdb6161 [POWERPC] spufs: runqueue simplification
This is the biggest patch in this series, and it reworks the guts of
the spu scheduler runqueue mechanism:

 - instead of embedding a waitqueue in the runqueue there is now a
   simple doubly-linked list, the actual wakeups happen by reusing
   the stop_wq in the spu context (maybe we should rename it one day)
 - spu_free and spu_prio_wakeup are merged into a single spu_reschedule
   function
 - various functionality is split out into small helpers, and kerneldoc
   comments are added in various places to document what's going on.
 - spu_activate is rewritten into a tight loop by removing test for
   various impossible conditions and using the infrastructure in this
   patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:41 +01:00
Christoph Hellwig 8389998ae9 [POWERPC] spufs: move prio to spu_context
It doesn't make any sense to have a priority field in the physical spu
structure.  Move it into the spu context instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:55:40 +01:00
Christoph Hellwig 650f8b0291 [POWERPC] spufs: simplify state_mutex
The r/w semaphore to lock the spus was overkill and can be replaced
with a mutex to make it faster, simpler and easier to debug.  It also
helps to allow making most spufs interruptible in future patches.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:52:37 +01:00
Christoph Hellwig 202557d29e [POWERPC] spufs: sched.c cleanups
Various cleanups to sched.c that don't change the global control flow:

 - add kerneldoc comments to various functions
 - add spu_ prefixes to various functions
 - add/remove context from the runqueue in bind/unbind_context as
   it's part of the logical operation
 - add a call to put_active_spu to spu_unbind_contex as it's logically
   part of the unbind operation

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:52:36 +01:00
Christoph Hellwig 81998bafe2 [POWERPC] spufs: bind_context sets SPU_STATE_RUNNABLE
Only bind_context/unbind_context change the spu context state.  Thus
we can move all assignents of SPU_STATE_RUNNABLE into bind_context,
which parallels the unbind side aswell.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:52:36 +01:00
Christoph Hellwig aa56c16807 [POWERPC] spufs: remove superfluous SPU_STATE_SAVED assignments
unbind_context already sets the context state to SPU_STATE_SAVED, thus
the spu_deactivate callers don't need to do it again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
2007-02-13 21:52:36 +01:00
Arnd Bergmann 8676727779 [POWERPC] spufs: add infrastructure for finding elf objects
This adds an 'object-id' file that the spe library can
use to store a pointer to its ELF object. This was
originally meant for use by oprofile, but is now
also used by the GNU debugger, if available.

In order for oprofile to find the location in an spu-elf
binary where an event counter triggered, we need a way
to identify the binary in the first place.

Unfortunately, that binary itself can be embedded in a
powerpc ELF binary. Since we can assume it is mapped into
the effective address space of the running process,
have that one write the pointer value into a new spufs
file.

When a context switch occurs, pass the user value to
the profiler so that can look at the mapped file (with
some care).

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-05 09:21:02 +10:00
Arnd Bergmann 9add11daee [POWERPC] spufs: implement error event delivery to user space
This tries to fix spufs so we have an interface closer to what is
specified in the man page for events returned in the third argument of
spu_run.

Fortunately, libspe has never been using the returned contents of that
register, as they were the same as the return code of spu_run (duh!).

Unlike the specification that we never implemented correctly, we now
require a SPU_CREATE_EVENTS_ENABLED flag passed to spu_create, in
order to get the new behavior. When this flag is not passed, spu_run
will simply ignore the third argument now.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-05 09:21:01 +10:00
Mark Nutter a68cf983f6 [POWERPC] spufs: scheduler support for NUMA.
This patch adds NUMA support to the the spufs scheduler.

The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
simplified, in an attempt to reduce complexity while adding
support for NUMA scheduler domains.  SPUs are allocated starting
from the calling thread's node, moving to others as supported by
current->cpus_allowed.  Preemption is gone as it was buggy, but
should be re-enabled in another patch when stable.

The new arch/powerpc/platforms/cell/spu_base.c maintains idle
lists on a per-node basis, and allows caller to specify which
node(s) an SPU should be allocated from, while passing -1 tells
spu_alloc() that any node is allowed.

Since the patch removes the currently implemented preemptive
scheduling, it is technically a regression, but practically
all users have since migrated to this version, as it is
part of the IBM SDK and the yellowdog distribution, so there
is not much point holding it back while the new preemptive
scheduling patch gets delayed further.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-05 09:21:00 +10:00
Jörn Engel 6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Geoff Levand a91942ae7e [POWERPC] spufs: fix spu irq affinity setting
This changes the hypervisor abstraction of setting cpu affinity to a
higher level to avoid platform dependent interrupt controller
routines.  I replaced spu_priv1_ops:spu_int_route_set() with a
new routine spu_priv1_ops:spu_cpu_affinity_set().

As a by-product, this change eliminated what looked like an
existing bug in the set affinity code where spu_int_route_set()
mistakenly called int_stat_get().

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-21 15:01:31 +10:00
Arnd Bergmann a33a7d7309 [PATCH] spufs: implement mfc access for PPE-side DMA
This patch adds a new file called 'mfc' to each spufs directory.
The file accepts DMA commands that are a subset of what would
be legal DMA commands for problem state register access. Upon
reading the file, a bitmask is returned with the completed
tag groups set.

The file is meant to be used from an abstraction in libspe
that is added by a different patch.

From the kernel perspective, this means a process can now
offload a memory copy from or into an SPE local store
without having to run code on the SPE itself.

The transfer will only be performed while the SPE is owned
by one thread that is waiting in the spu_run system call
and the data will be transferred into that thread's
address space, independent of which thread started the
transfer.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-27 14:48:26 +11:00
Arnd Bergmann 2fb9d20636 [PATCH] spufs: set irq affinity for running threads
For far, all SPU triggered interrupts always end up on
the first SMT thread, which is a bad solution.

This patch implements setting the affinity to the
CPU that was running last when entering execution on
an SPU. This should result in a significant reduction
in IPI calls and better cache locality for SPE thread
specific data.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:44:57 +11:00
Arnd Bergmann 8837d9216f [PATCH] spufs: clean up use of bitops
checking bits manually might not be synchonized with
the use of set_bit/clear_bit. Make sure we always use
the correct bitops by removing the unnecessary
identifiers.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 15:44:43 +11:00
Arnd Bergmann 7945a4a27d [PATCH] spufs: trivial compile fix
One of my last patches contained a broken line
from splitting out some other changes, this
restores a working version.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:53:14 +11:00
Arnd Bergmann 2a911f0bb7 [PATCH] spufs: Improved SPU preemptability [part 2].
This patch reduces lock complexity of SPU scheduler, particularly
for involuntary preemptive switches.  As a result the new code
does a better job of mapping the highest priority tasks to SPUs.

Lock complexity is reduced by using the system default workqueue
to perform involuntary saves.  In this way we avoid nasty lock
ordering problems that the previous code had.  A "minimum timeslice"
for SPU contexts is also introduced.  The intent here is to avoid
thrashing.

While the new scheduler does a better job at prioritization it
still does nothing for fairness.

From: Mark Nutter <mnutter@us.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:52:58 +11:00
Arnd Bergmann 5110459f18 [PATCH] spufs: Improved SPU preemptability.
This patch makes it easier to preempt an SPU context by
having the scheduler hold ctx->state_sema for much shorter
periods of time.

As part of this restructuring, the control logic for the "run"
operation is moved from arch/ppc64/kernel/spu_base.c to
fs/spufs/file.c.  Of course the base retains "bottom half"
handlers for class{0,1} irqs.  The new run loop will re-acquire
an SPU if preempted.

From: Mark Nutter <mnutter@us.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:52:55 +11:00
Arnd Bergmann 3b3d22cb84 [PATCH] spufs: Turn off debugging output
spufs is rather noisy when debugging is enabled, this
turns off the messages for production use.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:52:51 +11:00
Arnd Bergmann 8b3d6663c6 [PATCH] spufs: cooperative scheduler support
This adds a scheduler for SPUs to make it possible to use
more logical SPUs than physical ones are present in the
system.

Currently, there is no support for preempting a running
SPU thread, they have to leave the SPU by either triggering
an event on the SPU that causes it to return to the
owning thread or by sending a signal to it.

This patch also adds operations that enable accessing an SPU
in either runnable or saved state. We use an RW semaphore
to protect the state of the SPU from changing underneath
us, while we are holding it readable. In order to change
the state, it is acquired writeable and a context save
or restore is executed before downgrading the semaphore
to read-only.

From: Mark Nutter <mnutter@us.ibm.com>,
      Uli Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09 14:49:30 +11:00