Commit graph

863 commits

Author SHA1 Message Date
Dave Airlie efa27f9cec Merge tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Need to get my stuff out the door ;-) Highlights:
- pc8+ support from Paulo
- more vma patches from Ben.
- Kconfig option to enable preliminary support by default (Josh
  Triplett)
- Optimized cpu cache flush handling and support for write-through caching
  of display planes on Iris (Chris)
- rc6 tuning from Stéphane Marchesin for more stability
- VECS seqno wrap/semaphores fix (Ben)
- a pile of smaller cleanups and improvements all over

Note that I've ditched Ben's execbuf vma conversion for 3.12 since not yet
ready. But there's still other vma conversion stuff in here.

* tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel: (62 commits)
  drm/i915: Print seqnos as unsigned in debugfs
  drm/i915: Fix context size calculation on SNB/IVB/VLV
  drm/i915: Use POSTING_READ in lcpll code
  drm/i915: enable Package C8+ by default
  drm/i915: add i915.pc8_timeout function
  drm/i915: add i915_pc8_status debugfs file
  drm/i915: allow package C8+ states on Haswell (disabled)
  drm/i915: fix SDEIMR assertion when disabling LCPLL
  drm/i915: grab force_wake when restoring LCPLL
  drm/i915: drop WaMbcDriverBootEnable workaround
  drm/i915: Cleaning up the relocate entry function
  drm/i915: merge HSW and SNB PM irq handlers
  drm/i915: fix how we mask PMIMR when adding work to the queue
  drm/i915: don't queue PM events we won't process
  drm/i915: don't disable/reenable IVB error interrupts when not needed
  drm/i915: add dev_priv->pm_irq_mask
  drm/i915: don't update GEN6_PMIMR when it's not needed
  drm/i915: wrap GEN6_PMIMR changes
  drm/i915: wrap GTIMR changes
  drm/i915: add the FCLK case to intel_ddi_get_cdclk_freq
  ...
2013-08-30 09:47:41 +10:00
Paulo Zanoni c67a470b1d drm/i915: allow package C8+ states on Haswell (disabled)
This patch allows PC8+ states on Haswell. These states can only be
reached when all the display outputs are disabled, and they allow some
more power savings.

The fact that the graphics device is allowing PC8+ doesn't mean that
the machine will actually enter PC8+: all the other devices also need
to allow PC8+.

For now this option is disabled by default. You need i915.allow_pc8=1
if you want it.

This patch adds a big comment inside i915_drv.h explaining how it
works and how it tracks things. Read it.

v2: (this is not really v2, many previous versions were already sent,
     but they had different names)
    - Use the new functions to enable/disable GTIMR and GEN6_PMIMR
    - Rename almost all variables and functions to names suggested by
      Chris
    - More WARNs on the IRQ handling code
    - Also disable PC8 when there's GPU work to do (thanks to Ben for
      the help on this), so apps can run caster
    - Enable PC8 on a delayed work function that is delayed for 5
      seconds. This makes sure we only enable PC8+ if we're really
      idle
    - Make sure we're not in PC8+ when suspending
v3: - WARN if IRQs are disabled on __wait_seqno
    - Replace some DRM_ERRORs with WARNs
    - Fix calls to restore GT and PM interrupts
    - Use intel_mark_busy instead of intel_ring_advance to disable PC8
v4: - Use the force_wake, Luke!
v5: - Remove the "IIR is not zero" WARNs
    - Move the force_wake chunk to its own patch
    - Only restore what's missing from RC6, not everything

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-23 14:52:33 +02:00
Ben Widawsky accfef2e5a drm/i915: prepare bind_to_vm for preallocated vma
In the new execbuf code we want to track buffers using the vmas even
before they're all properly mapped. Which means that bind_to_vm needs
to deal with buffers which have preallocated vmas which aren't yet
bound.

This patch implements this prep work and adjusts our WARN/BUG checks.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big execbuf patch. Also move one BUG
back to its original place to deflate the diff a notch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:53 +02:00
Ben Widawsky 82a55ad1a0 drm/i915: Switch eviction code to use vmas
The execbuf wants to do relocations usings vmas, so we need a
vma->exec_list. The eviction code also uses the old obj execbuf list
for it's own book-keeping, but would really prefer to deal in vmas
only. So switch it over to the new list.

Again this is just a prep patch for the big execbuf vma conversion.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big execbuf vma patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:52 +02:00
Ben Widawsky b25cb2f882 drm/i915: s/obj->exec_list/obj->obj_exec_link in debugfs
To convert the execbuf code over to use vmas natively we need to
shuffle the exec_list a bit. This patch here just prepares things with
the debugfs code, which also uses the old exec_list list_head, newly
called obj_exec_link.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:51 +02:00
Chris Wilson 4b6d846e9a drm/i915: Drop the overzealous warning from i915_gem_set_cache_level
By our earlier reckoning, move from a snooped/llc setting to an uncached
setting, leaves the CPU cache in a consistent state irrespective of our
domain tracking - so we can forgo the warning about the lack of
invalidation. Similarly for any writes posted to the snooped CPU domain,
we know will be safely clflushed to the uncached PTEs after forcing the
domain change.

This WARN started to pop up with

commit d46f1c3f13
Author:     Chris Wilson <chris@chris-wilson.co.uk>
AuthorDate: Thu Aug 8 14:41:06 2013 +0100

    drm/i915: Allow the GPU to cache stolen memory

Ville brought up a scenario where the interaction of a set_caching
ioctl call from userspace on a scanout buffer (i.e. obj->pin_display
is set) resulted in the code getting confused and not properly
flushing stale cpu cachelines. Luckily we already prevent this by
rejecting caching changes when obj->pin_count is set.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68040
Tested-by: cancan,feng <cancan.feng@intel.com>
[danvet: Add buglink, bisect result and explain why Ville's scenario
is already taken care of.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:46 +02:00
Daniel Vetter 49987099e2 drm/i915: use vma->node directly and rewrap map&fence in bind
Use () to make for neater alignment of the split lines, too. With this
we ditch another jump through the obj_gtt_size/offset indirection
maze.

Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:45 +02:00
Ben Widawsky 4bd561b3e8 drm/i915: cleanup map&fence in bind
Cleanup the map and fenceable setting during bind to make more sense,
and not check i915_is_ggtt() 2 unnecessary times

v2: Move the bools into the if block (Chris) - There are ways to tidy
this function (fence calculations for instance) even further, but they
are quite invasive, so I am punting on those unless specifically asked.

v3: Add newline between variable declaration and logic (Chris)

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:45 +02:00
Ben Widawsky 433544bd25 drm/i915: Remove node only when allocated
VMAs can be created and not bound. One may think of it as lazy cleanup,
and safely gloss over the conditions which manufacture it. In either
case, when the object backing the i915 vma is destroyed, we must cleanup
the vma without stumbling into a bunch of pitfalls that assume the vma
is bound.

NOTE: I was pretty certain the above condition could only happen when we
introduced the use of VMAs being looked up at execbuf, and already
existing. Paulo has hit this though, so I must be missing something. As
I believe the patch is correct anyway, therefore I won't scratch my head
too hard.

v2: use goto destroy as a compromise (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:44 +02:00
Chris Wilson 4257d3ba3b drm/i915: Allow the user to set bo into the DISPLAY cache domain
This is primarily for the benefit of the create2 ioctl so that the
caller can avoid the later step of rebinding the bo with new PTE bits.
After introducing WT (and possibly GFDT) cacheing for display targets,
not everything in the display is earmarked as UC, and more importantly
what is is controlled by the kernel.

Note that set_cache_level/get_cache_level for DISPLAY is not necessarily
idempotent; get_cache_level may return UC for architectures that have no
special cache domain for the display engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:39 +02:00
Chris Wilson 651d794fae drm/i915: Use Write-Through cacheing for the display plane on Iris
Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.

However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.

v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:38 +02:00
Jani Nikula f2f4d82faf drm/i915: give more distinctive names to ring hangcheck action enums
The short lowercase names are bound to collide. The default warnings
don't even warn about shadowing.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:37 +02:00
Ben Widawsky 7ace7ef2f5 drm/i915: WARN_ON failed map_and_fenceable
I just noticed in our code we don't really check the assertion, and
given some of the code I am changing in this area, I feel a WARN is very
nice to have.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: s/&/&&/ to fix typo on the check.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:35 +02:00
Chris Wilson 000433b67e drm/i915: Only do a chipset flush after a clflush
Now that we skip clflushes more often, return a boolean indicating
whether the clflush was actually performed, and only if it was do the
chipset flush. (Though on most of the architectures where the clflush will
be skipped, the chipset flush is a no-op!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-22 13:31:34 +02:00
Dave Airlie 9712def2b3 Merge tag 'drm-intel-next-2013-08-09' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
New pile of stuff for -next:
- Cleanup of the old crtc helper callbacks, all encoders are now converted
  to the i915 modeset infrastructure.
- Massive amount of wm patches from Ville for ilk, snb, ivb, hsw, this is
  prep work to eventually get things going for nuclear pageflips where we
  need to adjust watermarks on the fly.
- More vm/vma patches from Ben. This refactoring isn't yet fully rolled
  out, we miss the execbuf conversion and some of the low-level
  bind/unbind support code.
- Convert our hdmi infoframe code to use the new common helper functions
  (Damien). This contains some bugfixes for the common infoframe helpers.
- Some cruft removal from Damien.
- Various smaller bits&pieces all over, as usual.

* tag 'drm-intel-next-2013-08-09' of git://people.freedesktop.org/~danvet/drm-intel: (105 commits)
  drm/i915: Fix FB WM for HSW
  drm/i915: expose HDMI connectors on port C on BYT
  drm/i915: fix a limit check in hsw_compute_wm_results()
  drm/i915: unbreak i915_gem_object_ggtt_unbind()
  drm/i915: Make intel_set_mode() static
  drm/i915: Remove intel_modeset_disable()
  drm/i915: Make intel_encoder_dpms() static
  drm/i915: Make i915_hangcheck_elapsed() static
  drm/i915: Fix #endif comment
  drm/i915: Remove i915_gem_object_check_coherency()
  drm/i915: Remove stale prototypes
  drm/i915: List objects allocated from stolen memory in debugfs
  drm/i915: Always call intel_update_sprite_watermarks() when disabling a plane
  drm/i915: Pass plane and crtc to intel_update_sprite_watermarks
  drm/i915: Don't try to disable plane if it's already disabled
  drm/i915: Pass crtc to our update/disable_plane hooks
  drm/i915: Split plane watermark parameters into a separate struct
  drm/i915: Pull some watermarks state into a separate structure
  drm/i915: Calculate max watermark levels for ILK+
  drm/i915: Rename hsw_lp_wm_result to intel_wm_level
  ...
2013-08-21 12:48:59 +10:00
Chris Wilson 2c22569bba drm/i915: Update rules for writing through the LLC with the cpu
As mentioned in the previous commit, reads and writes from both the CPU
and GPU go through the LLC. This gives us coherency between the CPU and
GPU irrespective of the attribute settings either device sets. We can
use to avoid having to clflush even uncached memory.

Except for the scanout.

The scanout resides within another functional block that does not use
the LLC but reads directly from main memory. So in order to maintain
coherency with the scanout, writes to uncached memory must be flushed.
In order to optimize writes elsewhere, we start tracking whether an
framebuffer is attached to an object.

v2: Use pin_display tracking rather than fb_count (to ensure we flush
cursors as well etc) and only force the clflush along explicit writes to
the scanout paths (i.e. pin_to_display_plane and pwrite into scanout).

v3: Force the flush after hitting the slowpath in pwrite, as after
dropping the lock the object's cache domain may be invalidated. (Ville)

Based on a patch by Ville Syrjälä.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:20:49 +02:00
Chris Wilson cc98b413c1 drm/i915: Track when an object is pinned for use by the display engine
The display engine has unique coherency rules such that it requires
special handling to ensure that all writes to cursors, scanouts and
sprites are clflushed. This patch introduces the infrastructure to
simply track when an object is being accessed by the display engine.

v2: Explain the is_pin_display() magic as the sources for obj->pin_count
and their individual rules is not obvious. (Ville)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:19:51 +02:00
Chris Wilson c76ce038e3 drm/i915: Update rules for reading cache lines through the LLC
The LLC is a fun device. The cache is a distinct functional block within
the SA that arbitrates access from both the CPU and GPU cores. As such
all writes to memory land first in the LLC before further action is
taken. For example, an uncached write from either the CPU or GPU will
then proceed to memory and evict the cacheline from the LLC. This means that
a read from the LLC always returns the correct information even if the PTE
bit in the GPU differs from the PAT bit in the CPU. For the older
snooping architecture on non-LLC, the fundamental principle still holds
except that some coordination is required between the CPU and GPU to
explicitly perform the snooping (which is handled by our request
tracking).

The upshot of this is that we know that we can issue a read from either
LLC devices or snoopable memory and trust the contents of the cache -
i.e. we can forgo a clflush before a read in these circumstances.
Writing to memory from the CPU is a little more tricky as we have to
consider that the scanout does not read from the CPU cache at all, but
from main memory. So we have to currently treat all requests to write to
uncached memory as having to be flushed to main memory for coherency
with all consumers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-10 11:19:50 +02:00
Dan Carpenter 58e73e1570 drm/i915: unbreak i915_gem_object_ggtt_unbind()
There is an extra semi-colon here so we just leak and never unbind
anything.

This regression has been introduced in

commit 07fe0b1280
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 31 17:00:10 2013 -0700

    drm/i915: plumb VM into bind/unbind code

Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-09 12:04:53 +02:00
Ben Widawsky 8b9c2b9411 drm/i915: Add vma to list at creation
With the current code there shouldn't be a distinction - however with an
upcoming change we intend to allocate a vma much earlier, before it's
actually bound anywhere.

To do this we have to check node allocation as well for the _bound()
check.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: move list_del(&vma->vma_link) from vma_unbind to vma_destroy,
again fallout from the loss of "rm/i915: Cleanup more of VMA in
destroy".]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

fixup for drm/i915: Add vma to list at creation
2013-08-08 14:10:20 +02:00
Ben Widawsky ca191b1313 drm/i915: mm_list is per VMA
formerly: "drm/i915: Create VMAs (part 5) - move mm_list"

The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VMx .

Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.

Shamelessly manipulated out of Daniel:
"active/inactive stuff is used by eviction when we run out of address
space, so needs to be per-vma and per-address space. Bound/unbound otoh
is used by the shrinker which only cares about the amount of memory used
and not one bit about in which address space this memory is all used in.
Of course to actual kick out an object we need to unbind it from every
address space, but for that we have the per-object list of vmas."

v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)

v3: Moved earlier in the series

v4: Add dropped message from v3

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Frob patch to apply and use vma->node.size directly as
discused with Ben. Also drop a needles BUG_ON before move_to_inactive,
the function itself has the same check.]
[danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA
in destroy", specifically unlink the vma from the mm_list in
vma_unbind (to keep it symmetric with bind_to_vm) instead of
vma_destroy.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:06:58 +02:00
Ben Widawsky 5cacaac77c drm/i915: Fix up map and fenceable for VMA
formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable
tracking"

The map_and_fenceable tracking is per object. GTT mapping, and fences
only apply to global GTT. As such,  object operations which are not
performed on the global GTT should not effect mappable or fenceable
characteristics.

Functionally, this commit could very well be squashed in to a previous
patch which updated object operations to take a VM argument.  This
commit is split out because it's a bit tricky (or at least it was for
me).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Drop the bogus hunk in i915_vma_unbind as discussed with
Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:04:55 +02:00
Ben Widawsky 9843877d10 drm/i915: turn bound_ggtt checks to bound_any
In some places, we want to know if an object is bound in any address
space, and not just the global GTT. This often applies when there is a
single global resource (object, pages, etc.)

function                             |      reason
--------------------------------------------------
i915_gem_object_is_inactive          | global object
i915_gem_object_put_pages            | object's pages
915_gem_object_unpin                 | global object
i915_gem_execbuffer_unreserve_object | temporary until we plumb vma
pread/pwrite                         | see the note below

Note: set_to_gtt_domain in pwrite/pread is abused as a wait_rendering
call - but that once only worked if the object is bound. We really
should replace this with a plain wait_rendering call, which would have
the upside that in pread it would be clearer that we actually only
wait for oustanding gpu writes.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Explain the set_to_gtt_domain in pwrite/pread and volunteer
Ben to replace those with wait_rendering calls.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:04:43 +02:00
Ben Widawsky f6cd1f15d3 drm/i915: Use new bind/unbind in eviction code
Eviction code, like the rest of the converted code needs to be aware of
the address space for which it is evicting (or the everything case, all
addresses). With the updated bind/unbind interfaces of the last patch,
we can now safely move the eviction code over.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:04:43 +02:00
Ben Widawsky 07fe0b1280 drm/i915: plumb VM into bind/unbind code
As alluded to in several patches, and it will be reiterated later... A
VMA is an abstraction for a GEM BO bound into an address space.
Therefore it stands to reason, that the existing bind, and unbind are
the ones which will be the most impacted. This patch implements this,
and updates all callers which weren't already updated in the series
(because it was too messy).

This patch represents the bulk of an earlier, larger patch. I've pulled
out a bunch of things by the request of Daniel. The history is preserved
for posterity with the email convention of ">" One big change from the
original patch aside from a bunch of cropping is I've created an
i915_vma_unbind() function. That is because we always have the VMA
anyway, and doing an extra lookup is useful. There is a caveat, we
retain an i915_gem_object_ggtt_unbind, for the global cases which might
not talk in VMAs.

> drm/i915: plumb VM into object operations
>
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
>
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
>
> Some code will still need to be ported over after this.
>
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
>
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
>
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
>
> v5: Very large rebase
>
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
>
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Rebase on top of the loss of "drm/i915: Cleanup more of VMA
in destroy".]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:04:20 +02:00
Ben Widawsky 80dcfdbd68 drm/i915: Rework __i915_gem_shrink
In order to do this for all VMs, it's convenient to rework the logic a
bit. This should have no functional impact.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-08 14:02:41 +02:00
Dave Airlie 32c913e436 Merge tag 'drm-intel-next-2013-07-26-fixed' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Neat that QA (and Ben) keeps on humming along while I'm on vacation, so
you already get the next feature pull request:
- proper eLLC support for HSW from Ben
- more interrupt refactoring
- add w/a tags where we implement them already (Damien)
- hangcheck fixes (Chris) + hangcheck stats (Mika)
- flesh out the new vm structs for ppgtt and ggtt (Ben)
- PSR for Haswell, still disabled by default (Rodrigo et al.)
- pc8+ refclock sequence code from Paulo
- more interrupt refactoring from Paulo, unifying ilk/snb with the ivb/hsw
  interrupt code
- full solution for the Haswell concurrent reg access issues (Chris)
- fix racy object accounting, used by some new leak tests
- fix sync polarity settings on ch7xxx dvo encoder
- random bits&pieces, little fixes and better debug output all over

[airlied: fix conflict with drm_mm cleanups]

* tag 'drm-intel-next-2013-07-26-fixed' of git://people.freedesktop.org/~danvet/drm-intel: (289 commits)
  drm/i915: Do not dereference NULL crtc or fb until after checking
  drm/i915: fix pnv display core clock readout out
  drm/i915: Replace open-coded offset_in_page()
  drm/i915: Retry DP aux_ch communications with a different clock after failure
  drm/i915: Add messages useful for HPD storm detection debugging (v2)
  drm/i915: dvo_ch7xxx: fix vsync polarity setting
  drm/i915: fix the racy object accounting
  drm/i915: Convert the register access tracepoint to be conditional
  drm/i915: Squash gen lookup through multiple indirections inside GT access
  drm/i915: Use the common register access functions for NOTRACE variants
  drm/i915: Use a private interface for register access within GT
  drm/i915: Colocate all GT access routines in the same file
  drm/i915: fix reference counting in i915_gem_create
  drm/i915: Use Graphics Base of Stolen Memory on all gen3+
  drm/i915: disable stolen mem for OVERLAY_NEEDS_PHYSICAL
  drm/i915: add functions to disable and restore LCPLL
  drm/i915: disable CLKOUT_DP when it's not needed
  drm/i915: extend lpt_enable_clkout_dp
  drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt
  drm/i915: Add some debug breadcrumbs to connector detection
  ...
2013-08-07 18:11:35 +10:00
David Herrmann 31e5d7c67b drm/mm: add "best_match" flag to drm_mm_insert_node()
Add a "best_match" flag similar to the drm_mm_search_*() helpers so we
can convert TTM to use them in follow up patches. We can also inline the
non-generic helpers and move them into the header to allow compile-time
optimizations.

To make calls to drm_mm_{search,insert}_node() more readable, this
converts the boolean argument to a flagset. There are pending patches that
add additional flags for top-down allocators and more.

v2:
 - use flag parameter instead of boolean "best_match"
 - convert *_search_free() helpers to also use flags argument

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-07 10:08:58 +10:00
Daniel Vetter 43387b37fa drm/gem: create drm_gem_dumb_destroy
All the gem based kms drivers really want the same function to
destroy a dumb framebuffer backing storage object.

So give it to them and roll it out in all drivers.

This still leaves the option open for kms drivers which don't use GEM
for backing storage, but it does decently simplify matters for gem
drivers.

Acked-by: Inki Dae <inki.dae@samsung.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: Ben Skeggs <skeggsb@gmail.com>
Reviwed-by: Rob Clark <robdclark@gmail.com>
Cc: Alex Deucher <alexdeucher@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-07 09:59:24 +10:00
Ben Widawsky 637efacf8f drm/i915: eliminate dead domain clearing on reset
The code itself is no longer accurate without updating once we have
multiple address space since clearing the domains of every object
requires scanning the inactive list for all VMs.

"This code is dead. Just remove it rather than port it to vma." - Chris
Wilson

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:12:25 +02:00
Ben Widawsky d1ccbb5d71 drm/i915: make reset&hangcheck code VM aware
Hangcheck, and some of the recent reset code for guilty batches need to
know which address space the object was in at the time of a hangcheck.
This is because we use offsets in the (PP|G)GTT to determine this
information, and those offsets can differ depending on which VM they are
bound into.

Since we still only have 1 VM ever, this code shouldn't yet have any
impact.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:13 +02:00
Ben Widawsky 3e12302705 drm/i915: BUG_ON put_pages later
With multiple VMs, the eviction code benefits from being able to blindly
put pages without needing to know if there are any entities still
holding on to those pages. As such it's preferable to return the -EBUSY
before the BUG.

Eviction code is the only user for now, but overall it makes sense
anyway, IMO.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:13 +02:00
Ben Widawsky 3089c6f239 drm/i915: make caching operate on all address spaces
For now, objects will maintain the same cache levels amongst all address
spaces. This is to limit the risk of bugs, as playing with cacheability
in the different domains can be very error prone.

In the future, it may be optimal to allow setting domains per VMA (ie.
an object bound into an address space).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:12 +02:00
Ben Widawsky c37e220461 drm/i915: Add VM to pin
To verbalize it, one can say, "pin an object into the given address
space." The semantics of pinning remain the same otherwise.

Certain objects will always have to be bound into the global GTT.
Therefore, global GTT is a special case, and keep a special interface
around for it (i915_gem_obj_ggtt_pin).

v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:09 +02:00
Ben Widawsky fcb4a57805 drm/i915: Use bound list for inactive shrink
Do to the move active/inactive lists, it no longer makes sense to use
them for shrinking, since shrinking isn't VM specific (such a need may
also exist, but doesn't yet).

What we can do instead is use the global bound list to find all objects
which aren't active.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:09 +02:00
Ben Widawsky a70a3148b0 drm/i915: Make proper functions for VMs
Earlier in the conversion sequence we attempted to quickly wedge in the
transitional interface as static inlines.

Now that we're sure these interfaces are sane, for easier debug and to
decrease code size (since many of these functions may be called quite a
bit), make them real functions

While at it, kill off the set_color interface. We'll always have the
VMA, or easily get to it.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:08 +02:00
Ben Widawsky fc8c067eee drm/i915: Create an init vm
Move all the similar address space (VM) initialization code to one
function. Until we have multiple VMs, there should only ever be 1 VM.
The aliasing ppgtt is a special case without it's own VM (since it
doesn't need it's own address space management).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-08-05 19:04:07 +02:00
Daniel Vetter c20e835586 drm/i915: fix the racy object accounting
Just use a spinlock to protect them.

v2: Rebase onto the new object create refcount fix patch.

v3: Don't kill dev_priv->mm.object_memory as requested by Chris and
hence just use a spinlock instead of atomic_t.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67287
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-25 15:30:54 +02:00
Daniel Vetter cb54b53ada Merge commit 'Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux'
This backmerges Linus' merge commit of the latest drm-fixes pull:

commit 549f3a1218
Merge: 42577ca 058ca4a
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Jul 23 15:47:08 2013 -0700

    Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

We've accrued a few too many conflicts, but the real reason is that I
want to merge the 100% solution for Haswell concurrent registers
writes into drm-intel-next. But that depends upon the 90% bandaid
merged into -fixes:

commit a7cd1b8fea
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 19 20:36:51 2013 +0100

    drm/i915: Serialize almost all register access

Also, we can roll up on accrued conflicts.

Usually I'd backmerge a tagged -rc, but I want to get this done before
heading off to vacations next week ;-)

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_gem.c

v2: For added hilarity we have a init sequence conflict around the
gt_lock, so need to move that one, too. Spotted by Jani Nikula.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-25 15:18:41 +02:00
David Herrmann 51335df9f0 drm/vma: provide drm_vma_node_unmap() helper
Instead of unmapping the nodes in TTM and GEM users manually, we provide
a generic wrapper which does the correct thing for all vma-nodes.

v2: remove bdev->dev_mapping test in ttm_bo_unmap_virtual_unlocked() as
ttm_mem_io_free_vm() does nothing in that case (io_reserved_vm is 0).
v4: Fix docbook comments
v5: use drm_vma_node_size()

Cc: Dave Airlie <airlied@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2013-07-25 20:47:08 +10:00
David Herrmann 0de23977cf drm/gem: convert to new unified vma manager
Use the new vma manager instead of the old hashtable. Also convert all
drivers to use the new convenience helpers. This drops all the
(map_list.hash.key << PAGE_SHIFT) non-sense.

Locking and access-management is exactly the same as before with an
additional lock inside of the vma-manager, which strictly wouldn't be
needed for gem.

v2:
 - rebase on drm-next
 - init nodes via drm_vma_node_reset() in drm_gem.c
v3:
 - fix tegra
v4:
 - remove duplicate if (drm_vma_node_has_offset()) checks
 - inline now trivial drm_vma_node_offset_addr() calls
v5:
 - skip node-reset on gem-init due to kzalloc()
 - do not allow mapping gem-objects with offsets (backwards compat)
 - remove unneccessary casts

Cc: Inki Dae <inki.dae@samsung.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2013-07-25 20:47:06 +10:00
Daniel Vetter d861e33876 drm/i915: fix reference counting in i915_gem_create
This function is called without the dev->struct_mutex held, hence we
need to use the _unlocked unreference variants.

As soon as the object is registered userspace can sneak in here with a
gem_close ioctl call, so the object can (and with my new evil tests
actually does) get the final unreference in this place. The lack of
locking then results in hilarity and some good leakage.

To fix this we simply need to revert

Chris Wilson <chris@chris-wilson.co.uk>

v2: We need to make the trace call _before_ we drop our ref - the
object might very well be gone by then already.

v3: Just revert the original patch as suggested by Chris Wilson.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Remove the added white line again to tighten the return
block, requested by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-24 23:25:03 +02:00
Daniel Vetter bc6bc15bd7 drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt
This has been broken in

commit 2f63315692
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 17 12:19:03 2013 -0700

    drm/i915: Create VMAs

which resulted in an OOPS the first time around we've hit -ENOSPC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Tested-by: meng <mengmeng.meng@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-24 10:37:08 +02:00
Xiong Zhang 0b74b508f7 drm/i915: add prefault_disable module option
prefault is stll enabled by default which prevent most of pwrite/pread/reloc
from running slow path, in order to verify these slow pathes, prefault need
to be disabled.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
[danvet: Make checkpatch happy and bikeshed the module option help
text a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-19 09:29:26 +02:00
Dan Carpenter 6286ef9b56 drm/i915: use after free on error path
i915_gem_vma_destroy() frees its argument so we have to move the
drm_mm_remove_node() call up a few lines.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-19 08:58:42 +02:00
Dan Carpenter db473b36d4 drm/i915: checking for NULL instead of IS_ERR()
i915_gem_vma_create() returns and ERR_PTR() or a valid pointer, it never
returns NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-19 08:58:33 +02:00
Dave Airlie e13af9a834 Merge tag 'drm-intel-next-2013-07-12' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Highlights:
- follow-up refactoring after the shared dpll rework that landed in 3.11
- oddball prep cleanups from Ben for ppgtt
- encoder->get_config state tracking infrastructure from Jesse
- used by the experimental fastboot support from Jesse (disabled by
  default)
- make the error state file official and add it to our sysfs interface
  (Mika)
- drm_mm prep changes from Ben, prepares to embedd the drm_mm_node (which
  will be used by the vma rework later on)
- interrupt handling rework, follow up cleanups to the VECS enabling, hpd
  storm handling and fifo underrun reporting.
- Big pile of smaller cleanups, code improvements and related stuff.

* tag 'drm-intel-next-2013-07-12' of git://people.freedesktop.org/~danvet/drm-intel: (72 commits)
  drm/i915: clear DPLL reg when disabling i9xx dplls
  drm/i915: Fix up cpt pixel multiplier enable sequence
  drm/i915: clean up vlv ->pre_pll_enable and pll enable sequence
  drm/i915: move error state to own compilation unit
  drm/i915: Don't attempt to read an unitialized stack value
  drm/i915: Use for_each_pipe() when possible
  drm/i915: don't enable PM_VEBOX_CS_ERROR_INTERRUPT
  drm/i915: unify ring irq refcounts (again)
  drm/i915: kill dev_priv->rps.lock
  drm/i915: queue work outside spinlock in hsw_pm_irq_handler
  drm/i915: streamline hsw_pm_irq_handler
  drm/i915: irq handlers don't need interrupt-safe spinlocks
  drm/i915: kill lpt pch transcoder->crtc mapping code for fifo underruns
  drm/i915: improve GEN7_ERR_INT clearing for fifo underrun reporting
  drm/i915: improve SERR_INT clearing for fifo underrun reporting
  drm/i915: extract ibx_display_interrupt_update
  drm/i915: remove unused members from drm_i915_private
  drm/i915: don't frob mm.suspended when not using ums
  drm/i915: Fix VLV DP RBR/HDMI/DAC PLL LPF coefficients
  drm/i915: WARN if the bios reserved range is bigger than stolen size
  ...

Conflicts:
	drivers/gpu/drm/i915/i915_gem.c
2013-07-19 12:12:21 +10:00
Daniel Vetter 94a335dba3 drm/i915: correctly restore fences with objects attached
To avoid stalls we delay tiling changes and especially hold of
committing the new fence state for as long as possible.
Synchronization points are in the execbuf code and in our gtt fault
handler.

Unfortunately we've missed that tricky detail when adding proper fence
restore code in

commit 19b2dbde57
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 12 10:15:12 2013 +0100

    drm/i915: Restore fences after resume and GPU resets

The result was that we've restored fences for objects with no tiling,
since the object<->fence link still existed after resume. Now that
wouldn't have been too bad since any subsequent access would have
fixed things up, but if we've changed from tiled to untiled real havoc
happened:

The tiling stride is stored -1 in the fence register, so a stride of 0
resulted in all 1s in the top 32bits, and so a completely bogus fence
spanning everything from the start of the object to the top of the
GTT. The tell-tale in the register dumps looks like:

                 FENCE START 2: 0x0214d001
                 FENCE END 2: 0xfffff3ff

Bit 11 isn't set since the hw doesn't store it, even when writing all
1s (at least on my snb here).

To prevent such a gaffle in the future add a sanity check for fences
with an untiled object attached in i915_gem_write_fence.

v2: Fix the WARN, spotted by Chris.

v3: Trying to reuse get_fences looked ugly and obfuscated the code.
Instead reuse update_fence and to make it really dtrt also move the
fence dirty state clearing into update_fence.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60530
Cc: stable@vger.kernel.org (for 3.10 only)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Matthew Garrett <matthew.garrett@nebula.com>
Tested-by: Björn Bidar <theodorstormgrade@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-19 00:08:16 +02:00
Daniel Vetter 8157ee2115 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into drm-intel-fixes

Backmerge Linux 3.10 to get at

commit 19b2dbde57
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jun 12 10:15:12 2013 +0100

    drm/i915: Restore fences after resume and GPU resets

That commit is not in my current -fixes pile since that's based on my
-next queue for 3.11. And the above mentioned fix was merged really
late into 3.10 (and blew up, bad me) so was on a diverging branch.

Option B would have been to rebase my current pile of fixes onto
Dave's drm-fixes branch. But since some of the patches here are a bit
tricky I've decided not to void all the testing by moving over the
entire merge window.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-18 12:03:29 +02:00
Ben Widawsky 2f63315692 drm/i915: Create VMAs
Formerly: "drm/i915: Create VMAs (part 1)"

In a previous patch, the notion of a VM was introduced. A VMA describes
an area of part of the VM address space. A VMA is similar to the concept
in the linux mm. However, instead of representing regular memory, a VMA
is backed by a GEM BO. There may be many VMAs for a given object, one
for each VM the object is to be used in. This may occur through flink,
dma-buf, or a number of other transient states.

Currently the code depends on only 1 VMA per object, for the global GTT
(and aliasing PPGTT). The following patches will address this and make
the rest of the infrastructure more suited

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

v4: killed obj->gtt_space
some reworks due to rebase

v5: Free vma on error path (Imre)

v6: Another missed vma free in i915_gem_object_bind_to_gtt error path
(Imre)
Fixed vma freeing in stolen preallocation (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Squash in fixup from Ben to not deref a non-existing vma in
set_cache_level, reported by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-18 08:46:13 +02:00
Ben Widawsky 5cef07e162 drm/i915: Move active/inactive lists to new mm
Shamelessly manipulated out of Daniel :-)
"When moving the lists around explain that the active/inactive stuff is
used by eviction when we run out of address space, so needs to be
per-vma and per-address space. Bound/unbound otoh is used by the
shrinker which only cares about the amount of memory used and not one
bit about in which address space this memory is all used in. Of course
to actual kick out an object we need to unbind it from every address
space, but for that we have the per-object list of vmas."

v2: Leave the bound list as a global one. (Chris, indirectly)

v3: Rebased with no i915_gtt_vm. In most places I added a new *vm local,
since it will eventually be replaces by a vm argument.
Put comment back inline, since it no longer makes sense to do otherwise.

v4: Rebased on hangcheck/error state movement

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:24:32 +02:00
Ben Widawsky 93bd8649db drm/i915: Put the mm in the parent address space
Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

v2: Rebased

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:23:43 +02:00
Ben Widawsky 853ba5d223 drm/i915: Move gtt and ppgtt under address space umbrella
The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists), and
many of their characteristics (size) can be shared. Do that.

The change itself doesn't actually impact most of the VMA/VM rework
coming up, it just fits in with the grand scheme of abstracting the GPU
VM operations. GGTT will usually be a special case where we either know
an object must be in the GGTT (dislay engine, workarounds, etc.).

The scratch page is left as part of the VM (even though it's currently
shared with the ppgtt code) because in the future when we have Full
PPGTT, I intend to create a separate scratch page for each.

v2: Drop usage of i915_gtt_vm (Daniel)
Make cleanup also part of the parent class (Ben)
Modified commit msg
Rebased

v3: Properly share scratch page (Imre)
Finish commit message (Daniel, Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-17 22:21:47 +02:00
Dave Airlie 6bd2cab2c1 Merge tag 'drm-intel-fixes-2013-07-11' of git://people.freedesktop.org/~danvet/drm-intel
One feature latecomer, I've forgotten to merge the patch to reeanble the
Haswell power well feature now that the audio interaction is fixed up.
Since that was the only unfixed issue with it I've figured I could throw
it in a bit late, and it's trivial to revert in case I'm wrong.

Otherwise all bug/regression fixes:
- Fix status page reinit after gpu hangs, spotted by more paranoid igt
  checks.
- Fix object list walking fumble regression in the shrinker (only the
  counting part, the actual shrinking code was correct so no Oops
  potential), from Xiong Zhang.
- Fix DP 1.2 bw limits (Imre).
- Restore legacy forcewake on ivb, too many broken biosen out there. We
  dump a warn though that recent userspace might fall over with that
  config (Guenter Roeck).
- Patch up the gen2 cs tlb w/a.
- Improve the fence coherency w/a now that we have a better understanding
  what's going on. The removed wbinvd+ipi should make -rt folks happy. Big
  thanks to Jon Bloomfield for figuring this out, patches from Chris.
- Fix write-read race when switching ring (Chris). Spotted with code
  inspection, but now we also have an igt for it.

There's an ugly regression we're still working on introduced between
3.10-rc7 and 3.10.0. Unfortunately we can't just revert the offender since
that one fixes another regression :( I've asked Steven to include my
-fixes branch into linux-next to prevent such fallout in the future,
hopefully.

* tag 'drm-intel-fixes-2013-07-11' of git://people.freedesktop.org/~danvet/drm-intel:
  Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs"
  drm/i915: Fix incoherence with fence updates on Sandybridge+
  drm/i915: Fix write-read race with multiple rings
  Partially revert "drm/i915: unconditionally use mt forcewake on hsw/ivb"
  drm/i915: fix lane bandwidth capping for DP 1.2 sinks
  drm/i915: fix up ring cleanup for the i830/i845 CS tlb w/a
  drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list
  drm/i915: switch disable_power_well default value to 1
  drm/i915: reinit status page registers after gpu reset
2013-07-17 08:40:49 +10:00
Mika Kuoppala 10cd45b6e8 drm/i915: introduce i915_queue_hangcheck
To run hangcheck in near future.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 12:44:02 +02:00
Ben Widawsky 59124506ba drm/i915: store eLLC size
The eLLC cannot be determined by PCIID because as far as we know, even
machines supporting eLLC may not have it enabled, or fused off or
whatever. It's possible this isn't actually true, and at that point we
can switch to a DEV_INFO flag instead.

I've defined everything where the docs are clear, and left the rest as
magic.

But we need it before we set the pte_encode function pointers, which
happens really early, in gtt_init.

The problem with just doing the normal sequence earlier is we don't have
the ability to use forcewake until after the pte functions have been set
up.

Since all solutions are somewhat ugly (barring rewriting all the init
ordering), I've opted to do the detection really early, and the enabling
later - since the register to detect doesn't require forcewake.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Move dev_priv->ellc_size away from the dri1 dungeon to a nice
place right next to the l3 parity stuff. Also squash in the follow-up
commit to read out the eLLC size a bit earlier.]
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 08:08:21 +02:00
Ben Widawsky 05e21cc43d drm/i915: Define some of the eLLC magic
The EDRAM present register isn't really defined in the docs. It just
says check to see if it's set to 1. So I haven't defined the 1 value not
knowing what it actually means.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-16 08:00:52 +02:00
Chris Wilson 46a0b638f3 Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs"
This reverts commit 25ff119 and the follow on for Valleyview commit 2dc8aae.

commit 25ff1195f8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

commit 2dc8aae06d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed May 22 17:08:06 2013 +0100

    drm/i915: Workaround incoherence with fence updates on Valleyview

Jon Bloomfield came up with a plausible explanation and cheap fix
(drm/i915: Fix incoherence with fence updates on Sandybridge+) for the
race condition, so lets run with it.

This is a candidate for stable as the old workaround incurs a
significant cost (calling wbinvd on all CPUs before performing the
register write) for some workloads as noted by Carsten Emde.

Link: http://lists.freedesktop.org/archives/intel-gfx/2013-June/028819.html
References: https://www.osadl.org/?id=1543#c7602
References: https://bugs.freedesktop.org/show_bug.cgi?id=63825
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Carsten Emde <C.Emde@osadl.org>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-10 15:31:12 +02:00
Chris Wilson d18b961903 drm/i915: Fix incoherence with fence updates on Sandybridge+
This hopefully fixes the root cause behind the workaround added in

commit 25ff1195f8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

Thanks to further investigation by Jon Bloomfield, he realised that
the 64-bit register might be broken up by the hardware into two 32-bit
writes (a problem we have encountered elsewhere). This non-atomicity
would then cause an issue where a second thread would see an
intermediate register state (new high dword, old low dword), and this
register would randomly be used in preference to its own thread register.
This would cause the second thread to read from and write into a fairly
random tiled location.  Breaking the operation into 3 explicit 32-bit
updates (first disable the fence, poke the upper bits, then poke the lower
bits and enable) ensures that, given proper serialisation between the
32-bit register write and the memory transfer, that the fence value is
always consistent.

Armed with this knowledge, we can explain how the previous workaround
work. The key to the corruption is that a second thread sees an
erroneous fence register that conflicts and overrides its own. By
serialising the fence update across all CPUs, we have a small window
where no GTT access is occurring and so hide the potential corruption.
This also leads to the conclusion that the earlier workaround was
incomplete.

v2: Be overly paranoid about the order in which fence updates become
visible to the GPU to make really sure that we turn the fence off before
doing the update, and then only switch the fence on afterwards.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Carsten Emde <C.Emde@osadl.org>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-10 14:41:46 +02:00
Daniel Vetter db1b76ca6a drm/i915: don't frob mm.suspended when not using ums
In kernel modeset driver mode we're in full control of the chip,
always. So there's no need at all to set mm.suspended in
i915_gem_idle. Hence move that out into the leavevt ioctl. Since
i915_gem_idle doesn't suspend gem any more we can also drop the
re-enabling for KMS in the thaw function.

Also clean up the handling of mm.suspend at driver load by coalescing
all the assignments.

Stumbled over while reading through our resume code for unrelated
reasons.

v2: Shovel mm.suspended into the (newly created) ums dungeon as
suggested by Chris Wilson. The plan is that once we've completely
stopped relying on the register save/restore code we could shovel even
that in there.

v3: Improve the locking for the entervt/leavevt ioctls a bit by moving
the dev->struct_mutex locking outside of i915_gem_idle. Also don't
clear dev_priv->ums.mm_suspended for the kms case, we allocate it with
kzalloc. Both suggested by Chris Wilson.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-10 14:30:25 +02:00
Chris Wilson 02978ff57a drm/i915: Fix write-read race with multiple rings
Daniel noticed a problem where is we wrote to an object with ring A in
the middle of a very long running batch, then executed a quick batch on
ring B before a batch that reads from the same object, its obj->ring would
now point to ring B, but its last_write_seqno would be still relative to
ring A. This would allow for the user to read from the object before the
GPU had completed the write, as set_domain would only check that ring B
had passed the last_write_seqno.

To fix this simply (and inelegantly), we bump the last_write_seqno when
switching rings so that the last_write_seqno is always relative to the
current obj->ring.

This fixes igt/tests/gem_write_read_ring_switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
[danvet: Add note about the newly created igt which exercises this
bug.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-10 10:41:55 +02:00
Linus Torvalds 2e17c5a97e Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Okay this is the big one, I was stalled on the fbdev pull req as I
  stupidly let fbdev guys merge a patch I required to fix a warning with
  some patches I had, they ended up merging the patch from the wrong
  place, but the warning should be fixed.  In future I'll just take the
  patch myself!

  Outside drm:

  There are some snd changes for the HDMI audio interactions on haswell,
  they've been acked for inclusion via my tree.  This relies on the
  wound/wait tree from Ingo which is already merged.

  Major changes:

  AMD finally released the dynamic power management code for all their
  GPUs from r600->present day, this is great, off by default for now but
  also a huge amount of code, in fact it is most of this pull request.

  Since it landed there has been a lot of community testing and Alex has
  sent a lot of fixes for any bugs found so far.  I suspect radeon might
  now be the biggest kernel driver ever :-P p.s.  radeon.dpm=1 to enable
  dynamic powermanagement for anyone.

  New drivers:

  Renesas r-car display unit.

  Other highlights:

   - core: GEM CMA prime support, use new w/w mutexs for TTM
     reservations, cursor hotspot, doc updates
   - dvo chips: chrontel 7010B support
   - i915: Haswell (fbc, ips, vecs, watermarks, audio powerwell),
     Valleyview (enabled by default, rc6), lots of pll reworking, 30bpp
     support (this time for sure)
   - nouveau: async buffer object deletion, context/register init
     updates, kernel vp2 engine support, GF117 support, GK110 accel
     support (with external nvidia ucode), context cleanups.
   - exynos: memory leak fixes, Add S3C64XX SoC series support, device
     tree updates, common clock framework support,
   - qxl: cursor hotspot support, multi-monitor support, suspend/resume
     support
   - mgag200: hw cursor support, g200 mode limiting
   - shmobile: prime support
   - tegra: fixes mostly

  I've been banging on this quite a lot due to the size of it, and it
  seems to okay on everything I've tested it on."

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (811 commits)
  drm/radeon/dpm: implement vblank_too_short callback for si
  drm/radeon/dpm: implement vblank_too_short callback for cayman
  drm/radeon/dpm: implement vblank_too_short callback for btc
  drm/radeon/dpm: implement vblank_too_short callback for evergreen
  drm/radeon/dpm: implement vblank_too_short callback for 7xx
  drm/radeon/dpm: add checks against vblank time
  drm/radeon/dpm: add helper to calculate vblank time
  drm/radeon: remove stray line in old pm code
  drm/radeon/dpm: fix display_gap programming on rv7xx
  drm/nvc0/gr: fix gpc firmware regression
  drm/nouveau: fix minor thinko causing bo moves to not be async on kepler
  drm/radeon/dpm: implement force performance level for TN
  drm/radeon/dpm: implement force performance level for ON/LN
  drm/radeon/dpm: implement force performance level for SI
  drm/radeon/dpm: implement force performance level for cayman
  drm/radeon/dpm: implement force performance levels for 7xx/eg/btc
  drm/radeon/dpm: add infrastructure to force performance levels
  drm/radeon: fix surface setup on r1xx
  drm/radeon: add support for 3d perf states on older asics
  drm/radeon: set default clocks for SI when DPM is disabled
  ...
2013-07-09 16:04:31 -07:00
Xiong Zhang 067556084a drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list
obj->mm_list link to dev_priv->mm.inactive_list/active_list
obj->global_list link to dev_priv->mm.unbound_list/bound_list

This regression has been introduced in

commit 93927ca52a
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Jan 10 18:03:00 2013 +0100

    drm/i915: Revert shrinker changes from "Track unbound pages"

Cc: stable@vger.kernel.org
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
[danvet: Add regression notice.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-09 16:31:48 +02:00
Ben Widawsky c6cfb32567 drm/i915: Embed drm_mm_node in i915 gem obj
Embedding the node in the obj is more natural in the transition to VMAs
which will also have embedded nodes. This change also helps transition
away from put_block to remove node.

Though it's quite an uncommon occurrence, it's somewhat convenient to not
fail at bind time because we cannot allocate the node. Though in
practice there are other allocations (like the request structure) which
would probably make this point not terribly useful.

Quoting Daniel:
Note that the only difference between put_block and remove_node is
that the former fills up the preallocation cache. Which we don't need
anyway and hence is just wasted space.

v2: Clean up the stolen preallocation code.
Rebased on the reserve_node patches
renames ggtt_ stuff to gtt_ stuff
WARN_ON if the object is already bound (which doesn't mean it's in the
bound list, tricky)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:36 +02:00
Ben Widawsky edd41a870f drm/i915: Kill obj->gtt_offset
With the getters in place from the previous patch this members serves no
purpose other than saving one spare pointer chase, which will be killed
in the next patch anyway.

Moving to VMAs, this members adds unnecessary confusion since an object
may exist at different offsets in different VMs.

v2: Properly preserve the stolen offset. This code is a bit hacky but it
all goes away when we embed the drm_mm_node and removes the need for the
incorrect patch I submitted previously: "Use gtt_space->start for stolen
reservation"

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:35 +02:00
Ben Widawsky f343c5f647 drm/i915: Getter/setter for object attributes
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-08 22:04:34 +02:00
Chris Wilson d26e3af842 drm/i915: Refactor the wait_rendering completion into a common routine
Harmonise the completion logic between the non-blocking and normal
wait_rendering paths, and move that logic into a common function.

In the process, we note that the last_write_seqno is by definition the
earlier of the two read/write seqnos and so all successful waits will
have passed the last_write_seqno. Therefore we can unconditionally clear
the write seqno and its domains in the completion logic.

v2: Add the missing ring parameter, because sometimes it is good to have
things compile.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:15:01 +02:00
Chris Wilson daa13e1ca5 drm/i915: Only clear write-domains after a successful wait-seqno
In the introduction of the non-blocking wait, I cut'n'pasted the wait
completion code from normal locked path. Unfortunately, this neglected
that the normal path returned early if the wait returned early. The
result is that read-only waits may return whilst the GPU is still
writing to the bo.

Fixes regression from
commit 3236f57a01 [v3.7]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Aug 24 09:35:09 2012 +0100

    drm/i915: Use a non-blocking wait for set-to-domain ioctl

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66163
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:15:00 +02:00
Jani Nikula 3765f30486 drm/i915: fix build warning on format specifier mismatch
drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_object_bind_to_gtt’:
drivers/gpu/drm/i915/i915_gem.c:3002:3: warning: format ‘%ld’ expects
argument of type ‘long int’, but argument 5 has type ‘size_t’ [-Wformat]

v2: Use %zu instead of %d. Two char patch, and 100% wrong. (Ville)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:14:43 +02:00
Konrad Rzeszutek Wilk 1625e7e549 drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Git commit 90797e6d1e
("drm/i915: create compact dma scatter lists for gem objects") makes
certain assumptions about the under laying DMA API that are not always
correct.

On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
I see:

[drm:intel_pipe_set_base] *ERROR* pin & fence failed
[drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28

Bit of debugging traced it down to dma_map_sg failing (in
i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).

That unfortunately are sizes that the SWIOTLB is incapable of handling -
the maximum it can handle is a an entry of 512KB of virtual contiguous
memory for its bounce buffer. (See IO_TLB_SEGSIZE).

Previous to the above mention git commit the SG entries were of 4KB, and
the code introduced by above git commit squashed the CPU contiguous PFNs
in one big virtual address provided to DMA API.

This patch is a simple semi-revert - were we emulate the old behavior
if we detect that SWIOTLB is online. If it is not online then we continue
on with the new compact scatter gather mechanism.

An alternative solution would be for the the '.get_pages' and the
i915_gem_gtt_prepare_object to retry with smaller max gap of the
amount of PFNs that can be combined together - but with this issue
discovered during rc7 that might be too risky.

Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Imre Deak <imre.deak@intel.com>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-07-01 11:14:42 +02:00
Dave Airlie 28419261b0 Merge tag 'drm-intel-next-2013-06-18' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Last 3.11 feature pull. I have a few odds bits and pieces and fixes in my
queue, I'll sort them out later on to see what's for 3.11-fixes and what's
for 3.12. But nothing to hold this here up imo.

Highlights:
- more hangcheck work from Mika and Chris to prepare for arb robustness
- trickle feed fixes from Ville
- first parts of the shared pch pll rework, with some basic hw state
  readout and cross-checking (this shuts up the confused pch pll refcount
  WARN that Linus just recently forwarded)
- Haswell audio power well support from Wang Xingchao (alsa bits acked by
  Takashi)
- some cleanups and asserts sprinkling around the plane/gamma enabling
  sequence from Ville
- more gtt refactoring from Ben
- clear up the adjusted->mode vs. pixel clock vs. port clock confusion
- 30bpp support, this time for real hopefully

* tag 'drm-intel-next-2013-06-18' of git://people.freedesktop.org/~danvet/drm-intel: (97 commits)
  drm/i915: remove a superflous semi-colon
  drm/i915: Kill useless "Enable panel fitter" comments
  drm/i915: Remove extra "ring" from error message
  drm/i915: simplify the reduced clock handling for pch plls
  drm/i915: stop killing pfit on i9xx
  drm/i915: explicitly set up PIPECONF (and gamma table) on haswell
  drm/i915: set up PIPECONF explicitly for i9xx/vlv platforms
  drm/i915: set up PIPECONF explicitly on ilk-ivb
  drm/i915: find guilty batch buffer on ring resets
  drm/i915: store ring hangcheck action
  drm/i915: add batch bo to i915_add_request()
  drm/i915: change i915_add_request to macro
  drm/i915: add i915_gem_context_get_hang_stats()
  drm/i915: add struct i915_ctx_hang_stats
  drm/i915: Try harder to disable trickle feed on VLV
  drm/i915: fix up pch pll enabling for pixel multipliers
  drm/i915: hw state readout and cross-checking for shared dplls
  drm/i915: WARN on lack of shared dpll
  drm/i915: split up intel_modeset_check_state
  drm/i915: extract readout_hw_state from setup_hw_state
  ...

Conflicts:
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_fb.c
	drivers/gpu/drm/i915/intel_sdvo.c
2013-06-28 09:50:34 +10:00
Dave Airlie 4300a0f8bd Linux 3.10-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQEbBAABAgAGBQJRxf9cAAoJEHm+PkMAQRiGMWkH911xM4gRmFgE7SqVW4F4AWBm
 ngcqMqNy9IdqKfibORUUDvVfEa5gjD5ai2quIKpfQiaukbpQJ696H90ijuAkajLn
 DQBrN243s0pzhhc/quWINnWxsFQ613JjdUMUMaD7e9A1aKjYzWrPGt/tSjrFXGCP
 tArTupVzc/iOmnEQDKiROI/Nokq44QJ36aTGPM7n08xMtpKmkCXM+9/UosBteB0O
 HVI33dmjwz7i55fI53XAWyuZCE+gSEnA4z8spJ9LfXso2W14V+roc+GuL6OyeeTI
 pCn/+4niVPb4B0ROZlpyVmdZjbPPcMMEK5o+BSJI68SH6LHZTQh2iVuqYfpSyA==
 =uUH5
 -----END PGP SIGNATURE-----

Merge tag 'v3.10-rc7' into drm-next

Linux 3.10-rc7

The sdvo lvds fix in this -fixes pull

commit c3456fb3e4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Jun 10 09:47:58 2013 +0200

    drm/i915: prefer VBT modes for SVDO-LVDS over EDID

has a silent functional conflict with

commit 990256aec2
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri May 31 12:17:07 2013 +0000

    drm: Add probed modes in probe order

in drm-next. W simply need to add the vbt modes before edid modes, i.e. the
other way round than now.

Conflicts:
	drivers/gpu/drm/drm_prime.c
	drivers/gpu/drm/i915/intel_sdvo.c
2013-06-27 20:40:44 +10:00
Konrad Rzeszutek Wilk 426729dcc7 drm/i915: make compact dma scatter lists creation work with SWIOTLB backend.
Git commit 90797e6d1e
("drm/i915: create compact dma scatter lists for gem objects") makes
certain assumptions about the under laying DMA API that are not always
correct.

On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup
I see:

[drm:intel_pipe_set_base] *ERROR* pin & fence failed
[drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28

Bit of debugging traced it down to dma_map_sg failing (in
i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB).

That unfortunately are sizes that the SWIOTLB is incapable of handling -
the maximum it can handle is a an entry of 512KB of virtual contiguous
memory for its bounce buffer. (See IO_TLB_SEGSIZE).

Previous to the above mention git commit the SG entries were of 4KB, and
the code introduced by above git commit squashed the CPU contiguous PFNs
in one big virtual address provided to DMA API.

This patch is a simple semi-revert - were we emulate the old behavior
if we detect that SWIOTLB is online. If it is not online then we continue
on with the new compact scatter gather mechanism.

An alternative solution would be for the the '.get_pages' and the
i915_gem_gtt_prepare_object to retry with smaller max gap of the
amount of PFNs that can be combined together - but with this issue
discovered during rc7 that might be too risky.

Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Imre Deak <imre.deak@intel.com>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: David Airlie <airlied@linux.ie>
CC: <dri-devel@lists.freedesktop.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-25 10:39:57 +10:00
Chris Wilson 19b2dbde57 drm/i915: Restore fences after resume and GPU resets
Stéphane Marchesin found that fences for pinned objects (i.e. the
scanout) were not being restored upon resume, leading to corruption on
the display and reference counting issues. This is due to a bug in

commit 312817a39f [2.6.38]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Nov 22 11:50:11 2010 +0000

    drm/i915: Only save and restore fences for UMS

that zapped the pinned fences even though they were in use.
Fortuitously, whilst we forced a VT switch during suspend and resume,
no fences were ever pinned at the time. However, we now can do
switchless S3 transitions and so the old bug finally surfaces.

Reported-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-16 01:10:45 +02:00
Mika Kuoppala aa60c664e6 drm/i915: find guilty batch buffer on ring resets
After hang check timer has declared gpu to be hung,
rings are reset. In ring reset, when clearing
request list, do post mortem analysis to find out
the guilty batch buffer.

Select requests for further analysis by inspecting
the completed sequence number which has been updated
into the HWS page. If request was completed, it can't
be related to the hang.

For noncompleted requests mark the batch as guilty
if the ring was not waiting and the ring head was
stuck inside the buffer object or in the flush region
right after the batch. For everything else, mark
them as innocents.

v2: Fixed a typo in commit message (Ville Syrjälä)

v3: - more descriptive function parameters (Chris Wilson)
    - use masked head address when inspecting if request is in ring
    - s/hangcheck.last_action/hangcheck.action
    - added comment about unmasked head hitting batch_obj range

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-13 17:42:17 +02:00
Mika Kuoppala 7d736f4f0b drm/i915: add batch bo to i915_add_request()
In order to track down a batch buffer and context which
caused the ring to hang, store reference to bo into the request struct.
Request can also cause gpu to hang after the batch in the flush section
in the ring. To detect this add start of the flush portion offset into the
request.

v2: Included comment about request vs batch_obj lifetimes (Chris Wilson)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-13 17:42:16 +02:00
Mika Kuoppala 0025c0772d drm/i915: change i915_add_request to macro
Only execbuffer needed all the parameters on i915_add_request().
By putting __i915_add_request behind macro, all current callsites
become cleaner. Following patch will introduce a new parameter
for __i915_add_request. With this patch, only the relevant callsite
will reflect the change making commit smaller and easier to understand.

v2: _i915_add_request as function name (Chris Wilson)

v3: change name __i915_add_request and fix ordering of params (Ben Widawsky)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-13 17:42:15 +02:00
Dave Airlie e6dfcc5303 Merge tag 'drm-intel-next-2013-06-01' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
Another round of drm-intel-next for 3.11. Highlights:
- Haswell IPS support (Paulo Zanoni)
- VECS support on Haswell (Ben Widawsky, Xiang Haihao, ...)
- Haswell watermark fixes (Paulo Zanoni)
- "Make the gun bigger again" multithread fence fix from Chris.
- i915_error_state finnally no longer fails with -ENOMEM! Big thanks to
  Mika for tackling this.
- vlv sideband locking fixes from Jani
- Hangcheck prep work for arb_robustness support (Mika&Chris)
- edp vs cpu port confusion clean-up from Imre
- pile of smaller fixes and cleanups all over.

* tag 'drm-intel-next-2013-06-01' of git://people.freedesktop.org/~danvet/drm-intel: (70 commits)
  drm/i915: add i915_ips_status debugfs entry
  drm/i915: add enable_ips module option
  drm/i915: implement IPS feature
  drm/i915: fix up the edp power well check
  drm/i915: add I915_PARAM_HAS_VEBOX to i915_getparam
  drm/i915: add I915_EXEC_VEBOX to i915_gem_do_execbuffer()
  drm/i915: add VEBOX into debugfs
  drm/i915: Enable vebox interrupts
  drm/i915: vebox interrupt get/put
  drm/i915: consolidate interrupt naming scheme
  drm/i915: Convert irq_refounct to struct
  drm/i915: make PM interrupt writes non-destructive
  drm/i915: Add PM regs to pre/post install
  drm/i915: Create an ivybridge_irq_preinstall
  drm/i915: Create a more generic pm handler for hsw+
  drm/i915: add support for 5/6 data buffer partitioning on Haswell
  drm/i915: properly set HSW WM_LP watermarks
  drm/i915: properly set HSW WM_PIPE registers
  drm/i915: fix pch_nop support
  drm/i915: Vebox ringbuffer init
  ...
2013-06-11 08:38:56 +10:00
Daniel Vetter 7abb690a0e drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus
Chris Wilson noticed that since

commit 1f83fee08d [v3.9]
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Nov 15 17:17:22 2012 +0100

    drm/i915: clear up wedged transitions

X can again get -EIO when it does not expect it. And even worse score
a SIGBUS when accessing gtt mmaps. The established ABI is that we
_only_ return an -EIO from execbuf - all other ioctls should just
work. And since the reset code moves all bos out of gpu domains and
clears out all the last_seqno/ring tracking there really shouldn't be
any reason for non-execbuf code to ever touch the hw and see an -EIO.

After some extensive discussions we've noticed that these spurios -EIO
are caused by i915_gem_wait_for_error:

http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html

That is easy to fix by returning 0 instead of -EIO, since grabbing the
dev->struct_mutex does not yet mean that we actually want to touch the
hw. And so there is no reason at all to fail with -EIO.

But that's not the entire since, since often (at least it's easily
googleable) dmesg indicates that the reset fails and we declare the
gpu wedged. Then, quite a bit later X wakes up with the "Timed out
waiting for the gpu reset to complete" DRM_ERROR message in
wait_for_errror and brings down the desktop with an -EIO/SIGBUS.

So clearly we're missing a wakeup somewhere, since the gpu reset just
doesn't take 10 seconds to complete. And indeed we're do handle the
terminally wedged state wrong.

Fix this all up.

References: https://bugs.freedesktop.org/show_bug.cgi?id=63921
References: https://bugs.freedesktop.org/show_bug.cgi?id=64073
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 14:35:18 +02:00
Ben Widawsky 35c20a60c7 drm/i915: Rename the gtt_list to global_list
Since it will be used for the global bound/unbound list with full PPGTT,
this helps clarify things for upcoming code rework.

Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:51:14 +02:00
Ben Widawsky 401c29f607 drm/i915: unpin pages at unbind
If we properly keep track of the pages_pin_count, then when we later add
multiple address spaces, the put_pages doesn't need any special checks
to be able to perform it's job.

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Rebased on top of the fix for stolen memory pinning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:50:22 +02:00
Ben Widawsky 1d64ae719b drm/i915: Unpin stolen pages
The way the stolen handling works is we take a pin on the backing pages,
but we never actually get a reference to the bo. On freeing objects
allocated with stolen memory, the final unref will end up freeing the
object with pinned pages count left. To enable an assertion to catch
bugs in this code path, this patch cleans up that remaining pin.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-06-03 10:49:08 +02:00
Ben Widawsky 9a8a2213a7 drm/i915: Vebox ringbuffer init
v2: Add set_seqno which didn't exist before rebase (Haihao)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-31 20:54:12 +02:00
Ben Widawsky 0a9ae0d7f8 drm/i915: pre-fixes for checkpatch
Since I'll need to modify i915_gem_object_bind_to_gtt(), fix the errors
now to get checkpatch to not complain.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Resolve conflict with Chris' improved debug output, and
bikeshed the new variable with s/max/gtt_max/ a bit while at it.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-31 20:53:56 +02:00
Dave Airlie e81f3d81e2 Merge tag 'drm-intel-next-2013-05-20-merged' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
Highlights (copy-pasted from my testing cycle mails):
- fbc support for Haswell (Rodrigo)
- streamlined workaround comments, including an igt tool to grep for
  them (Damien)
- sdvo and TV out cleanups, including a fixup for sdvo multifunction devices
- refactor our eDP mess a bit (Imre)
- don't register the hdmi connector on haswell when desktop eDP is present
- vlv support is no longer preliminary!
- more vlv fixes from Jesse for stolen and dpll handling
- more flexible power well checking infrastructure from Paulo
- a few gtt patches from Ben
- a bit of OCD cleanups for transcoder #defines and an assorted pile
  of smaller things.
- fixes for the gmch modeset sequence
- a bit of OCD around plane/pipe usage (Ville)
- vlv turbo support (Jesse)
- tons of vlv modeset fixes (Jesse et al.)
- vlv pte write fixes (Kenneth Graunke)
- hpd filtering to avoid costly probes on unaffected outputs (Egbert Eich)
- intel dev_info cleanups and refactorings (Damien)
- vlv rc6 support (Jesse)
- random pile of fixes around non-24bpp modes handling
- asle/opregion cleanups and locking fixes (Jani)
- dp dpll refactoring
- improvements for reduced_clock computation on g4x/ilk+
- pfit state refactored to use pipe_config (Jesse)
- lots more computed modeset state moved to pipe_config, including readout
  and cross-check support
- fdi auto-dithering for ivb B/C links, using the neat pipe_config
  improvements
- drm_rect helpers plus sprite clipping fixes (Ville)
- hw context refcounting (Mika + Ben)

* tag 'drm-intel-next-2013-05-20-merged' of git://people.freedesktop.org/~danvet/drm-intel: (155 commits)
  drm/i915: add support for dvo Chrontel 7010B
  drm/i915: Use pipe config state to control gmch pfit enable/disable
  drm/i915: Use pipe_config state to disable ilk+ pfit
  drm/i915: panel fitter hw state readout&check support
  drm/i915: implement WADPOClockGatingDisable for LPT
  drm/i915: Add missing platform tags to FBC workaround comments
  drm/i915: rip out an unused lvds_reg variable
  drm/i915: Compute WR PLL dividers dynamically
  drm/i915: HSW FBC WaFbcDisableDpfcClockGating
  drm/i915: HSW FBC WaFbcAsynchFlipDisableFbcQueue
  drm/i915: Enable FBC at Haswell.
  drm/i915: IVB FBC WaFbcDisableDpfcClockGating
  drm/i915: IVB FBC WaFbcAsynchFlipDisableFbcQueue
  drm/i915: Add support for FBC on Ivybridge.
  drm/i915: Organize VBT stuff inside drm_i915_private
  drm/i915: make SDVO TV-out work for multifunction devices
  drm/i915: rip out now unused is_foo tracking from crtc code
  drm/i915: rip out TV-out lore ...
  drm/i915: drop TVclock special casing on ilk+
  drm/i915: move sdvo TV clock computation to intel_sdvo.c
  ...
2013-05-31 12:56:05 +10:00
Chris Wilson 2dc8aae06d drm/i915: Workaround incoherence with fence updates on Valleyview
In commit 25ff1195f8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

we introduced an empirical workaround for memory corruption when using
fences from multiple CPUs. At the time, we did not have any results for
Valleyview, so the presumption was that it was limited to recent
generations using LLC. Now we have evidence that Valleyview also suffers
incoherence and requires a similar but different workaround. For
Valleyview, the wbinvd instruction is insufficient and we require the
serialising register write per-CPU. Conversely, that serialising
register write is not enough for SNB/IVB/HSW. To compromise and keep the
code relatively clean, employ both serialisation techniques in the same
workaround.

Reported-by: Jon Bloomfield <jon.bloomfield@intel.com>
Tested-by: Jon Bloomfield <jon.bloomfield@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-23 12:51:31 +02:00
Chris Wilson a36689cb77 drm/i915: Be more informative when reporting "too large for aperture" error
This should help debugging the truly unexpected cases where it occurs -
in particular to see which value is garbage.

References: https://bugzilla.kernel.org/show_bug.cgi?id=58511
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/%ld/%zd/ as spotted by Wu Fengguang's autobuilder.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-23 12:51:29 +02:00
Imre Deak e054cc3937 drm/i915: avoid premature timeouts in __wait_seqno()
At the moment wait_event_timeout/wait_event_interruptible_timeout may
time out 1 jiffy too early, as the calculated expiry time is 1 less than
needed. Besides timing out too early this also means that the
calculation of the remaining time will be incorrect and we will pass a
non-zero remaining time to user space in case of a time out. This is one
reason for the following bugzilla report:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64270

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-22 13:51:23 +02:00
Daniel Vetter e1b73cba13 Linux 3.10-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x
 BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF
 BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV
 P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9
 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi
 FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0=
 =jgDI
 -----END PGP SIGNATURE-----

Merge tag 'v3.10-rc2' into drm-intel-next-queued

Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts
grew a bit out of hand. intel_dp.c has the only real functional
conflict since the logic changed while dev_priv->edp.bpp was moved
around.

Also squash in a whitespace fixup from Ben Widawsky for
i915_gem_gtt.c, git seems to do something pretty strange in there
(which I don't fully understand tbh).

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_dp.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-21 09:52:16 +02:00
Mika Kuoppala 0e50e96bf2 drm/i915: add context into request struct
Storing context reference into request struct
allows us to inspect context and its associated
objects when requests are retired.

Both ppgtt and arb robustness work will need
this.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-06 11:21:51 +02:00
Chris Wilson 4f42f4ef0d drm/i915: Always normalize return timeout for wait_timeout_ioctl
As we recompute the remaining timeout after waiting, there is a
potential for that timeout to be less than zero and so need sanitizing.
The timeout is always returned to userspace and validated, so we should
always perform the sanitation.

v2 [vsyrjala]: Only normalize the timespec if it's invalid
v3: Add a comment to clarify the situation and remove the now
    useless WARN_ON() (ickle)

Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-30 10:50:32 +02:00
Ville Syrjälä 42b5aeabe9 drm/i915: IVB/HSW have 32 fence register
Increase the number of fence registers to 32 on IVB/HSW. VLV however
only has 16 fence registers according to the docs.

Increasing the number of fences was attempted before [1], but there was
some uncertainty about the maximum CPU fence number for FBC. Since then
BSpec has been updated to state that there are in fact 32 fence registers,
and the CPU fence number field in the SNB_DPFC_CTL_SA register is 5 bits,
and the CPU fence number field in the ILK_DPFC_CONTROL register must be
zero. So now it all makes sense.

[1] http://lists.freedesktop.org/archives/intel-gfx/2011-October/012865.html

v2: Include some background information based on the previous attempt

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:21 +02:00
Ben Widawsky b7c36d2546 drm/i915: Allow PPGTT enable to fail
I'm really not happy that we have to support this, but this will be the
simplest way to handle cases where PPGTT init can fail, which I promise
will be coming in the future.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:16 +02:00
Ben Widawsky 6197349bde drm/i915: Abstract PPGTT enabling
Since we've already set up a nice vtable to abstract other PPGTT
functions, also abstract the actual register programming to enable
things.

This function will probably need to change a bit as we implement real
processes.

v2: Resolve conflicts due to patch series reordering.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:15 +02:00
Chris Wilson 25ff1195f8 drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
In order to fully serialize access to the fenced region and the update
to the fence register we need to take extreme measures on SNB+, and
manually flush writes to memory prior to writing the fence register in
conjunction with the memory barriers placed around the register write.

Fixes i-g-t/gem_fence_thrash

v2: Bring a bigger gun
v3: Switch the bigger gun for heavier bullets (Arjan van de Ven)
v4: Remove changes for working generations.
v5: Reduce to a per-cpu wbinvd() call prior to updating the fences.
v6: Rewrite comments to ellide forgotten history.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Tested-by: Jon Bloomfield <jon.bloomfield@intel.com> (v2)
Cc: stable@vger.kernel.org
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-18 09:43:10 +02:00
Ben Widawsky 88a2b2a32d drm/i915: Don't wait for PCH on reset
BIOS should be setting this, but in case it doesn't...

v2: Define the bits we actually want to clear (Jesse)
Make it an RMW op (Jesse)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-08 20:53:05 +02:00
Imre Deak 2db76d7c3c lib/scatterlist: sg_page_iter: support sg lists w/o backing pages
The i915 driver uses sg lists for memory without backing 'struct page'
pages, similarly to other IO memory regions, setting only the DMA
address for these. It does this, so that it can program the HW MMU
tables in a uniform way both for sg lists with and without backing pages.

Without a valid page pointer we can't call nth_page to get the current
page in __sg_page_iter_next, so add a helper that relevant users can
call separately. Also add a helper to get the DMA address of the current
page (idea from Daniel).

Convert all places in i915, to use the new API.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-27 17:13:44 +01:00
Chris Wilson f9c513e9d6 drm/i915: Always call fence-lost prior to removing the fence
There is a minute window for a race between put-fence removing the fence
and for a new transaction by an external party on the GTT mmap. That is
we must zap the mmap prior to removing the fence and not afterwards.

Fixes regression from
commit 61050808bb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 17 15:31:31 2012 +0100

    drm/i915: Refactor put_fence() to use the common fence writing routine

v2: Remember the fence to remove with a local variable (gcc)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-26 20:16:18 +01:00
Imre Deak 90797e6d1e drm/i915: create compact dma scatter lists for gem objects
So far we created a sparse dma scatter list for gem objects, where each
scatter list entry represented only a single page. In the future we'll
have to handle compact scatter lists too where each entry can consist of
multiple pages, for example for objects imported through PRIME.

The previous patches have already fixed up all other places where the
i915 driver _walked_ these lists. Here we have the corresponding fix to
_create_ compact lists. It's not a performance or memory footprint
improvement, but it helps to better exercise the new logic.

Reference: http://www.spinics.net/lists/dri-devel/msg33917.html
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:17:09 +01:00
Imre Deak 67d5a50c04 drm/i915: handle walking compact dma scatter lists
So far the assumption was that each dma scatter list entry contains only
a single page. This might not hold in the future, when we'll introduce
compact scatter lists, so prepare for this everywhere in the i915 code
where we walk such a list.

We'll fix the place _creating_ these lists separately in the next patch
to help the reviewing/bisectability.

Reference: http://www.spinics.net/lists/dri-devel/msg33917.html
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23 12:16:36 +01:00