Commit graph

3619 commits

Author SHA1 Message Date
Dmitrii Tcvetkov 52dfcc5ccf drm/nouveau: fix for disabled fbdev emulation
Hello,

after this commit:

commit f045f459d9
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Thu Jun 2 12:23:31 2016 +1000
    drm/nouveau/fbcon: fix out-of-bounds memory accesses

kernel started to oops when loading nouveau module when using GTX 780 Ti
video adapter. This patch fixes the problem.

Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=120591

Signed-off-by: Dmitrii Tcvetkov <demfloro@demfloro.ru>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: f045f459d9 ("nouveau_fbcon_init()")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-24 07:51:32 +10:00
Ben Skeggs 6aa85f1129 drm/nouveau/iccsense: fix memory leak
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-06-15 16:18:28 +10:00
Robin Murphy 539aae6e3a drm/nouveau/Revert "drm/nouveau/device/pci: set as non-CPU-coherent on ARM64"
This reverts commit 1733a2ad36.

There is apparently something amiss with the way the TTM code handles
DMA buffers, which the above commit was attempting to work around for
arm64 systems with non-coherent PCI. Unfortunately, this completely
breaks systems *with* coherent PCI (which appear to be the majority).

Booting a plain arm64 defconfig + CONFIG_DRM + CONFIG_DRM_NOUVEAU on
a machine with a PCI GPU having coherent dma_map_ops (in this case a
7600GT card plugged into an ARM Juno board) results in a fatal crash:

[    2.803438] nouveau 0000:06:00.0: DRM: allocated 1024x768 fb: 0x9000, bo ffffffc976141c00
[    2.897662] Unable to handle kernel NULL pointer dereference at virtual address 000001ac
[    2.897666] pgd = ffffff8008e00000
[    2.897675] [000001ac] *pgd=00000009ffffe003, *pud=00000009ffffe003, *pmd=0000000000000000
[    2.897680] Internal error: Oops: 96000045 [#1] PREEMPT SMP
[    2.897685] Modules linked in:
[    2.897692] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc5+ #543
[    2.897694] Hardware name: ARM Juno development board (r1) (DT)
[    2.897699] task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
[    2.897711] PC is at __memcpy+0x7c/0x180
[    2.897719] LR is at OUT_RINGp+0x34/0x70
[    2.897724] pc : [<ffffff80083465fc>] lr : [<ffffff800854248c>] pstate: 80000045
[    2.897726] sp : ffffffc9768ab360
[    2.897732] x29: ffffffc9768ab360 x28: 0000000000000001
[    2.897738] x27: ffffffc97624c000 x26: 0000000000000000
[    2.897744] x25: 0000000000000080 x24: 0000000000006c00
[    2.897749] x23: 0000000000000005 x22: ffffffc97624c010
[    2.897755] x21: 0000000000000004 x20: 0000000000000004
[    2.897761] x19: ffffffc9763da000 x18: ffffffc976b2491c
[    2.897766] x17: 0000000000000007 x16: 0000000000000006
[    2.897771] x15: 0000000000000001 x14: 0000000000000001
[    2.897777] x13: 0000000000e31b70 x12: ffffffc9768a0080
[    2.897783] x11: 0000000000000000 x10: fffffffffffffb00
[    2.897788] x9 : 0000000000000000 x8 : 0000000000000000
[    2.897793] x7 : 0000000000000000 x6 : 00000000000001ac
[    2.897799] x5 : 00000000ffffffff x4 : 0000000000000000
[    2.897804] x3 : 0000000000000010 x2 : 0000000000000010
[    2.897810] x1 : ffffffc97624c010 x0 : 00000000000001ac
...
[    2.898494] Call trace:
[    2.898499] Exception stack(0xffffffc9768ab1a0 to 0xffffffc9768ab2c0)
[    2.898506] b1a0: ffffffc9763da000 0000000000000004 ffffffc9768ab360 ffffff80083465fc
[    2.898513] b1c0: ffffffc976801e00 ffffffc9762b8000 ffffffc9768ab1f0 ffffff80080ec158
[    2.898520] b1e0: ffffffc9768ab230 ffffff8008496d04 ffffffc975ce6d80 ffffffc9768ab36e
[    2.898527] b200: ffffffc9768ab36f ffffffc9768ab29d ffffffc9768ab29e ffffffc9768a0000
[    2.898533] b220: ffffffc9768ab250 ffffff80080e70c0 ffffffc9768ab270 ffffff8008496e44
[    2.898540] b240: 00000000000001ac ffffffc97624c010 0000000000000010 0000000000000010
[    2.898546] b260: 0000000000000000 00000000ffffffff 00000000000001ac 0000000000000000
[    2.898552] b280: 0000000000000000 0000000000000000 fffffffffffffb00 0000000000000000
[    2.898558] b2a0: ffffffc9768a0080 0000000000e31b70 0000000000000001 0000000000000001
[    2.898566] [<ffffff80083465fc>] __memcpy+0x7c/0x180
[    2.898574] [<ffffff800853e164>] nv04_fbcon_imageblit+0x1d4/0x2e8
[    2.898582] [<ffffff800853d6d0>] nouveau_fbcon_imageblit+0xd8/0xe0
[    2.898591] [<ffffff80083c4db4>] soft_cursor+0x154/0x1d8
[    2.898598] [<ffffff80083c47b4>] bit_cursor+0x4fc/0x538
[    2.898605] [<ffffff80083c0cfc>] fbcon_cursor+0x134/0x1a8
[    2.898613] [<ffffff800841c280>] hide_cursor+0x38/0xa0
[    2.898620] [<ffffff800841d420>] redraw_screen+0x120/0x228
[    2.898628] [<ffffff80083bf268>] fbcon_prepare_logo+0x370/0x3f8
[    2.898635] [<ffffff80083bf640>] fbcon_init+0x350/0x560
[    2.898641] [<ffffff800841c634>] visual_init+0xac/0x108
[    2.898648] [<ffffff800841df14>] do_bind_con_driver+0x1c4/0x3a8
[    2.898655] [<ffffff800841e4f4>] do_take_over_console+0x174/0x1e8
[    2.898662] [<ffffff80083bf8c4>] do_fbcon_takeover+0x74/0x100
[    2.898669] [<ffffff80083c3e44>] fbcon_event_notify+0x8cc/0x920
[    2.898680] [<ffffff80080d7e38>] notifier_call_chain+0x50/0x90
[    2.898685] [<ffffff80080d8214>] __blocking_notifier_call_chain+0x4c/0x90
[    2.898691] [<ffffff80080d826c>] blocking_notifier_call_chain+0x14/0x20
[    2.898696] [<ffffff80083c5e1c>] fb_notifier_call_chain+0x1c/0x28
[    2.898703] [<ffffff80083c81ac>] register_framebuffer+0x1cc/0x2e0
[    2.898712] [<ffffff800845da80>] drm_fb_helper_initial_config+0x288/0x3e8
[    2.898719] [<ffffff800853da20>] nouveau_fbcon_init+0xe0/0x118
[    2.898727] [<ffffff800852d2f8>] nouveau_drm_load+0x268/0x890
[    2.898734] [<ffffff8008466e24>] drm_dev_register+0xbc/0xc8
[    2.898740] [<ffffff8008468a88>] drm_get_pci_dev+0xa0/0x180
[    2.898747] [<ffffff800852cb28>] nouveau_drm_probe+0x1a0/0x1e0
[    2.898755] [<ffffff80083a32e0>] pci_device_probe+0x98/0x110
[    2.898763] [<ffffff800858e434>] driver_probe_device+0x204/0x2b0
[    2.898770] [<ffffff800858e58c>] __driver_attach+0xac/0xb0
[    2.898777] [<ffffff800858c3e0>] bus_for_each_dev+0x60/0xa0
[    2.898783] [<ffffff800858dbc0>] driver_attach+0x20/0x28
[    2.898789] [<ffffff800858d7b0>] bus_add_driver+0x1d0/0x238
[    2.898796] [<ffffff800858ed50>] driver_register+0x60/0xf8
[    2.898802] [<ffffff80083a20dc>] __pci_register_driver+0x3c/0x48
[    2.898809] [<ffffff8008468eb4>] drm_pci_init+0xf4/0x120
[    2.898818] [<ffffff8008c56fc0>] nouveau_drm_init+0x21c/0x230
[    2.898825] [<ffffff80080829d4>] do_one_initcall+0x8c/0x190
[    2.898832] [<ffffff8008c31af4>] kernel_init_freeable+0x14c/0x1f0
[    2.898839] [<ffffff80088a0c20>] kernel_init+0x10/0x100
[    2.898845] [<ffffff8008085e10>] ret_from_fork+0x10/0x40
[    2.898853] Code: a88120c7 a8c12027 a88120c7 a8c12027 (a88120c7)
[    2.898871] ---[ end trace d5713dcad023ee04 ]---
[    2.898888] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

In a toss-up between the GPU seeing stale data artefacts on some systems
vs. catastrophic kernel crashes on other systems, the latter would seem
to take precedence, so revert this change until the real underlying
problem can be fixed.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
[acourbot@nvidia.com: port to Nouveau tree, remove bits in lib/]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-15 16:16:13 +10:00
Ben Skeggs 4691409b3e drm/nouveau/disp/sor/gm107: training pattern registers are like gm200
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-07 08:11:25 +10:00
Ben Skeggs a8953c52b9 drm/nouveau/disp/sor/gf119: both links use the same training register
It appears that, for whatever reason, both link A and B use the same
register to control the training pattern.  It's a little odd, as the
GPUs before this (Tesla/Fermi1) have per-link registers, as do newer
GPUs (Maxwell).

Fixes the third DP output on NVS 510 (GK107).

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-07 08:11:14 +10:00
Ben Skeggs 77154fd969 drm/nouveau/core: swap the order of imem/fb
Fixes a use-after-free reported by valgrind and KASAN.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-06-02 13:54:07 +10:00
Ben Skeggs f045f459d9 drm/nouveau/fbcon: fix out-of-bounds memory accesses
Reported by KASAN.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-02 13:53:44 +10:00
Ben Skeggs 383d0a419f drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-02 13:53:41 +10:00
Ben Skeggs 9057c8d750 drm/nouveau/ltc/gm107-: fix typo in the address of NV_PLTCG_LTC0_LTS0_INTR
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-02 13:53:38 +10:00
Ben Skeggs bc9139d23f drm/nouveau/bios/disp: fix handling of "match any protocol" entries
As it turns out, a value of 0xff means "any protocol" and not "VGA".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2016-06-02 13:53:30 +10:00
Dave Airlie d5fa33f284 Merge branch 'linux-4.7' of git://github.com/skeggsb/linux into drm-next
Nothing too exciting here, there's a larger chunk of work that still
needs more testing but not likely to get that done today - so - here's
the rest of it.  Assuming nothing else goes horribly wrong, I should be
able to send the rest Monday if it isn't too late....

Changes:
- Improvements to power sensor support
- Initial attempt at GM108 support
- Minor fixes to GR init + ucode
- Make use of topology information (provided by the GPU) in various
places, should at least fix some fault recovery issues and
engine/runlist mapping confusion on newer GPUs.

* 'linux-4.7' of git://github.com/skeggsb/linux: (51 commits)
  drm/nouveau/gr/gf100-: fix race condition in fecs/gpccs ucode
  drm/nouveau/core: recognise GM108 chipsets
  drm/nouveau/gr/gm107-: fix touching non-existent ppcs in attrib cb setup
  drm/nouveau/gr/gk104-: share implementation of ppc exception init
  drm/nouveau/gr/gk104-: move rop_active_fbps init to nonctx
  drm/nouveau/bios/pll: check BIT table version before trying to parse it
  drm/nouveau/bios/pll: prevent oops when limits table can't be parsed
  drm/nouveau/volt/gk104: round up in gk104_volt_set
  drm/nouveau/fb/gm200: setup mmu debug buffer registers at init()
  drm/nouveau/fb/gk20a,gm20b: setup mmu debug buffer registers at init()
  drm/nouveau/fb/gf100-: allocate mmu debug buffers
  drm/nouveau/fb: allow chipset-specific actions for oneinit()
  drm/nouveau/gr/gm200-: fix bad hardcoding of a max-tpcs-per-gpc value
  drm/nouveau/gr/gm200-: rop count == ltc count
  drm/nouveau/gr/gm200: modify the mask when copying mmu settings from fb
  drm/nouveau/gr/gm200: move some code into init_gpc_mmu() hook
  drm/nouveau/gr/gm200: make generate_main() static
  drm/nouveau/gr/gf100-: abstract fetching rop count
  drm/nouveau/gr/gf100-: rename magic_not_rop_nr to screen_tile_row_offset
  drm/nouveau/gr/gf100-: remove hardcoded idle_timeout values
  ...
2016-05-21 06:12:13 +10:00
Ben Skeggs ca79e49d6a drm/nouveau/gr/gf100-: fix race condition in fecs/gpccs ucode
This is a simplied version of the fix by Roy in fdo#93629.  While this
doesn't appear to fix the issues for the users in that report, it's a
real issue that deserves to be resolved.

Reported-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs f9e2029443 drm/nouveau/core: recognise GM108 chipsets
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 00f50c662c drm/nouveau/gr/gm107-: fix touching non-existent ppcs in attrib cb setup
Also removes an XXX; according to nvgpu headers the field is called
NV_PGRAPH_GPCS_SWDX_TC_BETA_CB_SIZE_DIV3, so, apparently not some
magic we need to figure out :)

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs a00ecf2212 drm/nouveau/gr/gk104-: share implementation of ppc exception init
This was really inconsistent, some implementations could touch PPCs
that didn't exist, others neglected to touch ones that did.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 87ac331e3f drm/nouveau/gr/gk104-: move rop_active_fbps init to nonctx
Matches newer RM.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 4d3df19a8e drm/nouveau/bios/pll: check BIT table version before trying to parse it
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 2781c928b1 drm/nouveau/bios/pll: prevent oops when limits table can't be parsed
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Karol Herbst d07a97e939 drm/nouveau/volt/gk104: round up in gk104_volt_set
We always want a equal or higher voltage than the requested ones, otherwise
nouveau undervolts.

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs e976278ad2 drm/nouveau/fb/gm200: setup mmu debug buffer registers at init()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 834b21f5e9 drm/nouveau/fb/gk20a,gm20b: setup mmu debug buffer registers at init()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 99c5917253 drm/nouveau/fb/gf100-: allocate mmu debug buffers
Later chipsets require setting this up both in FB and GR, so let's just
move the allocation to FB.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 917d95a86e drm/nouveau/fb: allow chipset-specific actions for oneinit()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 06d4f26cc3 drm/nouveau/gr/gm200-: fix bad hardcoding of a max-tpcs-per-gpc value
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 734a0aa669 drm/nouveau/gr/gm200-: rop count == ltc count
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs c83e7d6836 drm/nouveau/gr/gm200: modify the mask when copying mmu settings from fb
Appears to more closely match what RM does.

For GM20B, now also copying bit 12 from NV_PFB_MMU_CTRL as upcoming
changes will require it.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 54aa38a8ad drm/nouveau/gr/gm200: move some code into init_gpc_mmu() hook
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 560e6da267 drm/nouveau/gr/gm200: make generate_main() static
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 64cb5a31f4 drm/nouveau/gr/gf100-: abstract fetching rop count
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 5ec3def735 drm/nouveau/gr/gf100-: rename magic_not_rop_nr to screen_tile_row_offset
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 933ad44594 drm/nouveau/gr/gf100-: remove hardcoded idle_timeout values
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 0cdc3fdfb7 drm/nouveau/fifo/gm107-: remove engines from mmu engine mapping array
These are specified by PTOP on Maxwell GPUs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 289e082706 drm/nouveau/fifo/gk104-: identify mmu engine ids for host faults
It appears these don't map to PBDMAs (at least on Kepler, it may or may
be valid for Fermi - this hasn't been checked), but to runlists.

This drops the NVKM_ENGINE_FIFO data from the entries too, as resetting
all of PFIFO is *not* the way to handle such faults.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs e50d0237fc drm/nouveau/fifo/gk104-: implement support for PTOP fault info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 91419acf78 drm/nouveau/fifo/gk104-: abstract mmu fault data structures
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 98ac3f061a drm/nouveau/fifo/gk104-: subclass func
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs e93e198d46 drm/nouveau/fifo/gk104-: use device info from top subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 56d06fa29e drm/nouveau/core: remove pmc_enable argument from subdev ctor
These are now specified directly in the MC subdev.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs d85e2a8dd8 drm/nouveau/mc/nv04: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 667e99ab23 drm/nouveau/mc/nv11: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 79360b7d5f drm/nouveau/mc/nv17: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 9199fbdbf8 drm/nouveau/mc/nv50: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 7354902001 drm/nouveau/mc/g84: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs e56f90fe17 drm/nouveau/mc/g98: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 88c0de2cdb drm/nouveau/mc/gt215: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs a6bb38e902 drm/nouveau/mc/gf100: define reset masks + intr cleanup
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 33537d6fdc drm/nouveau/mc/gk104: define reset masks + intr cleanup
Engine fields have been removed, as they're specified by PTOP.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 921be10d85 drm/nouveau/mc: implement support for PTOP interrupt routing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 583f8e4ea2 drm/nouveau/mc: implement support for PTOP reset info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs 70b01f07db drm/nouveau/mc: allow for local definition of reset bits
With the addition of PTOP-specified reset bits, it makes more sense to
move the definitions here rather than in individual subdev
implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00